# Outlier Detection

Outlier Detection tutorial and codes

Data Source Variables in Data Model-specific methods Cook's Distance Pareto Multivariate methods Mahalanobis Distance Details about Mahalanobis Distance Robust Mahalanobis Distance Minimum Covariance Determinant (MCD) robust tolerance ellipsoid (RTE) Invariant Coordinate Selection (ICS) OPTICS Isolation Forest Local Outlier Factor 'check_outliers' function in {performance} R package Threshold specification Reference

Types of Unusual Observations Regression Outliers Leverage Influential Observations Good vs. Bad Leverage Detecting Influential Observations Graphic diagnostics A scatter plot with Confidence Ellipse Quantile Comparison Plots (QQ-Plot) Rule of Thumb Added-variable plots Numerical diagnostics Hat Matrix Rule of Thumb Standardized Residuals Rule of Thumb Studentized Residuals Rule of Thumb Studentized Residuals-the Bonferroni adjustment DFBeta and DFBetas Rule of thumb Robust Distance Mahalanobis Distance Rule of Thumb Cook's Distance Rule of Thumb DFITS Rule of Thumb Summary

Method 1: Sorting Your Datasheet to Find Outliers Method 2: Graphing Your Data to Identify Outliers Histogram Boxplot Adjusted boxplot (Hubert and Vandervieren, 2008) Method 3: Using Z-scores to Detect Outliers Z-Score pros: Z-Score cons: Method 4: Using the Interquartile Range (IRQ) to Create Outlier Fences Method 5: Percentiles scores function from {outliers} packages Method 6: Hampel filter Method 7: Finding Outliers with Hypothesis Tests Grubbs' test Dixon's test Rosner's test Challenges of Using Outlier Hypothesis Tests: Masking and Swamping

What are Outliers? Causes for Outliers Types of Outliers Philosophy about Finding Outliers General Rules

