R

9 Join Function Example with the R {dplyr} Package

Simple Example Data Load {dplyr} package Function 1: inner_join Function 2: left_join Function 3: right_join Function 4: full_join Function 5: semi_join Function 6: anti_join Complex Example 1: Join Multiple Data Frames Complex example 2: Join by Multiple Columns Complex example 3: Join Data & Delete ID Hi, I’m the here-bot cat! Use me to find your way in your website. Here I am: content/blog/2020-11-25-nine-join/index.html Here is my R Markdown source file: blog/2020-11-25-nine-join/index.

Outliers-Part 4:Finding Outliers in a multivariated way

Series: Outlier Detection

Data Source Variables in Data Model-specific methods Cook’s Distance Pareto Multivariate methods Mahalanobis Distance Details about Mahalanobis Distance Robust Mahalanobis Distance Minimum Covariance Determinant (MCD) robust tolerance ellipsoid (RTE) Invariant Coordinate Selection (ICS) OPTICS Isolation Forest Local Outlier Factor ‘check_outliers’ function in {performance} R package Threshold specification Reference Hi, I’m the here-bot cat! Use me to find your way in your website.

Outliers-Part 3:Outliers in Regression

Series: Outlier Detection

Types of Unusual Observations Regression Outliers Leverage Influential Observations Good vs. Bad Leverage Detecting Influential Observations Graphic diagnostics A scatter plot with Confidence Ellipse Quantile Comparison Plots (QQ-Plot) Rule of Thumb Added-variable plots Numerical diagnostics Hat Matrix Rule of Thumb Standardized Residuals Rule of Thumb Studentized Residuals Rule of Thumb Studentized Residuals-the Bonferroni adjustment DFBeta and DFBetas Rule of thumb Robust Distance Mahalanobis Distance Rule of Thumb Cook’s Distance Rule of Thumb DFITS Rule of Thumb Summary Hi, I’m the here-bot cat!

Outliers-Part 2:Finding Outliers in a univariated way

Series: Outlier Detection

Method 1: Sorting Your Datasheet to Find Outliers Method 2: Graphing Your Data to Identify Outliers Histogram Boxplot Adjusted boxplot (Hubert and Vandervieren, 2008) Method 3: Using Z-scores to Detect Outliers Z-Score pros: Z-Score cons: Method 4: Using the Interquartile Range (IRQ) to Create Outlier Fences Method 5: Percentiles scores function from {outliers} packages Method 6: Hampel filter Method 7: Finding Outliers with Hypothesis Tests Grubbs’ test Dixon’s test Rosner’s test Challenges of Using Outlier Hypothesis Tests: Masking and Swamping Hi, I’m the here-bot cat!

Outliers-Part 1:Causes, Philosophy and General Rules

Series: Outlier Detection

What are Outliers? Causes for Outliers Types of Outliers Philosophy about Finding Outliers General Rules Hi, I’m the here-bot cat! Use me to find your way in your website. Here I am: content/blog/outlier-series/01-Outliers-part1/index.html Here is my R Markdown source file: blog/outlier-series/01-Outliers-part1/index.Rmd You’ll want to edit this file, then re-knit to see the changes take effect in your site preview. To remove me, delete this line inside that file: {{< here >}}

Introducing Tidyverse-Part 2: %>%, the Forward Chaining

Figure 1: Pipe Operator Hi, I’m the here-bot cat! Use me to find your way in your website. Here I am: content/blog/2020-05-22-pipe-forward-chaining/index.html Here is my R Markdown source file: blog/2020-05-22-pipe-forward-chaining/index.Rmd You’ll want to edit this file, then re-knit to see the changes take effect in your site preview. To remove me, delete this line inside that file: {{< here >}} My content section is: blog My layout is: single-sidebar Images in this page bundle: /blog/2020-05-22-pipe-forward-chaining/featured.

Introducing Tidyverse-Part 1:Tidy Data

What is Tidy data? Why is it important? Fixed variable vs. Measured variable Is such a data set tidy? Example 1: Example 2: Example 3: Code Examples Traditional measurement testing dataset Hi, I’m the here-bot cat! Use me to find your way in your website. Here I am: content/blog/2020-05-09-tidy-data/index.html Here is my R Markdown source file: blog/2020-05-09-tidy-data/index.Rmd You’ll want to edit this file, then re-knit to see the changes take effect in your site preview.

Parallel Analysis: Determining the Dimensionality of Data

window.xaringanExtraClipboard(null, {“button”:" Copy Code",“success”:" Copied!",“error”:“Press Ctrl+C to Copy”}) WHAT IS PARALLEL ANALYSIS METHODOLOGY PARALLEL ANALYSIS IN R Recently, my colleague asked me to review a state assessment tech report. In the tech report, a section-“Parallel Analysis” really caught my eyes. I have done parallel analysis multiple times in the past. However, I have never thought about this topic in a systematical way. It is always a good memory refreshing opportunity.

First blog and some words

Yihui Xie Keith McNulty David Robinson Useful links This is the first blog I created after I put my personal website online. This is the second time I’ve built my personal website. Last time, when I built my first one, I was still in graduate school fighting for my dissertation. It seems like a Déjà vu, but two websites serve as two different purposes. My first website was mainly used for advertising and job hunting.

Write Readable Code

Write readable code: Simple and Practical Techniques for Better Statistical Programming.