Data Analysis I
1
Introduction
2
A Gentle Introduction to R
2.1
R as a Calculator
Interactive Example
Interactive Example
2.2
Assigning Values to Variables
Interactive Example
2.3
Working with an R Script
2.4
Vectors
Interactive Example
2.4.1
Basic Operations on Vectors
2.4.2
Element-by-Element Operations on Vectors
2.5
Loading Data
3
Look at Your Data! One Variable at a Time…
3.1
Types of Variables
3.1.1
Logical Operations
3.1.2
Logical Evaluation with Vectors
3.2
Measures of Centrality
3.2.1
Mean
3.2.2
Median
3.2.3
Mode
3.3
Variance
3.4
Simple Plots
3.4.1
Histogram
3.4.2
Density Plot
3.4.3
Box Plot
3.5
Summary Statistics
3.6
Missing Values
4
Look at Your Data! Two Variables at a Time…
4.1
Indexing Vectors using Logical Operations
4.1.1
Indices
4.1.2
“And” and “Or” Operators
4.1.3
And & and Or | Applied to Vectors
4.2
When Both Variables are Categorical
4.3
When One Variable is Categorical
4.4
When Both Variables are Interval Variables
4.4.1
Covariance
4.4.2
Correlation
4.5
Look at Your Data! Putting It All Together
5
Operationalization
6
Sampling and Surveys
7
Causal Effects and Experiments
8
Research Design
9
Bivariate Regression: It’s a Line!
10
Multiple Regression
11
Discrete Random Variables
11.1
Uncertainty
\(\ne\)
Equally Likely Outcomes
11.2
Variables, Variables, and Random Variables
11.3
Discrete Random Variables
11.4
Axioms of Probability
11.4.1
The Complement
11.5
PMFs as Equations, Tables, and Graphs
11.6
Cumulative Probability
11.6.1
Variations on Cumulative Probability
12
Discrete Random Variables, pt II
12.1
The Bernoulli Distribution
12.2
The Expected Value of a Random Variable
12.3
The Variance of a Random Variable
13
Continuous Random Variables
13.1
Probability Density Function (PDF) and Cumulative Distribution Function (CDF)
13.2
Percentiles
13.3
Summary
14
Distribution of the Sample Mean
14.1
The Sample Mean as a Random Variable
14.2
Larger (Random) Samples are Better
Summary
15
Confidence Intervals
15.1
Interval Estimates
15.2
Large-Sample Confidence Intervals for the Population Mean
15.3
Large Sample Confidence Intervals for the Population Proportion
15.4
Interpreting Confidence Intervals
15.5
Practical Uses of Confidence Intervals
15.6
Margin of Error and Sample Size
15.7
Small Sample Confidence Intervals for the Population Mean
15.8
Summary
16
Hypothesis Tests: Proportions
17
Hypothesis Tests: Means
18
Comparing Two Groups
19
Joint Distributions, Independence, and Conditional Probability
19.1
Joint Probability Distribution for Two Discrete Random Variables
19.2
Conditional Probability
19.3
Independence
19.4
The Binomial Distribution (Optional)
20
Cross Tabulations and Association
21
Bivariate Regression: Inference
22
Multiple Regression: Model Fit and Inference
Appendix A: Installing R and RStudio
A.1 Install R
A.1.1 mac OS
A.1.2 Windows
A.2 Install RStudio
A.2.1 mac OS
A.2.2 Windows
A.3 Create Course Desktop Folder and Configure RStudio
A.3.1 Create Course Desktop Folder
A.3.2 Configure RStudio’s Default Working Directory
A.4 Verify R and RStudio Installations
A.4.1 Download Test R Script and Dataset
A.4.2 Verify Default Working Directory
A.4.3 Open Test R Script
A.4.4 Install Rmarkdown Packages and Verify Compilation
A.5 Optional: Install LaTeX
Appendix B: Math for Introductory Statistics
B.1
B.1.1 Exponents
B.1.2 Euler’s Number
B.1.3 Logarithms
B.1.4 Factorials
B.2.1 Summing the Elements of a Vector
B.2.2 Summing the Squared Elements of a Vector
B.2.3 Sum of Squared Deviations from the Mean
Acknowledgements
References
Home
Data Analysis I (DRAFT)
Chapter 7
Causal Effects and Experiments