View Modules at a time Verified Only
Introduction to Programming with R
Andrea S Foulkes
This module gives an introduction to programming with R at an elementary level. Additional sub-topics include creation of vectors and matrices, object manipulation and basic data summaries. There are three labs designed to be completed using R/RStudio.
Confidence Intervals Around a Mean
Eric A Cohen
This module focuses on understanding and calculating confidence intervals around a sample mean. This includes understanding the concept of a confidence interval; understanding when using either the normal or the t-distribution (or neither) is appropriate; writing R code to calculate confidence intervals; and writing code to test coverage of the true population mean by confidence intervals. Finally it introduces the native R functions for calculating confidence intervals around a mean.
Introduction to Simple Linear Regression
This module was taken from my Statistical Modeling and Data Visualization course at UMass-Amherst. This module provides four classes on simple linear regression targeted for upper-level undergraduate or entry-level graduate students.
Multiple Linear Regression: Introduction
This module was adapted from the Statistical Modeling and Data Visualization course at UMass-Amherst. This module provides materials for six to seven classes on multiple linear regression targeted for upper-level undergraduate or entry-level graduate students. Topics covered include: basic MLR notation (including matrix notation), interpretations of continuous and categorical covariates, F-testing, and confidence intervals. There is one lab included, which provides a hands-on comparison of looking at multiple tests of coefficients and of doing global F-tests.
Eric A Cohen
This module concentrates on graphic methods available in R for presentation and exploratory analysis. These include: stripcharts, histograms, stem and leaf plots, kernel density plots, boxplots, piecharts, stacked bar charts, and scatterplots. Other graphic presentations are briefly mentioned. Explanation is devoted to the purposes and uses of graphic techniques, motivation for the use of each type of display, and guidelines for appropriate use of each technique. For most techniques above the user is taught how to create the plot using R, and how to read essential information from the plot.
Confidence Intervals Around a Proportion
Eric A Cohen
This module focuses on understanding and calculating confidence intervals around a sample proportion. This includes understanding the concept of a confidence interval; writing R code to calculate Wald (normal approximation) confidence intervals; and writing code to test coverage of the true population proportion by confidence intervals. Next it introduces using the prop.test function to calculate CIs around a mean; has the user write code to compare the results and coverage proportion of their CI code with the results of prop.test; and briefly mentions the difference between Wald and Wilson CIs to explain the different results, and the more accurate coverage proportion of Wilson CIs.
Introduction to Resampling Inference
Nicholas G Reich
The revamp module provides an introduction to resampling inference. A dataset with zero-inflated observations is provided and used as an example to compare group means. For drawing inference, a standard parametric approach (i.e. the t-test) is compared with a permutation test and bootstrap confidence interval for the observed mean difference between groups. In a second lab, students are asked to downsample the large dataset and use it to compare inferences between parametric and non-parametric approaches. A bonus assignment asks students to try to stochastically simulate data in such a way as to match the zero-inflated dataset. This exercise has been turned into a class contest with great success.
Eric A Cohen
This module concentrates on the ideas of evaluating a (binary) diagnostic test: the gold standard, false positives and negatives, sensitivity and specificity, positive and negative predictive values. It has the user create R code to calculate the above values for a given cutoff level of a test in a simulated population, and then do this repeatedly to see the tradeoff between sensitivity and specificity in a test. Finally it has the user apply their code to evaluate these measures for the same test on a simulated population with high prevalence, and then with low prevalence, to see that PV+ and PV_ depend on the population as well as the test.
Discrete Random Variables
Harrel E Blatt
This module gives instruction on the concept of discrete random variables. Additional sub-topics include simulation, expected value, variance and pmf's .There are three labs designed to be completed using RStudio.
Power Error and FDR
Harrel E Blatt
This module gives instruction on the concept of statistical power. Additional topics include adjusting pvalues in a multiple hypothesis setting. There are two labs that the student should complete using RStudio.