After teaching myself how to use learnR to teach students how to apply statistics to real problems using R and RStudio, I started teaching some of my colleagues as part of a Resident Digital Fellows Pilot Program. I divided the content into four sessions with a project being due for the fourth session. For the project, each participant was asked to design a tutorial using R that they would use in their classes. The content covered in each of the sessions is as follows:
Session 1 covered an introduction to R and various coding paradigms which would be useful from a pedagogical perspective. We covered base R, the tidyverse, and MOSAIC (an NSF funded program designed for teaching statistics using as few functions as possible). Participants were asked to come up with an idea for their project by the next session.
Session 2 covered an introduction to R Markdown. We covered how to intersperse chunks of code with formatted text and spend half the session creating a non-interactive version of the tutorial that they would eventually implement.
Session 3 covered the learnR package. We covered how to create interactive coding exercises and different types of questions for students to answer. The second half of the session was devoted to converting the non-interactive document created in the last session into an interactive learnR-based tutorial.
Session 4 covered how to upload learnR tutorials to an RStudio Connect Server so that students could access and work through them using only a web browser. We then had each participant upload their tutorials and did a bit of show-and-tell.
I will leave you with each participants project described in their own words below (with light editing from me):
Ryan King, Associate Professor of Biology: For my tutorial I wanted to have students walk through some next generation sequencing data to look at overall quality of the data and whether there were differences in gene expression between samples exposed to different conditions. The tutorial starts out be having the students open up a table with columns representing different measurements and rows representing different genes. The next exercise shows students how to use ggplot to generate a scatter plot with regression line between different samples. From playing around with the plotting students can see that biological replicates have similar gene expression profiles while gene expression profiles differ more between samples from different experimental treatments. Pairwise plotting is cumbersome and so the next exercise shows students how to filter the data frame, perform principle component analysis and plot the results to show how biological replicates cluster while different treatments have significant differences in gene expression profiles. The final exercise has students set up a volcano plot to examine gene expression profile differences between samples by plotting the p-value and fold change difference for each gene.
Jonathan Dunbar, Associate Professor of Mathematics: I used learnR to create a tutorial for my MATH 123 class, Applications of Contemporary Mathematics, which teaches mathematical concepts simultaneously with societal truths. This tutorial would offer my students support as they learn to construct time series graphs from which they can infer socioeconomic trends in income inequality in the United States. I made abundant use of learnR‘s “quiz” feature to check in with students as they interpret the visualizations they create in ggplot2. MATH 123 does not have a programming component or prerequisite, so the use of R in this tutorial only scratches the surface of R and ggplot2‘s capabilities. My goal in using R to make the tutorial is twofold. First, it provides the student with a sense of the power and possibilities that coding can offer them if they are open and interested in learning more about it. And second, but possibly more importantly, it allows students to complete problems using real data, about real people, with real conclusions and consequences. Given the ease I had in creating this first one, I see obvious next lessons in the same course that would benefit from their own tutorials.
Carrie Kissman, Associate Professor of Biology and Environmental Science: I created a tutorial using a hypothetical scenario of woodchuck weight data from the Fox and Wolf Rivers to practice skills developing an ecological hypothesis and prediction, implementing a t-test to analyze the woodchuck weight data, and drawing the appropriate statistical and
ecological conclusions given the results of the statistical test. This tutorial takes them through 15 multiple choice and 2 short answer questions, each with instant feedback, to prepare for an upcoming exam.
Rachel McCoy, Assistant Professor of Biology: The tutorial I created for this project was aimed at analyzing and visualizing data in R. During the tutorial, students learn how to calculate averages and standard deviations by categorical group. They also learn to make a variety of graphs for both discrete and categorical variables. Finally, the tutorial begins to show students the power of ggplot2 by demonstrating some of the customization available, including adding color to plots and renaming axes and titles.