šŸ“š Collection of Data Analytics Tutorials for R & Python

Jan 1, 2023Ā·
Yuxiao (Rain) Luo, PhD
Yuxiao (Rain) Luo, PhD
Ā· 8 min read

This post aims to summarize some free and high-quality tutorials of R and Python for those interested in data analytics and programming.

Learning R

From 2021 to 2023, I was a PhD student and involved with the Digital Fellowship at The Graduate Center (CUNY). During that time, our digital fellows created and published a variety of digital tutorials online, ranging from text analysis in R to building games in Python. These resources are free, accessible, and perfect for building your data skills.

Integrated R Book

I compiled all our R-related tutorials into a single resource:
šŸ“– Data Analytics in Digital Research with R. This book begins with the fundamentals of R, covering topics such as data types, functions, and data structures—essential for beginners with no prior experience. It then moves on to intermediate and advanced topics, including R Markdown, Shiny, text analysis, predictive modeling, and more, making it a comprehensive one-stop resource for learning R in the context of data analytics and digital research.

R Workshops

We created some workshops focusing on building up specific skills in R. All these workshops are beginner-friendly.

  • Intro to R: This workshop will help you start learning R from scratch.

  • Data Wrangling in R: This workshop teaches: 1. Cleaning and transforming Spotify datasets; 2. Identifying patterns or trends within the data; 3. Creating visualizations to surface those insights.

  • Predictive modeling in R (PDF) & Code: The workshop covers building predictive models in R, from data preparation to model training and evaluation. It demonstrates methods like logistic regression and tree-based models with cross-validation and performance metrics. The accompanying R code implements the full workflow for training, predicting, and assessing models.

  • Intro to regression analysis in R: The tutorial introduces simple and multiple linear regression in R using lm(). It covers checking model assumptions with diagnostic plots and interpreting key statistics. It also demonstrates visualizing regression results with ggplot2.

  • Intro to text analysis in R: The tutorial introduces text analysis in R using tidyverse, tidytext, and visualization libraries. It covers tokenizing text, counting word frequencies, removing stopwords, and creating word clouds with wordcloud and wordcloud2. It also demonstrates filtering by artist, visualizing word frequency with ggplot2, and applying the workflow to another dataset.

  • Qualitative data analysis with R: RQDA: The workshop trains participants to use RQDA, a free R package for qualitative data analysis. It covers installation, importing and coding text data, writing memos, organizing codes, and running searches—combining a GUI with R scripting. Participants learn to manage and analyze qualitative data efficiently without costly proprietary software.

Tutorials from R User Group

We created a series of R tutorails from the bi-weekly R User Group (RUG) meetings, covering topics from text analysis to R Markdown and Shiny applications. You can find a more organized stack in the GitHub repository. All these tutorials are beginner-friendly.

I’m listing some useful R tutorials below.

  • Intro to string processing in R: The tutorial introduces the stringr package for consistent, easy string manipulation in R. It covers functions for changing case, measuring length, extracting/modifying substrings, concatenating, trimming, and padding. It also demonstrates pattern matching with regular expressions using functions like str_detect(), str_replace(), and str_split().

  • Intro to regular expression in R: The tutorial introduces regular expressions in R for defining text patterns, using stringr functions built on stringi. It explains key regex elements like metacharacters, quantifiers, character classes, and anchors. It demonstrates finding, replacing, and viewing matches for tasks like pattern detection and text cleaning.

  • Intro to R Markdown: The tutorial introduces R Markdown as a format combining Markdown text with R code to create dynamic, reproducible reports. It covers installing the package, creating .Rmd files, adding code chunks, and embedding plots or text. Finally, it explains rendering documents to HTML, PDF, or Word via RStudio’s Knit button or render().

  • Intro to R Shiny: The tutorial introduces R Shiny as an R package for building interactive web apps with a user interface and server logic. It covers creating layouts with fluidPage(), adding input widgets, and rendering outputs that respond to user input. Finally, it explains running apps via shinyApp() or RStudio’s ā€œRun Appā€ button using Shiny’s reactive programming model.

Other blog posts about using R

  • Quick and easy mapping with R

    • The tutorial introduces the sf package for spatial data handling in R.
    • It shows how to load map data with rnaturalearth and plot it using geom_sf() in ggplot2.
    • It highlights the benefit of doing GIS analysis and mapping entirely within R.
  • Using RQDA for Qualitative Data Analysis

    1. Please find the 2-hour workshop recording here.

    2. Context & Motivation: A PhD student in Business Information Systems, yielding from projects analyzing ERP system implementation interviews and CEO speeches, highlights the limitations of purely quantitative methods. Such approaches can oversimplify social complexity, obscure theory-building, and ā€œpresent an illusion of precision,ā€ ultimately losing perspectives vital to qualitative understanding.

    3. Why RQDA? The student chose RQDA, an open‑source R package, for three key reasons:

    • It’s free and open source—no expensive licenses required.
    • It integrates seamlessly with R/RStudio, allowing both GUI interactions and script‑based manipulation.
    • It offers standard CAQDAS features comparable to proprietary software.
    1. Capabilities & Limitations:
    • RQDA supports plain-text input, qualitative coding (including multiple levels and concept development), memos, file and code organization, and both general and conditional search—all via a GUI with the option to extend functionality via R scripting
    • However, its development became inactive as of early 2020, and it’s not guaranteed to receive future updates—though existing versions remain functional for most research needs
    1. Table summary of RQDA:
      AspectDetails
      ToolRQDA (R package for CAQDAS)
      ProsFree; integrates with R; full-featured (coding, memos, search, organization)
      ConsDevelopment halted; may require working with R version 3.6.3 for compatibility
      Use CaseSuitable for mixed-method qualitative research, particularly when avoiding proprietary licenses

Additional R Resources

  • R for Data Science
  • RStudio Education
  • tidytuesday github
    • TidyTuesday is a weekly data project in the R community where participants explore, clean, and visualize a shared dataset, typically using the tidyverse. It’s aimed at improving data science skills through open, reproducible analysis and sharing work with others, often via GitHub or social media.
  • tidytuesday youtube

Learning Python

The best way to learn a programming language is to take structured courses on the subject and pair them with consistent, hands-on practice. I have taught two undergraduate Python courses—core requirements for the Information Systems major—at an AACSB-accredited business school and developed open-source materials for them.

Intro to Programming in Python

  • All materials for this beginner-level Python course are freely available here: https://github.com/YuxiaoLuo/Intro_Python. This 14-week, full-semester course is thoughtfully designed and includes lectures, hands-on labs, and quizzes.

  • Course Description: This course intends to introduce how to use Python to program and write the basic algorithm. This course will cover fundamental principles and concepts required for problem-formulation and problem-solving, and not just programming. The goal of this course is to equip students with the basic ability to use computational principles such as iteration, abstraction, recursion, and functional decomposition. This course will introduce students to basic programming constructs such as control statements and data structures to facilitate learning of these computational principles. This is an introductory course intended for students with little or no programming background.

Programming for Analytics using Python

  • All materials for this intermediate-level Python course are freely available here: https://github.com/YuxiaoLuo/Analytics_Python. In this course, I covered modules such as data handling, web scraping, APIs, analysis and visualization, all delivered through Jupyter Notebooks with real datasets. The focus is on building reusable code and applying analytics workflows to realistic data scenarios.

  • Employing experiential learning principles, the design of this course followed the route: learning outcomes → weekly activity → artifact → feedback rubric.

  • Based on my observations in the class, prior knowledge/experience with Python or programming (such as Java, R, etc.) is strongly recommended before taking this course.

  • Course Description: This course introduces the aspects of programming that can support business analytics. The course covers hands-on issues in programming for analytics which include accessing data, creating informative data graphics, writing functions, debugging, and organizing and commenting code. This course covers several analytics toipcs, including data wrangling, data analysis, web scraping, APIs, etc. Basic knowledge of Python is requried.

Python Workshop

  • Building your first game using Pygame: The tutorial introduces Pygame for building simple interactive games in Python. It covers setting up the display, handling events, and running a game loop with drawing and updates. It also demonstrates using sprites for movement, collisions, and basic game mechanics.

  • Creating simulations in Python: The tutorial teaches building simulations in Python using object-oriented programming and custom functions. It models ā€œCrittersā€ with attributes like age, food, and reproduction, adding events such as environmental disasters. The focus is on replicating real-world processes to explore different outcomes.

Additional Python Resources


These materials and tutorials are open resources—feel free to explore them at your own pace and integrate them into your projects, learning, or teaching.

–>

Did you find this page helpful? Consider sharing it šŸ™Œ

šŸ’¬ Discussions