Reproducible Workflows

Fundamentals of Data Science

MTH-391D | Spring 2025 | University of Portland

January 22, 2025

Objectives

Previously… (1/2)

R Script File Example

R Script File Example

Previously… (2/2)

Running R Commands in Different Ways

Running R Commands in Different Ways

RMarkdown, an Introduction

What is RMarkdown?

RMarkdown for Reproducibility

Why use RMarkdown?

RMarkdown in R Studio

What does RMarkdown look like in R Studio?

RMarkdown Viewed in R Studio

RMarkdown Viewed in R Studio

Basic Parts of RMarkdown (1/2)

RMarkdown Basic Parts 1, 2, & 3

RMarkdown Basic Parts 1, 2, & 3

  1. The YAML header: It is placed at the very beginning of the file with start and end syntax “---”. This header is required and the information that it contains can affect the entire output.
  2. Levels of Headers: It provides titles to sections.
  3. Basic Paragraph: Normal text that you can write where a new line means new paragraph.

Basic Parts of RMarkdown (2/2)

RMarkdown Basic Parts 4 & 5

RMarkdown Basic Parts 4 & 5

  1. Code Chunks: The heart of RMarkdown, where you can run codes and present it and its output clearly with the rest of the document. The start and end syntax is “```”.
  2. Blocked Paragraph: A special paragraph that allows customization, such as changing the text color of its output.

Code Chunks in RMarkdown

Creating an R Code Chunk

Creating an R Code Chunk

Running Code Chunks

Running Code Chunks

Running R on RMarkdown vs Knitting

\(\star\) Understanding this difference is key for fixing majority of errors and undesired outputs.

Transferable Skills

Why learn any of these things?

Data Science

  1. Data processing and Analysis
  2. Critical thinking and problem-solving
  3. Data visualization and communication
  4. Domain knowledge adaptability
  5. Project management and team collaboration

R and Python

  1. Logical reasoning
  2. Automating repetitive tasks
  3. Programming is an in-demand skill in both industry and academia
  4. Learning R or Python is a gateway for more advanced programming languages
  5. Learning how to code is a versatile skill and opens up opportunities

Expectations on Assignment Deliverables

Mini-Assignments

In-class activities and mini-assignments must be submitted either physically or through Moodle. Depending on the lesson, submission format will be in physical form, .pdf, .Rmd, or .html.

Project Phases

All project phase reports must be submitted in both .Rmd and .html files through Moodle.

\(\star\) Mini-assignments and the Project will have special R Studio spaces.

Tips for Success

How can I be successful in this class?

In-Class

Assignments

\(\star\) Help hours –walk-in and by appointment– exists!

Activity: Modify a RMarkdown document

The purpose of this activity is for you to get a sense of how mini-assignments and the project are done.

  1. Log-in to Posit Cloud and open the R Studio assignment MA2: Modify a RMarkdown Document.

  2. Make sure you are in the current working directory. Rename the .Rmd file by replacing [name] with your name using the format [First name][Last initial]. Then, open the .Rmd file.

  3. Change the author in the YAML header.

  4. Read the provided instructions.

  5. Answer all exercise problems on the designated sections.