Project Milestone: Project Proposal

Fundamentals of Data Science

MTH-391A | Spring 2025 | University of Portland

February 3, 2025

Objectives

Previously…

The guiding principle of data science is the data science life cycle.

The Data Science Life Cycle

The Data Science Life Cycle

Project Expectations

Purpose of Project Phases

Presentation Requirements

Best Practices

Project Milstone vs Phase

Project Phases

Project Milestones

\(\star\) Key Difference: Phases are ongoing work while Milestones are completion markers

Data Sets (1/10): Survivor

Source: surivoR

Data Sets (2/10): Flights

Source: anyflights

Data Sets (3/10): Census

Source: tidycensus

Data Sets (4/10): Jeopardy

Source: 200,000+ Jeopardy! Questions

Data Sets (5/10): Spotify Songs

Source: Spotify Million Song Dataset (Kaggle)

Data Sets (6/10): Fake News

Source: Fake News detection dataset (Kaggle)

Data Sets (7/10): Spam Emails

Source: E-Mail classification NLP (Kaggle)

Data Sets (8/10): Apps

Source 1: Apple AppStore Apps (Kaggle) Source 2: Google Play Store Apps (Kaggle)

Data Sets (9/10): Movies

Source: IMDb Non-Commercial Datasets

Data Sets (10/10): Twitter

Source: Twitter Hashtag 94 Data

Project Proposal

Frame your data science project, establish objectives, and outline the approach to solving it. Requirements:

While you think about your proposal, here are key contents that must exist in your overall project:

Activity: Getting Started with your Project

  1. Log-in to Posit Cloud and open the R Studio assignment Project AlexQ.

  2. Replace the project name Project AlexQ to your name using the format Project [First name][Last initial].

  3. Create a new RMarkdown file and modify it with your name. Remove the default texts.

  4. Choose one data set in the given list. In your Rmarkdown file, provide the following information:

    • Define your problem or research question
    • Specify the dataset(s) to be used and their relevance
    • Outline the methods or techniques to be employed
  5. When finished, knit your .Rmd to .html, and create a 3-slide presentation that summarizes your report, then Submit your work to Moodle.

\(\star\) Your report (.Rmd, .html, and the 3-slide presentation) for this milestone will be due on the next milestone of the project. The in-between phases will serve as check-ins.