Project description

Important

Project information is still being finalized, and will be completed by June 22, 2026.

Timeline

Project Proposal due Sun, June 28

Draft report due Fri, July 17

Peer review due Wed, July 22

Final report due Wed, July 29

Presentation on Thu, July 30

Introduction

Make a Research Question, Pick a data set, and do analysis. That is your final project.

The goal of the final project is for you to apply the skills you have been learning through out the quarter and apply to real life data to answer a question YOU HAVE MADE.

The data will come from a pre-curated list of data from various different sources including public research institutions and survey firms. As a group you will select a data set based on a research question that you have made. Remember the goal of this project is for you to demonstrate the skills we have covered in this class and apply them to the selected data set to analyze it in a meaningful way.

YOU DO NOT NEED TO GO BEYOND THE SKILLS IN THIS CLASS. There should not be any reason to use AI in this course, because you will be provided all the code and information to do the analysis that you need to do. If anyone in your group is caught using AI you will receive a 0 on this project.

All analyses must be done in RStudio, and all components of the project must be reproducible. This means that when I run the code, I do not get any errors.

Logistics

You will work on the project with your assigned team.

The pieces for the final project are the following:

  • A submitted written, reproducible report detailing your analysis.
  • The presentation for the project.
  • A summary of the edits made following the draft.
  • A summary of each contribution, written by each member of the group
  • The orginal quarto file with the code

Topic ideas

I have curated a list of data sets from the following sources:

  1. United Nations Data: https://data.un.org/
  2. Pew Research: https://www.pewresearch.org/datasets/
  3. Election Studies: https://electionstudies.org//
  4. Latin American Statistics:

As a group select a data set that you are interested in, and relates to your research question. You will be required to submit your groups research question and the selected data set.

Find the data sets here.

Project Proposal submission

Your Project Proposal submission should be no more than half a page and will be graded on completion. This means that as long as you attempt, show that you tried, and included all requirments then you will receive full credit.

The proposal should include the following:

Introduction and data (1 paragraph)

  • State the source of the data set: who made the data?
    • Remember to provide a citation for the dataset
  • Describe the observations and the general characteristics being measured in the data
    • What questions is your team thinking they will be using?
    • Use the code book to find what variables were collected and how they are measured.

Research question (1 paragraph)

  • State the research question your team wants to answer
  • Tell me why your team is interested in answering using this data
    • What motivated your team to ask this question

Glimpse of data

  • Use the head function to provide an overview of each data set
    • head()

Download the template here. Render as a word docx and submit to canvas.

Project Report Draft #1

Draft report

The purpose of the project report draft #1 is to get your team started on this project earlier, so you are not doing this last minute. This is graded on completion as long as all parts are included then your group will receive full credit.

Include the following in the draft:

Section 1 - Introduction

The introduction should include the following (1 page):

  • You should state the motivation of the research question
    • Focus these two key elements:
      • What is the observed problem
      • What does this contribute to?
        • Does this contribute to more knoweldge? (Intellectual)
        • Does this contribute to policy? (Broader)
    • In your introduction section, you should be focused on discussing what the problem is that you are seeing and why this needs to be addressed. Is there a current problem in the world; did something happen?
    • https://www.sjsu.edu/writingcenter/docs/handouts/Introduction%20of%20Research%20Papers.pdf
    • Example:
      • “The number of women running for political office has skyrocket since the introduction of gender quota laws and revisions were introduced to reduce the loopholes of the laws. The most recent election in Mexico during 2024 was the largest election in its history and saw the most women running for office. However, while the number of women entering political office is increasing, so is the level of criminal violence, especially against political leaders. The 2024 election was one of the most violent elections in recent history (Data Cívica, México Evalúa y Animal Político, 2024A). Yet, little research has examined the impact of criminal violence on the candidate selection process and how this impacts women, and as violence increases it is critical to analyze the relationship between criminal violence and candidate selection.” (Daarstad, N.D.)
      • “While previously polarization was primarily seen only in issue-based terms, a new type of division has emerged in the mass public in recent years: Ordinary Americans increasingly dislike and distrust those from the other party. Democrats and Republicans both say that the other party’s members are hypocritical, selfish, and closed-minded, and they are unwilling to socialize across party lines. This phenomenon of animosity between the parties is known as affective polarization. We trace its origins to the power of partisanship as a social identity, and explain the factors that intensify partisan animus.” (Iyenger, et. al, 2019)

Section 2 - Literature Review

Section 3 - Hypotheses (1/2 page)

In this section, you will tell me your hypotheses!

Section 4 - Data Description (1/2 page)

In this section, you will describe the data. This includes

  • description of the observations in the data set,
    • What is your independent variable?
    • What is your dependent variable?
    • How was it measured?
    • How was it collected?

Section 5 - Data Summary + Regression Model (1-2 pages)

In this section, you will provide basic summary stats, visualizations, and a simple regression model of your data. This includes:

  • Visualization and summary statistics for the response variable.
  • Simple Regression model

Section 6 - Conclusion (1 pages)

  • Tell me what you found, does it support your conclusions?

Submission

Submit as a word document on canvas.

Project Draft grading

Total 100 pts
Introduction 20 pts
Lit Review 20 pts
Hypotheses 10 pts
Data Description 10 pts
Data Summary + Regression 20 pts
Conclusion 20 pts

Peer review

Critically reviewing others’ work is a crucial part of the scientific process. Each team will be assigned two other teams’s projects to review. You will be randomly assigned two other team’s drafts and will need to provide detailed comments by Wed. July 22

The peer review will be graded on the extent to which it comprehensively and constructively addresses the components of the partner team’s report: the research context and motivation, exploratory data analysis, modeling, interpretations, and conclusions.

Pairings

Team being reviewed Reviewer 1 Reviewer 2
Team #1
Team #2
Team #3
Team #4
Team #5
Team #6

Process and questions

Spend ~30 mins to review each team’s project.

  • Find your team name on the Reviewer 1 and Reviewer 2 columns.
  • For each of the columns, find the name of the team to review in the Team being reviewed column. You will
  • As a team you will submit the following document (1-2 pages)
    • Peer review by: [NAME OF TEAM DOING THE REVIEW]

    • Names of team members that participated in this review: [FULL NAMES OF TEAM MEMBERS DOING THE REVIEW]

    • State the research question and hypothesis.

    • Describe the data used.

    • Is there anything that is unclear from the proposal?

    • Provide constructive feedback on how the team might be able to improve their project. Make sure your feedback includes at least one comment on data summary and regression section of the project.

    • What aspect of this project are you most interested in and would like to see highlighted in the presentation.

Final Project Report

Your final project report must include the following files to be considered complete:

  • final_report_team-name.docx

  • final_pres_team-name.docx

  • team_contributions_team-name.docx

  • final_edits_team-name.docx

  • final_project_team-name.qmd

You will submit these files of your final report on canvas.

Each piece is worth the following points, broken down as follows

Total 200 pts
Final Report 100 pts
Presentation 25 pts
Team Contributions 25 pts
Final Edits 25 pts
Final Report Code 25 pts

Final Report

The final report will follow the same format as the draft report, but will be the final version.

Introduction (20 points)

The introduction will be worth 20 points and needs to include the following:

  • The Motivation/Significance (10 points)

    • What is the observed problem

    • What does this contribute to?

  • Research Question (10 points)

Lit Review (20 points)

The lit review will be worth 20 points and needs to include the following:

  • A comprehensive review of the lit based on your topic (10 points)

    • You need to cite a minimum of 5 academic articles

      • You will lose 2 points for every article below the minimum
  • You need to address the gap: where does your question settle in the literature? (10 points)

    • You do not need to provide a huge or original contribution, but what just think what does answering your question add that you think is missing?

Hypotheses (10 points)

  • You need to include at least 1 hypothesis and it needs to be testable with the data you have selected

Data Description (10 points)

  • description of the observations in the data set,

    • What is your independent variable? (3 points)
      • How was it measured? (1 points)
      • How was it collected? (1 points)
    • What is your dependent variable? (3 points)
      • How was it measured? (1 points)

      • How was it collected? (1 points)

Data Summary + Regression (20 points)

  • Data Summarization (10 points)

    • Two different data visualizations (5 points/2.5 points each)

    • Description of averages, medians, etc. (5 points)

  • Regression Model (10 points)

    • One simple regression model (5 points)

    • One mulitivariate regression model (5 points)

Conclusion (20 points)

  • Discussion of the results (10 points)

    • Provides a detailed and accurate discussion of the results from the models.
  • Summary of everything (10 points)

    • Provides a summary of the full report

    • Essentially, put a nice bow and wrap up the paper

Total 100 pts
Introduction 20 pts
Lit Review 20 pts
Hypotheses 10 pts
Data Description 10 pts
Data Summary + Regression 20 pts
Conclusion 20 pts

Slides (25 points)

The final report will include a pdf version of the teams presentation of the project.

This will be graded for completion, so if submitted completed full points will be awarded.

Team Contributions (25 points)

The final report will include a summarization of all the team contributions to the project. Each member is required to submit a paragraph on their contributions to the overall project.

IMPORTANT: NO TEAM MEMEBER CAN BE SOLEY RESPONSIBLE FOR THE CODING

This will be graded for completion, so if submitted completed full points will be awarded.

Edit Report (25 points)

The final report will include a summarization of all the edits team made to the final project. this should be about a page no more.

This will be graded for completion, so if submitted completed full points will be awarded.

Final Report Code (25 points)

The final report will include a reproducible document of your code, submit the .qmd file with the code

This will be graded for completion and reproduceability, so if submitted completed and runs with no errors, full points will be awarded. If submitted completed and runs with errors are found, half points will be awarded.

Click here for a PDF of the full Final Project Report rubric.

Presentation

Slides

In addition to the written report, your team will required to create presentation and present in class to summarize and showcase your project. You will need to introduce your research question and data set, showcase visualizations, and discuss the primary conclusions. The presentation should be around 10 - 12 minutes long. Every group member needs to speak at least once. If there is a complication with this, please come to me as soon as possible.

For submission, convert these slides to a .pdf document, and submit the PDF of the slides on canvas.

The slide deck should have no more than 6 content slides + 1 title slide. Here is a suggested outline as you think through the slides; you do not have to use this exact format for the 6 slides.

  • Title Slide
  • Slide 1: Introduce the topic and motivation
  • Slide 2: Introduce the data
  • Slide 3: Summary of Data
  • Slide 4: Final model
  • Slide 5: Interesting findings from the model
  • Slide 6: Conclusions

Late work policy

There is no late work accepted on the final project report. Be sure to turn in your work early to avoid any technological mishaps.