Assignment 4: Tell a "Real" Story About Data
Released: Friday, April 24 | Due: Monday, April 27
Overview
For this assignment, you'll tell a notebook-based data story using real data from The Collaborative, our social impact practicum partner, or another real-world dataset of your choosing.
Your story should be analogous to the ones you put together for Assignments 1 and 2, but this time you must:
- Use real data — download, clean, and analyze a real dataset
- Write code — your analysis should be implemented in a Jupyter/Colab notebook
- Generate your own figures — create data visualizations programmatically
- Answer a specific question — frame your story around a clear research question
You may work on this assignment individually or in groups of any size.
Story Format
Use the demo project as a template for putting your story together. Your project should comprise:
1. Jupyter Notebook
A single Google Colaboratory-compatible notebook containing your story and code. Name the notebook groupname.ipynb, where groupname is a unique identifier for you or your group.
The notebook should include:
- Your research question and motivation
- Code for downloading/loading the data
- Data exploration and cleaning
- Analysis and visualizations
- Interpretation and narrative
2. README
A README file formatted using Markdown based on the README template. Include:
- Project description and overview
- Link to your YouTube video
- Links to the data you analyzed
- Instructions for replicating your results
- How someone could contribute to your project
3. YouTube Video
A video (up to 5 minutes) telling your data story. Suggested formats:
- Narrated screencast scrolling through your notebook
- Narrated slideshow with figures from your notebook
- Any creative format that effectively communicates your findings
Submission
Submit your assignment by making a pull request adding your project files to the course repository's data-stories/ folder.
The Collaborative
During week 4, we met with representatives from The Collaborative to learn about their work and the data they've shared with us. You're encouraged (but not required) to use their dataset for this assignment. You may also use any other real-world dataset.
This part of the course is supported by Dartmouth's Social Impact Practicum program.
Tips
- Start with a question, not just a dataset. What do you want to know?
- Use vibe coding to help with data wrangling and visualization — describe what you want and let AI help you build it
- Iterate on your figures — the first plot is rarely the best one
- Tell a story, don't just present results — guide your audience through your reasoning
- Keep it focused — a 5-minute story should make one or two clear points
Resources
- Python Data Science Handbook (NumPy, Pandas, Matplotlib)
- Dartmouth AI tools — for vibe coding assistance
- Kaggle Datasets — if you want to explore other datasets
- Google Dataset Search