IDS 701 Final Project#
For your final project in IDS 701, you and your group should choose a substantive problem you care about, identify a Causal Question which, if answered, would help solve that problem, and then set about answering that question to the degree possible.
Project Proposal#
The first element of your project will be a project proposal. Your project proposal must follow the approach laid out in your Solving Problems with Data readings. In particular, you must specify:
The substantive problem that motivates your project, and which you wish to help solve.
The Causal Question you will seek to answer, and how answering that question will help solve your motivating problem. In specifying your Causal Question, you should be very explicit about your causal factor of interest (your \(D\)) and your outcome of interest (your \(Y\)).
What you expect the answer to your question to look like. Please be concrete—what figure or table do you imagine being in your final report? If it is a figure (and a data visualization is usually an important part of the presentation of results), what will the axes be? Be sure the figure or table will answer your question!
When imagining this result, think about the ideal result you could imagine—at this stage, don’t worry too much about whether you will be able to find a data source that will allow you to create this figure. This is an important exercise for clarifying what you are trying to achieve in principle. All too often, data scientists get too bogged down in the details of what’s in a specific dataset. There’s a time for that, but to ensure you have a clear understanding of what you are trying to achieve—the question you are asking and what the platonic ideal of an answer to that question would look like—start without considering feasibility or the practicalities of data availability.
Given what you have for (3), what variables do you need in your final dataset, and what sample do you need your data to cover? For example, if you were interested in studying how changes in opioid regulations affect overdose death rates, you’d need:
data on opioid regulations at different points in time, overdose counts, and population data; and
observations with and without opioid regulations and data covering enough time to allow for changes in regulations to be implemented and have an effect.
Where might you get the data you need? Now you get to think about this. What are some initial thoughts you have on where you might be able to find data?
Just because you don’t have an answer to this question that would allow you to create the figure you imagined in (3) doesn’t mean you should go back and change your answer to (3)! Your instructor and TAs are likely to know about possible data sources you don’t know about — this is just a place to write YOUR initial thoughts and ideas.
Data Sources#
As you are thinking about sources of data, you may find this list of data sources helpful.
Final Deliverable: Data Science Memo#
Your final deliverable should take the form of a 4-6 page Data Science Memo, following the guidelines laid out here. I won’t recapitulate the linked guidelines, except to emphasize the point that what we are looking for is not a document in which you demonstrate all the hard work you did by detailing all the technical details of your project—we are interested in a document written for a stakeholder that communicates what you’ve learned and why it matters.
Due Dates#
Project Proposal: Your project proposal is due by end of day April 2nd.
Project Rough Draft: You are strongly encouraged (though it is not required) to submit a rough draft of your project by April 20th.
Final Submission: Your final memo is due May 3rd.