Assignment 2: Exploratory Question Report

Assignment 2: Exploratory Question Report#

The second project assignment is to answer the Exploratory Questions laid out in your initial proposal. The report should be written to an imaginary but specified stakeholder (please state at the top of the report who you imagine your stakeholder to be).

It should answer at least three Exploratory Questions. Of the three Exploratory Questions you seek to answer, at least one must be answered through your own analysis (i.e., you must load data into Python/R and generate your answers). You may answer up to two by citing reputable sources (if someone else has already found the answer, why re-invent the wheel?!).

The report should be written according to the principles in this document (which will be familiar to those of you who took Practical Data Science).

Please start your document with a quick note on who you imagine as your stakeholder (a CEO, a public policy expert, etc.).

Finding Data#

To help you along your way, here’s a compilation of a huge number of public data sources. It’s far from exhaustive, but a good place to browse if you’re unsure how to answer one of your questions.

Still stuck? Let me know and we can brainstorm together.

Length#

I’m relatively flexible on length — I imagine most will be in the three to six-page range, including appendices and figures and provide additional information.

Structure#

My philosophy on document structure is that the way a document is organized should always be purely instrumental — that is, you should always organize your document in the way that best serves the goals of the document. As the goal of this report is to communicate what you have learned by answering your Exploratory Questions to your stakeholder, the way you organize your report should reflect that goal (again, review the Writing to Stakeholders reading for guidance on how best to accomplish that goal).

That means that your report will probably not have a structure that mirrors the “state a problem and three questions you plan to answer” structure of the last assignment you completed. That structure was chosen to help you organize your thoughts at the problem and question generation stage. But you’ve now answered those questions and (presumably!) learned a lot in the process, and given your goal has changed and what you know has changed, so too should the structure of your report.

Writing a Coherent Report#

A final suggestion: there are some tasks in data science and in life that can be easily parallelized, and others than cannot. Writing a good, coherent report, in my experience, is very hard to do in a fully parallelized manner.

The “problem and three questions” structure that you used for the last assignment naturally lends itself to a divide-and-conquer approach in which different people answer and write up different questions. And for the first draft of your report, that’s a reasonable strategy. But that approach also tends to create a very fragmented report without a clear message.

With that in mind, I would recommend that after pulling together a first draft of the various answers your team has collected, get together and discuss what you think is the take-away you want to communicate to your stakeholder. Then, with that in hand, move to a serial workflow in which one person takes a crack at editing the entire report. This person should feel unafraid to cut or move things around. Then let someone else read it and make edits. But since the report will be read from start to finish by one person, it makes a lot of sense that it be written (or at least edited) from start to finish by one person at a time.

Due Date#

Your Exploratory Question Report is due February 13th.