Project Proposals (initial and revised)

Published

January 1, 2022

Proposals

The first stage of your final project will be a topic proposal. The proposal outlines your idea for the regression model, as well as the relationships you expect to observe.

Picking the topic is perhaps the most challenging piece of this assignment, so dedicate some time in your group to brainstorming ideas. Plan to brainstorm at least half a dozen serious ideas, much the way we did in the “question generation” exercise, before you pick one to developing into a mature proposal.

For the most part, the choice of topic is left up to you. Try to pick something that’s interesting yet substantial and worth studying, and aim for a topic that you think nobody has tried before; remember that part of your overall grade will be based on originality.

It is easiest to write a proposal with data in hand, but data is not required for the proposal. If you can’t find a dataset on the topic you are interested in, spend some time imagining your ideal dataset. What would the observations (rows) be? What variables would it contain? Consider whether it will be possible to find data like this– if your imagined rows are students in a school or patients in a hospital, it will be difficult to find data because of issues with anonymity. Sometimes you can find datasets with observations that are individual people, but those datasets are typically from randomized studies or opinion surveys.

Once I respond to your initial proposal, you will revise it (perhaps starting with a different topic/dataset), then submit a new proposal that addresses my feedback. When you produce your revised proposal, you will supply essentially the same information required for the initial proposal, but give a bit more detail.

Content

Your initial and revised proposals should contain the following content:

  1. Group Members and Group Letter: List the members of your group, and include the letter that has been assigned to your group

  2. Title: Your working title. This should describe your most basic question/hypothesis in a few words.

  3. Purpose: Describe the general topic/phenomenon you want to study, as well as some focused questions that you hope to answer and specific relationships you expect to observe. What variables do you think will have a positive relationship? A negative relationship? Do you expect to observe any interaction effects?

  4. Data: Describe the data that you plan to use, with specifications of where it can be found (such as a URL) and a short description. If you do not have data yet, describe your ideal dataset. Eventually, you may want to combine data from multiple sources into one file. We will discuss data management techniques in the coming weeks, but for now you should simply list multiple sources if you have them.

  5. Population: Specify what the observational units are (i.e. the rows of the data frame), and (if appropriate) describe the larger population/phenomenon to which you’ll try to make inference. In some circumstances, you may have all the data (a census), or you may not have a random sample. In those cases, you cannot make inference to a larger population, and will use regression as a descriptive technique. If this is the case for your project, explain why you can’t make inference to a larger population.

  6. Response Variable: Your project should focus on one response variable that you want to predict. What is the response variable you are interested in? What are its units? Estimate the range of possible values that it may take on.

  7. Explanatory Variables: Describe the variables you’ll examine as predictors (i.e. the other columns of the data frame). Carefully define each variable and describe how it was measured. For categorical variables, list the possible categories; for quantitative variables, specify the units of measurement. You may want to add more variables later on, but you should have at least five variables already.

Format

You will turn in a single document per group, in HTML or PDF format. I recommend writing your proposals in RMarkdown (you don’t need to include code, just text) to practice for the final report, but if you prefer to work in Word or another technology, that is fine. If you work in RMarkdown, knit the document to HTML or PDF and just submit the knitted document. If you work in Word, Save As PDF.

Most proposals are 1-2 pages in length, although RMarkdown makes it difficult to assess length until you knit. I will not be grading on length (or any other aspects of writing), simply on content, so just make sure you include all the necessary components.