Assignment
Instead of traditional problem sets, this course has a single four part assignment where you will build upon your previous work each week with new material from the course. You will explore property assessment in Cook County, Illinois and create an assessment model. After the completion of the assignment, you will wrap your model into a report which analyzes the effectiveness of your model based on the ethical and other frameworks from class and make a brief presentation to the class.
Submissions
Each week you will submit two files on blackboard, your code/Rmd file and the knitted output of your code. Blackboard will not accept html files so you must zip the files together.
Part 2
Objective: Now that you have a decent understanding of the landscape in Cook County, create a new file (part_2.Rmd) which builds upon part_1 in the html report Rmarkdown style.
Submission: submit to Blackboard both your code and the knitted Rmarkdown output.
For this assignment (and the rest of the project) you have been assigned a subset of Cook County to run your analysis/modeling. Your selection will be based on the township level (there are 38 townships). I have grouped some townships together. Below is the list of townships. Each subset will have different challenges based on the composition of the property market. We will discuss these in class. ASSIGNMENTS HERE
Group | townships |
---|---|
12 | Bloom (12) |
13 | Bremen (13) |
18 | Barrington (10), Hanover (18) |
20 | Norwood Park (26), Leyden (20) |
21 | Stickney (36), Lyons (21) |
22 | Maine (22) |
23 | Evanston (17), New Trier (23) |
24 | Niles (24) |
25 | Northfield (25) |
27 | Riverside (34), Oak Park (27), River Forest (33), Berwyn (11), Cicero (15) |
28 | Lemont (19), Palos (30), Orland (28) |
29 | Palatine (29) |
31 | Proviso (31) |
32 | Rich (32) |
35 | Schaumburg (35), Elk Grove (16) |
37 | Thornton (37) |
38 | Wheeling (38) |
39 | Calumet (14), Worth (39) |
70 | Hyde Park (70) |
71 | Jefferson (71) |
72 | Lake (72) |
73 | Rogers Park (75), North Chicago (74), Lake View (73), South Chicago (76) |
77 | West Chicago (77) |
Part A
Create an ‘introduction’ to your report. Generally, only include stylized output (do not use base R print). This could mean using stargazer
to show regressions, DT::datatable
to show data.frames
, and adding titles/labels to plots. Your introduction should include:
- Brief background (2-3 sentences) on assessments
- 3 to 4 graphs with descriptive captions which include information on sale price, assessment accuracy, and outliers. Generally focus on single family homes and arm’s length transactions. Are there data you would exclude?
Part B
We have two separate (but very related) problems we want to model. First, we want to find a way to identify if a home is likely to be overassessed in a given year (e.g. was the assessment ‘correct’). This involves comparing actual market values (e.g. homes which sold) to the assessor’s assessed values. We will analyze homes and assessments from 2022. We will use tidymodels to create a workflow.
- Create your workflow
- Add to your workflow a classification model
- Add to your workflow a recipe of preprocessing steps. Use 2022 sales and assessments with the parcels property characteristics (note that we only know if a home was overassessed if it sold). Create a classification metric of overassessment based on properties which sold and use this as your dependent variable. Explain how you decided to construct this metric and how many classes it has.
- Create testing/training data and evaluate your model using the classification metrics from tables 8.3 and 8.4 from the textbook and the classification probability metric ROC curves.
Part C
Second, building off of the workflow from part B. Create a second model to create your own 2023 assessments. This involves using market information (property sales) to generate market values. The assessments are not used in creation of this model but we will use the assessor’s assessments as a comparison of model performance in the coming weeks.
- Create your workflow
- Add to your workflow a model
- Add to your workflow a recipe of preprocessing steps. Use sales from before 2023 with the parcels property characteristics.
- Create testing/training data and evaluate your model using numeric metrics RMSE and MAPE.
Grading Overview
For each assignment, you will be graded on substantial completion of the assignment (demonstrated by an attempt of all parts). When submitting parts 2, 3, and 4, you will be additionally graded on your incorporation of feedback, new concepts from the course, or the correction of any flagged issues.
The assignment will culminate in a final submission of code/report and presentation. Code will be graded based on reproducibility, conceptual understanding, and accuracy. The report will be an Rmarkdown file which knits together graphs, tables, and ethical frameworks. It should be concise (include only relevant information from Parts 1-4). This report will be used to give a five minute presentation to the class on your model and ethical/technical issues with Cook County property assessment.
Asg. | Points | Category | Notes |
---|---|---|---|
1 | 5 | Substantial Completion (attempted all parts) | |
2 | 5 | Substantial Completion (attempted all parts) | |
2 | 5 | Incorporation of Feedback/New Concepts | From Part 1 |
3 | 10 | Substantial Completion (attempted all parts) | |
3 | 10 | Incorporation of Feedback/New Concepts | From Part 2 |
4 | 30 | Final Code | Reproducible (10), Concepts (10), Accurate (10) |
4 | 20 | Final Report | Via Rmarkdown HTML, contextualized analysis and ethics |
4 | 15 | Final Presentation | 3-5 minute presentation on model and insights |