4 min read

Cook Part 1

Assignment

Instead of traditional problem sets, this course has a single four part assignment where you will build upon your previous work each week with new material from the course. You will explore property assessment in Cook County, Illinois and create an assessment model. After the completion of the assignment, you will wrap your model into a report which analyzes the effectiveness of your model based on the ethical and other frameworks from class and make a brief presentation to the class.

Submissions

Each week you will submit two files on blackboard, your code/Rmd file and the knitted output of your code. Blackboard will not accept html files so you must zip the files together.

Part 1 (Due 2/13, 11:59pm)

You have been tasked with undertaking a multi-part analysis of homes in Cook County, Illinois. You are provided with a database to facilitate this analysis. This database was constructed from the Cook County Open Data portal. More information is included in the database section below. Note that the database must be downloaded.

R Markdown Requirements

Please include code_folding: hide as a yaml option and knitr::opts_chunk$set(warning = FALSE) in your setup chunk so that your code can be seen in your knitted output but is initially hidden.

Database

I have provided data via a sqlite database. Broadly this data is what is the most appropriate to be used in creating 2023 assessments based on what is available presently. It can be found on OneDrive.

Four tables are provided:

geospatial_universe

Information on latitude/longitude and neighborhood code from tax year 2022 (released on a delay). Only a subset of columns is selected.

See: https://datacatalog.cookcountyil.gov/Property-Taxation/Assessor-Parcel-Universe/nj4t-kc8j/about_data

sales

Information on sales from 2021 to present (current mid-September 2023)

See: https://datacatalog.cookcountyil.gov/Property-Taxation/Assessor-Parcel-Sales/wvhk-k5uv/about_data

dbplyr

Starter code, replace PATH with the location you saved the database within your R project. Note that if you provide an incorrect path you will get a confusing error message (table is missing/not found). In order to verify that your connection is working correctly run DBI::dbListTables(con) you should see 4 tables.

con <- DBI::dbConnect(RSQLite::SQLite(), "PATH")

# sales tbl

dplyr::tbl(con, 'sales')

# convert to tibble
#dplyr::tbl(con, 'sales') %>% dplyr::collect()

# sql query

dplyr::tbl(con, 'sales') %>% count(year(sale_date))

#dplyr::tbl(con, 'sales') %>% count(year(sale_date)) %>% show_query()

Assignment

  • Section A: Conduct an exploratory data analysis. Offer an overview of relevant trends in the data and data quality issues. Contextualize your analysis with past issues and recent improvements by the Assessor.
  • Section B: Use cmfproperty to conduct a sales ratio study across the relevant time period. Note that cmfproperty is designed to produce Rmarkdown reports but use thedocumentation and insert relevant graphs/figures into your report. Look to make this reproducible since you’ll need these methods to analyze your assessment model later on.
  • Section C: Explore trends and relationships with property sales using simple regressions

Grading Overview

For each assignment, you will be graded on substantial completion of the assignment (demonstrated by an attempt of all parts). When submitting parts 2, 3, and 4, you will be additionally graded on your incorporation of feedback, new concepts from the course, or the correction of any flagged issues.

The assignment will culminate in a final submission of code/report and presentation. Code will be graded based on reproducibility, conceptual understanding, and accuracy. The report will be an Rmarkdown file which knits together graphs, tables, and ethical frameworks. It should be concise (include only relevant information from Parts 1-4). This report will be used to give a five minute presentation to the class on your model and ethical/technical issues with Cook County property assessment.

Asg. Points Category Notes
1 5 Substantial Completion (attempted all parts)
2 5 Substantial Completion (attempted all parts)
2 5 Incorporation of Feedback/New Concepts From Part 1
3 10 Substantial Completion (attempted all parts)
3 10 Incorporation of Feedback/New Concepts From Part 2
4 30 Final Code Reproducible (10), Concepts (10), Accurate (10)
4 20 Final Report Via Rmarkdown HTML, contextualized analysis and ethics
4 15 Final Presentation 3-5 minute presentation on model and insights