Create a Program to Implement NYC Datasets in Python Assignment Solution

July 04, 2024

Dr. Andrew

🇨🇦 Canada

Python

Dr. Andrew Taylor, a renowned figure in the realm of Computer Science, earned his PhD from McGill University in Montreal, Canada. With 7 years of experience, he has tackled over 500 Python assignments, leveraging his extensive knowledge and skills to deliver outstanding results.

Hire me to do Your Python Assignment

Python

Key Topics

Instructions
- Objective
Requirements and Specifications

Submit Your Python Homework

Get a FREE Quote

Tip of the day

Always break down your Java assignment into small, manageable classes and methods—this makes debugging easier and keeps your code modular and readable.

News

In 2025, MIT's CSAIL introduced Exo 2, a groundbreaking programming language that allows students and developers to create high-performance computing libraries with significantly less code, enhancing efficiency in AI and scientific computing projects.

Instructions

Objective

Write a program to implement NYC datasets in python language.

Requirements and Specifications

Step 1 (20 pts.): Select two datasets from NYC Open Data https://opendata.cityofnewyork.us/ (Links to an external site.). Write a paragraph about how they might be related and why looking at the data might be helpful. Be sure to check the data dictionary which will explain each column/attribute. NYC School Datasets are an example of a category that are easy to merge as they share a common key. Alternative data sets could be used, but must have my approval.

Step 2 (50 pts.): Write a python assignment that cleans and merges the data. Get rid of the columns you aren't using. I expect to see the use of head() in your data where relevant, so that I can see what is going on in your dataframes. Use comments to explain how you dealt with missing data and other issues in the dataset.

Step 3 (30 pts.): Use seaborn (regression plot) or other plots/graphs to graphically illustrate the relationship between some variables in each dataset. Write another paragraph on what insights you've gained.

Upload all Juypter notebooks, along with your code saved as a .pdf. Save your notebooks with output! Otherwise, I might need to get your data from you to test your work.

.py files are not accepted as a substitute for Juypter notebooks. Any submission that is only a .py file will receive a grade of 0.

Source Code

# Datasets # Datasets ###COVID-19 Daily Counts of Cases, Hospitalizations, and DeathsHealth https://data.cityofnewyork.us/Health/COVID-19-Daily-Counts-of-Cases-Hospitalizations-an/rc75-m7u3 **Download link:** https://data.cityofnewyork.us/api/views/rc75-m7u3/rows.csv?accessType=DOWNLOAD ### Emergency Department Visits and Admissions for Influenza-like Illness and/or Pneumonia https://data.cityofnewyork.us/Health/Emergency-Department-Visits-and-Admissions-for-Inf/2nwg-uqyg **Download link:** https://data.cityofnewyork.us/api/views/2nwg-uqyg/rows.csv?accessType=DOWNLOAD The datasets mentioned above have information on the number of cases of infection and death reported in NY per day. The other dataset contains the number of visits to hospitals or ER rooms for cases of pneumonia, influenza or other similar symptoms. It is planned to verify a direct relationship between these datasets from the date the first cases of COVID-19 were reported in NY. import pandas as pd import requests import matplotlib.pyplot as plt import io import seaborn as sns ## Step 1: Download Datasets data1 = pd.read_csv('https://data.cityofnewyork.us/api/views/rc75-m7u3/rows.csv?accessType=DOWNLOAD') data1.head() data2 = pd.read_csv('https://data.cityofnewyork.us/api/views/2nwg-uqyg/rows.csv?accessType=DOWNLOAD') data2.head() # Step 2: Clean Datasets ### For first dataset, select only the columns of interest columns_of_interest = ['DATE_OF_INTEREST', 'CASE_COUNT', 'HOSPITALIZED_COUNT', 'DEATH_COUNT'] df1 = data1[columns_of_interest] ### Convert 'DATE_OF_INTEREST' to Datetime df1['DATE_OF_INTEREST'] = pd.to_datetime(df1['DATE_OF_INTEREST']) df1.head() ### Let's do the same for the second dataset columns_of_interest= ['extract_date', 'date', 'total_ed_visits', 'ili_pne_visits'] df2 = data2[columns_of_interest] df2['extract_date'] = pd.to_datetime(df2['extract_date']) df2['date'] = pd.to_datetime(df2['date']) df2.head() # Let's find the date for the first COVID-19 case reported start_date = data1.loc[0, 'DATE_OF_INTEREST'] print(start_date) ### Find all rows in second dataset from start_date to present df2 = df2[df2['date'] >= start_date] df2.head() # Step 3: Plots ### Plot the number of COVID-19 case and number of ER visits per day **NOTE: ** We will normalize (between 0 and 1) the number of cases for both datasets. This is because we only want to compare the shape of curves and not the values df1_grouped = df1.groupby(by=['DATE_OF_INTEREST']).sum().sort_values(by=['DATE_OF_INTEREST'], ascending=True) df1_grouped = (df1_grouped-df1_grouped.min())/(df1_grouped.max() - df1_grouped.min()) df1_grouped.head() df2_grouped = df2.groupby(by=['date']).sum().sort_values(by=['date'], ascending = True) df2_grouped = (df2_grouped-df2_grouped.min())/(df2_grouped.max() - df2_grouped.min()) df2_grouped.head() ### Plot plt.figure() ax = df1_grouped.plot(y = 'CASE_COUNT', label = 'COVID-19 Cases') df2_grouped.plot(y='ili_pne_visits', label = 'Hospital Visits', ax = ax) plt.legend() plt.show() We see that there is a clear correlation between the curves. At the beginning, the curves are very similar, and it is because when the pandemic began, everyone was very scared and at the first symptom (even if it was minimal) people attended the ER room. As time passed and social distancing measures began to be implemented, we see that the number of covid cases decreased as did visits to the ER. By October 2020, the number of cases increased again (second wave) and ER visits also increased, although in lesser quantity and it is because people were no longer so scared. ## Correlation maps for each dataset corr1 = data1.corr() corr2 = data2.corr() ### Dataset 1 sns.heatmap(corr1) ## Dataset 2 sns.heatmap(corr2)

Similar Samples

Explore our array of programming assignment samples at ProgrammingHomeworkHelp.com. From Java to Python, C++, and beyond, our samples illustrate effective coding solutions across various languages and topics. Each example is designed to assist students in mastering programming concepts and techniques. Dive into our samples to find inspiration and guidance for your next assignment.

See All Samples

Prime Number Check, Sum of Even Numbers, Guessing Game, and Dice Simulation in Python

Python

Word Count

4091 Words

Writer Name:Walter Parkes

Total Orders:2387

Satisfaction rate:

Python Assignment Sample: Analyzing Stock Market Data with Pandas

Python

Word Count

2184 Words