Hi guys, nice to meet you and I hope this finds you all in good spirits.
Help would be appreciated : Exercise 3: Remove Missing Values and Correct Data Types
Before analyzing the data, you need to ensure your data is complete and correctly formatted.
Directions
- Remove all rows with missing values (code is given).
- Convert the data type of all numeric columns, which is all columns except
"county_state"
, to"Int64"
.
If you are stuck, click here for extra hints!
To convert column types, you can iterate over all the column names you want to change, and use the .astype()
method to convert the column data type to “Int64”.
[13]:
# remove rows with missing values.
census_df_clean = census_df.dropna().copy()
# convert columns to numeric.
# get all the numerical column (all except "county_state" in index 1)
numeric_columns = ["county", "employed_male", "employed_total",
"female_pop_over_75", "female_pop_under_5", "male_pop_over_75",
"male_pop_under_5", "population", "poverty_count",
"poverty_count_female_over_75", "poverty_count_female_under_5",
"poverty_count_male_over_75", "poverty_count_male_under_5", "state",
"total_pop_male"]
### START CODE HERE ###
# iterate over each of the columns and convert to numeric
for None in None:
census_df_clean[None] = None
### END CODE HERE ###
Cell In[13], line 18 for None in None: ^ SyntaxError: cannot assign to None
[14]:
# 🔒This cell is locked. You will not be able to edit it.
# print data types
print("\nData types:")
print(census_df_clean.dtypes)
Data types:
--------------------------------------------------------------------------- NameError Traceback (most recent call last) Cell In[14], line 5 1 # 🔒This cell is locked. You will not be able to edit it. 2 3 # print data types 4 print("\nData types:") ----> 5 print(census_df_clean.dtypes) NameError: name 'census_df_clean' is not defined
Expected output:
<small>
Thanks for any help