Pandas Intermediate
Use the Template
to explore some further functionality of Pandas.
Create new cells with # %%
as necessary.
Use the Data Management section and the Pandas Documentation for help.
Template
# %%# Import Pandas# %%# Read data from the CSV file (read_csv):# https://gitlab.com/alping/python-data-science/-/raw/main/data/external/heart-disease.csv# %%# Inspect the data [.info, .describe]# %%# Get the number of males and females [.value_counts]# %%# Create a 2x2 table for the variables sex and exang (exercise-induced angina)# %%# Inspect the age distribution as a histogram [.hist] and# adjust the number of bins# %%# Inspect the age distribution, stratified by sex# %%# Keep only observations with chol >200 [.query]# %%# Change the sex variable to have the values male/female,# instead of 1/0 [.assign, .replace]# %%# Create a new categorical age variable, binning ages in# decades (0, 10, 20, ...) [.assign, pd.cut]# %%# In a new data variable, using method chaining:# - Change the sex variable and create the age variable as above, but in one assign statement# - Rename the column "exang" to "exercise_angina" [rename]# - Keep only those with age between 18 and 50, inclusive [query]# - Sort by "chol" [sort_values]