Seaborn Basics

Use the Template to explore the basics of Seaborn (an extension to Matplotlib). Create new cells with # %% as necessary.

To be able to read Excel files and use Seaborn, install the required packages with mamba install seaborn xlrd -c conda-forge (in the terminal), and also add them to the environment.yml file.

Use the Plotting section, the Seaborn Tutorial, and the Seaborn API Reference for help.

Template
# %%# Import pandas, matplotlib and seaborn# %%# Import the stats4life marathon dataset in excel format# ('http://www.stats4life.se/data/marathon.xls')# This dataset was used in the article "Hyponatremia among Runners# in the Boston Marathon" from NEJM 2005 [.read_excel]# %%# Explore the dataset with pandas to learn the names of the 17 columns and# what type of values they contain [.info and .head]# %%# Use seaborn to create a histogram of the frequency of the sodium# blood values [.histplot]# %%# Add a Kernel Density Estimate to smoothe the histogram to show the# distribution [keyword kde]# %%# Stratify the values by female and male runners [keyword hue]# %%# Make the plot into and object named fig# Add labels to the axes, and add a title [.set]# Update the legend of the two strata [plt.legend]# %%# Save the figure
Solution
# %%# Import pandas, matplotlib and seabornimport pandas as pdimport matplotlib.pyplot as pltimport seaborn as sns# %%# Import the stats4life marathon dataset in excel format# ('http://www.stats4life.se/data/marathon.xls')# This dataset was used in the article "Hyponatremia among Runners# in the Boston Marathon" from NEJM 2005 [.read_excel]df = pd.read_excel("http://www.stats4life.se/data/marathon.xls")# %%# Explore the dataset with pandas to learn the names of the 17 columns and# what type of values they contain# [.info and .head]df.info()df.head()# Use seaborn to create a histogram of the frequency of the sodium# blood values [.histplot]sns.histplot(data=df, x="na")# %%# Add a Kernel Density Estimate to smoothe the histogram to show the# distribution [keyword kde]sns.histplot(data=df, x="na", kde=True)# %%# Stratify the values by female and male runners [keyword hue]sns.histplot(data=df, x="na", hue="female", kde=True)# %%# Make the plot into and object named fig# Add labels to the axes, and add a title [.set]# Update the legend of the two strata [plt.legend]ax = sns.histplot(data=df, x="na", hue="female", kde=True)ax.set(    xlabel="Sodium values (mmol/L)",    ylabel="No. of runners",    title="Sodium values for the participants of the Boston marathon",)ax.legend(title="Sex", loc="upper left", labels=["Female", "Male"])# %%# Save the figurefig = ax.get_figure()fig.savefig("my-first-seaborn-figure.png")

My First Figure