2018 - 09 - 16

0. **[Seaborn]**
In last post, we plotted a graph from the WorldCup dataframe using matplotlib which is the most basic and common library for data visualization in Jupyter. In this post, we will use another library.

1.**[Import libraries]**
After installing the Seaborn library, we can move on to the Jupyter Notebook.
Open Jupyter and create a new note.
Then import the libraries we need:

2.**[Import data]**
There are few steps to construct a diagram using Seaborn:

3.**[Setting of figure]**
Similar to {darkgrid, whitegrid, dark, white, ticks}
Personally, I prefer darkgrid because it is easier to read the diagram on a white notebook background. So we define:

4.**[Types of plots in Seaborn]**
There are many types of graphs we can construct using Seaborn and hence it is useful for data visualization and data analysis. Graphs we can plot include but not limited to:

5.**[Example: lmplot]**
Since Seaborn is based on Matplotlib, simple line graphs are plotted using

6.**[Example: distplot]**
To show the normal distribution of data, we can use

`Seaborn`

is also a library to plot diagram by which we can plot colourful diagram and various graphs for data analysis.
To begin, we have to install seaborn. Open terminal and type:
pip install seaborn

1.

import pandas as pd import matplotlib.pyplot as plt import seaborn as sns %matplotlib inline

2.

- Import data
- Setup figure
- Plotting
- Customize the diagram

df = pd.read_csv('WorldCups.csv') df.head()

3.

`plt`

, Before plotting the actual diagram, we have to define how to figure aspects. Remeber `figure`

is just like a container of the plot.
To configure, we use the function `sns.set_style()`

which can parameters as follows:
sns.set_style('darkgrid')Also, we can define the size of the diagram:

plt.figure(figsize=(12,6))

4.

- lmplot
- distplot
- jointgrid
- countplot
- headmap

5.

`plt.plot`

. What if we want to show the relationship between two columns? Linear regression is to plot scattered points on the graph with X against Y. A line is drawn to check if these two columns are proportionally related.
sns.lmplot('MatchesPlayed', 'GoalsScored', df)The graph should look like this It reveals the MatchesPlayed and the GoalsScored have no direct linear relationship.

6.

`sns.distplot`

in which it will show the distribution of data(Histogram) and the KDE(Kernel Density Estimation).
sns.distplot(df['QualifiedTeams'])The output should look like: From the diagram, the bars represent the histogram while the line is the KDE. We can alter to show either KDE or histogram by:

sns.displot(df['QualifiedTeams'], kde=False, hist=True)

There is no comment yet