Data in the raw form might not be very useful. It should be transformed before it can be used. This process in which we convert raw data into useful format is called Data Wrangling. There are many available tools for data wrangling. One of the most popular is Pandas package in Python. 
<h3 id="loading-data">Loading Data</h3>
Loading data is the first step in the data wrangling process. Pandas provide lots of reader functions to read data into a dataframe. Few commonly used functions are
<img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1629457640191/S_haQMPlc.png" alt="image.png" />
<pre><code>import pandas as pd
Song_DF = pd.read_excel("data//Hindi_Songs.xlsx", sheet_name="Songs")
</code></pre>pd.read_excel can take multiple arguments. Filename and sheetname are commonly used.
<h3 id="data-sanity-check">Data sanity check</h3>
Once data is loaded, a sanity check is an important step. This involves (but is not limited to) 
<ul>
<li>looking at the data on a high level
</li>
<li>getting the count of rows and columns
</li>
<li>understand the type of data and null values
</li>
</ul>
<h4 id="1-data-at-a-high-level">1. Data at a high level</h4>
Every dataframe has head() function to display top 5 records
<pre><code>Song_DF.head()
</code></pre><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1629458638852/IK4NzhS92.png" alt="image.png" />
This dataset displays details of songs on Youtube like the name of the channel, song title, Movie, etc.
<h4 id="2-dimension-of-data">2. Dimension of data</h4>
shape property of a dataframe returns the dimension of data i.e. number of rows and columns.
<pre><code>Song_DF.shape
</code></pre><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1629459549031/EpNEcJnec.png" alt="image.png" />
<h4 id="3-understand-the-data-types-of-columns-and-null-values">3. Understand the data types of columns and null values</h4>
<pre><code>Song_DF.info()
</code></pre><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1629461111137/nuS5KY6qG.png" alt="Info.png" />

Data in the raw form might not be very useful. It should be transformed before it can be used. This process in which we convert raw data into useful format is called **Data Wrangling.** There are many available tools for data wrangling. One of the most popular is Pandas package in Python. 

### Loading Data

Loading data is the first step in the data wrangling process. Pandas provide lots of reader functions to read data into a dataframe. Few commonly used functions are
![image.png](https://cdn.hashnode.com/res/hashnode/image/upload/v1629457640191/S_haQMPlc.png)


```
import pandas as pd
Song_DF = pd.read_excel("data//Hindi_Songs.xlsx", sheet_name="Songs")
``` 
*pd.read_excel* can take multiple arguments. Filename and sheetname are commonly used.

### Data sanity check
Once data is loaded, a sanity check is an important step. This involves (but is not limited to) 

- looking at the data on a high level

- getting the count of rows and columns

- understand the type of data and null values

#### 1. Data at a high level
Every dataframe has ***head()*** function to display top 5 records

```
Song_DF.head()
``` 

![image.png](https://cdn.hashnode.com/res/hashnode/image/upload/v1629458638852/IK4NzhS92.png)

This dataset displays details of songs on Youtube like the name of the channel, song title, Movie, etc.

#### 2. Dimension of data
***shape***  property of a dataframe returns the dimension of data i.e. number of rows and columns.


```
Song_DF.shape
``` 


![image.png](https://cdn.hashnode.com/res/hashnode/image/upload/v1629459549031/EpNEcJnec.png)

#### 3. Understand the data types of columns and null values
```
Song_DF.info()
``` 


![Info.png](https://cdn.hashnode.com/res/hashnode/image/upload/v1629461111137/nuS5KY6qG.png)





pandas read_excel info() head() shape() data wrangling sanity check

How to Load excel data and perform sanity check in pandas?

Data Literacy Tutorial, Enhance your data skill, Data Literacy for all, Course for data literacy