Hey Python learners, we have already learned reading csv and json file in previous tutorials. In this tutorial we will learn reading excel files in python. So Python Reading Excel files tutorial will give you a detail explanation how to read excel files in python.So let’s start.
If you want to learn reading csv and json files in python then check these links.
- Python CSV Reader Tutorial – Reading CSV Files with Python
- Python Read JSON File – Reading JSON in Python
Before proceeding ahead we have to know what is excel so that we can easily understand all things which are going to cover later.
What Is Excel ?
- Excel is a Microsoft software program.
- It is capable of creating and editing spreadsheets that are saved with .xls or .xlsx file formats.
- Excel is supported by windows, macOs, Android and iOS.
- It is a very widely used spreadsheet for these platforms, especially since version 5 in 1993.
General uses of Excel
It is generally used for –
- Graphing tools
- Pivot tables
- Cell based calculation
Although, you can create a monthly budget, business expenses and many more things with excel spreadsheets.
Why do people use Excel?
There are many spreadsheets are available but people prefers excel because of its various features, and it provides various options to manipulate the data.It is still so much popular because it is used by many enterprises.
Python Reading Excel Files Tutorial
Now, we will see how to read excel files in python.You might think reading excel files are arduous but seriously it is not so much difficult.So let’s start to implement it.
Creating A New Project
First of all create a new project and inside this create a python file.
Creating an Excel File
Now we have to create a excel file.It is not necessary to create an excel file, if u have file already then use this. So look how to create an excel file.
For example, I have created a file that stores book’s information.You can take your choice of example. So now our excel file is like this –
Installing Library
Now we have to install library that is used for reading excel file in python.Although some other libraries are available for reading excel files but here i am using pandas library.
- The Pandas library is built on NumPy and provides easy-to-use data structures and data analysis tools for the Python programming language.
- This is a very powerful and flexible library and used frequently by (aspiring) data scientists to get their data into data structures that are highly expressive for their analyses.
To install pandas library ,go to terminal and write the following code.
1 2 3 |
pip install pandas |
Now pandas is successfully installed.
installing xlrd
Now we have to install one another library xlrd.For this run the following code.
1 2 3 |
pip install xlrd |
xlrd is a library for developers to extract data from Microsoft Excel ™ spreadsheet files
Read Excel File
Now we will start reading excel file in python.For this we have to write following code.
1 2 3 4 5 6 7 8 |
import pandas as pd file = "Books.xls" data = pd.read_excel(file) #reading file print(data) |
What we Did ?
- First we have imported pandas module.
- Then initialized a variable file that stores the excel file.Notice that i am not provide the path of excel file because i kept it in the same directory, but if you will keep it in another directory then you have to provide the proper path of file.
- read_excel() method is used to read the excel file in python.And then you have to pass file as an argument.
- print(data) simply prints the data of excel file.
Now on running the above chunks of code we got the output as below.
Conversion of Cell Contents
Some times you want to do conversion of your cell contents from excel.So, here you can see that how it happens ?
For example, if you want to convert the author name of book Python for Beginners.Here author name is Hilary and let’s say you want to convert it as visly.So what is to be done let’s see.
1 2 3 4 5 6 7 8 9 10 11 12 |
import pandas as pd file = "Books.xls" def convert_author_cell(cell): if cell == "Hilary": return 'visly' return cell data = pd.read_excel(file,converters={'Author':convert_author_cell}) print(data) |
What We Did ?
- First of all you have to define a function.
- inside this function, you have to check if the cell is equals to Hilary then return visly otherwise return the cell whatever you got.
- Now, in read_excel() method you have to supply converters argument.
- Converters argument will take basically python dictionary. And in python dictionary you can supply the name of the column that is to be converted.
- Whenever it is reading Author column is gonna call to convert_author_cell function for every single cell in this column.
Now, run the code and see what happens ?
Now you can see that instead of author Hilary it is replaced to the visely.In this way you can convert the cell’s contents.
Python Writing Excel Files
Now further we will see writing excel file.Some times we need to write dataframe into an excel file. And to_excel() function is used to write an excel file.For this we have to do following things –
First of all we have to install a module xlwt.
1 2 3 |
pip install xlwt |
- xlwt is a library for developers to use to generate spreadsheet files compatible with Microsoft Excel versions 95 to 2003.
- The package itself is pure Python with no dependencies on modules or packages outside the standard Python distribution.
Now we have a dataframe and we have to write back to an excel.So let’s see –
1 2 3 4 5 6 7 8 9 |
import pandas as pd file = "Books.xls" data = pd.read_excel(file) #writing excel file data.to_excel("new.xls", sheet_name="Student") |
What We Did ?
- To writing excel file, to_excel() function is used. It takes the name of the file as an input so here i am taking file name as new.xls for example.
- And second thing is that you need to supply sheet name where you want to write to.So i am taking sheet name as Student.
After running this code we see that a new excel file is created successfully. Go to the directory and you can see this file.
Writing Two data frames into same excel file
And now we are going to write two data frames into a single excel file as two different sheets. Let’s see how to do that ?
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 |
import pandas as pd #creating dataframes df_stock = pd.DataFrame({ 'tickers':['GOOGLE','WMT','MSFT'], 'price':[400,600,900], 'eps':[20.80,6.90,90.8] }) df_weather = pd.DataFrame({ 'day':['1/9/17','2/9/17','3/9/17'], 'temperature':[40,60,30], 'event':['Rainy','Sunny','Snow'] }) with pd.ExcelWriter('stocks_weather.xls') as writer: df_stock.to_excel(writer,sheet_name="stocks") df_weather.to_excel(writer,"weather") |
What We Did ?
- First of all we need to create two data frames.That dataframes are stock and weather.
- And now we will write these two data frames into one single excel file as two different sheets.
- For this ExcelWriter class is used.
- And now we use that writer to write an individual data frame. And i am writing my stock data frame in a sheet called stock and weather data frame in a sheet called weather.
When you run this stocks_weather file is created. When you open this file you will notice that this file has two sheets stock where you can see all stock information and weather where you can see all weather information.
stock information :
weather information :
So friends this was all about Python Reading Excel files Tutorial.Please share it with your python learners friends and help them.In upcoming tutorials i will come with some more cool stuffs till then stay tuned.
Hello I have installed pandas and xlrd, however I am not able to read excel file in pycharm.
What’s the error?
How to import RTD from excel to python ?