Are you surprised about how the modern devices that are non-living things listen your voice, not only this but they responds too. Yes,Its looks like a fantasy, but now-a-days technology are doing the surprising things that were not possible in past. So guys, welcome to my new tutorial Speech Recognition Python.This is a very awesome tutorial having lots of interesting stuffs. In this tutorial we will learn about concept of speech recognition and it’s implementation in python. So let’s gets started.
As the technologies are growing more rapidly and new features are emerging in this way speech recognition is one of them. Speech recognition is a technology that have evolved exponentially over the past few years. Speech recognition is one of the popular and best feature in computer world. It have numerous applications that can boost convenience, enhance security, help law enforcement efforts, that are the few examples. Let’s start understanding the concept of speech recognition, it’s working and applications.
What is Speech Recognition?
- Speech Recognition is a process in which a computer or device record the speech of humans and convert it into text format.
- It is also known as Automatic Speech Recognition(ASR), computer speech recognition or Speech To Text (STT).
- Linguistics, computer science, and electrical engineering are some fields that are associated with Speech Recognition.
Working Nature of Speech Recognition
Now we will discuss how it actually works?
The above pictures shows the working principle of Speech Recognition very clearly.Now let’s understand the concept behind it.
It is based on the algorithm of acoustic and language modeling.So now the question is -what is acoustic and language modeling?
- Acoustic modeling represents the relationship between linguistic units of speech and audio signals.
- Language modeling matches sounds with word sequences to help distinguish between words that sound similar.
Any speech recognition program is evaluated using two factors:
- Accuracy (percentage error in converting spoken words to digital data).
- Speed (extent to which the program can keep up with a human speaker).
Applications
The most frequent applications of speech recognition are following:
- In-car systems.
- Health care – Medical documentation and Therapeutic use
- Military – High performance fighter aircraft ,Helicopters,Training air traffic controllers.
- Telephony and other domains
- Usage in Education and Daily life
- People with disabilities.
Speech Recognition Python
Have you ever wondered how to add speech recognition to your Python project? If so, then keep reading! It’s easier than you might think.
Implementing Speech Recognition in Python is very easy and simple. Here we will be using two libraries which are Speech Recognition and PyAudio.
Creating new project
Create a new project and name it as SpeechRecognitionExample (Though the name doesn’t matter at all it can be anything). And then create a python file inside the project. I hope you already know about creating new project in python.
Installing Libraries
we have to install two library for implementing speech recognition.
Installing SpeechRecognition
- Go to terminal and type
1 2 3 |
pip install SpeechRecognition |
SpeechRecognition is a library that helps in performing speech recognition in python. It support for several engines and APIs, online and offline e.g. Google Cloud Speech API, Microsoft Bing Voice Recognition, IBM Speech to Text etc.
Installing PyAudio
- Go to terminal and type
1 2 3 |
pip install pyaudio |
PyAudio provides Python bindings for PortAudio, the cross-platform audio I/O library. With PyAudio, you can easily use Python to play and record audio on a variety of platforms, such as GNU/Linux, Microsoft Windows, and Apple Mac OS X / macOS.
Performing Speech Recognition
Now let’s jump into the coding part.
So this is the code for speech recognition in python.As you are seeing, it is quite simple and easy.
1 2 3 4 5 6 7 8 9 10 11 12 13 |
import speech_recognition as sr # import the library r = sr.Recognizer() # initialize recognizer with sr.Microphone() as source: # mention source it will be either Microphone or audio files. print("Speak Anything :") audio = r.listen(source) # listen to the source try: text = r.recognize_google(audio) # use recognizer to convert our audio into text part. print("You said : {}".format(text)) except: print("Sorry could not recognize your voice") # In case of voice not recognized clearly |
Explanation of code
So now we will start understanding the code line-by-line.
- first of all we will import speech_recognition as sr.
- Notice that we have speech_recognition in such format whereas earlier we have installed it in this way SpeechRecognition , so you need to have a look around the cases because this is case sensitive.
- Now we have used as notation because writing speech_recognition whole every time is not a good way.
- Now we have to initialize r = sr.Recognizer() ,this will work as a recognizer to recognize our voice.
- So, with sr.Microphone() as source: which means that we are initialising our source to sr.Microphone ,we can also use some audio files to convert into text but in this tutorial i am using Microphone voice.
- Next we will print a simple statement that recommend the user to speak anything.
- Now we have to use r.listen(source) command and we have to listen the source.So, it will listen to the source and store it in the audio.
- It may happen some time the audio is not clear and you might not get it correctly ,so we can put it inside the try and except block .
- So inside the try block, our text will be text = r.recognize_google(audio) ,now we have various options like recognize_bing(),recognize_google_cloud(),recognize_ibm(), etc.But for this one i am using recognize_google().And lastly we have to pass our audio.
- And this will convert our audio into text.
- Now we just have to print print(“You said : {}”.format(text)) ,this will print whatever you have said.
- In the except block we can just write print(“Sorry could not recognize your voice”) ,this will message you if your voice is not recorded clearly.
Output
The output of the above code will be as below.
So, its working fine.Obviously You must have enjoyed it, yeah am i right or not?
If you are working on a desktop that do not have a mic you can try some android apps like Wo Mic, from play store to use your smartphone as a mic. And if you’ve got a real mic or headphones with mic then you can try them too.
Finally Speech Recognition Python Tutorial completed successfully. So friends If you have any question, then leave your comments. If you found this tutorial helpful, then please SHARE it with your friends. Thank You 🙂
Errors on
pip install pyaudio
[1]
Easily install SpeechRecognition 3.8.1 with
!pip install SpeechRecognition
the leading ! since I am within a cell in Jupyter Notebook on Microsoft Azure (http://www.notebooks.azure.com)
[2]
Errors on
!pip install pyaudio
Looks like it gcc build failed since there is no portaudio.h
Any hints about pyaudio?
DETAILS:
Collecting pyaudio
Downloading https://files.pythonhosted.org/packages/ab/42/b4f04721c5c5bfc196ce156b3c768998ef8c0ae3654ed29ea5020c749a6b/PyAudio-0.2.11.tar.gz
Building wheels for collected packages: pyaudio
Running setup.py bdist_wheel for pyaudio … error
Complete output from command /home/nbuser/anaconda3_501/bin/python -u -c “import setuptools, tokenize;__file__=’/tmp/pip-install-hgcg4y3h/pyaudio/setup.py’;f=getattr(tokenize, ‘open’, open)(__file__);code=f.read().replace(‘\r\n’, ‘\n’);f.close();exec(compile(code, __file__, ‘exec’))” bdist_wheel -d /tmp/pip-wheel-xnk_drv5 –python-tag cp36:
running bdist_wheel
running build
running build_py
creating build
creating build/lib.linux-x86_64-3.6
copying src/pyaudio.py -> build/lib.linux-x86_64-3.6
running build_ext
building ‘_portaudio’ extension
creating build/temp.linux-x86_64-3.6
creating build/temp.linux-x86_64-3.6/src
gcc -pthread -B /home/nbuser/anaconda3_501/compiler_compat -Wl,–sysroot=/ -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -I/home/nbuser/anaconda3_501/include/python3.6m -c src/_portaudiomodule.c -o build/temp.linux-x86_64-3.6/src/_portaudiomodule.o
src/_portaudiomodule.c:29:23: fatal error: portaudio.h: No such file or directory
compilation terminated.
error: command ‘gcc’ failed with exit status 1 <<<<<<<<<<<<<<<<<<<< build/lib.linux-x86_64-3.6
running build_ext
building ‘_portaudio’ extension
creating build/temp.linux-x86_64-3.6
creating build/temp.linux-x86_64-3.6/src
gcc -pthread -B /home/nbuser/anaconda3_501/compiler_compat -Wl,–sysroot=/ -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -I/home/nbuser/anaconda3_501/include/python3.6m -c src/_portaudiomodule.c -o build/temp.linux-x86_64-3.6/src/_portaudiomodule.o
src/_portaudiomodule.c:29:23: fatal error: portaudio.h: No such file or directory
compilation terminated.
error: command ‘gcc’ failed with exit status 1
—————————————-
Command “/home/nbuser/anaconda3_501/bin/python -u -c “import setuptools, tokenize;__file__=’/tmp/pip-install-hgcg4y3h/pyaudio/setup.py’;f=getattr(tokenize, ‘open’, open)(__file__);code=f.read().replace(‘\r\n’, ‘\n’);f.close();exec(compile(code, __file__, ‘exec’))” install –record /tmp/pip-record-ftuiec6_/install-record.txt –single-version-externally-managed –compile” failed with error code 1 in /tmp/pip-install-hgcg4y3h/pyaudio/
which operating system you are using?
You can try this, I think it will help. https://stackoverflow.com/questions/5921947/pyaudio-installation-error-command-gcc-failed-with-exit-status-1 And again if you get something like
unmet dependencies
then you should runsudo apt-get install -f
and then try to install pyaudio.Your real problem is with portaudio.h which has no available python wheel or libraries and this is currently not available on Python 3.7 so to remove that error downgrade the python version to 3.6 and run the same command
pip install pyAudio it will work
Just install python 3.6 and pip install PyAudio will work
This is on some Microsoft server that hosts Microsoft Azure and Jupyter Notebooks.
I am using using Chrome browser on Windows 10, but that should not matter.
I login at https://notebooks.azure.com/
In a Jupyter Notebook, the 2 Python commands:
[1]
os.path
returns
[2]
os.name
returns
‘posix’
Hope that helps.
Thanks.
Edward Bujak
This is awesome update in Python
Thanks for the post, it is very helpful. I tried and it worked fine for me.
But it converted only the first 4-5s of the audio file. (1 short sentence)
What if I want to convert longer audio files? Do you have any recommendations?
Thanks in advance.
hello sir thank you so much i tried with this code its working fine…i have one query that with this code its taking some time to give response(text) back .can i add loop in this code if(can u tell me the code) or any other methods how best i can improve the speed .please help f=me for this sir….WAITING FOR RESPONSE
Thanks in advance.
First of all thanks for your comment.Yes it takes some time to response.It may be depends upon your internet speed or speaker’s quality.
it shows the error message “module ‘speech_recognition’ has no attribute ‘Recognizer’ “
May be your file name is speech_recognition.py .You need simple to rename your module (file) like speech-recog.py.
Thanks for sharing it worked for me
If voice is unclear to read , how can it eliminate around noisy things to get distinguished voice for returning text. Do you have any way?
hello sir! I run the code and it show no error but when i try to say something it can’t hear me, I try this in my laptop vaio sony core i3.
It can’t record my voice, I am really in a trouble please help me. to solve this shit..
Thanks
Hi i am unable to install pyaudio i am getting the following error:
ERROR: Command “‘c:\users\ganesh.marella\appdata\local\programs\python\python37\python.exe’ -u -c ‘import setuptools, tokenize;__file__='”‘”‘C:\\Users\\GANESH~1.MAR\\AppData\\Local\\Temp\\pip-install-afndru1v\\pyaudio\\setup.py'”‘”‘;f=getattr(tokenize, ‘”‘”‘open'”‘”‘, open)(__file__);code=f.read().replace(‘”‘”‘\r\n'”‘”‘, ‘”‘”‘\n'”‘”‘);f.close();exec(compile(code, __file__, ‘”‘”‘exec'”‘”‘))’ install –record ‘C:\Users\GANESH~1.MAR\AppData\Local\Temp\pip-record-lqg1dul4\install-record.txt’ –single-version-externally-managed –compile” failed with error code 1 in C:\Users\GANESH~1.MAR\AppData\Local\Temp\pip-install-afndru1v\pyaudio\
Please help me with this.
I want to use this functionality on web application using django, how can I do it? Please reply
Since we are using speech speech to text API, is this free cost?
First install
portaudio
and then install ‘pyaudio’ on any OS that works as expected.on MAC :
brew install portaudio
pip install pyaudio
While installing speech recognition it is showing that pip is not an internal or external command .why it is showing that
Because you have not installed pip on your system. Search on youtube how to install pip according to your system type. Thanks
It is easy to write “import SpeechRecognition”, but it only works if you have your system set up to provide it.
The hard part is to tell people precisely how to collect the libraries on all those platforms. Its not just “pip install SpeechRecognition”.