Python read number from image

I am using a combination of pyautogui and pytesseract to capture small regions on the screen and then pull the number/text out of the region. I have written script that has read the majority of captured images perfectly, but single digit numbers seem to cause an issue for it. For example small regions of an image containing numbers are saved to .png files the numbers 11, 14, and 18 were pulled perfectly, but the number 7 is just returning as a blank string.

Question: What could be causing this to happen?

Code: Scaled down drastically to make it every easy to follow:

def get_text(image):
    return pytesseract.image_to_string(image)

answer2 = pyautogui.screenshot('answer2.png',region=(727, 566, 62, 48))
img = Image.open('answer2.png')
answer2 = get_text(img)

This code is repeated 4 times, once for each image, it worked for 11,14,18 but not for 7.

Just to slow the files being read here is a screenshot of the images after they were saved through the screenshot command.

https://gyazo.com/0acbf5be2d970abeb29561113c171fbe

here is a screenshot of what I am working from:

https://gyazo.com/311913217a1302382b700b07ad3e3439

Dealing with images is not a trivial task. To you, as a human, it’s easy to look at something and immediately know what is it you’re looking at. But computers don’t work that way.

Photo by Lenin Estrada on Unsplash

Tasks that are too hard for you, like complex arithmetics, and math in general, is something that a computer chews without breaking a sweat. But here the exact opposite applies — tasks that are trivial to you, like recognizing is it cat or dog in an image are really hard for a computer. In a way, we are a perfect match. For now at least.

While image classification and tasks that involve some level of computer vision might require a good bit of code and a solid understanding, reading text from a somewhat well-formatted image turns out to be a one-liner in Python —and can be applied to so many real-life problems.

And in today’s post, I want to prove that claim. There will be some installation to go though, but it shouldn’t take much time. These are the libraries you’ll need:

  • OpenCV
  • PyTesseract

I don’t want to prolonge this intro part anymore, so why don’t we jump into the good stuff now.

OpenCV

Now, this library will only be used to load the images(s), you don’t actually need to have a solid understanding of it beforehand (although it might be helpful, you’ll see why).

According to the official documentation:

OpenCV (Open Source Computer Vision Library) is an open source computer vision and machine learning software library. OpenCV was built to provide a common infrastructure for computer vision applications and to accelerate the use of machine perception in the commercial products. Being a BSD-licensed product, OpenCV makes it easy for businesses to utilize and modify the code.[1]

In a nutshell, you can use OpenCV to do any kind of image transformations, it’s fairly straightforward library.

If you don’t already have it installed, it’ll be just a single line in terminal:

pip install opencv-python

And that’s pretty much it. It was easy up until this point, but that’s about to change.

PyTesseract

What the heck is this library? Well, according to Wikipedia:

Tesseract is an optical character recognition engine for various operating systems. It is free software, released under the Apache License, Version 2.0, and development has been sponsored by Google since 2006.[2]

I’m sure there are more sophisticated libraries available now, but I’ve found this one working out pretty well. Based on my own experience, this library should be able to read text from any image, provided that the font isn’t some bulls*** that even you aren’t able to read.

If it can’t read from your image, spend more time playing around with OpenCV, applying various filters to make the text stand out.

Now the installation is a bit of a pain in the bottom. If you are on Linux it all boils down to a couple of sudo-apt get commands:

sudo apt-get update
sudo apt-get install tesseract-ocr
sudo apt-get install libtesseract-dev

I’m on Windows, so the process is a bit more tedious.

First, open up THIS URL, and download 32bit or 64bit installer:

Python read number from image

The installation by itself is straightforward, boils down to clicking Next a couple of times. And yeah, you also need to do a pip installation:

pip install pytesseract

Is that all? Well, no. You still need to tell Python where Tesseract is installed. On Linux machines, I didn’t have to do so, but it’s required on Windows. By default, it’s installed in Program Files.

If you did everything correctly, executing this cell should not yield any error:

Python read number from image

Is everything good? You may proceed.

Before you leave

Reading text from an image is a pretty difficult task for a computer to perform. Think about it, the computer doesn’t know what a letter is, it only works only with numbers. What happens behind the hood might seem like a black box at first, but I encourage you to investigate further if this is your area of interest.

I’m not saying that PyTesseract will work perfectly every time, but I’ve found it good enough even on some trickier images. But not straight out of the box. Some image manipulation is required to make the text stand out.

It’s a complex topic, I know. Take it one day at a time. One day it will be second nature to you.

Loved the article? Become a Medium member to continue learning without limits. I’ll receive a portion of your membership fee if you use the following link, with no extra cost to you.

How do I read text from an image in Python?

Extract text from a single image using Python.
from PIL import Image..
from pytesseract import pytesseract..
img = Image. open(path_to_image).
text = pytesseract. image_to_string(img).
print(text).

How do I use Tesseract to read text from an image?

Create a Python tesseract script Create a project folder and add a new main.py file inside that folder. Once the application gives access to PDF files, its content will be extracted in the form of images. These images will then be processed to extract the text.

Can Python read images?

Using ImageIO : Imageio is a Python library that provides an easy interface to read and write a wide range of image data, including animated images, video, volumetric data, and scientific formats.