📅  最后修改于: 2023-12-03 15:04:06.971000             🧑  作者: Mango
OCR or Optical Character Recognition is a popular task in the field of computer vision. OCR can be used to recognize printed or handwritten characters and convert them into machine-readable text. In this article, we will focus on OCR for English alphabets using Python OpenCV.
To begin with, we need to set up our environment. Our requirements are:
Install OpenCV and Pytesseract using pip:
pip install opencv-python
pip install pytesseract
Before applying OCR to the image, we need to preprocess it to make it easier for OCR to recognize characters. Preprocessing involves the following steps:
Let's write a function to perform these steps:
import cv2
def preprocess_image(image):
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
blur = cv2.GaussianBlur(gray, (5,5), 0)
_, threshold = cv2.threshold(blur, 0, 255, cv2.THRESH_BINARY_INV+cv2.THRESH_OTSU)
filtered = cv2.fastNlMeansDenoising(threshold, None, 10, 7, 21)
return filtered
Once the image is preprocessed, we can apply OCR on it. We will use Pytesseract for this task. Pytesseract is a Python wrapper for Google's Tesseract-OCR engine.
import pytesseract
def ocr(image):
config = '--psm 11'
text = pytesseract.image_to_string(image, config=config)
return text
--psm
stands for "page segmentation mode". We are using mode 11, which is "Sparse text with OSD". This mode is best suited for recognizing single words or characters.
Let's test our OCR function on a sample image:
import cv2
image = cv2.imread('sample.png')
preprocessed_image = preprocess_image(image)
text = ocr(preprocessed_image)
print(text)
This should print the recognized text on the console.
In this article, we saw how to perform OCR for English alphabets using Python OpenCV and Pytesseract. OCR is a useful technique that can be used in various applications such as document digitization, number plate recognition, and more.