pytesseract.image_to_string 保存文本文件 (1)

📌 相关文章

📜 pytesseract.image_to_string 保存文本文件 (1)

📅 最后修改于: 2023-12-03 15:33:55.848000 🧑 作者: Mango

pytesseract.image_to_string 保存文本文件

简介

pytesseract是一个Python wrapper，它使用Google的OCR引擎Tesseract来扫描和提取图片中的文本。其中，image_to_string函数是其中的一种使用方式，用于将图片中的文本转化为字符串。

在使用image_to_string函数时，用户可以将识别的文本保存成txt格式的文本文件。

代码示例

下面的代码演示了如何使用pytesseract.image_to_string将图片中的文字识别出来，并保存成txt文件。

import pytesseract
from PIL import Image
import os

# 设置tesseract路径，如果无需设置则可忽略
pytesseract.pytesseract.tesseract_cmd = r'/usr/local/Cellar/tesseract/4.1.1/bin/tesseract'

# 打开图片
img = Image.open('example.png')

# 识别图片中的文本
text = pytesseract.image_to_string(img)

# 保存文本到txt文件
with open('example.txt', 'w', encoding='utf-8') as f:
    f.write(text)

# 文件保存成功提示
print('文件保存成功！')

结论

使用pytesseract.image_to_string可以很方便地将图片中的文本转化为字符串，通过保存文本到txt文件，可以将识别结果保存下来，便于下一步的处理。