使用Python将 PDF 文件文本转换为音频语音

让我们看看如何阅读将文本 PDF 文件转换为音频的 PDF。

使用的包：

pyttsx3：它是一个用于文本到语音的Python库。它有很多功能可以帮助机器与我们沟通。它将帮助机器与我们交谈
PyPDF2：它将有助于 PDF 中的文本。作为 PDF 工具包构建的纯 Python 库。它能够提取文档信息，逐页拆分文档，逐页合并文档等。

这两个模块都需要安装

pip install pyttsx3
pip install PyPDF2

您还需要了解open()函数，它将帮助我们以阅读模式打开 PDF。还建议了解 OOPS 概念。

这是示例中阅读的 PDF 链接：https://drive.google.com/file/d/1zhf7-_v6CVUtgd_XMK562mg6ciewi1QR/view?usp=sharing

方法：

导入 PyPDF2 和 pyttx3 模块。
打开 PDF 文件。
使用PdfFileReader()阅读 PDF。我们只需要给出 PDF 的路径作为参数。
使用getPage()方法选择要读取的页面。
使用extractText()从页面中提取文本。
实例化一个 pyttx3 对象。
使用say()和runwait() 方法朗读文本。

现在这里是它的代码

Python3

# importing the modules
import PyPDF2
import pyttsx3
  
# path of the PDF file
path = open('file.pdf', 'rb')
  
# creating a PdfFileReader object
pdfReader = PyPDF2.PdfFileReader(path)
  
# the page with which you want to start
# this will read the page of 25th page.
from_page = pdfReader.getPage(24)
  
# extracting the text from the PDF
text = from_page.extractText()
  
# reading the text
speak = pyttsx3.init()
speak.say(text)
speak.runAndWait()

输出：