检查两个PDF文档是否与Python相同

Python是一种解释型的通用编程语言。它是一种面向对象和过程范式的编程语言。 Python导入了各种类型的模块，例如difflib，hashlib。

使用的模块：

difflib ：它是一个包含允许比较数据集的函数的模块。
SequenceMatcher ：用于比较输入序列对。

函数：

hash_file ( 字符串 $algo , 字符串 $filename , bool $binary = false )：它是一个具有文件哈希值的函数。
object.hexdigest()：它是一个返回字符串的函数。
fileObject.read(size)：它是一个返回文件指定字节数的函数。

方法

导入模块
声明一个带有 2 个参数的函数，用于文件。
为 hashlib.sha1() 声明两个对象
打开文件
通过将行分成更小的块来读取文件
现在返回两个文件，例如 160 位的 h1.hexdigest()。
使用 hash_file()函数来存储文件的哈希值。
比较并生成适当的消息

使用中的文件

文件 1

comapre pdf 1

档案 2

程序：

Python3

import hashlib
from difflib import SequenceMatcher
  
  
def hash_file(fileName1, fileName2):
  
    # Use hashlib to store the hash of a file
    h1 = hashlib.sha1()
    h2 = hashlib.sha1()
  
    with open(fileName1, "rb") as file:
  
        # Use file.read() to read the size of file
        # and read the file in small chunks
        # because we cannot read the large files.
        chunk = 0
        while chunk != b'':
            chunk = file.read(1024)
            h1.update(chunk)
              
    with open(fileName2, "rb") as file:
  
        # Use file.read() to read the size of file a
        # and read the file in small chunks
        # because we cannot read the large files.
        chunk = 0
        while chunk != b'':
            chunk = file.read(1024)
            h2.update(chunk)
  
        # hexdigest() is of 160 bits
        return h1.hexdigest(), h2.hexdigest()
  
  
msg1, msg2 = hash_file("pd1.pdf ", "pd1.pdf")
  
if(msg1 != msg2):
    print("These files are not identical")
else:
    print("These files are identical")

输出

These files are not identical

编程需要懂一点英语