如何使用python在一个文件夹中处理多个扫描的pdf文件

2024-05-19 • 问答

将文件夹中的多个扫描的PDF文件转换为纯文本格式

import os
from PIL import Image 
import pytesseract 
from pdf2image import convert_from_path
pytesseract.pytesseract.tesseract_cmd = r'C:\Users\Admin\Anaconda3\Menu\tesseract.exe'
d = r'‪C:\Users\Admin\Downloads\pdf'
r = d.lstrip('\u202aC:')
for path in os.listdir(r):
    full_path = os.path.join(d,path)
    for i in full_path:
        print(i)    
pages = convert_from_path(i,500) 
image_counter = 1

d21qgu9p 回答：如何使用python在一个文件夹中处理多个扫描的pdf文件

暂时没有好的解决方案，如果你有好的解决方案，请发邮件至：iooj@foxmail.com

directory file file-handling pdf

本文链接：https://www.f2er.com/2936682.html

如何使用python在一个文件夹中处理多个扫描的pdf文件

d21qgu9p 回答：如何使用python在一个文件夹中处理多个扫描的pdf文件

大家都在问