Python中anaconda3上安装的pytesseract无法使用，是安装问题还是其他原因？

pytesseract.image_to_string(image)
Traceback (most recent call last):

File “<ipython-input-14-e63aa9f2dc4e>”, line 1, in <module>
pytesseract.image_to_string(image)

File “C:\ProgramData\Anaconda3\lib\site-packages\pytesseract<a target=”_blank" href=“http://pytesseract.py” rel=“nofollow”>pytesseract.py", line 294, in image_to_string
return run_and_get_output(*args)

File “C:\ProgramData\Anaconda3\lib\site-packages\pytesseract<a target=”_blank" href=“http://pytesseract.py” rel=“nofollow”>pytesseract.py", line 202, in run_and_get_output
run_tesseract(**kwargs)

File “C:\ProgramData\Anaconda3\lib\site-packages\pytesseract<a target=”_blank" href=“http://pytesseract.py” rel=“nofollow”>pytesseract.py", line 172, in run_tesseract
raise TesseractNotFoundError()

TesseractNotFoundError: tesseract is not installed or it’s not in your path

Help on package pytesseract:

NAME
pytesseract

PACKAGE CONTENTS
pytesseract

FILE
c:\programdata\anaconda3\lib\site-packages\pytesseract<a target="_blank" href=“http://init.py” rel=“nofollow”>init.py
Python中anaconda3上安装的pytesseract无法使用，是安装问题还是其他原因？

htzhanglong 1楼

pytesseract 只是 Python 接口，还要单独装 tesseract 并加入 PATH 环境变量才可以

htzhanglong 2楼

先检查一下是不是没装Tesseract-OCR本体。pytesseract只是个调用接口，你得先装底层引擎。

1. 先确认Tesseract本体安装

Windows：去UB-Mannheim的Tesseract安装包下载安装，记住安装路径（比如C:\Program Files\Tesseract-OCR）
Linux：sudo apt install tesseract-ocr（Ubuntu）或sudo yum install tesseract（CentOS）
Mac：brew install tesseract

2. 配置pytesseract指向正确路径 安装后找到tesseract.exe位置，在代码里这样设置：

import pytesseract
# Windows示例路径
pytesseract.pytesseract.tesseract_cmd = r'C:\Program Files\Tesseract-OCR\tesseract.exe'
# Linux/Mac通常是 /usr/bin/tesseract

# 测试
from PIL import Image
text = pytesseract.image_to_string(Image.open('test.png'))
print(text)

3. 如果还不行，检查环境变量 安装时勾选“添加到PATH”，或者手动添加安装目录到系统PATH。

4. 验证安装 命令行直接运行tesseract --version能输出版本信息就说明本体装好了。

常见坑点：

只pip install pytesseract没装Tesseract本体
路径包含中文或特殊字符
32/64位版本混用

总结：先装Tesseract本体再配路径。

zlyuanteng 3楼

直接用 tesseract 吧

ionicwang 4楼

你 pip 了后
机器还要安装一个啊
在 debian 下
apt-get install tesseract

sinazl 5楼

1 楼说的没错

htzhanglong 6楼

谢谢各位，现在可以运行了，但是不能识别中文，可是 Tesseract-OCR 已经装了中文（ tessdata 文件夹了已经有 chi_sim.traineddata ）。

ionicwang 7楼

加了 lang='chi_sim’就能识别中文了，不知道还有没有其他参数识别图片上某一个地方的。