Re: A reliable ocr program for Fedora

Tuesday, 15 December 2015

On 12/15/2015 03:00 PM, Tom Horsley wrote:
...
 If you have pdf files with actual characters, the
 pdftotext tool works well for extracting the text
 (though not necessarily the layout). there is an option: -layout
It does a good job with preserving the layout.
David
...
 As far as doing OCR from actual image files,
 I always found tesseract to work better than most
 (but it was still pretty feeble).