This is a short introduction to PDF to Text OCR Converter Command Line. This application is useful for converting scanned PDF and images to textual files with command line. This article is divided into three parts:
1. Download and run | 2. Basic usage | 3. Supported options |
---|
PDF to Text OCR Converter Command Line is runnable without installation. You can download it with this link and then unpack to anywhere in your disk to run it.
Start a DOS command prompt window in your system, and change the current directory to where your PDF to PDF to Text OCR Converter Command Line is saved.
Basic usage is
pdf2txtocr [options] <input file> <output file>
where input file is for specifying the original file, and output file is for specifying the name of target file to save the converted textual file.
A simple practical command line is
pdf2txtocr.exe -noc C:\in.pdf C:\out.txt
PDF to Text OCR Converter Command Line supports these options:
-firstpage <int>: specify the first PDF page to convert. -lastpage <int>: specify the last PDF page to convert. -res <int>: set resolution in DPI, default is 300. -ownerpwd <string>: provide owner password for encrypted PDF. -userpwd <string>: provide user password for encrypted PDF. -layout: keep original physical layouts in created textual file. -noc: do not insert page break "0x0C" between pages in text file. -bitcount <1, 8 or 24>: set color depth for rendering PDF page to image, default is 8. -ocr: enable OCR function for scanned PDF. -lang <string>: specify a language for OCR engine. -text <string>: add additional text at the end of each text page.
This parameter supports the following variables: %PageNumber%: current page number. %PageCount%: total page count of PDF. -$ <string>: register the application with a registration key.