Convert PCX to TXT and scanned PDF to TXT through OCR technology in batches
What is OCR technology (Optical Character Recognition)?
OCR stands for Optical Character
Recognition. OCR technology comes in answer to the problems faced by users when
text documents are converted to PDF or Portable Document Formats. The text in a
PDF document is in the image format and hence cannot be edited. To enable
editing these documents, we use the OCR technology. How does this work? The
system of OCR technology has an optical scanner. This enables the reading of the
text. It also analyses images. While advanced OCR technology can read a variety
of fonts, it is difficult to recognize the text of handwritten documents. Two
methods are used by the OCR technology to recognize and read characters. The
first of these is the Matrix Matching technology. Here the scanner matches the
character that it reads and tries to match it with its inbuilt library of
templates and characters. If an image that it scans matches the one that is
present in its library then that image is labeled by the computer with the ASCII
character that it corresponds to. For more knowledge about OCR technology,
please feel free to go to What is OCR technology?
What is OCR? OCR Technology optionally.
VeryPDF's PCX to TXT OCR Converter Command Line is a command line application
that enables you to convert PCX to TXT and scanned PDF to TXT via OCR technology
and command line singly or in batches.
VeryPDF's PCX to TXT OCR Converter Command Line also help you convert TIF,
JPG, BMP, PNG etc. image to TXT with OCR technology. In general,
VeryPDF's PCX to TXT OCR Converter Command Line is a perfect command line
tool for Windows users to convert image to TXT ,e.g., PCX to TXT and scanned PDF
to TXT and other formats files singly or in batches flexibly and professionally.
Download and Purchase PCX to TXT OCR
Converter Command Line
Version |
Quantity |
Price (USD) |
Download |
|
PCX to TXT OCR Converter Command Line |
1 Server License | 195/each | ||
1 Developer License | 1495/each | |||
OCR Language Packs |
|
Free |
Free |
Attention: PCX to TXT OCR Converter Command Line contains OCR technology support only for English language. But you can easily download more OCR language packs at here.
PCX to TXT OCR Converter Command Line has following features:
PCX to TXT OCR Converter Command Line Options:
-------------------------------------------------------
Usage: pdf2txtocr.exe [options] <PDF> <Text>
-firstpage <int> : first PDF page to convert
-lastpage <int> : last PDF page to convert
-res <int> : set resolution, the
unit is DPI (default is 300 dpi)
-ownerpwd <string> : set owner password for encrypted PDF file
-userpwd <string> : set user password for encrypted PDF file
-layout :
maintain original physical layout
-noc
: don't insert page breaks 0x0C between pages in text file
-bitcount <int> : set color depth when render PDF page to
image data, it can be set 1, 8, 24, default is 8bit
-ocr
: enable OCR function for scanned PDF file
-lang <string> : choose the language for OCR engine
-text <string> : add additional text at end of each text
page, this parameter supports the following variables:
%PageNumber% : current page number
%PageCount% : total page count of PDF file
-$ <string> : input your License Key
Examples:
pdf2txtocr.exe C:\in.pdf C:\out.txt
pdf2txtocr.exe -firstpage 1 -lastpage 1 C:\in.pdf C:\out.txt
pdf2txtocr.exe -ocr -res 300 C:\in.pdf C:\out.txt
pdf2txtocr.exe -ownerpwd 123 -userpwd 456 C:\in.pdf C:\out.txt
pdf2txtocr.exe -layout C:\in.pdf C:\out.txt
pdf2txtocr.exe -noc C:\in.pdf C:\out.txt
pdf2txtocr.exe C:\in.tif C:\out.txt
pdf2txtocr.exe C:\in.jpg C:\out.txt
pdf2txtocr.exe C:\in.bmp C:\out.txt
pdf2txtocr.exe C:\in.png C:\out.txt
pdf2txtocr.exe -ocr -lang eng C:\in.pdf C:\out.txt
pdf2txtocr.exe -ocr -bitcount 1 C:\in.pdf C:\out.txt
pdf2txtocr.exe -ocr -bitcount 8 C:\in.pdf C:\out.txt
pdf2txtocr.exe -ocr -bitcount 24 C:\in.pdf C:\out.txt
pdf2txtocr.exe -ocr -lang deu C:\in.pdf C:\out.txt
pdf2txtocr.exe -lang deu C:\in.tif C:\out.txt
pdf2txtocr.exe -text "PageText %PageNumber% of %PageCount%" C:\in.pdf C:\out.txt
Following command line will OCR all PDF
files in D:\temp\ folder to text files:
for %F in (D:\temp\*.pdf) do pdf2txtocr.exe -ocr -lang deu "%F" "%~dpnF.txt"
Following command line will OCR all PDF
files in D:\temp\ folder and subdirectories to text files:
for /r D:\temp %F in (*.pdf) do pdf2txtocr.exe -ocr "%F" "%~dpnF.txt"
Following command line will OCR all PDF
files from D:\temp\ folder and output text files to C:\test folder:
for %F in (D:\temp\*.pdf) do pdf2txtocr.exe -ocr "%F" "C:\test\%~nF.txt""
Other Tools for You to View also:
PDF to
Vector Converter: Convert PDF files to PS, EPS, WMF, EMF, XPS, PCL, HPGL,
SWF, SVG, etc. vector files.
PDF to Image
Converter: Convert PDF files to TIF, TIFF, JPG, GIF, PNG, BMP, EMF, PCX, TGA
formats.
DocConverter COM
Component (+HTML2PDF.exe): Convert HTML, DOC, RTF, XLS, PPT, TXT etc.
files to PDF files depending on
PDFcamp Printer.
Image to
PDF Converter: Convert 40+ image formats to PDF files.
PDF to Text
Converter: Convert PDF files to plain text files.
PDF to HTML
Converter: Convert PDF files to HTML documents.
Email:
support@verypdf.com
Search By Keywords:
IMG TO TXT ::
IMG TO TEXT ::
IMG TO PLAIN TEXT ::
IMG TO RTF ::
IMG TO HTML ::
IMG TO ASCII ::
IMG TO HTM ::
IMG TO TEXT DOCUMENT ::
IMG TO DOCUMENT ::
IMG TO DOC ::
IMG TO EDITABLE DOCUMENT ::
IMG TO EDITABLE DOC ::
IMG TO DOCX ::
IMG TO WORD ::
IMG TO OFFICE ::
IMG TO OPENOFFICE ::
IMG TO XML ::
IMG TO EDITABLE WORD ::
PNM TO TXT ::
PNM TO TEXT ::
VeryPDF.com
|
VeryDOC.com |
VeryPCL.com |
Links |
Contact
Copyright © 2002- VeryPDF.com, Inc. All rights reserved.
Send comments about this site to the
webmaster.