About Acrobat to HTML OCR Converter
VeryPDF's Acrobat to HTML OCR Converter is a command line application which helps you OCR scanned PDF to HTML and images to HTML files (TIFF, BMP, PNG, JPG, PCX, TGA, etc.) on Windows platforms. VeryPDF's Acrobat to HTML OCR Converter can not only help you produce basic HTML from original PDF files and image files, but also allow you to edit HTML simple properties etc. with related parameters. Besides, Acrobat to HTML OCR Converter does NOT need Adobe Acrobat or free Acrobat Reader software.
About OCR technology
Often abbreviated OCR, optical character recognition refers to the branch of computer science that involves reading text from paper and translating the images into a form that the computer can manipulate (for example, into ASCII codes). An OCR system enables you to take a book or a magazine article, feed it directly into an electronic computer file, and then edit the file using a word processor etc..
Download and Purchase Acrobat to HTML OCR Converter Command Line
What is OCR?
Optical Character Recognition (OCR) is a visual recognition process that turns printed or written text into an electronic character-based file. A document that is scanned and converted into a PDF document provides the basis for which character recognition software may interpret each character image on the PDF and assign it an electronic character-based file that can then be entered into an editable format, such as a Text or Word document.
What is HTML?
HTML is a computer language devised to allow website creation. These websites can then be viewed by anyone else connected to the Internet. It is relatively easy to learn, with the basics being accessible to most people in one sitting; and quite powerful in what it allows you to create. It is constantly undergoing revision and evolution to meet the demands and requirements of the growing Internet audience under the direction of the W3C, the organisation charged with designing and maintaining the language.
About Acrobat to HTML OCR Converter Command Line
Acrobat to HTML OCR Converter Command Line is a Command Line application uses
Optical Character Recognition technology to OCR scanned PDF documents and images
(TIFF, BMP, PNG, JPG, PCX, TGA, etc.) to HTML files.The default package of Acrobat to HTML OCR Converter Command Line includes support
for only English. However you can download more OCR language packs at here.
Download and Purchase Acrobat to HTML OCR
Converter Command Line product,
Version |
Quantity |
Price (USD) |
Download |
|
Acrobat to HTML OCR Converter Command Line |
1 Server License | $195 /each | ||
1 Developer License | $1495 /each | |||
OCR Language Packs |
Free |
Free |
Note: For more supported languages package of Acrobat to HTML OCR Converter Command Line besides default English one, please click here for downloading more OCR language packs.
Acrobat to HTML OCR Converter Command Line has following features:
Supported Options on Acrobat to HTML OCR Converter Command Line: Acrobat to HTML OCR Converter Command Line features Acrobat to HTML OCR Converter Command Line Options Examples: Read More: What is OCR?
What is OCR? OCR Technology
PDF to HTML
Converter: Convert PDF files to HTML documents.
More Products at VeryPDF
VeryPDF.com
|
VeryDOC.com |
VeryPCL.com |
Links |
Contact
Related Products:
PDF to Word OCR
Converter: Convert PDF to Word documents with OCR technology.
PDF to Excel OCR
Converter: Convert PDF files to Excel file with OCR technology.
Image to PDF OCR Converter: Convert different kinds of images to PDF file with OCR tech.
-------------------------------------------------------
Usage: pdf2txtocr.exe [options] <PDF-file> <Text-file>
-firstpage <int> : first PDF page to convert
-lastpage <int> : last PDF page to convert
-res <int> : set resolution, the
unit is DPI (default is 300 dpi)
-ownerpwd <string> : set owner password for encrypted PDF file
-userpwd <string> : set user password for encrypted PDF file
-layout :
maintain original physical layout
-noc
: don't insert page breaks 0x0C between pages in text file
-bitcount <int> : set color depth when render PDF page to
image data, it can be set 1, 8, 24, default is 8bit
-ocr
: enable OCR function for scanned PDF file
-lang <string> : choose the language for OCR engine
-text <string> : add additional text at end of each text
page, this parameter supports the following variables:
%PageNumber% : current page number
%PageCount% : total page count of PDF file
-$ <string> : input your License Key
pdf2txtocr.exe C:\in.pdf C:\out.txt
pdf2txtocr.exe -firstpage 1 -lastpage 1 C:\in.pdf C:\out.txt
pdf2txtocr.exe -ocr -res 300 C:\in.pdf C:\out.txt
pdf2txtocr.exe -ownerpwd 123 -userpwd 456 C:\in.pdf C:\out.txt
pdf2txtocr.exe -layout C:\in.pdf C:\out.txt
pdf2txtocr.exe -noc C:\in.pdf C:\out.txt
pdf2txtocr.exe C:\in.tif C:\out.txt
pdf2txtocr.exe C:\in.jpg C:\out.txt
pdf2txtocr.exe C:\in.bmp C:\out.txt
pdf2txtocr.exe C:\in.png C:\out.txt
pdf2txtocr.exe -ocr -lang eng C:\in.pdf C:\out.txt
pdf2txtocr.exe -ocr -bitcount 1 C:\in.pdf C:\out.txt
pdf2txtocr.exe -ocr -bitcount 8 C:\in.pdf C:\out.txt
pdf2txtocr.exe -ocr -bitcount 24 C:\in.pdf C:\out.txt
pdf2txtocr.exe -ocr -lang deu C:\in.pdf C:\out.txt
pdf2txtocr.exe -lang deu C:\in.tif C:\out.txt
pdf2txtocr.exe -text "PageText %PageNumber% of %PageCount%" C:\in.pdf C:\out.txt
Following command line will OCR all PDF
files in D:\temp\ folder to text files:
for %F in (D:\temp\*.pdf) do pdf2txtocr.exe -ocr -lang deu "%F" "%~dpnF.txt"
Following command line will OCR all PDF
files in D:\temp\ folder and subdirectories to text files:
for /r D:\temp %F in (*.pdf) do pdf2txtocr.exe -ocr "%F" "%~dpnF.txt"
Following command line will OCR all PDF
files from D:\temp\ folder and output text files to C:\test folder:
for %F in (D:\temp\*.pdf) do pdf2txtocr.exe -ocr "%F" "C:\test\%~nF.txt"
Other Tools for Your Overviews here Also:
PDF to Text
Converter: Convert PDF files to plain text files.
PDF to
Vector Converter: Convert PDF files to PS, EPS, WMF, EMF, XPS, PCL, HPGL,
SWF, SVG, etc. vector files.
PDF to Image
Converter: Convert PDF files to TIF, TIFF, JPG, GIF, PNG, BMP, EMF, PCX, TGA
formats.
DocConverter COM
Component (+HTML2PDF.exe): Convert HTML, DOC, RTF, XLS, PPT, TXT etc.
files to PDF files, it is depend on
PDFcamp Printer
product.
Image to
PDF Converter: Convert 40+ image formats to PDF files.
HTML
Converter: Convert HTML files to TIF, TIFF, JPG, JPEG, GIF, PNG, BMP, PCX,
TGA, JP2 (JPEG2000), PNM, etc. formats.
Search By Keywords:
MULTI-PAGE TIF TO DOC ::
MULTI-PAGE TIFF TO DOCUMENT ::
MULTI-PAGE TIFF TO DOC ::
MULTI-PAGE TIFF TO EDITABLE DOCUMENT ::
MULTI-PAGE TIFF TO EDITABLE DOC ::
MULTI-PAGE TIFF TO DOCX ::
MULTI-PAGE TIFF TO WORD ::
MULTI-PAGE TIFF TO OFFICE ::
MULTI-PAGE TIFF TO OPENOFFICE ::
MULTI-PAGE TIFF TO XML ::
MULTI-PAGE TIFF TO EDITABLE WORD ::MULTI-PAGE TIF TO TXT ::
MULTI-PAGE TIF TO TEXT ::
MULTI-PAGE TIF TO PLAIN TEXT ::
MULTI-PAGE TIF TO RTF ::
MULTI-PAGE TIF TO HTML ::
MULTI-PAGE TIF TO ASCII ::
MULTI-PAGE TIF TO HTM ::
MULTI-PAGE TIF TO TEXT DOCUMENT ::
MULTI-PAGE TIF TO DOCUMENT ::
Copyright © 2002- VeryPDF.com, Inc. All rights reserved.
Send comments about this site to the
webmaster.