1. Download and install | 2. PaperTools SDK/COM options |
---|---|
3. PaperTools SDK/COM examples | 4. Screenshots |
VeryPDF PaperTools SDK/COM is a toolkit for developers, you can download it from here, unpack it to the disk, open a command prompt window with Administrator Privilege in Windows system, then you can run following command line to register PaperToolsCom.exe into your system,
PaperToolsCom.exe /regserver
after you registered PaperTools SDK/COM successful, you can compile and run ASP, C#, C++, Javascript, PHP, VB, VB.NET, VBScript, etc. demo projects easily.
VeryPDF PaperTools SDK/COM is easy to use, and it has these options:
Description: Batch process scanned image files from command line.
Features:
1. Black Lines Removal
2. Dynamic Thresholding
3. Deskew
4. Despeckle
5. Black Border Removal
6. Layout Analysis
7. OCR Scanned Image files to text files
Usage: PaperTools.dll [options] <in-file> [<out-file>]
-bitcount <int> : Set color depth when render PDF page to image data, it can be set 1, 8, 24, default is 8bit
-rotate <int> : Rotate image file
-threshold <int> : Lightness threshold that used to convert image to B&W, from 1 to 255, 0 is auto, default is -1
-dither <int> : Convert the color image to B&W using the desired method:
-dither 0: Floyd-Steinberg
-dither 1: Ordered-Dithering (4x4)
-dither 2: Burkes
-dither 3: Stucki
-dither 4: Jarvis-Judice-Ninke
-dither 5: Sierra
-dither 6: Stevenson-Arce
-dither 7: Bayer (4x4 ordered dithering)
-width <int> : Scale output to specific width (proportional unless height specified)
-height <int> : Scale output to specific height (proportional unless width specified)
-flip : Flip the image vertically
-mirror : Mirror the image horizontally
-deskew : Enable image deskew options
-skewrange <fp> : Range in which to search for rotation, from -degrees to +degrees rotation. (default: 5.0)
-deskew2 : Enable image deskew by second arithmetic
-despeckle : Enable image despeckle options
-despeckle2 : Enable image despeckle by second arithmetic
-specklesize <int> : Set speckle size for -despeckle2 option, default is 20 pixel
-layout : Layout Analysis for page automatically
-boxobjects : Recognize objects from scanned image file automatically
-removelongline <int> : Remove black lines which length larger than this value
-removeshortline <int> : Remove black lines which length less than this value
-trimmargin : Trim solid color margins from scanned image file automatically
-fuzz <fp> : Colors within this distance of percentage are considered equal, from 1.00 to 100.00, default is 80.00
-trimcolor <string> : Trim border by special color,
-trimcolor FF0000: Red color
-trimcolor 00FF00: Green color
-trimcolor 0000FF: Blue color
-trimcolor HexNum: Other colors
-removeborder : Remove black borders from scanned image file automatically
-ocr : OCR image file to text file automatically
-lang <string> : choose the language for OCR engine
-boxpic : Output image file with boxes
-skip : Skip existing output files, don't overwrite it
-v : Print copyright and version info
-h : Print usage information
-help : Print usage information
--help : Print usage information
-? : Print usage information
-$ <string> : Input registration key
Example:
PaperTools.dll -removeshortline 5 X:\test_despeckle.tif D:\out.png
PaperTools.dll -removeshortline 3 -removelongline 0 X:\test_despeckle.tif D:\out.png
PaperTools.dll -removelongline 100 X:\test_black_border1.jpg D:\out.png
PaperTools.dll -removelongline 200 X:\test_line.png D:\out.png
PaperTools.dll -removelongline 0 X:\test_line.png D:\out.png
PaperTools.dll -removelongline 100 X:\test_table_ocr.tif D:\out.png
PaperTools.dll -removelongline 1000 X:\test_black_border1.jpg D:\out.png
PaperTools.dll -removelongline 0 X:\test_black_border1.jpg D:\out.png
PaperTools.dll -removelongline 0 X:\test_skew.tif D:\out.png
PaperTools.dll -removelongline 0 X:\test_line.png D:\out.png
PaperTools.dll -removelongline 0 X:\test_table_ocr.tif D:\out.png
PaperTools.dll -removelongline 30 D:\out014.png D:\out.png
PaperTools.dll -removelongline 0 D:\out009.png D:\out.png
PaperTools.dll -removelongline 0 -boxobjects X:\test_negative.png D:\out.png
PaperTools.dll -deskew -skewrange 45 X:\test_negative.png D:\out.png
PaperTools.dll -deskew2 X:\test_negative.png D:\out.png
PaperTools.dll -despeckle X:\test_despeckle.tif D:\out.png
PaperTools.dll -despeckle2 -specklesize 20 X:\test_despeckle.tif D:\out.png
PaperTools.dll -removeborder D:\out009.png D:\out.png
PaperTools.dll -removeborder -fuzz 50 X:\test_despeckle.tif D:\out.png
PaperTools.dll -removeborder -fuzz 80 X:\test_black_border1.jpg D:\out.png
PaperTools.dll -trimmargin -fuzz 80 X:\test_black_border1.jpg D:\out.png
PaperTools.dll -trimmargin X:\test_black_border2.png D:\out.png
PaperTools.dll -trimmargin -trimcolor FFFFFF -fuzz 100 X:\test_table_ocr.tif D:\out.png
PaperTools.dll -ocr -boxpic X:\test_table_ocr.tif D:\out.png
PaperTools.dll -bitcount 1 X:\test_color.tif D:\out.png
PaperTools.dll -layout D:\out009.png D:\out.png
PaperTools.dll -$ XXXXXXXXXXXXXXXXX
VeryPDF PaperTools SDK/COM package is contain ASP, C#, C++, Javascript, PHP, VB, VB.NET, VBScript, etc. demo projects, you can compile and run them easily.
Sample: C++
void main() { ::CoInitialize(NULL); _PaperToolsCom VeryPDFCom; try { VeryPDFCom.CreateDispatch("VeryPDF.PaperToolsCom"); } catch(COleDispatchException* pEx) { printf("Something is wrong...\n"); return; } CString strReturn = ""; int nFileIndex = 0; VeryPDFCom.EnableDebugLog(true); string strFolder = GetParentFolder(); string strInFile = strFolder + "\\sample\\test_table_ocr.tif"; string strOutFile = strFolder + "\\sample\\output\\_output_" + intToString(nFileIndex) + ".png"; string strCmd = "-$ XXXXXXXXXXXXXXXXXX -removelongline 0 -boxobjects \"" + strInFile + "\" \"" + strOutFile + "\""; printf("%s\n", strCmd.c_str()); strReturn = strReturn + VeryPDFCom.PaperTools(strCmd.c_str()); printf("Message: %s\n", strReturn); }Sample: C#
string appPath = Path.GetDirectoryName(Application.ExecutablePath); System.Type VeryPDFType = System.Type.GetTypeFromProgID("VeryPDF.PaperToolsCom"); VeryPDF.PaperToolsCom VeryPDFCom = (VeryPDF.PaperToolsCom) System.Activator.CreateInstance(VeryPDFType); string appFolder = Path.GetDirectoryName(Application.ExecutablePath); string strFolder = Directory.GetParent(appFolder).FullName; string strReturn = ""; int nFileIndex = 0; VeryPDFCom.EnableDebugLog(true); string strInFile = strFolder + "\\sample\\test_table_ocr.tif"; string strOutFile = strFolder + "\\sample\\output\\_output_" + nFileIndex.ToString() + ".png"; string strCmd = "-$ XXXXXXXXXXXXXXXXXX -removelongline 0 -boxobjects \"" + strInFile + "\" \"" + strOutFile + "\""; strReturn = strReturn + VeryPDFCom.PaperTools(strCmd); MessageBox.Show(strReturn);Sample: VB.NET
Dim strFolderDir As String = Application.StartupPath() Dim VeryPDFCom As Object = CreateObject("VeryPDF.PaperToolsCom") Dim filesys As Object = CreateObject("Scripting.FileSystemObject") Dim strFolder As String = filesys.GetParentFolderName(strFolderDir) Dim strReturn As String = "" Dim nFileIndex As Integer = 0 VeryPDFCom.EnableDebugLog(1) Dim strInFile As String = strFolder & "\sample\test_table_ocr.tif" Dim strOutFile As String = strFolder & "\sample\output\_output_" & CStr(nFileIndex) & ".png" Dim strCmd As String = "-$ XXXXXXXXXXXXXXXXXX -removelongline 0 -boxobjects """ & strInFile & """ """ & strOutFile & """" strReturn = strReturn & VeryPDFCom.PaperTools(strCmd) MsgBox(strReturn)Sample: VB Script
set VeryPDFCom = CreateObject("VeryPDF.PaperToolsCom") Set filesys = CreateObject("Scripting.FileSystemObject") strFolder = filesys.GetParentFolderName(WScript.ScriptFullName) strFolder = filesys.GetParentFolderName(strFolder) strReturn = "" nFileIndex = 0 VeryPDFCom.EnableDebugLog(1) strInFile = strFolder & "\sample\test_table_ocr.tif" strOutFile = strFolder & "\sample\output\_output_" & CStr(nFileIndex) & ".png" strCmd = "-$ XXXXXXXXXXXXXXXXXX -removelongline 0 -boxobjects """ & strInFile & """ """ & strOutFile & """" strReturn = strReturn & VeryPDFCom.PaperTools(strCmd)Sample: PHP
<?php $strFolder = realpath(dirname(dirname(__FILE__))); $nFileIndex = 0; $VeryPDFComObject = new COM("VeryPDF.PaperToolsCom"); $strInFile = $strFolder . "\\sample\\test_table_ocr.tif"; $strOutFile = $strFolder . "\\sample\\output\\_output_" . $nFileIndex . ".png"; $strCmd = "-$ XXXXXXXXXXXXXXXXXX -removelongline 0 -boxobjects \"" . $strInFile . "\" \"" . $strOutFile . "\""; printf("%s\n", $strCmd); $strReturn = $strReturn . $VeryPDFComObject->PaperTools($strCmd); printf("Message: %s\n", $strReturn); ?>
Deskew: Despeckle:
Locate characters with box:
Locate words with box:
Remove Black Borders: