VeryPDF Table Extractor OCR has the ability to recognize characters from input PDF or image file and then draw table according to your needs in Windows or Mac OS X system. The operations on these two systems are the same and so are the interfaces. This guide will show you how to use the application VeryPDF Table Extractor OCR on Windows system and you can flexibly use this application in Mac OS X system with the same way. There are four steps for you to follow:
1. Download and install | 2. Load PDF files | 3. Deskew and despeckle input file |
---|---|---|
4. Draw and remove table | 5. Save target file or publish |
In order to use VeryPDF Table Extractor OCR in different systems, please download Windows version or Mac version and then install the application in corresponding system.
After opening the application, you can click Open button in the toolbar to open file picker window for choosing PDF or image file. You can also use drag and drop operations supported by the application.
The application supports to deskew the skew PDF or image file within the range of 15 degree by auto deskew method or manual deskew method. In addition, it also supports to despeckle PDF or image which are full of speckles with auto despeckle method. After adding file into the application, please click the arrow button beside OCR and the Advanced Option window will come out. Please see it in Figure 1.
Fig 1
There are four group boxes in this window.
is used to adjust the black or white degree of pixels in input file by dragging the scrolling bar to control the threshold. |
The buttons from left to right are respectively used to auto deskew input file, draw a line to rotate file, clockwise rotation and anticlockwise rotation. You can input rotation angle in the spin box and click the button beside to manually rotate file page. | |
The upper button is to auto despeckle the speckles in input files. The under button is for filling some character with white color. |
is used to adjust the file quality.
When you need to deskew or despeckle PDF or image, just use the button Auto Deskew or Despeckle to realize the functions and then click Apply button to apply the operations. If you need to cancel all operations, please click Revert button.
When the application recognizes characters from scanned PDF or image, you need to click Draw a table button at first and then draw a red rectangle in middle preview panel to select the area you need to recognize characters. You can click Draw vertical lines button to draw vertical lines in the rectangle. When clicking on OCR button, all the characters in selected area will be accurately recognized and appear in lower box. At the same time, the table will shape, too. Please see it in Figure 2. If you need to remove the rectangle or vertical lines, please click Remove the table or Remove selected lines button.
Fig 2
If you click on any recognized characters below, the related data will be highlighted in the input file.
When saving target file, please click Save button and choose suitable format (.csv, .xls, .xlsx, .html, .pptx, .pptx, .txt, .rtf and .docx) in Files of type dropdown list. In Look in dropdown list, you can choose output location. In File name edit box, please input the file name of the created file. Then please click Save button.
The application enables users to publish recognized file to VeryPDF server. For obtaining details, please see How to deskew or despeckle PDF document and publish online in Mac OS X system.