VeryPDF PDF to Text OCR SDK for .NET - Supply PDF to Text OCR Converter for Software Development

VeryPDF PDF to Text OCR SDK for .NET is a software component that provides tools and libraries for software programmers or developers to quickly integrate PDF to Text OCR Converter or functions of it to into other applications. This PDF to text converter can convert scanned PDF and images to plain TXT text with OCR (Optical Character Recognition) technology.

System Requirement

Operating Systems: all the Windows systems, like Windows 2000, XP, Vista, 7, 10, 11, Windows Server 2003, 2008 and later systems of 32-Bit and 64-Bit, etc.

Version: v2.0

Program UI Language: English

Input: Text based PDF file, Scanned PDF file, scanned TIFF file, JPEG, JPG, BMP, GIF, PNG

Output: Text file with layout, Text file with reading order, Searchable PDF file with color information, Searchable PDF file without color information

Main Features

Convert normal and scanned PDF to text
VeryPDF PDF to Text OCR SDK for .NET can not only convert PDF files to text files with/without original layout, but also can recognize and extract words and texts from scanned PDF to text with OCR.

Extract text from scanned TIFF and image
If you want to extract text from scanned TIFF and image, the application can help you a lot. It can recognize and extract text contents from scanned TIFF and images to text with OCR technology.

Create searchable PDF
This application can create searchable PDF from scanned TIFF, image and PDF files. What’s more, it can set open password, owner password, permission, key length to output PDF file.

Support OCR
The OCR engine supports more than ten languages and five different modes.

Support various settings
It supports to make different settings such as deskew and despeckle images automatically, keep coordination information of text in original PDF, insert page breaks 0x0C between pages in text file, rotate pages, lightness threshold, etc.

Support various program languages
It provides COM interface which can be easily called from VB, VB.NET, C#, ASP.NET program languages. Software developers or programmers can easily integrate the codes and APIs of the program into their own applications of higher capability, quality and security.

Feature List of VeryPDF PDF to Text OCR SDK for .NET

Convert PDF files to text files and keep original layout;
Convert PDF files to text files and keep reading order (without original layout);
Provide COM interface which can be called easily from VB, VB.NET, C#, ASP.NET program languages;
Convert scanned PDF files to text files;
Convert scanned TIFF and Image files to text files;
Support multi-page TIFF and PDF files as input format;
Support more than ten languages;
Able to create searchable PDF files to scanned TIFF files, image files and PDF files;
Create searchable PDF with original color retained;
Create searchable black-and-white PDF without image;
Create searchable black-and-white PDF with image;
Create searchable PDF with specific color bitcount, such as, color or grayscale PDF file;
Create Text file containing the coordination information of text in original PDF, include [X, Y, Width, Height] information for each word when OCR;
Able to set open password, owner password, permission, key length to output PDF file;
Able insert page breaks 0x0C between pages in text file;
Able to rotate pages before OCR;
Support threshold option, able to control lightness threshold that used to convert color image to black and white image;
Able to deskew and despeckle images automatically;

Support more OCR modes, such as,

-ocrmode <int> : set OCR mode when convert text based PDF files and scanned PDF files to searchable PDF files

-ocrmode 0: output to plain text file

-ocrmode 1: OCR PDF pages and insert a new text layer under original PDF pages

-ocrmode 2: output to plain text based PDF file (pure text based PDF file)

-ocrmode 4: output to OCRed PDF file (Color) with hidden text layer

Sample: C# Project:

namespace CSharp_WindowsFormsApplication1

{

public partial class Form1 : Form

{

public Form1()

{

InitializeComponent();

}

private void button1_Click(object sender, EventArgs e)

{

string strStartupPath = System.Windows.Forms.Application.StartupPath + "\\";

System.Type pdf2vecName = Type.GetTypeFromProgID("pdfcom.pdfclass");

if (pdf2vecName != null)

{

object pdf2vec = Activator.CreateInstance(pdf2vecName);

string strInFile = strStartupPath + "test-color.tif";

string strOutFile = strStartupPath + "_test-color.pdf";

string strCmd = "-$ XXXXXXXXXXXXXXXXXXXX -ocrmode 4 \"" + strInFile + "\" \""

+ strOutFile + "\"";

MessageBox.Show(strCmd);

object[] argn = new object[1];

argn[0] = strCmd;

int nRet = (int)pdf2vecName.InvokeMember("com_PDFToTextOCRSDKShell",
BindingFlags.InvokeMethod, null, pdf2vec, argn);

MessageBox.Show("Return Value is: " + string.Format("{0}", nRet));

}

To know more usage about this .NET package, you can download VeryPDF PDF to Text OCR SDK for .NET and have a try.

To get full version of this .NET package, you can buy VeryPDF PDF to Text OCR SDK for .NET here.

Video Demo

Discount 45% ($49.90) to buy PDF to Word Converter, PDF to Excel Converter, and PDF to PowerPoint Converter.


		Learn more about PDF to Text OCR Converter Command Line			See other products

VeryPDF

PDF to Text OCR Converter CMD

System Requirement

Main Features

Feature List of VeryPDF PDF to Text OCR SDK for .NET

Sample: C# Project:

Video Demo

Related Links

Relative Products

More Products

For Windows

For Mac

For Linux

You may like these products

VeryPDF PDFcamp Printer Pro

VeryPDF PDF Editor

VeryPDF PDF to Word OCR Converter