VeryPDF PDF to Text OCR SDK for .NET is a software component that provides tools and libraries for software programmers or developers to quickly integrate PDF to Text OCR Converter or functions of it to into other applications. This PDF to text converter can convert scanned PDF and images to plain TXT text with OCR (Optical Character Recognition) technology.
Operating Systems: all the Windows systems, like Windows 2000, XP, Vista, 7, 10, 11, Windows Server 2003, 2008 and later systems of 32-Bit and 64-Bit, etc.
Version: v2.0
Program UI Language: English
Input: Text based PDF file, Scanned PDF file, scanned TIFF file, JPEG, JPG, BMP, GIF, PNG
Output: Text file with layout, Text file with reading order, Searchable PDF file with color information, Searchable PDF file without color information
Convert normal and scanned PDF to text
VeryPDF PDF to Text OCR SDK for .NET can not only convert PDF files to text files with/without original layout, but also can recognize and extract words and texts from scanned PDF to text with OCR.
Extract text from scanned TIFF and image
If you want to extract text from scanned TIFF and image, the application can help you a lot. It can recognize and extract text contents from scanned TIFF and images to text with OCR technology.
Create searchable PDF
This application can create searchable PDF from scanned TIFF, image and PDF files. What’s more, it can set open password, owner password, permission, key length to output PDF file.
Support OCR
The OCR engine supports more than ten languages and five different modes.
Support various settings
It supports to make different settings such as deskew and despeckle images automatically, keep coordination information of text in original PDF, insert page breaks 0x0C between pages in text file, rotate pages, lightness threshold, etc.
Support various program languages
It provides COM interface which can be easily called from VB, VB.NET, C#, ASP.NET program languages. Software developers or programmers can easily integrate the codes and APIs of the program into their own applications of higher capability, quality and security.
Support more OCR modes, such as,
-ocrmode <int> : set OCR mode when convert text based PDF files and scanned PDF files to searchable PDF files
-ocrmode 0: output to plain text file
-ocrmode 1: OCR PDF pages and insert a new text layer under original PDF pages
-ocrmode 2: output to plain text based PDF file (pure text based PDF file)
-ocrmode 3: output to OCRed PDF file (BW) with hidden text layer-ocrmode 4: output to OCRed PDF file (Color) with hidden text layer
namespace CSharp_WindowsFormsApplication1
{
public partial class Form1 : Form
{
public Form1()
{
InitializeComponent();
}
private void button1_Click(object sender, EventArgs e)
{
string strStartupPath = System.Windows.Forms.Application.StartupPath + "\\";
System.Type pdf2vecName = Type.GetTypeFromProgID("pdfcom.pdfclass");
if (pdf2vecName != null)
{
object pdf2vec = Activator.CreateInstance(pdf2vecName);
string strInFile = strStartupPath + "test-color.tif";
string strOutFile = strStartupPath + "_test-color.pdf";
string strCmd = "-$ XXXXXXXXXXXXXXXXXXXX -ocrmode 4 \"" + strInFile + "\" \""
+ strOutFile + "\"";
MessageBox.Show(strCmd);
object[] argn = new object[1];
argn[0] = strCmd;
int nRet = (int)pdf2vecName.InvokeMember("com_PDFToTextOCRSDKShell",
BindingFlags.InvokeMethod, null, pdf2vec, argn);
MessageBox.Show("Return Value is: " + string.Format("{0}", nRet));
}
}
}
}
To know more usage about this .NET package, you can download VeryPDF PDF to Text OCR SDK for .NET and have a try.
To get full version of this .NET package, you can buy VeryPDF PDF to Text OCR SDK for .NET here.