Home > Products > PDF Extract Tool Command Line
PDF Extract Tool Command Line $79.95

VeryPDF

PDF Extract Tool

Command Line

  • Extract text with positions from PDF file
  • Extract font from PDF file
  • Extract image from PDF file
Download Buy Now

VeryPDF PDF Extract Tool Command Line is a program developed for extracting fonts, images, drawings, text contents, text positions, metadata, document properties, etc. information from PDF files. This is a brief user guide for it.

1. Download and install 2. Command line examples
3. Description for extracted files 4. Sample XML Code for PageContents.xml file
5. Sample Code for TextFileWithPosition.txt file 6. Command options

Download and install

VeryPDF PDF Extract Tool Command Line is a portable application, and it does not need to install. Download the package, unpack it to the disk, open a command prompt window in Windows system, and then you may run it.

Command line examples

The usage rule of the program is

pdfextract.exe [options] <input  PDF>
In the rule, "pdfextract.exe" is the executable file, field of options is for specifying options, "input PDF" indicates an input PDF. The following command line will extract all information from the given PDF file "test.pdf", and show these information to console.
pdfextract.exe test.pdf

You can use "-outfolder" parameter to set the output folder to save extracted files, e.g.,

pdfextract.exe -outfolder _annotstamp annotstamp.pdf

pdfextract.exe -outfolder _test-long-page test-long-page.pdf

pdfextract.exe -outfolder _test-form test-form.pdf

Description for extracted files

pdfforms.fdf: Extracted form fields, this file is contain all form names and form values, the contents are available in UTF-8 format.

 

*.cff;*.ttf;*.afm files: These files are extracted fonts, you can reuse them in MS Word, Photoshop and other drawing editor applications.

 

*.ppm;*.pbm;*.jpg;*.tif;*bmp;*.png files: These files are extracted images.

 

PageContents.xml: This file is contain the drawing information, such as, transformation matrix, fontsize, graphics state, color space, single character position, path, filling, etc. information.

 

Metadata.xml: Extracted metadata XMP file.

 

cnt*.txt files: These files are contents of original PDF pages.

 

TextFile.txt: Plain text file.

 

TextFileWithPosition.txt: Text contents with positions.

Sample XML Code for PageContents.xml file

Text Node:
<text font="Times-Bold" matrix="12.96 0 0 12.96">
    <g c="U" x="285.000" y="603.6" />
    <g c="S" x="294.357" y="603.6" />
    <g c=" " x="301.563" y="603.6" />
    <g c="M" x="304.803" y="603.6" />
    <g c="a" x="317.037" y="603.6" />
    <g c="r" x="323.517" y="603.6" />
    <g c="k" x="329.271" y="603.6" />
    <g c="e" x="336.477" y="603.6" />
    <g c="t" x="342.231" y="603.6" />
    <g c=" " x="346.547" y="603.6" />
    <g c="S" x="349.787" y="603.6" />
    <g c="h" x="356.993" y="603.6" />
    <g c="a" x="364.199" y="603.6" />
    <g c="r" x="370.679" y="603.6" />
    <g c="e" x="376.433" y="603.6" />
</text>
Fill Path Node:
<path fill="evenodd">
    <moveto x="457" y="393.96" />
    <lineto x="492" y="393.96" />
    <lineto x="492" y="366.84" />
    <lineto x="457" y="366.84" />
    <closepath />
</path>
Stroke Path Node:
<path fill="stroke" cap="1" join="0" width="0.84" miter="10">
    <moveto x="287.28" y="443.4" />
    <lineto x="293.04" y="443.4" />
    <lineto x="293.04" y="437.6" />
    <lineto x="287.28" y="437.6" />
    <closepath />
</path>

Sample Code for TextFileWithPosition.txt file

//Text Positions for each Word
word: x=157.06..188.76 y=18.60..32.55 base=30.17 fontSize=11.52 rot=0 link=00000000 'Home'
word: x=197.88..257.12 y=18.60..32.55 base=30.17 fontSize=11.52 rot=0 link=00000000 'PDF-Tools'
word: x=266.21..287.18 y=18.60..32.55 base=30.17 fontSize=11.52 rot=0 link=00000000 'Doc'
word: x=288.38..323.97 y=18.60..32.55 base=30.17 fontSize=11.52 rot=0 link=00000000 'ument'
word: x=333.65..379.00 y=18.60..32.55 base=30.17 fontSize=11.52 rot=0 link=00000000 'Support'
word: x=65.66..182.76  y=46.95..72.70 base=68.32 fontSize=21.26 rot=0 link=00000000 'Advanced'
word: x=190.02..237.43 y=46.95..72.70 base=68.32 fontSize=21.26 rot=0 link=00000000 'PDF'
word: x=245.12..307.43 y=46.95..72.70 base=68.32 fontSize=21.26 rot=0 link=00000000 'Tools'
word: x=314.31..432.03 y=46.95..72.70 base=68.32 fontSize=21.26 rot=0 link=00000000 'Command'
word: x=439.29..488.87 y=46.95..72.70 base=68.32 fontSize=21.26 rot=0 link=00000000 'Line'
word: x=496.13..550.23 y=46.95..72.70 base=68.32 fontSize=21.26 rot=0 link=00000000 'User'
word: x=557.23..643.64 y=46.95..72.70 base=68.32 fontSize=21.26 rot=0 link=00000000 'Manual'
word: x=8.87..62.31    y=86.81..100.7 base=98.39 fontSize=11.52 rot=0 link=00000000 'Version:'
word: x=65.64..94.2    y=86.81..100.7 base=98.39 fontSize=11.52 rot=0 link=00000000 'v2.0'
word: x=8.87..79.14    y=117.82..137  base=133.8 fontSize=15.95 rot=0 link=00000000 'Content'
word: x=79.86..133.67  y=155.91..169  base=167.4 fontSize=11.52 rot=0 link=00000000 'Overview'
word: x=79.86..131.13  y=172.74..186  base=184.3 fontSize=11.52 rot=0 link=00000000 'Features'

//Text Positions for each Line
line: x=157.06..379.00 y=18.60..32.55   base=30.17  'Home PDF-Tools Doc ument Support'
line: x= 65.66..643.64 y=46.95..72.70   base=68.32  'Advanced PDF Tools Command Line User Manual'
line: x=  8.87..94.23  y=86.81..100.76  base=98.39  'Version: v2.0'
line: x=  8.87..79.14  y=117.82..137.13 base=133.85 'Content'
line: x= 79.86..133.67 y=155.91..169.86 base=167.49 'Overview'
line: x= 79.86..131.13 y=172.74..186.69 base=184.31 'Features'
line: x= 79.86..203.05 y=189.57..203.52 base=201.15 'Command Line Usage'
line: x=115.36..263.33 y=223.23..237.18 base=234.81 'Input and output PDF file'
line: x=115.36..264.56 y=240.07..254.01 base=251.64 'Show PDF file information'
line: x=115.36..253.03 y=256.90..270.84 base=268.47 'Set PDF file information'

Command options

VeryPDF PDF Extract Tool Command Line is easy to use, and it has these options:

Usage: pdfextract.exe [options] <PDF-file>
-f <int> : first page to print
-l <int> : last page to print
-opw <string> : owner password (for encrypted files)
-upw <string> : user password (for encrypted files)
-outfolder <string> : Set a folder to store extracted files
-h : print usage information
-help : print usage information
--help : print usage information
-? : print usage information
-$ <string> : input your license key
Example:
pdfextract.exe D:\in.pdf
pdfextract.exe -outfolder D:\out\ D:\in.pdf
pdfextract.exe -outfolder D:\out\ D:\in.pdf
pdfextract.exe -opw 123 -upw 456 -outfolder D:\out\ D:\in.pdf
pdfextract.exe -outfolder D:\out\ D:\in.pdf > out.log
pdfextract.exe -outfolder D:\out\ D:\in.pdf out.log
pdfextract.exe D:\in.pdf out.log

Discount 45% ($49.90) to buy PDF to Word Converter, PDF to Excel Converter, and PDF to PowerPoint Converter.

Use As

Gold Support 30-DAY NO RISK REFUND
 
  Learn more about
PDF Extract Tool Command Line
  See other products   Download   Buy Now
 
 
                   
 You may like these products
VeryPDF PDFcamp Printer Pro
VeryPDF PDFcamp Printer Pro

$38.00

Convert files of Microsoft Word, PowerPoint, Excel, JPG, PNG, GIF, and HTML to PDF. Create PDF from printable documents.
VeryPDF PDF Editor
VeryPDF PDF Editor

$89.95

Create PDF, annotate PDF, fill PDF forms, edit PDF contents and hyperlinks, and convert PDF to image. It is a cost-effective PDF editor.
VeryPDF PDF to Word OCR Converter
VeryPDF PDF to Word OCR Converter

$59.95

Recognize characters in scanned image PDF and save as Word. It supports batch process that can convert multiple PDF files with one click.