Home » Shop » PDFlib TET PDF IFilter 5 Features

PDFlib TET PDF IFilter 5 Features

In Stock

Quantity:
Share
Delivery & Return
Ask a Question
Estimated Delivery:
26 November - 30 November
39 People viewing this product right now!
Category:
Guaranteed Safe Checkout Trues Badge
Get it today
Free shipping
Free shipping Free shipping on orders over $75.
30 - Day Returns
30 - Day Returns Not impressed? Get a refund. You have 30 days to break our hearts.
Dedicated Support
Dedicated Support Support from 8:30 AM to 10:00 PM everyday
Description

PDFlib TET PDF IFilter 5 Features

Accepted PDF Input

TET PDF IFilter supports all relevant flavors of PDF input:

  • All PDF versions up to Acrobat DC, including ISO 32000-1 and 32000-2 (PDF 2.0)
  • Protected PDFs which do not require a password for opening the document
  • Damaged PDF documents are repaired

Internationalization

In addition to Western text TET PDF IFilter fully supports Chinese, Japanese, and Korean (CJK) text. All CJK encodings are recognized; horizontal and vertical writing modes are supported. Automatic detection of the locale ID (language and region identifier) of the text improves the results of Microsoft’s word breaking and stemming algorithms, which is especially important for East Asian text.

Right-to-left languages such as Hebrew and Arabic are also supported. Contextual character forms are normalized and the text is delivered in logical order.

PDF is more than just a Bunch of Pages

TET PDF IFilter treats PDF documents as containers which may contain much more information than only plain pages. TET PDF IFilter indexes all relevant items in PDF documents:

  • Page contents
  • Text in bookmarks, annotations (comments) and form fields
  • Metadata (see below)
  • Embedded PDFs and PDF packages (portfolios) are processed recursively so that the text in all embedded PDF documents can be searched.

XMP Document Metadata and Document Info

The advanced metadata implementation in TET PDF IFilter supports the Windows property system for metadata. It indexes XMP metadata as well as standard or custom document info entries. Metadata indexing can be configured on several levels:

  • Document info entries, Dublin Core fields and other common XMP properties are mapped to Windows shell properties, e.g. Title, Subject, Author.
  • TET PDF IFilter adds useful PDF-specific pseudo-properties, e.g. page size, PDF/A conformance level, font names.
  • All relevant predefined XMP properties can be indexed.
  • User-defined XMP or PDF-based properties can be searched, e.g. company-­specific classification properties, digital signatures or ZUGFeRD/Factur-X conformance.

TET PDF IFilter optionally integrates metadata in the full text index. As a result, even full text search engines without metadata support (e.g. SQL Server) can search for metadata.

Unicode Postprocessing

TET PDF IFilter supports various Unicode postprocessing steps which can be used to improve the extracted text:

  • Foldings preserve, remove or replace characters, e.g. remove punctuation or characters from irrelevant scripts.
  • Decompositions replace a character with an equivalent sequence of one or more other characters, e.g. replace narrow, wide or vertical Japanese characters or Latin superscript variants with their respective standard counterparts.
Reviews (0)
Categories
Close
Home
Category
0 Wishlist
0 Cart

Login

Shopping Cart

Close

Your cart is empty.

Start Shopping

Note
Cancel
Estimate Shipping Rates
Cancel
Add a coupon code
Enter Code
Cancel
Close
PDFlib TET PDF IFilter 5 Features
PDFlib TET PDF IFilter 5 Features

In Stock

Quantity:

Ask a Question

Error: Contact form not found.

Select the fields to be shown. Others will be hidden. Drag and drop to rearrange the order.
  • Image
  • SKU
  • Rating
  • Price
  • Stock
  • Availability
  • Add to cart
  • Description
  • Content
  • Weight
  • Dimensions
  • Additional information
Click outside to hide the comparison bar
Compare