Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

Version 1 Next »

Following you will find a brief of Textract

Service

Textract service is divided in 2 API’s

Detect Document Text API: The Detect Document Text API uses optical character recognition (OCR) technology to extract text from a provided document.

Optical Character Recognition (OCR): Detect printed text and numbers in a scan or rendering of a document, use synchronous or asynchronous operations via API and information is returned in JSON format. Synchronous refer to live scenes such as posters or road signs, asynchronous to a multy page documents.


image-20190205-011424.png

Analyze Document API: The Analyze Document API extracts data from tables and key-value pairs from forms.

Key-Value Pair Extraction: Detect key-value pairs in document images automatically to retain the inherent context of the document. Use synchronous or asynchronous operations to analyze text in a document. The results of text analysis are returned in a JSON format

image-20190205-011554.png

Table Extraction: Automatically load the extracted data into a database using a pre-defined schema. Preserves the composition of data stored in tables during extraction.

image-20190205-011622.png

Pricing

No minimum fees and no upfront commitments. Amazon Textract charges for each page processed and whether we extract only text from documents or text with tables and/or form data.

As AWS customers we have access to the API’s in a free tier for 12 months with the following restrictions:

  • 1,000 pages per month using the Detecting Document Text API

  • 100 pages per month using the Analyze Document API

Full price list in the following link:

https://aws.amazon.com/textract/pricing/

  • No labels