Now – Process PDFs, Word Documents, and Images with Amazon Comprehend for IDP

Today we are announcing a new Amazon Comprehend feature for intelligent document processing (IDP). This feature allows you to classify and extract entities from PDF documents, Microsoft Word files, and images directly from Amazon Comprehend without you needing to extract the text first. Many customers need to process documents that have a semi-structured format, like images of receipts that were scanned or tax statements in PDF format. Until today, those customers first needed to preprocess those documents using optical character recognition (OCR) tools to extract the text. Then they could use Amazon Comprehend to classify and extract entities from those preprocessed files. Now with Amazon Comprehend for IDP, customers can process their semi-structured documents, such as PDFs, docx, PNG, JPG,…