As enterprises race toward digital transformation, a number of business functions continue to remain paper-dependent. And while this trail of paper is largely maintained for sensitive processes (such as insurance claims or purchase/sale orders), organizations still face the need to transfer information from paper documents to a digital format. This creates a major bottleneck, with teams either working to maintain, scan, and integrate the information manually, or using old-school optical character recognition (OCR) technology.
OCR solutions convert the text on an image into machine-encoded text. OCR is commonly used to scan passports at the airport, or to scan vehicle license plates on toll roads. OCRs are the holy grail for many companies when it comes to document processing, but there are some major downsides.
The problem with OCR
In most cases, real-world documents within an enterprise have no fixed format or layout. The data residing inside them could be structured, unstructured, or even semi-structured while containing manual handwriting, checkboxes, logos, signatures, or other irregular elements. In addition, the document itself could be rotated, skewed, or of low resolution.
This situation calls for unique classification and processing approaches, which can be a tricky affair for OCR solutions that are only adept at extracting text from documents with a fixed template. For example, if a vendor sends a batch of documents stamped “approved” and “not approved,” and a team needs to extract information only from approved documents, an OCR system would fail. It only views documents through a single lens, with no flexibility. It would be unable to locate, read or classify the approved stamp, forcing human operators to intervene from time to time, and defeating the whole purpose of roping in technology.
Even if we throw RPA into the mix, there is no major benefit. The repetitive process of scanning and digitizing will get automated, but we would still have to define and code rules to account for all the variables in the document’s layout (format, style, etc). This is a cumbersome task, as configuring every possible document scenario is extremely challenging, and humans would be required to intervene every time something unexpected comes up during the project.
Enter intelligent document processing
The need to quickly and accurately digitize documents of different types and formats – from invoices, to know-your-customer forms, to patient records – can be addressed with intelligent document processing (IDP). This solution combines the power of OCR with AI to automate document processing end-to-end, starting with understanding what the document is about and what information it contains, to extracting what is needed and converting it into searchable and reusable structured formats. IDP adds the element of flexibility that is missing from OCR.
At its core, IDP uses machine learning (ML) and natural language processing (NLP) to handle structured, unstructured, and semi-structured document data. The ML algorithms trained on different samples identify and classify documents into different categories, based on different parameters associated with each document type. Meanwhile, NLP boosts the OCR engine’s performance, digging deep into the text, interpreting it, and retrieving meaningful business information at very high speeds. There is also a human in the loop who looks at occasional disputed cases and enables the system to self-learn for improved long-term accuracy – no need for continuous coding.
During document processing, output from the IDP engine can trigger internal workflows to drive analytics, ML, and other initiatives on the extracted information, or make it available as-is to business stakeholders for decision-making. For example, AI consultancy Provectus implemented its IDP solution (leveraging Amazon Textract’s OCR and proprietary NLP) with Amazon Comprehend Medical, and other AWS services, to deliver a comprehensive document and data processing platform to a healthcare customer serving seven million patients annually. With this offering, the healthcare provider was not only able to capture, structure, and integrate clinical data (information generated during clinical visits) in various forms and formats, but also to run analytics on it, creating accurate dashboards for BI teams.
Notably, many IDP solutions also address the problem of poor paper quality by automatically preparing the document in advance. They identify and remove any noise (like watermarks) in the document, adjust its brightness and contrast, and rotate or de-skew the image.
Accuracy and outcomes
IDP systems can extract text with an accuracy of 80-95%, removing the need for frequent human intervention and improving cost and time-to-insights. According to Infrrd.ai, one industrial manufacturer who previously processed 200-page RFPs (request for proposals) manually was able to increase its proposal win rate by 40% after switching to IDP. The company also saw a significant boost in revenue following the transition.
A number of enterprises already use intelligent data processing to streamline business processes at a faster speed and at lower cost. As the volume of information increases and the need to perform analytics on it grows, more companies will resort to AI-driven solutions such as IDP. In fact, according to IDC, AI-augmented systems will touch 1.4 ZB of information by 2025, affecting many businesses and organizations around the world.
The global IDP market alone, estimated at USD $700-750 million in 2020, is expected to grow at a rate of 55-65% throughout 2022, according to Everest Group. This trend is being driven by the significant cost benefits of the technology, combined with improved operational efficiency and productivity.
Shubham Sharma is a journalist covering data and analytics.