Navigating Hype, Myths and Realities of Cognitive RPA

Excitement continues to grow around the capabilities of applying automation to various business processes, particularly using robotic process automation (RPA). The enthusiasm is appropriate because early initiatives to automate rote, low-level tasks have seen very positive results with high levels of automation achieved, which frees up staff to spend time on higher-value, more complex tasks.

Low variance, rote and simple tasks have been the primary focus for the majority of RPA projects because they are easy to define and the complexity related to handling different types of exceptions can be avoided. According to AIIM’s 2018 report titled, “Enhancing Your RPA Implementation with Intelligent Information,” the top processes across different functional areas include well-defined processes that operate on structured data. The report highlights processes such as inventory management, payroll, order management and records processing, all of which benefit from standardized data and straightforward tasks. The result is close to 100% automation.

Automating Key Activities

As most organizations become more adept at process automation using these tools, attention starts to turn to processes that involve key activities within an organization. Processes involving customers need to be sped-up and more convenient. Processes involving the delivery of products and services need to be better controlled and accelerated. It is not just about automating tasks to lower costs. In the same AIIM report, it found that organizations see RPA technologies as a way to deal with reducing errors within processes while at the same time, improving data quality and customer service.

However, unlike the simpler processes already benefiting from automation, complex processes deal with many levels of task variance and often times include less-structured data such as what is provided within emails and documents, especially where external stakeholders such as customers and trading partners are concerned. Unfortunately, many organizations have the same level of expectations for complex, document-oriented processes as they do for simpler tasks.

Doculab’s CEO James Watson has noted several times, based upon real-world experiences of organizations shared at their RPA Summits that the key to success is to approach complex RPA projects with moderate expectations. The bots involved, he notes are “more fragile than anyone anticipated.” This suggests that unexpected circumstances within the confines of a process have the effect of “breaking the bot.”

Cognitive RPA: Document Automation

The AIIM study identified that “two out of three organizations say that documents create problems for most RPA tools and 70% say unstructured information is the Achilles Heel for many RPA implementations.” This challenge comes at a time when the amount of document-based information is significantly increasing. The challenges are easy to understand. Unlike tasks that involve very standardized data and tasks that are simple to encode, tasks involving documents can be complex with high variation. Take, for example, an accounts receivable process where the tasks involved are to review the remittance advice with the payment as well as the invoicing data stored within the accounting system. While comparison of structured information can be easily accomplished, when the data involves that contained in documents, it is not as simple. There are many places for an exception. A structured data process can be easily automated:

Simply perform database look-ups using primary keys and then do a direct match of data in specific columns of a record. This is easy to do once everything is in a relational database. The complexity is getting document-based information into a database in the first place.

In order to get to the above simple-to-execute procedure, invoices, remittances, and checks must all undergo a process of analysis in order to reliably locate and extract data. This process occurs regardless of whether the document is “borne digital” or is a scanned image of a paper document. 

According to the AIIM study, 27% of respondents surveyed identified that the format variance of documents created the most problems in RPA environments. The second most cited problem is handwritten information at 19%. Yet, 36% of respondents had no opinion or disagreed with the assertion that processes involving many documents present difficulties with using RPA. There is clearly a disconnect with respect to complexity and expectations.

Effectively Managing Complexity

In the simpler cases that today’s RPA solutions solve, the execution workflow supports defining decisions such as “is ‘FIRST NAME’ provided in request” with a YES/NO option branch to allow for handling exceptions. If there is a first name, the process proceeds. If there is no first name, a response can be provided and another workflow invoked. This type of binary option at the task level of YES/NO is common as it defines a clear path.

However, with processes that involve documents, there can be many subtasks, each of which can have an exception. For instance, mortgage loan origination involves receipt of potentially hundreds of documents. Many of these documents are submitted within a single PDF file. The key to the loan origination process is identifying and separating each document. For each document assignment, there can be one of two outcomes: successfully identified or not successfully identified (actually there can be a third if not identified at all). If there are 100 documents involved, there are 100 chances for error in just the document identification portion. Next, there is the process of automatically separating each document into an individual file. This, too, can have an outcome of success or failure. Rather than a single binary outcome of success or failure, there can be literally hundreds of these outcomes all within a single task.

If you consider the task of data entry, many organizations might take a similar path of using a single success/failure or complete/non-complete outcome within RPA to automate their processes. If you consider receivables processing, a single remittance can have many different data elements each of which may or may not be successfully processed. So again, for automation of data entry, there is no ability to assign a binary outcome at the document level.

The reality is that document automation is very rarely a process of 100% accuracy so this necessitates humans reviewing individual results and making corrections if necessary. Is it possible to have an invoice with 10 data fields and have each one extracted accurately? Yes. However, the probability of that occurring is very low, often less than 10%. Practically every invoice can have at least one or two fields that require correction or review.

Achieving Unattended Automation

Another reality is that most RPA solutions were really built around not relying upon exception handling; rather the focus is on achieving near 100% unattended automation at a task level. When “cognitive” tasks are added to automated processes that involve analysis of complex document-oriented data, the ability to achieve task-level or document-level automation at high rates is not possible. So how are organizations to cope with automation of processes that involve documents?

Document Classification

For document classification, it means that focus must shift to the amount of automation at the document level than at the task level. The decisions within an RPA workflow must target the pass/fail of individual document-level tasks rather than target the entire task of loan packet identification. Automation at a document level can be 80% or higher, which is a significant amount of automation.

Data Extraction

For data extraction, organizations must focus on the amount of data, which can be automated within a given workflow rather than the amount of documents, which can be automated. This means a focus on the individual data elements for each document. For data extraction, even with the inability to achieve upper 90% automation at the document level, using a data field-level approach, it is possible to automate 90% or more of the data entry involved. Again, this is a significant amount of labor reduction leaving only a small percentage requiring manual review.

Quality Data Results with Straight Through Processing

Lastly, the key to achieving any level of true automation of document-oriented tasks within an RPA implementation (without the need for manual intervention) lies in the system’s ability to reliably separate the results into two camps: correct data and incorrect data. Without this ability, RPA processes involving documents require 100% data review.

This 100% data review means looking at every result of document classification or data extraction. While shifting the task of data entry to one of only review has benefits, real benefits in terms of data accuracy, time reduction and cost efficiencies only occur when the software can mimic a human data entry operator.

Document automation is measured on accuracy so reviewing software feature lists provides very little meaningful information. The only way to understand if a given system can support this ability is to observe output on thousands of examples and calculate the results.

The most efficient way to do this is to ask the vendor to produce output for you on a significant number of samples and provide a report on the percentage of data that can flow straight through at a given accuracy rate such as 99%.

Achieving high levels of automation for document-oriented processes is definitely achievable. It just takes adoption of a slightly different approach and the effort to ascertain what level of straight-through processing a given system can reliably achieve.


Greg Council, Vice President of Marketing and Product Management

Greg Council is Vice President of Product Management at Parascript, responsible for market vision and product strategy. Greg has over 20 years of experience in solution development and marketing within the information management market. This includes search, content management and data capture for both on premise solutions and SaaS. To contact Greg and Parascript, please email: