Document Scanning Demystified
Duncan Lord, September 26, 2016
Follow up from article: Digitisation: Making it work for you.
When dealing with document scanning in a high volume environment unforeseen difficulties can arise. Quite often the overall performance can be below expectations and lead to stress when business critical timeframes are not being met or HR budgets start ballooning.
The ultimate quandary is when you are too busy to allocate the time and effort to improve the process, is when you need it the most. Find a way to invest your time and reap the rewards. In the previous blog we covered breaking the overall process down into its core components. In this way you tackle the questions and experiments without feeling overwhelmed.
Let’s look at the steps involved and ways you can address these:
Now reverse the order. Tackle the end process first and work your way to the beginning.
The three main topics tackled here are; file structure, data needed and the way they are made available for import into the software.
What is the output that you need? In what software are the data and images being used? Which fields do I need and where are the contents coming from? All simple questions, but sometimes the answers can be looked at from a prescriptive ‘point of view’ and miss opportunities for asking if there is an alternative. Document everything, as it will impact all of the steps and you will find yourself referring back to it often.
This is the time to thoroughly investigate how the information is going to be used. Scanning is only useful if you can easily find the information when you need it. For example, depending on your application, it may be infinitely better for it to be split into sub categories by document type. Conversely the opposite may be true, where splitting would unnecessarily segregate the information where it is best collated instead into a multipage file.
I always like to think of Quality Assurance as not being a single step; it is a series of checking mechanisms in each of the listed components addressed below. When enacted across the whole project, quality assurance at the desired level is achieved.
What type of records are they? What are they being used for? Of all the content being scanned and data being extracted, are there different levels of importance? Only a select few document types have an exhaustive 100% compliance requirement. Work out the business risk and the business impact of an error and then equate this to an accuracy rate per separate component. (e.g. image quality separate from data accuracy). From this you can then decide on the measures needed to have confidence in achieving this number.
Overestimate and you add unnecessary time burden to the project, underestimate and suffer unwanted consequences. Examples of some other questions: Are you seeking destruction approval? What are the minimum requirements to comply? Is image quality or data accuracy the most important?
Above all, a multi-layered approach to checking in each of the process steps will produce the highest chance for success. Utilise the strengths of both people and electronic automation in combination to achieve that sweet spot between time and accuracy.
Once the answers to the output questions above are fully explored you can now progressively work out what data is available on the hard copy, what is available from existing electronic databases or lookups and which fields may need some manipulation in addition to these sources. The rule of thumb I like to go by is to always use external data feeds to validate and populate information and only resort to capturing off the hardcopy/image as a last resort. It is well worth the time to interrogate (and sometimes challenge) IM or IT as to availability of the data you need.
Use software tools that allow you to custom build business rules and apply them in an automated manner. In this way you can validate, populate and manipulate huge volumes of information in an extremely quick and consistent manner.
So you know what data you need and where to get it from. Now you can work out the most effective way to capture it. Be careful this can be a trap. We all love the idea of automation and have heard all the amazing stories but it is imperative to be practical here. Time taken maintaining, checking and correcting should be included in your time assessments. The typical choices are manual key, OCR, OMR and automated lookups. (I’ve left out ICR Hand written recognition because it is such a fringe case to be viable that it is more suited to another conversation). From my experience it is highly likely it will be a combination of all of these to achieve the most efficient and cost effective end result.
Unfortunately, I see a lot of people missing out by exclusively relying on OCR and not exploring the benefits of combined methods. The trick is how you combine them in a controlled environment that speeds up the process and not slows it down.
We have our compliance requirements. We know how the images are intended to be used, by whom and in what system(s) and the data capture methods being employed. Now we can work out the optimum scanner settings. You have the universal ones: Resolution, Bit Depth (Colour or B&W), compression type etc., and then you have your content specific settings based upon the source material you are going to come across: Brightness/Contrast thresholds – which are ascertained through much trial and error. You want to test and re-test any variations in hard copies to attempt as universal a setting as possible. Light originals, dark originals, different paper materials. The sweet spot is image quality that matches your requirements above, without the need for the operator to stop and change, then change back again during document scanning. Depending on the scanner and software, you will also have access to a number of dynamic image enhancement tools and settings you can bring to bear.
Finally, decide on the document separation method that best suits your originals and the software you are using. Barcodes already present? Great use them. Barcodes not present or incomplete? A number of possibilities are available depending on the situation. E.g. adding a pre-printed barcode, using scanner specific separator sheets, thickness or size detection for items like envelopes or manila folders. The idea is to use the method that works and does not increase effort level needed in preparation.
There are a number of best practice procedures a scanner operator can follow that are too wordy to put in this blog. Feel free to ask me and I am happy to share. Work practices, testing methods, training tips, monitoring and KPI’s to support effective management over time and more.
In essence, document preparation is all about getting the paper into a format so that it can pass through a scanner unencumbered in the most efficient way. Yes, this involves removing staples, unfolding dog-ears etc. but how this is done often depends on the type and condition of documents being prepared.
The best rule of thumb is to have a clean, uncluttered workspace and easy access to preparation tools (e.g. Staple removal, scissors, patch sheets, barcode stickers, thimbles etc.). When handling documents, it is imperative to only ever have them in one of 3 locations: left = to be done, in front = being processed, right = completed)
Keep it structured, keep it simple!
Don’t be afraid to think outside of the box. After all, a critical role of leaders is questioning why something has ‘always been this way’.
Sure people like to point to work habits and constraints as reasons why something is done the way it is done, but questioning creates the opportunity to improve a process that will impact the lives of your team and your customers.
As you go through this process, remember to work with your team to develop the best plan and how to make it happen. In addition, it is important to align the right people with the right tasks and to get your team to rally behind the approach. This way you gain support whilst optimising the technology to fit your process.
You will be amazed at how well this will work, and it is extremely gratifying for everyone to see the results right before their eyes. What was once deemed a painful experience can now be something the entire team supports. Productivity improves, customer service improves, and most importantly, your team feels like they are making a difference.
So the next time you look around and feel like there must be a better way, know the answer is yes and that is a positive opportunity. Just take a step back and think it through with your team as together, you can make change a reality.
As the Director of Avantix, Duncan has over 20 years experience digitisation, data governance and document process automation. He is passionate about problem solving and acting as a positive driver in business. He can be reached at firstname.lastname@example.org