…said the Analyst to the EDRMS
Well, of course it does.
When considering a document scanning project, there are a plethora of technical settings which need to be examined. I am constantly asked what impact scan resolution has on the file size and quality. For the purposes of this article I will focus exclusively on these two aspects and address other considerations in subsequent articles. (The next article reviews what impact compression has on the image quality).
When scanning files, there is often a play-off between getting a document scanned at the highest possible resolution to provide the best visual quality, yet keeping the file size manageable.
Firstly, let’s start off by looking at how file size impacts usability. To put it in unscientific terms, the faster an image appears on the screen – the better and in this context, smaller files open faster than larger files. Conversely, frustration levels soar, if a user has to wait tens of seconds before an image appears (and even longer for larger files). This problem may be exacerbated if a user is accessing a hosted solution rather than accessing files on their own computer.
On the other side of this play-off is; image quality. The rule of thumb is: The higher the resolution – the better the quality. (In reality though, there are a number of examples where this is not so, but this too will be addressed in a subsequent article).
The State Archives lists in its recommendations to have archive scanning performed at 600 PPI (Pixels per inch). Note: this is a recommendation and not a standard. The guidelines go on to suggest the resolution can be adjusted to ensure the image is “fit for purpose”. This suggests the resolution can be adjusted to the appropriate level for the particular document type and circumstances)
The best way to illustrate the effect resolution has on Image Size, when document scanning, is via an example. I have taken a typical single A4 page of content, scanned it at varying resolutions (both Black & White and Colour). The relative sizes of the documents are listed in the table below:
Black and White Size (KB)
Colour Size (KB)
|300||1 066||25 380
|400||1 892||45 162
|600||4 257||101 597
This clearly demonstrates that file size almost doubles every time the resolution increases by 100 PPI. A typical multi-page PDF document consisting of 30 – 40 double sided pages therefore varies quite significantly in size (even when compression is factored in) when comparing a low, to a high resolution scan. Not only will accessing a large file present frustration, you may well have the IT department up in arms over significant storage space requirements and network traffic bottlenecks.
Then next aspect of the document scanning process to review is the quality of the output and compare differences. I have taken a screen shot of the same snippet of the document for the resolutions: 200, 300 and 600 PPI respectively (Don’t be concerned about the content. The snippets of each document are merely to demonstrate the relative quality of result). I have specifically used a document containing a pattern as it is where the patterns intersect where changing quality is best observed.
|Resolution (PPI)||B&W Image||Colour Image
From the samples, we can see that a high resolution scan creates a crisper clearer image. This difference is best noted between 200 PPI and the others, but is less obvious to the naked eye between 300PPI and 600PPI. On face value, it seems that the slightly higher quality we get with the 600 PPI image, may not necessarily be sufficient to justify creating files which are more than four times the size of the 300 PPI image. For colour images the impact is amplified.
Fit For Purpose
Now we know the impact resolution has on file size and image quality, the next action item is to define how “Fit for Purpose” applies to your document scanning project. [Refer to page 6 of the Digitisation Disposal Policy – Queensland State Archives]
The best place to start answering this question is to examine the reason for initiating a document scanning project and the types of records involved. This has the greatest impact on resolution settings. If for example, you are bulk scanning financial documents, (Invoices etc.) which only have a retention period of 7 years with no real requirement above being a legible representation of the original, then scanning at 200 – 300 PPI Black & White, may well be sufficient. This produces a usable image where the content can be read with confidence. If the same images are intended to be OCR’d for data capture, then you would not go below 300 PPI as going under that would negatively impact the result.
If however, you are imaging a legal document (say), where the expectation is for the image to be as close a representation to the original as possible and the smallest detail is clearly visible, it follows that higher resolution colour may be needed. Even still, you would have to think that 600 PPI would be overkill. An alternative approach would be to step up to 400 PPI, if there are compliance concerns regarding 300 PPI.
There is a concept called “Point of Diminishing Returns”. There comes a point with resolution where the higher the resolution scans, only makes a marginal difference to quality. For the example used above, if the document had been scanned to 1200 PPI colour, the increased quality would be minimal but the price paid for file size would be dramatic. Note: A situation where higher resolutions do make a difference is when scaling up the image, such as a photographic negative that is to be enlarged. For this article, I am focussing on 1:1 scale document scanning.
Given that each project requires specific considerations around file size and resolution, it is difficult to make hard and fast recommendations to cover all scenarios. This article rather highlights the factors which need to be considered when document scanning. More often than not, we get asked to perform bulk scanning on documents at 300 PPI (either B&W or Auto-Colour) as this provides a good balance between the size and the quality of output.
Odds are, you have witnessed or know somebody who works in a company that has implemented an overarching software solution which was supposed to be a cure-all for electronic records. But when it came down to specific business processes the end result was less than desirable. Typically in these enterprise solutions the automation of key, end-user activities is lagging.
Nearly every vendor – software & hardware – claims to have the answer to your prayers. After all, business process automation is a hot topic right now and just about every software company and integrator is jumping on the bandwagon.
Despite the hoopla, what I find most confusing; is very few ‘solutions’ actually solve the problem. Information ends up being manually checked or pushed down the line to become somebody else’s problem. In other cases, it may be locked away in some sort of exclusive virtual vault, inaccessible to decision makers or the front line when it matters the most. More often than not, additional IT resources are employed to ‘manage’ the software, negating any resource benefits the organisation anticipated once ‘automation’ was achieved.
In the end, the organisation is locked into an expensive, incomplete investment. Front-line employees create workarounds and even worse your customers or suppliers are confused or frustrated by the need to provide the same information multiple times.
So, I ask myself where is the time, effort, and money going once the dust has settled on the initial implementation?
Some companies may act as if the problem does not still persist, while others see the light and decide automation is more than a catchphrase; it is a strategic imperative. In doing so, they assign leaders such as you and they bring on experts to help lead the way.
It is common for companies to experience pain in attempting to make digital disruption through enterprise software solutions work for them. The good news is that there is a way to lessen the impact by taking several steps to address legacy issues that can cause problems.
To achieve automation for business unit functions that continue to lag, it is best to take a targeted approach that is not limited to the enterprise software constraints. Instead, huge gains can be made when scoping a solution that compliments and integrates with core systems to take your performance to the next level.
1. Identify the processes that have the best chance to produce some quick wins. This is important for everyone involved to break the negativity that can sometimes surround a changing environment. Bring hope to end users and senior management alike that real benefits can be enjoyed across all layers. It can be important to understand that the status quo is very rarely the only possibility. Your recent changes have improved certain aspects of the business, but in so doing, may have made others more painful. The best of both is achievable.
2. Get input from representatives from each of departments/locations/roles that interact with the documents or data during the process. From the point it first enters the organisation through to it’s final “completed” state in the enterprise solution. Understand all the challenges that that face the people involved in each of these touch points.
3. Form an in-depth understanding of the causes and impacts that incomplete or inaccurate data has on the process. What is the waste or risk implications? Are people avoiding what should be done? If so, why are they?
4. Identify how, where and when the data needs to be integrated into core systems. Is there a single source of truth? Are there multiple software systems that need to be synchronised so that during every decision-making step along the way people can trust their data?
5. Where does it all go wrong? What are the causes to breakdowns in the process? Things break! The most perfect process comes unstuck when a supplier/customer/contractor/staff member misses a key piece of information, makes a mistake or neglects what was meant to occur.
6. Weigh up the pros and cons of customising the ERP versus a third party integrated software solution. Consider the risk of budget blowouts and the impact to future upgrades.
7. From this grounding, you are now armed with the information to take positive action. What this plan may encompass will depend on the specific situation. It may be any combination of people, process or technology. It may require custom scripting or data modelling to allow for integration in areas that cannot be integrated.
8. Learn to look beyond the clever marketing of your vendors. The trick is being armed with the full knowledge of what the solution needs to address and having a much clearer picture of the outcomes software automation must achieve. In this way, you gain control of the outcome and are not left at the mercy of professional sales people bedazzling you with the exciting story of software robots or other fluff.
As this space can be highly detailed with many variances, I am only just scratching the surface, but rest assured there is a lot more in my head than I’ll ever get down on paper. I’m always available for a chat or a coffee to help in a more useful way.
Our only Competitive Advantage is to learn faster than our competitors
Yes, for those of you who recognise the quote, I borrowed this title from Arie de Geus (Strategist – Shell Oil) and while it may be over 20 years old, the sentiment applies more now than ever. With a plethora of agile tools available to business, the lead time for any competitor to match or better New Product Design has reduced from what was traditionally years to mere weeks. So in reality to gain competitive advantage through innovation, all that sets us apart from any competitor is for us to learn faster and in turn, implement products, processes or solutions which reflect what we have learned.
This being the case, it stands to reason that a starting point for innovation is to first have comprehensive knowledge of your business’ current position before departing on future initiatives. What can be learned from existing methodologies, processes, experiences and trends and how can these be relied upon as the foundation for change. The decisions made from what you have learned will have significant impact on success.
The cornerstone of Quality Decision Making within an organisation is the quality of data upon which decisions are built. Data integrity plays such a critical role in the organisation, it follows that Data Governance should hold its place as “first among equals” in the spectrum of business tools available to strategists and decision makers.
Because Data Governance matters, best results are achieved through knowledge based on the broadest possible spectrum of data – not just the bottom line produced by the ERP or other core system. In order to gain the advantage of your entire knowledge within the business it is necessary to bring together data sets from all aspects of paper, digital and activities. Data Governance emanates out of the right tools and a defined set of procedures, specifically focussed on achieving these outcomes:
– Increased confidence in decision making
– Decreased Risk
– Better planning and strategising
– Faster identification of improvement areas
– Better staff effectiveness
In other words; the ability for an organisation to Learn all there is, from end to end, elevates the success of future innovation – whatever form it may take.
Follow up from article: Digitisation: Making it work for you.
When dealing with document scanning in a high volume environment unforeseen difficulties can arise. Quite often the overall performance can be below expectations and lead to stress when business critical timeframes are not being met or HR budgets start ballooning.
The ultimate quandary is when you are too busy to allocate the time and effort to improve the process, is when you need it the most. Find a way to invest your time and reap the rewards. In the previous blog we covered breaking the overall process down into its core components. In this way you tackle the questions and experiments without feeling overwhelmed.
Let’s look at the steps involved and ways you can address these:
Now reverse the order. Tackle the end process first and work your way to the beginning.
The three main topics tackled here are; file structure, data needed and the way they are made available for import into the software.
What is the output that you need? In what software are the data and images being used? Which fields do I need and where are the contents coming from? All simple questions, but sometimes the answers can be looked at from a prescriptive ‘point of view’ and miss opportunities for asking if there is an alternative. Document everything, as it will impact all of the steps and you will find yourself referring back to it often.
This is the time to thoroughly investigate how the information is going to be used. Scanning is only useful if you can easily find the information when you need it. For example, depending on your application, it may be infinitely better for it to be split into sub categories by document type. Conversely the opposite may be true, where splitting would unnecessarily segregate the information where it is best collated instead into a multipage file.
I always like to think of Quality Assurance as not being a single step; it is a series of checking mechanisms in each of the listed components addressed below. When enacted across the whole project, quality assurance at the desired level is achieved.
What type of records are they? What are they being used for? Of all the content being scanned and data being extracted, are there different levels of importance? Only a select few document types have an exhaustive 100% compliance requirement. Work out the business risk and the business impact of an error and then equate this to an accuracy rate per separate component. (e.g. image quality separate from data accuracy). From this you can then decide on the measures needed to have confidence in achieving this number.
Overestimate and you add unnecessary time burden to the project, underestimate and suffer unwanted consequences. Examples of some other questions: Are you seeking destruction approval? What are the minimum requirements to comply? Is image quality or data accuracy the most important?
Above all, a multi-layered approach to checking in each of the process steps will produce the highest chance for success. Utilise the strengths of both people and electronic automation in combination to achieve that sweet spot between time and accuracy.
Once the answers to the output questions above are fully explored you can now progressively work out what data is available on the hard copy, what is available from existing electronic databases or lookups and which fields may need some manipulation in addition to these sources. The rule of thumb I like to go by is to always use external data feeds to validate and populate information and only resort to capturing off the hardcopy/image as a last resort. It is well worth the time to interrogate (and sometimes challenge) IM or IT as to availability of the data you need.
Use software tools that allow you to custom build business rules and apply them in an automated manner. In this way you can validate, populate and manipulate huge volumes of information in an extremely quick and consistent manner.
So you know what data you need and where to get it from. Now you can work out the most effective way to capture it. Be careful this can be a trap. We all love the idea of automation and have heard all the amazing stories but it is imperative to be practical here. Time taken maintaining, checking and correcting should be included in your time assessments. The typical choices are manual key, OCR, OMR and automated lookups. (I’ve left out ICR Hand written recognition because it is such a fringe case to be viable that it is more suited to another conversation). From my experience it is highly likely it will be a combination of all of these to achieve the most efficient and cost effective end result.
Unfortunately, I see a lot of people missing out by exclusively relying on OCR and not exploring the benefits of combined methods. The trick is how you combine them in a controlled environment that speeds up the process and not slows it down.
We have our compliance requirements. We know how the images are intended to be used, by whom and in what system(s) and the data capture methods being employed. Now we can work out the optimum scanner settings. You have the universal ones: Resolution, Bit Depth (Colour or B&W), compression type etc., and then you have your content specific settings based upon the source material you are going to come across: Brightness/Contrast thresholds – which are ascertained through much trial and error. You want to test and re-test any variations in hard copies to attempt as universal a setting as possible. Light originals, dark originals, different paper materials. The sweet spot is image quality that matches your requirements above, without the need for the operator to stop and change, then change back again during document scanning. Depending on the scanner and software, you will also have access to a number of dynamic image enhancement tools and settings you can bring to bear.
Finally, decide on the document separation method that best suits your originals and the software you are using. Barcodes already present? Great use them. Barcodes not present or incomplete? A number of possibilities are available depending on the situation. E.g. adding a pre-printed barcode, using scanner specific separator sheets, thickness or size detection for items like envelopes or manila folders. The idea is to use the method that works and does not increase effort level needed in preparation.
There are a number of best practice procedures a scanner operator can follow that are too wordy to put in this blog. Feel free to ask me and I am happy to share. Work practices, testing methods, training tips, monitoring and KPI’s to support effective management over time and more.
In essence, document preparation is all about getting the paper into a format so that it can pass through a scanner unencumbered in the most efficient way. Yes, this involves removing staples, unfolding dog-ears etc. but how this is done often depends on the type and condition of documents being prepared.
The best rule of thumb is to have a clean, uncluttered workspace and easy access to preparation tools (e.g. Staple removal, scissors, patch sheets, barcode stickers, thimbles etc.). When handling documents, it is imperative to only ever have them in one of 3 locations: left = to be done, in front = being processed, right = completed)
Keep it structured, keep it simple!
Don’t be afraid to think outside of the box. After all, a critical role of leaders is questioning why something has ‘always been this way’.
Sure people like to point to work habits and constraints as reasons why something is done the way it is done, but questioning creates the opportunity to improve a process that will impact the lives of your team and your customers.
As you go through this process, remember to work with your team to develop the best plan and how to make it happen. In addition, it is important to align the right people with the right tasks and to get your team to rally behind the approach. This way you gain support whilst optimising the technology to fit your process.
You will be amazed at how well this will work, and it is extremely gratifying for everyone to see the results right before their eyes. What was once deemed a painful experience can now be something the entire team supports. Productivity improves, customer service improves, and most importantly, your team feels like they are making a difference.
So the next time you look around and feel like there must be a better way, know the answer is yes and that is a positive opportunity. Just take a step back and think it through with your team as together, you can make change a reality.
Do you feel frustrated with your scanning project not performing up to expectations?
You are not alone.
Quite often the theory behind scanning is sound, but in practice it does not get fully realised. Stuff gets in the way.
I constantly find myself having the same frustrations. Be it the pressure to perform at the level needed as a scanning services provider, or being confronted by some truly mind-boggling situations.
No matter the genesis of the problems there is one constant. Once you find yourself questioning how ‘slow or cumbersome’ the process has become – it already is. Now is the time to begin looking at alternatives. This is particularly so when dealing with large volumes of information – as it will be well worth the effort.
Regardless of whether the change is a radical shift in process or a minor tweak, the result should free up time for everyone involved.
Where to start?
Ensuring the best for your digitisation project can be a lot like raising a child. There are a thousand and one parenting books that tell you what to do and what not to do, but very little on what to do right now in your situation, today – your first step to righting an undesirable behaviour followed by the second step and so on. Books are not enough. Action and practical immersion is what is called for. Experiment a little, see the results and adjust accordingly.
In order to take that first step, break up the process into manageable pieces. Ones that you can focus in on without feelings of being overwhelmed with complexities.
In terms of digitisation, the most logical activities include:
Take a step back. Don’t be distracted by the way it currently is. Above all make a start.
The best way to proceed is to identify the following and figure out where the biggest return for effort will be achieved. Consult the team and find out:
- Where are the bottlenecks? Where does the intended process break down and need intervention?
- What is the frequency in which this occurs?
- Where is manual effort needed? Can this be automated in any way?
- Are there areas of unnecessary duplication or repletion that can be streamlined without compromising the outcome?
From the answers to these questions you can develop the priority system to then delve deeper. You want to make sure your efforts are going to be worth the time investment before commencing. It would be nice if everything could be handled with a quick fix, but in reality to have an impact, the details are a necessity.
In the next blog I will expand on these activities and provide some tips on questions to ask and methods to use.
Have you ever found yourself saying, ‘there has to be a better way’? We hear you.
When it comes to processing data through your organisation, are you fed up with the way things are done? Have you been forced to follow the status quo because it is just the way things are? Powerless and exhausted, you know there is a better way, but why doesn’t anyone do something about it?
One question we frequently ask is ‘Why would you continue to do things the same way if there’s a better option?’ After all, times change, technologies change, and the job at hand is not getting any easier.
You often hear talk of increasing efficiencies and reducing costs, but this usually masks the real intention – more work with fewer staff. As a result, the tasks pile up with no end in sight.
Fear not, as you are not alone. In fact, we have run into office after office of filing cabinets hiding the disorganisation. Effectively out of sight, out of mind.
However, the impact is real. Lost files, compromised confidential data, the loss of relevant information, and even worse, time-consuming processes and a general sense of frustration.
“WHEN YOU COME TO THE END OF YOUR ROPE, TIE A KNOT AND HANG ON.”
– FRANKLIN D. ROOSEVELT
We don’t simply shift from manual or paper-based systems to electronic systems. We help you to modernise processes by creating a better working environment, more efficient data systems, more effective tracking systems, and increased availability to the information you need to get the job done. It does not need to be like this. Automated solutions can transform the way business is conducted and this is where we come in. Clients rely on Avantix to work alongside with them, to understand their current problems, long-term goals and the kind of solutions they need to implement to take their business to the next level using automation.
If you are dissatisfied with your existing processes, you have a choice; either continue to do things the old way and get the same results or make your life easier by transforming the way you work.
Our team works with clients of all sizes and in all industries throughout Australia. We understand creating an automated system is meaningless if it does not benefit everyone – today and tomorrow. Moreover, a clearly defined ROI in under 12 months is essential.
Many of the people we have worked with initially thought their cause was hopeless and would fall neatly into the “too-hard basket”. This is a trap! As soon as our clients saw how seamlessly their goals could be met, the hesitation quickly dissipated.
Will you continue to just think there is a better way or are you willing to take action? Once you and those around you become champions for change, anything is possible. The first step is yours. Contact us to find out how others who were in a similar situation to you, have achieved extraordinary results.
Avantix empowers businesses to realise the benefits of digital information. Our core services reduce the time it takes to locate relevant business information, supporting better decision making. Our intelligent information processing and digital solutions transform disparate silos of information into searchable intellectual property that is securely accessible. Contact us at email@example.com for more information.
It has been an amazing time going from a small Queensland business to what is now a company with a global network that makes us greater than the sum of our parts – Avantix. I want to thank everyone who has been on this journey with us. Some of you for years and some newly onboard.
Change in our industry is being driven by our customer’s desire for innovation and process improvement. The need is for information to be readily accessible and fully integrated into core systems. I am proud of the evolution our industry and our organisation has experienced over the last 15 years. Many of our clients are making significant investments in Business Intelligence tools, ERP systems, ECM software and Practice Management packages. Consequently, we have been able to help maximise the return for our customers on these initiatives.
Over the years our clients have asked for increasingly more sophisticated solutions to their problems. What started out as scanning onto CD’s and removable hard drives, became the need to search for and view images using either structured fields in a database or whole of document content searches. Next was investigating ways that images and data can be used to bring further controls, efficiencies and targeted reporting to operations.
This led to cloud based solutions for document types like POD’s, Accounts Payable Invoices, Sales Orders, Membership Forms, HR and more. This is where the fun began – automation solutions. Finally we were able to bring further joy to our clients by developing automated workflows, validation and exception handling by using business rules that greatly reduce the stress and pain around high volume processing. The peace of mind and time savings gained by our customers is born out of knowing that the previously messy data created from a wide variety of sources (both paper and digital) are automatically cleansed and vetted. All done behind the scenes in a way of their choosing and without disruption to their core duties, before it being ingested into their systems.
Suffice to say, this process taught us a great deal. Being thrown into the world of time and business critical applications that hundreds of people were counting on, in order to perform their duties, really opened us up to what is needed to perform. This feeling of responsibility has helped shape us into what we are today.
The Avantix vision is to bring that collective knowledge to greater effect at solving your information processing problems and challenges. If you are not aware of our enhanced software and service offerings, we are more than happy to share over a coffee and a chat with you.