When not doing my real job, I pursue the Great Purge of 2008. I appear to be behind schedule so it's like most big IT projects. In my defense, proper purging is predicated by planning, precision, premeditated patterns, and pre-medicated posturing, especially when it applies to paper. So here's a little story:
Once upon a time, there was this woman who started a construction business with her husband in the late 1940s. Even today the little construction-company-that-could is still in business. She kept meticulous records. Every receipt, invoice, check stub, statement, IOU, and tissue she filed away. As time passed, boxes multiplied like bunnies. One day, the volume of paperwork was so incomprehensibly insurmountable, her family intervened and non-hilarity ensued. By "one day" I mean last year. By "woman" I mean my grandmother. I can only shake my fist at the sky and yell "why do you mock me bloodline?!" for I too have been stockpiling my own paperwork. The end.
Here I sit scrawling a shockingly long article about my own horde of paper. A couple of notes before I begin: first, not a single company has sponsored this so all opinions are my own. Second, the criteria I use for what is acceptable for my life may not work for everybody, so before you shred those Confederate land deeds you found in your attic, make sure you know what you're doing. I'm just trying to share helpful advice.
Starting At the End
As the title suggests, I formed a plan around future-proofing. Future-proofing is ensuring any product from a process is easily accessible in the future. For example, books printed on high-quality paper will last longer than books printed on toilet paper. Feel free to experiment. Admittedly, this is an over-simplified definition but I'm doing home and home-office documents not files from Area 51.
Naturally when we talk about paperless, we mean going digital as in paper to computer. If you assumed something else, please let me know how you are reading this blog; I'm intrigued.
Before I jump-start my purge, I have needs. And I have requirements too:
- I need to comply with tax retention laws.
- I want to store documents in such a way that I can open them in the future without going through hell.
- I want to be able to search my documents.
- I will need redundancy to protect all my hard work.
Simple, right? Here is how I addressed each of these requirements.
Retention Laws
You can do a Google search and find tons of opinions on retention. My advice is just call your accountant to ensure you comply with both federal and state. Most experts say you need seven years of personal tax information and ten years of business tax information, but ask definitely ask your accountant. Just because it's on a random website doesn't mean it applies to your situation, or for that matter, true. Got it?
Document Formats
I know I'm going to scan documents to be saved digitally. I learned my lesson about document formats while working in IT many years ago. Remember Wang word processors? Exactly. For a while there was a niche specialty of doing Wang-to-anything-but-Wang file conversions. In addition, I personally have a bunch of documents originally formatted in Wordstar that no longer can be read with the original formatting.
This is the first gotcha when it comes to a paperless conversion: Not all scanning software saves documents into a future-proof file format or even into a file. It's true. A lot of the documentation management software that ships with document scanners or is sold independently saves files into a proprietary database. This is obviously not future-proof and it's pretty annoying to be permanently chained to proprietary software.
Enter PDF/A. Go to the PDF/A website and read about how it's an archiving standard. Your best bet in storing your files in something you can access them from in the future will be the PDF/A format. Because it's a recognized ISO standard and large companies and governments are actually using PDF/A, your chances in the future of accessing and/or converting them is much greater than if you go with bad software.
Searching Documents
PDF/A files are stored in plain text. That means they are searchable by definition. However, when you scan documents, they are saved as an image. So your document management software should ideally allow you to store metadata about each file as you scan. For example, you would type key words like "bank statement" and "prison record" to be stored within the files. In addition, good document management software will also do a quick OCR scan of your document and save OCR copy within the PDF file.
Redundant Storage
My final requirement is to have enough file storage. Because I know I'm going to scan a bazillion documents and then save them, I need a lot of hard drive space and an equal amount of backup space. This takes planning because I'm not made of money. Things would be better if some publisher would just accept my open book proposal, but wishes aren't ponies or something like that.
The Plan
So here is my plan for getting this all done:
- Obtain a document scanner: to be addressed in part 2 of this series.
- Buy beer: this helps lubricate the robotics that will be actually feeding paper into the scanner.
- Obtain document management software: to be addressed in part 3 of this series.
- Obtain file storage and backup: to be addressed in part 4 of this series.
- Actually do it: to be addressed in part 5 of this series.
Who knows, I might get all of this done.
Related articles by Zemanta
- Online OCR extracts text from scans for free (downloadsquad.com)
- Details Matter: Conforming with International Standards for Viewing PDF Files (blogs.adobe.com)
- ECM Whitepaper (slideshare.net)
Popularity: 6% [?]

![Reblog this post [with Zemanta]](http://img.zemanta.com/reblog_e.png?x-id=ed2b02fb-bdc6-439c-b6a4-d94e6e1d6d70)




Unrest by Parkway Drive
Hi Mk,
I really enjoyed your article. I think you could provide some great insight into document management for the Office Live community on Facebook. Please, I hope you'll share your expertise and experience with the community here: http://www.facebook.com/officelive.
Best of luck on parts 2-5!
KIM
Microsoft Office Live Outreach
[Reply]
Kim: Thanks. I will post my articles as I crank them out. It's a big project.
[Reply]