Converting books to digital copies

Not being able to read books causes quite a few problems when they are not available digitally.  Publishers are required to provide an accessible copy of a book to the visually impaired, however this rarely happens in an acceptable timeframe.  I have been waiting for one book for university for 2 months now, so if this was term time I would of been 2/3 of the way through the term.  So I would be falling way behind.  So how do I solve this problem.

I will give a brief outline here then in subsquent posts break it down into further detail on how to do each step.

It all starts with buying the book you want to scan and going to a friendly print shop.  I get the spine of the book removed.  Now some print shops can get a little funny with this as they appear to care about the publishers copyrights.  All you need to do is highlight that copyright law actually allows a visually impaired person to do anything they want to a book in order to make it accessible for them.  the RNIB even have a section on their website dedicated to this.

Once the spine is removed, I scan the book in using a Canon DR2010M document fed scanner.  I use this scanner as it allows me to stack around 70 pages at a time, so I can scan a large book quickly.

Once I have all pages scanned I do a little post processing.  This is because there will be small misalignments and a few pages will be skewed.  This is also a great chance to run a few extra tools that highlight the text by increasing its contrast.  This makes it far easier for the OCR software to convert.

Using ABBYY FineReader Express I convert the books to multiple formats: PDF, RTF and HTML.  Each format has its own use.  The PDF for high resolution information and easy search-ability.  The RTF format is great for converting to other things such as ePub and Audiobooks.  While the HTML format is just so portable will work on anything and has great screen reader features.

The RTF is my man in the middle format.  Using Calibre I am able to convert the RTF book to an ePub which will work under iBooks and enables VoiceOver for the book too.  So really is great for the visually impaired.  It is also the in between format for converting to audio.  This particular step is OS X specific, as I use the “convert to spoken track” feature of OS X in order to convert each chapter of a book to audio.  I can then convert this into an Audiobook through iTunes and it will work either in iTunes itself or through Audible.

This whole process can be very time consuming, but doesn’t need much human interaction.  It is simply configuring a bunch of applications to do their thing.  Resulting in converting a paperback or hardback book into multiple accessible formats.

As noted earlier I will break down each step into a little HOWTO but for the tech savvy this should be enough to get you on your way!

One thought on “Converting books to digital copies

Leave a Reply to cd Cancel reply

Your email address will not be published.

This site uses Akismet to reduce spam. Learn how your comment data is processed.