If you’ve been lurking around Maphmatically Yours for any length of time, you are likely aware of my commitment to digitizing my scholarly library.  The most substantive posts on the subject focused on the work flow that turns paper books into perfect .pdf files,  while this one focused on how such newly minted .pdf files fit into my academic work, but the sad reality is that I’ve always struggled to make peace with my trusty Epson GT-1500.  The most vexing problem is that the GT-1500 lacks duplex capability meaning that each side of each page must be fed into the scanner in the appropriate orientation.  Therefore, getting a book scanned necessitates several rounds of paper-shuffling.

1) Divide the book into 40 sheet bundles (signatures)

2) Run the first signature scanning pages 1,3,5, etc.

3) Reorganize the first signature so that pages 2,4,6, etc. are “up”

4) Run the first signature again scanning pages 2,4,6, etc.

5) Flip the pages again so that the odd pages are “up” again so that the book can be rebound in its original order.

6) Repeat steps 2 through 5 with the second signature and so on.

The simple answer is to purchase a duplexing scanner–either one that does mechanically the “flip” that I’ve been doing manually, or one with two scanning bars oriented one on each side of the page so that the front (odd page numbers)  and back (even page numbers) are scanned simultaneously.  Mechanical duplexing scanners like the “grown-up” version of my trusty Epson GT-1500–the GT-2500–are expensive and slower than most one pass duplex scanners with two scanning bars.  So, when at last I was forced to admit my need for a faster scanner, I begrudgingly ponied up for the one pass duplex Epson S50–after my wife helped me to realize that I could no longer afford to spend hours and hours manually flipping pages and babysitting the GT-1500.

So, here is my mini-review of the S50 from the perspective of a person who scans books–and is nit-picky and impatient and stubborn and… well you get the idea.

Speed. The S50 is roughly twice as fast as the my old stand-by, the GT-1500.  At a resolution of 600 dpi, the minimum required for accurate optical character recognition (OCR), and saved as TIF files for post-processing, the S50 happily consumes paper at a rate of seven to nine sides per minute.  The smaller the paper, the more sheets the unit can burn through.  For comparison’s sake, the fastest I ever got the GT-1500 to scan was three to four sides per minute, but that doesn’t count all the time spent manually flipping the paper between scan runs which would likely halve that number.  Advantage S50!

Paper handling.  Both the GT-1500 and the S50 struggle with thicker, less-milled paper–like that used to print budget paper-backs tending to feed multiple pages if not carefully prepared by rolling the pages this way and that to prevent their appearing to the scanner’s pick-up assembly like one extra-thick piece of paper.  Books with nice, smooth milled paper tend to fair much better, but glossy stock can also misfeed.  In general the sort of paper used in good quality academic paper-backs and hardcovers feeds quite well in both the GT-1500 and the S50.  However, fixing a missed sheet is much easier with the S50 as the meshing of even and odd sheets in the GT-1500 gets off when a sheet is missed and all the pages–either even or odd–scanned after the missed sheet have to be renamed to allow insertion of the offending page.  Because the s50 scans two sheets at a time, inserting missing sheets is as simple as adding a page (a) and (b) so that the final list of files reads “…28, 29, 29a, 29b, 30…”  However, the down side is that it is easier to miss the fact that the S50 scanner lost a sheet because the page numbers remain sequential where the GT-1500 does out of sequence just at the point where the sheet was missed–pointing out the problem like a flashing neon sign. Tie!

Software.  Neither Epson software packages are particularly stunning.  Both the S50 and the GT-1500 interfaces look very “Windows 95” with windows that can’t be resized, options that disappear and reappear when the scanner is in certain modes, and a generally ugly and cluttered aesthetic.  The S50’s “Epson Scan” software is actually considerably more powerful when in “expert” mode: including the ability to adjust the offset of the scan from side to side, more options for controlling contrast and brightness–including curves, black and white levels, and even a live histogram.  In contrast the S50 is clearly aimed at the office user more concerning with mowing down a stack of photocopied TPS reports than creating perfect, digital versions of academic texts as it sports a painfully simplified set of contrast controls and no offset controls whatsoever.  Even the paper size selection is initially limited to office standard sizes before the user plugs in a custom paper size for each new project.  Where the GT-1500 has three levels of image sharpening and five levels of de-screening, the S50 has a click box for “unsharp mask” and “descreen.”  It is clear that the S50 is intended by Epson to be operated by folks not savy in the ways of image enhancement.  Advantage GT-1500!

(NOTE: One unexpected benefit of the Epson Scan software is that the two software sets nest with each other, so that when I click the Epson Scan icon from my desktop I’m asked if I’d prefer to use the S50 or the GT-1500.)

Scan Quality.  The learning curve for the creation of perfect scans on the GT-1500 was months and months long–in part because I demand nothing less than perfection and in part because the manual is less than super-detailed.  In contrast, learning the simplified-for-simpletons S50 was an exercise in wondering “why did they take that out?” and hunting for the necessary work-around.  In the end both scanners are capable of producing ultra-high contrast, near perfect tifs for later post-processing by Abbyy Finereader OCR.  However, neither is quite as perfect as a flat-bed scanner: with the GT-1500 sheets are pulled from the horizontal hopper past the scan bar allowing individual sheets to twist or pull creating a scan that has a bit of “wow and flutter” from top to bottom.  The S50 pulls sheets from its larger vertical hopper past the twin scan bars creating the possibility that individual sheets aren’t perfectly flat when they pass the scan bar leaving slight “hills and dales” from right to left.  Overall, these scan artifacts aren’t terribly objectionable, but constitute the price one must pay avoid individually setting the book on the glass for each and every scan.  However, there is a much larger fly in the ointment: on the S50 the two scan bars are directly across from one another, meaning that if the paper size is narrower or shorter than the scan size, you will get black bars at the right and left or bottom of each scan that have to be removed in post processing.  In contrast, the white plastic strip that sits opposite the singe scan bar of the GT-1500 insures that the scanner reads clear white from left to right and top to bottom regardless of the any scan and paper size disparity.  Epson could easily address this by off-setting the two scan heads of the S50, but again the target market for that machine isn’t concerned beyond capturing readable text from disposable documents.  Advantage GT-1500!

So, all in all the S50 is not without its short-comings for my book-scanning purposes.  Less demanding users will likely find these issues negligible in terms of the speed advantage that the S50 has over the GT-1500, but those with plenty of time and a discerning eye will prefer the output of the GT-1500.  My wife actually forced me to buy the S50… no, I’m dead serious.  Her concern was that my book-scanning time was compromising my book-reading time and she was likely correct.  With the S50, I can turn the average book into collection of TIFs in about an hour–while working on something else–and another hour of post processing cleans up the covers, erases black stripes, and completes OCR processing.  The same amount of work would likely run at least four hours with the GT-1500, and wouldn’t allow me to work on other projects while the scanner is happily chugging away.  So, is the S50 perfect? No, but at twice the street price of the GT-1500, it does get the job done in half the time–or less if you’re not looking for a perfect digital clone.


  1. Robert Minto says:

    I may have asked you this before, and you probably wrote the answer in an earlier post, but since I’m lazy and since it’s always nice to get a comment — when you scan in a book, do you usually/always get text recognition so that you can search the document later? If so, does txt recognition require greater scanning precision than otherwise?

  2. maphman says:

    Hey Robert,

    Yes, I do run every document through OCR processing. What I get out the other end is a scan (a photo) of the page on the top layer and an underlying layer that is the OCR software’s best guess at what the text in the top layer is. Most of the time, probably ninety-five percent of the text layer corresponds to the photo layer, but certain characters (logic symbols, Greek or Hebrew letters, or less commonly used characters like “&,*,%,#” are misinterpreted in the text layer. So long as you aren’t doing a search for one of these characters or copy and pasting text from the book to another document, you’ll never know that the software made a mistake.

    Yes, OCT runs best when contrast is very high with completely blown-out white pages and black-as-night text scanning at 600 dpi or so. While this makes for larger file sizes, it allows for much more accurate OCR processing. I do run the pdf files that I get for classes through OCR as well, but as these images are scanned at 150 dpi or less, they have a much higher error rate when run through Abbyy Finereader.

    My workflow scans 600 dpi images for OCR processing and then takes the .pdf that Abbyy spits out and resamples it at 300 dpi and jpeg compression for the images, trying to get the best of both worlds.

