Google’s Book Search: Disappointing for eReaders

Not so fast, Huck

Not so fast, Huck, you're not in this post

Google Book Search recently reached a “groundbreaking agreement” which is great for searching the text of books, not so great for reading those books, abysmal for reading anything on an ereader.

Basically the idea is that you can search the text of all kinds of books, but the only parts you see are the pages directly around your search hits. Then the idea, I guess, is that you buy the books you want. Unfortunately, there’s no option to buy the ebook versions of those books.

Additionally, the downloadable PDFs of out-of-copyright books are enormous and unwieldy. I downloaded The Adventures of Tom Sawyer first: a behemoth 6 MB PDF was my prize (most ebooks in my library, even PDFs, are around half a megabyte). Not only is it so large that the fast (for ereaders) 800 MHz processor of my Sony PRS-700 had trouble with even the most basic of operations with it. Like, for instance, opening it.

The other problem with Book Search’s PDFs is that they have no reflow. If you’ve used a Sony Reader, reflow is the attribute that allows you to change the size of the text, and still read the PDF as pages, instead of scrolling up and down an image file, and scrolling side to side to read lines. It makes lines wrap around, and it makes words know they’re words and act like words. (Reflow also works with images. Here’s a decent page about it, with a few images that demonstrate a reflowing PDF.)

Without reflow in a PDF, you’re looking at a picture with a text reader, which is the problem with Google’s downloadable books. You have to zoom in to even be able to make out the text, and even then, the text is tough to read. It looks and acts like Google had a few million books scanned and then ran the images through a text identifier, because the images of the words are searchable.

That’s fine, I guess. It’s useful to have one search engine to search inside all books. Isn’t Amazon doing a thing kind of like that? Maybe they should get together. What confuses me, though, is to go to all the trouble of scanning all those books, and read them with text identifiers, and then not converting them into readable files. Why not take that one extra step? Or, if other people like Gutenberg are doing this, why not use their files for their search?

For now, I’ll be doing what I was doing anyway: using Manybook.net’s amazing database, which has thousands of out-of-copyright books (most or all from Gutenberg), and makes them downloadable in dozens of formats, including those compatible with all major ereaders.

Leave a Reply

 

 

 

You can use these HTML tags

<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>