Ever since I bought the Kindle a year ago, I’ve been a big fan of e-books and this autumn I’ve spent quite some time learning to make them. One thing has started to annoy me about many commercial ebooks: far too many of them are littered with all kinds of encoding errors: extra line breaks, missing spaces between words, indents turning into empty lines, etc. It seems like this problem has been getting worse, not better, during this last year, and now I ran into the total nadir of this: the Kindle-edition of “Snuff” by Terry Pratchett.
To put it mildly, it looks like the whole first page of the book is fucked up. First you see some code, then a rather strange heading called BEGIN READING which might or might not be from the original book. In middle of the page the text turns to indented italic, but it reads like the whole page should have been that, the quote from the same fictional book.
Okay, big fucking deal, they messed up the first page. Well, not really. There are stretches in the book where they have managed to mess every damn page with quite disturbing errors. First of all, every damn drop cap floats over the chapter in a way that looks decidedly silly, but the most annoying thing is that there’s an incredible amount of spaces missing from between the words, which seems to be connected to the word “people”. This is really jarring when reading, and drops you off from the story to being annoyed at the text.
I’m slagging the ebook version of Snuff here because it’s absolutely the worst example of this I’ve seen, but this isn’t meant to be understood that this is somehow rare. At least half of the ebooks I’ve read have had crap that would’ve never made it to the paper version, most notably extra line breaks that cut up the chapters this way and that. I actually contacted one publisher about this and pointed out some of the errors, and they were pretty friendly and grateful for it, gifting me a free book for my trouble. I’m not expecting any kind of reply from Harper-Collins, though.
I spent some time researching the ebook formats and it took me more or less one work day to do my first conversion, the Musta antologia by Uusrahvaanomainen spekulatiivinen fiktio. I did it by hand on the code level, and I’m willing to bet that it has less typographical errors than 90% of the commercial ebooks out there.
Okay, okay, I do understand that a large publisher has a pretty high volume of books coming through and they have to rely for automation pretty heavily, but for fuck’s sake – ebooks should nevertheless have the same level of quality control as the print books. As a computational linguist by education, I can’t see automated detection of extra line breaks to be such a big problem. Or proofreading crap like missing spaces.
So, really, all you publishers out there, put some more effort in making the ebook versions of your books – signed, your customer.
Meanwhile, if someone wants to commission me to make ebook conversions, drop me a line at the comments