Hooray! EpubCheck will make sure a book meets the IDPF standard
Jeez. Don’t get me started on Web standards—and the battle between, say, Firefox-readable pages and those optimized for Internet Explorer.
How to avoid the same mess in e-books? Now that the IDPF has given us the .epub standard, shouldn’t we be able make sure a file is the real McCoy, and I don’t just mean the Adobe variety?
Actually we can, more or less. Hats off to Adobe’s Peter Sorokin, ETI founder and IDPF rep Garth Conboy, XML guy and access expert Markus Gylling and recent Adobe intern Piotr Kula—the people behind EpubCheck, an open source app.
‘Fairly mature and extremely handy’
“It is not complete (there are still many checks that we can do), but it is already fairly mature and extremely handy,” writes Peter in his Adobe Digital Editions blog.
“If you author epub files, you should consider running this tool on your content regularly. Standard content is much less likely to have problems in today and future eBook readers and any problems with fully-compliant eBooks are much more likely to get serious attention of the developers.
Help ‘em out
“If you are a developer, I would like to invite you both to use EpubChecker code in your development (it is licensed under BSD terms) and to contribute back to the project.”
More details: “EpubCheck is a command-line tool to validate IDF Epub files. It can detect many types of errors in Epub. OCF container structure, OPF and OPS mark-up, and internal reference consistency are checked.”
OK, gang, try it out and let us know what you think. What constructive suggestions to you have for Peter and the others? Hey, Hadrien, since Feedbooks has been an .epub pioneeer, what do you think? Do your files pass the test entirely? Keep us posted.
Reminder to the obtuse: “The fact that the .epub-compatible Digital Editions can read Adobe’s PDF format does not make PDF part of the IDPF standard.” Let’s save our paranoia for matters worthy of it.
And now the logo issue: So if it’s possible to guarantee .epub purity, shouldn’t a logo be next—first, for nonDRMed files and then a second version, in a different color, for all IDPF-standard files; just so the core format and DRM are both compliant.
Speaking of which: I wonder what joint efforts might be possible to get interoperable DRM right, if it can be. While I’m anti-DRM, I’m pro-book and believe that if publishers insist on “protection,” there at least needs to be a standard. Interoperable DRM is on the IDPF’s agenda.
(Via MobileRead.)









December 18th, 2007 at 10:15 am
“Digital Editions can read PDF is NOT the IDPF standard.”
I’m having trouble parsing that sentence. I think there may be one or more words left out, but I’m not sure what they are. Could you take a closer look at it and fill me in on what I’m missing?
December 18th, 2007 at 10:21 am
Hi, Cat: Try this: “The fact that the .epub-compatible Digital Editions can read Adobe’s PDF format does not make PDF part of the IDPF standard.” My bad. Fixed. Big thanks for the catch. David
December 18th, 2007 at 7:15 pm
This is a critical tool.
Adobe’s version of “Alice in Wonderland”, and “The Adventures of Sherlock Holmes” both are full of errors.
Even “EpubGuide-hxa7241.epub” has the wrong mine type and missing attributes in the toc.ncx file.
It will help, at the very least, software developers to better write reader software, and it will be invaluable to publishers and editors.
Could I make an open request to anyone more HTML savvy than myself - can the jar file have a HTML front end to make it a bit more user friendly (with a file selector and a text file output)?
December 26th, 2007 at 9:11 am
I think there are incorrectitudes in the checker rather than my EpubGuide-hxa7241.epub in this case. I will look at the checker code and maybe submit fixes…