Universal formats vs universal readers
By Steve Jordan
The e-book market of 2009 has had one overriding concern throughout the industry: Can customers read this book? The issue isn’t one of literacy, availability or accessibility… it is one of format. Specifically, a question of the many, many e-book formats competing for dominance in the industry.
When e-books first appeared, it seemed there was almost a format for every e-book. Individuals created their own idea of the ideal e-book format, and custom-crafted readers to translate those formats. New devices, capable of reading e-books, soon had new e-book reading applications designed for them, and new formats optimized for those new devices. After about twenty years, many formats have fallen by the wayside, while certain formats have become overridingly popular in particular regions, or with particular subjects and genres. But the present result is almost a dozen commonly-used e-book formats, none of which can claim real dominance over the others.
As the world of dedicated e-book readers has developed, hardware makers have generally chosen an e-book format to support early on, and optimized their device for that format. A few of them read multiple formats, but until recently, that was the exception, not the rule. Also, until recently, the most popular readers read only one of the more popular formats, and a few of the lesser-known formats. For instance, Amazon’s Kindle e-book reader reads versions of the Mobipocket PRC or Mobi format, and the Sony Reader Digital Book reads the LRF format… but neither of these popular devices reads the other’s format.
This has led to a schism in the industry, pitting consumers’ desire for a particular device against the availability of e-books in the particular format read by that device. Potential readers are forced to choose one e-book market or the other, and often have to forego certain books that are only available in the other market. This fractioning of the industry from the consumer’s point of view has only added to the plodding growth of the e-book market.
Many in the industry have decided that the way to solve this problem is to adopt a universal e-book format that everyone will use. Presently, the ePub format created by the IDPF is the odds-on favorite for becoming the de-facto standard format for all users. It is argued that every reader should be able to read e-pub files, making all e-books available for every device.
This sounds laudable, but it has one problem: There are already thousands of e-books out there, in different functional formats; it would be a lot of work to go back, collect all of those existing e-books, and convert them to another format; and not every interested party will have the interest, or the resources, to do that. In the real world, we would be left with a vast number of unconverted books that would not be readable on these e-pub optimized devices.
We can look to another, similar industry for inspiration. The home computer industry got off to a slow start, mainly because of a lack of standardization among hardware, operating systems and file formats. But when computers began to standardize with popular operating systems and common programs, the industry finally began to take off and thrive. This is exactly what the e-book industry needs to thrive, as well: Standardization. But as we already have a large legacy of existing e-books, and a hardware industry that is still in flux, the logical solution is to provide standardization in the still-developing hardware.
That’s why a better solution is to include multiple conversion engines for every possible format on every e-book reading device. A device that is capable of reading a dozen formats is infinitely more useful than a device that can only read one or two formats, and it provides continued access to those e-books that will never see conversion. Such a device-wielding consumer can buy e-books from any market, if they know their device is sure to be able to read it.
Presently e-book device sellers are too concerned with trying to lock customers into specific formats, and in so doing, prevent the homogenization that the industry needs to move forward… this is the wrong way to go. Offering devices that read many formats, reformat them for optimized display, and offer other features such as attractive designs or intuitive, efficient controls, will be more likely to attract customers and make sales. This would also be the best way to encourage real design innovation by manufacturers, beyond today’s simple manipulation of package coloring and button bezels.
There is presently a large infrastructure of professional and amateur programmers capable of designing format conversion algorithms for various e-book formats. Different conversion engines may reformat text differently, and consumers would presumably have a choice of which conversion engine they prefer: The Adobe engine does a better PDF format on device X, but device Y also has the Powell Mobipocket engine, and I like the look of Powell-formatted e-books. Some of these engines might be licensed to specific devices, or perhaps other engines could be loaded by the consumer onto the device of choice.
However individual engines were distributed, the end result would be a consumer that could choose their reading device for its looks, cost, features, etc, and be sure that they could read any e-book out there, regardless of its format. The fractioning of the industry would be erased in one motion, and the potential for e-books to spread would be more easily realized.















September 3rd, 2009 at 5:04 am
More accurately, supporting Calibre as the current best-of-breed conversion tool seems to be the way to go. That way any publisher can just say “we supply a format that Calibre likes” and any reader can say “we can read a format that Calibre can output”. Pretty much that is all that’s required.
I’ve long since stopped worrying about whether any device I might buy can read the formats I buy in. Instead I only buy formats that I can convert to something open and re-convertible. Mostly because the device I have now is not the device I will have in ten years time. So buying a stupid format or renting DRM’d files might be ok today but is just going to bite me later on.
September 3rd, 2009 at 7:44 am
With all due respect, that’s not what I’m talking about, Moz. You’re thinking of a desktop manual conversion tool that the consumer uses, then ports the result into the reader. Calibre, in addition, won’t open or convert everything… it’s best with only a few formats.
I’m talking about a reading device that already has multiple conversion tools on-board, and does the conversion itself, automatically, without consumer effort. A seamless transition from whatever file you have, in up to a dozen or more formats, to words on your screen. That’s what consumers want, and they will respond to the first device that gives them that with overwhelming support (unless, of course, it is hideous and expensive… in which case, they’ll respond to the first device that converts all files, is pretty and/or inexpensive).
September 3rd, 2009 at 9:14 am
Steve, I think what you really want to say is that the devices should have multiple Rendering (not conversion) engines on them. Or do you really want all formats treated the same at the user interaction level?
September 3rd, 2009 at 9:38 am
I don’t see why it matters where the conversion happens, although I see advantages to it being initiated on the reading device. Calbre already runs on the Kindle 2, it is call Savory and currently autoconverts PDF and ePub. However, any web-connected device could flip this and use Calibre on the Desktop over the network. Amazon already does this with its centralized conversion service (although it must be initiated by e-mail).
The reason this isn’t a widespread approach today is DRM. If ebooks were DRM free, the ebook format would be much less of an issue because conversion is largely a “solved” problem. Only PDF to reflow conversion is currently hard, and I expect good progress even on this front.
September 3rd, 2009 at 10:37 am
Oh dear, it seems not many people read carefully what Steve wrote. To oversimplify: My mother would not bother to convert files. She wants a device that reads any ebook file: 1) put file on reader, 2) read. Any more complicated than that, and a very large percentage of the public will pronounce a pox on all our houses and walk away. Paper books are fine.
Steve’s point is that with a babel of file formats and renderers, most manufacturers and publishers would greatly benefit from massive cross licensing, rather than attempts to lock in customers. Without that, sales of ebooks and readers will remain a tiny percentage of the book market.
Amazon, Sony, etc. can have the biggest slice of a tiny pie, or a modest slice of a giant one. They need to think about this carefully.
Regards,
Jack Tingle
September 3rd, 2009 at 11:20 am
Got it in one, Jack!
September 3rd, 2009 at 7:16 pm
I can see the attraction of the basic idea of rendering all possible formats but the technical hurdles are so great that I assumed you had seen them.
Sony struggle to display their own format reliably. Every other device I’ve seen has had similar issues (I’ve seen iPhone, iLiad, Kindle, bEBook, Cybook and a few others). If you add all those display bugs together you’ve got rather a lot of issues. Enough to make it unlikely that any one device will deal “well enough” with more than a few formats.
Conversions are always lossy. Every format has features missing from others, some of which can be emulated but some can’t. Dump any book to HTML and see what you get, for example, or in the extreme, text. Until we have a metadata standard no automatic conversion will cover all cases, and it will have to be a format in its own right to cover everything – like OGG, for example). So a “native” version will almost always be superior.
Many formats assume computing resources that are unreasonable for a portable device. Rendering an MS-Word docx file on my laptop redlines the cpu for a few seconds and eats ~10x the document size in RAM, plus the overhead of the docx viewer. My laptop can play MP3 files at ~30x real time and CPU is the limit there. My Sony PRS-505 can play mp3 files at 1x speed, but battery life drops to less than 10 hours. So assuming CPU limiting in both cases and assuming enough RAM (my Sony is ~100x short), the Sony would take a couple of minutes to open a similar DOCX file and doing so would flatten the battery in a few hours. Pre-converting on the laptop would make a lot more sense, if you were willing to allow that. The Sony “library” software does this, for example.
Now, assuming the Moore’s Law pixies continue to work their magic it’s quite likely that within a decade that last issue will be much less important – I’ll have enough hardware power to run whatever software I need for weeks between charges. But unless we make more progress in software engineering in the next ten years than we have in the last fifty, we’ll still have buggy rendering engines making a mess of eccentric ebook formats.
We will also need dramatic reform of a few laws and a big change in attitude from major players. How do you propose to let “my gran” read a book she bought for her Kindle on her new Sony PRS? Does she just pay another rental fee for the new format, backed by anti-monopoly law banning exclusive deals? And what about licenses for all the various formats that require fees for rendering/decoding engines?
September 3rd, 2009 at 7:54 pm
How much do you think Joe Consumer cares how hard it is to write a program? Not a whit. All he knows is, when he goes to a store, and sees a device that reads multiple formats, and on screen they all look good, he’s going to buy that device.
That puts it in the company’s hands to get the program written, and make it work. If he succeeds, the company makes money. If they fail, the device tanks, and everyone’s on the unemployment line. That’s how consumer-oriented business works.
Is creating a conversion engine to read a file of text and images that tough? Not really. Much tougher conversions have been tackled in the past, and this can be done, too. (Sony has problems because they created an overly-complicated format.) There are devices out there right now that read 6 or more formats, which pretty much says it all.
September 4th, 2009 at 4:09 pm
Creating a conversion engine that Just Works is extremely tough, especially when the native format of the target device can’t do everything that the source format can do. Mobipocket, for example, is severely limited in its CSS handling.
And when you get to incompatible implementations of an allegedly identical format such as ePub, what do you do then? How do you decide whether you should be converting ePub automatically to ePub or the other way round?
Historically, on-the-fly conversion has often seemed an attractive temporary solution but it has generally been superseded by one of the contending formats emerging as a clear winner.
But yes, it’s always worth remembering that if you ask consumers to go away and check what formats their device is capable of reading, they will indeed obey you, and go away.