Like Odysseus, I have a very simple goal in mind. Like Odysseus, I seem to be taking forever to get there. My goal is to create an ePub version of my book, Values of Harry Potter. Hopefully in describing my difficulties in doing that I can help point others in the right direction, and perhaps encourage some software developers to help with the transition.
The good news is that I have already made a Kindle version of the book available, as I’ve noted. I also have great HTML version of the book finished (and this was the basis for the Kindle version), as well as a pdf with fully functional internal links. With direct sales of the ebook, I want to include all three versions — HTML, pdf, and ePub — so that last format is what’s hanging me up. (In general I think all publishers should offer multiple, DRM-free ebook versions, to provide maximum flexibility to customers.)
I like the idea of the ePub format, developed by Adobe. It is open, so anybody can use it. At least theoretically, any author or publisher can create an ePub, and anybody can create a reader for the format; several readers now exist. ePub already reads on a variety of devices — including my iPod Touch — and I hear several more compatible readers are entering the market. Unlike an HTML ebook, ePub organizes many files, including text and images, into a single package. Unlike pdf ebooks, ePub reflows text to fit your screen and reading preferences.
The problem is that it is a royal bitch to create a complicated ePub book.
I finalized my book in inDesign, Adobe’s design software. From that finished text, I created a text-only file and hand-coded the HTML version, adding hundreds of internal links. Then I modified this file for Kindle. For the pdf, I went back to inDesign and added all the internal links, which inDesign anchors to specific pages.
I didn’t want to create the ePub straight from inDesign, because I doubted the internal links would work well. (With the HTML version, I anchor particular words and paragraphs rather than pages.) Anyway, even though inDesign supposedly has a built-in ePub converter, this didn’t work for me. It merely told me — repeatedly — that the conversion had failed. Thanks a lot, Adobe. Perhaps with Creative Suite 5, inDesign’s ePub converter will actually, you know, work, and perhaps Dreamweaver will also offer a functional converter for use with HTML.
So I decided to go back to the HTML for the ePub conversion. Dreamweaver automatically converts HTML to XHTML 1.1, so I made that conversion. (I think ePub requires XHTML, but I’m not sure about the details.) Those without the software will have to code by hand. (In the future, I’ll just code straight XHTML to save myself the hassle.)
One page lists a variety of ePub conversion programs. I tried Calibre, which created an ePub with tons of junk characters that eventually crashed my readers.* I also tried eCub, which created a file that immediately crashed my readers.
Jedisaber provides the single best source on ePubs that I’ve yet found. Indeed, creating an ePub from his “sample” file is at least as easy as trying to use one of the software converters. (It has the added bonus of actually working.) After modifying the “sample” files with my content and information, I immediately created an ePub that opened on Adobe’s Digital Editions.
Unfortunately, after making a minor tweak to the file, it no longer opened. After a lot of exasperating trial and error, I figured out that the problem was that the files were not listed in the correct order in the .zip folder (which, renamed, becomes the ePub). Finally I downloaded YemuZip, dropped in the files in the correct order, and created another working ePub.
I should say “partly working ePub.” Digital Editions would recognize only a few of my internal links. It took me quite a lot more trial and error to figure out the problem. In HTML, I had used the “a name” tag, such as (a name=”1note”), which Dreamweaver converted to (a name=”1note” id=”1note”). I learned my HTML back in 1998, so I wasn’t aware of the new (apparently nonsensical) regulations. Anyway, I quickly learned, “the id attribute’s value must be an XML name and cannot start with digit or have spaces in it.”
I used “search and replace” in Dreamweaver to change all the offending digits to text. The resulting ePub opens in Digital Editions and functions perfectly. All the internal links work great. Unfortunately, the ePub crashes the BN reader and works improperly in Stanza, which doesn’t display any of the internal links.
My hypothesis is that the “a name” tags are causing problems in those other readers. I learned, “In XHTML, the id attribute has essentially replaced the use of the name attribute. The value of the id must start with an alphabetic letter or an underscore. The rest of the value can contain any alpha/numeric [character].”
Unfortunately, it is not immediately obvious to me how to convert all the “a name” tags to “id.” In order to prevent hard line breaks (with an extra return), I used breaks with space indentations rather than paragraph markers. So for long stretches of text there’s nothing to attach the “id” marker to. (Perhaps the way to do this is obvious to somebody who actually knows all the XHMTL codes; if so please leave a comment.) I think it is possible to correct this problem by using the style sheets to read paragraph breaks as soft returns, but I don’t know how to do this off hand. (Plus, I’m not even sure this is what is causing the problem with other ereaders.)
The upshot is that I still do not have a fully functional ePub. I have one that works great on Digital Editions but poorly on every other reader I’ve tried. I guess my next step is to convert all the “a name” tags to “id,” then try to compensate with the style sheets for the soft returns. Unfortunately, I’m not sure how to do this, and I’m not sure it will even solve the problem.
If you have a simple, straight-text book you want to convert to ePub, using Jedisaber’s directions should be a piece of cake. But God help you if you want to take advantage of the digital format with many internal links, something that becomes even more complicated if you try to link across chapters in separate files. (I just put the main text of my book in a single file.)
So consider this a status report. The ebook version of Values of Harry Potter is coming. Soon. If I can just get past the cyclops.
* Update: After reading Jason Fleming’s comment, I decided to give Calibre another try, using the XHTML with the corrected link tags. I got very similar results that I got doing it by hand: an ePub book that works beautifully on Digital Editions, crashes the BN reader, and works with Stanza sans links. If you just have a cover image and straight-text book in a single file, this software probably works great. However, it doesn’t (obviously) allow the flexibility of splitting up files; for instance, in my hand version I broke off the title information into a separate file.
Update May 1: After recoding my entire xhtml document to eliminate the “a name” tags (in favor of “id” tags tied to the paragraph markers), I created a new ePub with Calibre that works exactly as before. It works beautifully in Digital Editions, crashes BN, and works in Stanza but without any active links. So that was a complete waste of time. I wonder whether BN or Stanza are even set up to handle internal links. If anybody happens to know, please comment.
***
Comments
D. Jason FlemingApril 30, 2010 at 1:53 PM
The guy who runs Calibre seems to be very dedicated to making the .epub standard as clean and accessible as possible. I’m not sure how specific your problem was, but if you filed a bug report (http://calibre-ebook.com/bugs), it’s a good bet that he’d treat it seriously.
D. Jason Fleming May 4, 2010 at 5:52 PM
I’m sorry for the waste of time. What I intended to say was: “based on what he has said on epub messageboards, the guy who runs Calibre seems…”, and that you should report what happened as a bug, you might get either a) results, or b) an explanation of why it behaves the way it does at the moment. I didn’t mean to make you go through the frustrating process again just on my say-so. Many apologizings.
Ari May 4, 2010 at 6:07 PM
I didn’t try the “id” conversion because of Fleming’s comment; I tried it because I thought it might work. It didn’t. That’s the way it goes sometimes.
Becca May 28, 2010 at 9:19 AM
Any luck yet?
We do conversions to epub and have no trouble with internal linking, except in the Stanza, which just doesn’t seem set up to handle the links or the design elements. Have you used the google epub validator? http://code.google.com/p/epubcheck/
That, or the Threepress validator http://threepress.org/document/epub-validate, should be able to give you a good idea of what’s causing the errors.