I love to read, in particular using the Amazon Kindle application on on either my Asus EEE Transformer Pad or my iPad. But being the computer geek that I also am, I figured it was about time to look into how the mass bunch of these books are deployed. As I already had a draft copy of a book titled Sending Emails - The Safe Way, something I wrote back in 2007-2008 while trying to explain the use of Digital Signatures and Encryption in order to secure emails, what better place to start testing.
The book itself is, as most I've written, prepared in a Document Preparation System called LaTeX. Ok, so the name might bring up some wrong associations, but the fact is, this is quite probably the best environment to write any scientific papers. In addition to having a very nice bibliography system called BibTeX for referencing, it is superior for equations, hence heavily used in particular in scientific academia. This, in addition to being built on the base of the typesetting system TeX, makes it far superior in terms of typesetting matters such as kerning, hyphenations and ligatures, something other writing tools (in particular MS Word) lack completely. As this isn't supposed to be a LaTeX post, I'll just stop there and refer to wikipedia and wikibooks.org for more advantages to the curious.
But where greatness ends, trouble arise. As LaTeX is a typesetting system, it is naturally page oriented to either a book or an article format (in addition to other document classes). The Amazon Kindle format, as well as ePub and other generic formats, are on the other hand flow-oriented. That means that line-breaks and handling of hyphenations, that you would normally leave to the typesetter, is handled by the device on-the-run. Due to the limited capacity for resources, and the nature of the flow rendering (in page oriented it doesn't matter if creating the initial page takes some time, as it will only be done once (write-one-read-many)), the instruction set for flow-oriented documents is more limited.
The underlying format used for these devices is actually the same as used on this website (and any other), HTML (or for the prudent reader, the XML based counterpart XHTML with a subset of CSS). Ok, lets take a stop here. HTML was something I have used to mark up text for what is approaching two decades way too quickly. The simplified nature of it also means that the author is losing a lot of control of how things are viewed, but have to depend on proper formatting by the various devices on the market.
Accepting that I won't get things as pretty in the Kindle version compared to the PDF I, never the less, continue the conversion process from my .tex source files to the kindle format. First step using LaTexML. At this stage, however, I realize that a lot of the nice packages I've used while creating my PDF files, is incompatible with this tool, resulting in a need for defining various dummy commands and post-processing of the output HTML file, both automatically and manually before importing it in the eBook Creator. The sad thing actually is, that for an author that mainly expect to publish using digital media, it is far beneficial to just stay away from more advanced tools and write the HTML by hand, just as we did back in the mid 1990s when first introduced to the Internet. Truly an impressive progress....
An increasing number of the eBooks that are distributed are prepared by lay-men, and the automatic tools for conversion by Amazon is more targeted towards Microsoft Word documents than anything else. But hey, it boost the available products and is cheap for the distributors, so I would probably do the same if I were them, but I still consider it a shame.
In any case, while I'm contemplating other (fiction-related) projects for 2012, at least I've gotten around to testing the system a bit and made my first book available at both Amazon Kindle (eBook) and a printed version (certainly no surprise that I'm most satisfied regarding the printed version).
Happy New Years! (and feel free to visit my page on amazon.com)

