Have some news you would like to see here?
You must be logged in as a registered Web site user or SSP member to submit an article. If you are not, please create an account now.
Join SSP | Contact Us | Sitemap | RSS
Have some news you would like to see here?
You must be logged in as a registered Web site user or SSP member to submit an article. If you are not, please create an account now.
By Bill Kasdorf, Vice President, Apex, and General Editor, The Columbia Guide to Digital Publishing
E-books still have a bit of an image problem. Despite their wide use in academia and professional contexts — especially for reference, technical, and scholarly content — they are generally seen as not having gotten much traction. One of the main reasons has been the messy proliferation of formats and the failure of specific high-profile reading devices to catch on. So despite the fact that today everything from bestsellers to bodice-rippers and cookbooks to computer books are e-books, people still think e-books are a pipedream.
The International Digital Publishing Forum (IDPF) — formerly the Open eBook Forum — has just taken a major step to recify this situation. It has issued a new standard file format called .epub, which promises to dramatically simplify the creation of e-book content and the distribution of that content from publisher to reader. And that’s using "reader" in both its human and electronic senses.
The new format is the successor to (and largely backward-compatible with) the Open eBook Publication Structure (OEBPS), which, since 1999, has been the format in which tens of thousands of e-books have been distributed. The new .epub format is actually a combination of three new standards:
One File Format, Many Ways to View It
The main purpose of this new standard is to make it possible for reading systems (whether your laptop, your cellphone, or a dedicated e-book reader) to properly render (display) e-book content. Any OPS-compliant reading system must be able to automatically render an .epub file.
That does not mean that reading systems will all render that file the same way. First of all, it’s important to realize that .epub is for reflowable content. Unlike PDF, which is a fixed-page format, .epub is an XML-based standard designed to enable content to adapt to the capabilities of a range of reading devices — with wide or narrow screens, with high or low resolution, with broad or limited font support, with or without color, and so forth. In fact, it even includes support of SVG (Scalable Vector Graphics), an XML standard that is particularly useful for enabling graphics to adapt to various resolutions and screen sizes.
What .epub provides are standard minimum "vocabularies" or tag sets that any OPS-compliant reading system must be able to deal with. Some systems will be able to render the tagged content in all the glorious detail the publisher envisioned, even to the extent of using embedded fonts. But it’s the responsibility of the publisher to provide fallbacks for elements that not all reading systems can be expected to handle. For example, you can provide a special embedded font for chapter titles, but you might specify bold roman for systems that can’t use embedded fonts. Likewise, a color image might be displayed on some systems as grayscale, but it must be displayed. This ensures that a person using that reading system can see everything in the publication, one way or another.
What a Concept: A Standards-Based Standard!
OPS 2.0 is based on two preferred vocabularies: XHTML and DTBook. The former is the most common way to tag content for online display, and the latter is the most important standard for accessibility (see the recent SSP News article, "The Most Useful DTD You’ve Probably Never Heard Of"). In order to create a manageable expectation, not all capabilities in the XHTML standard are supported; instead, OPS 2.0 specifies certain XHTML 1.1 modules that must be supported by any OPS-compliant reading system. For content in which more structure is needed, DTBook is the preferred vocabulary. That provides elements such as sidebars, footnotes, annotations, and page numbers; and it also ensures that OPS 2.0 publications can be NIMAS-compliant. (NIMAS is the National Instructional Materials Accessibility Standard.)
OPS 2.0 also makes it possible to use markup not in XHTML or DTBook, through the use of XML Islands. So, for example, you can include an equation encoded in MathML in an OPS-compliant e-book, but you have to provide a fallback (like a graphic) for reading systems that can’t render MathML.
And finally, OPS 2.0 includes a stylesheet language based on CSS 2.0. That way, one reading system can render paragraphs flush left with space above for online style, while another can render them with no spaces and first line indents, as in print pages.
Putting the Pieces Together
Of course, making sure you can render the content is only part of the problem. There are also the issues of identifying the content and putting the pieces in the right order. That’s where OPF 2.0, the Open Packaging Format, comes in. OPF provides the standards for metadata and navigation that identify what all the components are (markup files, text files, images, navigation files), specifies the reading order, and provides the fallback information, among other things.
And then, zipping it all up (literally) into one neat package is OCF 1.0, the Open Container Format. This is a zip-based standard that specifies how to encapsulate an OPS publication along with other related files. It must include the OPS file, of course, but it may also include many other kinds of files. So, for example, a publisher might want to enclose a PDF file along with the OPS file. That’s not required, but it’s permitted. Likewise, the standard permits but does not require the use of DRM (digital rights management).
The OPS 2.0 and OPF 2.0 standards were formally issued on September 11, 2007. (The OCF 1.0 standard was issued on October 27, 2006.) Even before their formal approval, they had begun to achieve rapid adoption. Such major tools as Adobe CS3 and Adobe Digital Editions were early adopters, as was the Sony Reader. VitalSource, in the educational space, and LibreDigital, the so-called digital warehouse for many trade book publishers, have adopted it as well.
It looks as if .epub may finally be the breakthrough that will take e-books from being a cool curiosity to being commonplace.
Bill Kasdorf is Vice President of Apex Content Solutions, a leading supplier of business services, data conversion, editorial, production, and support services to publishers and other organizations worldwide. A past SSP president, Bill has led seminars and spoken widely for publishing industry organizations such as SSP, the Association of American Publishers, the Association of American University Presses, Seybold Seminars, the Council of Science Editors, the Association of Learned and Professional Society Publishers, the Library of Congress, and the International Association of Scientific, Technical, and Medical Publishers.