We now come to the final file we will need to make our fixed-layout iBook complete... at least in terms of functionality, that is. But once you build this file you can load your ebook onto the iOS device of your choice and see it in action. And that's when the real work begins. For now we'll focus on the final puzzle piece required, the content.opf.
As its name suggests, the content file is a descriptive listing of the ebook's physical contents, that is, the actual component parts included in the archive, not its literary or artistic content. OPF stands for open packaging format, which refers to the specification that defines the structure and semantics of the package. There are four essential elements that make up the content.opf file, each of which we'll take in turn.
First, however, you will notice that a new declaration element has replaced the previous html namespace references:
<package xmlns="http://www.idpf.org/2007/opf" unique-identifier="book-id" version="3.0"
This provides a reference to the official opf spec at the International Digital Publishing Forum website. Those are the good folks who have been piecing this thing together over the years and working diligently to keep it up to date (not an easy task at the rate technology is changing).
The second element in the first line of this declaration is the unique-identifier which you should recall we made a reference to in the toc.ncx metadata section (if you read that part, that is). The dtb:uid entry there will show up again in a moment, linking it with the reference here, but we will reference it again here in a minute in any case. The "book-id" element is sometimes written as "BookId" or some such instead. It's just a linking reference like an "id", so it can be named anything you like, so long as it matches what we enter below.
The version="3.0" element is where we tell the reading system that we're building an EPUB3-compliant file, rather than the older EPUB2. This is what allows us to add the newer navigation file and incorporate a lot of advanced features we will not delve into in this introduction, such as interactive scripting and support for vertical and right-to-left languages. What we will make great use of here is a rich and robust expansion of the metadata attributes that define our file's content. Consequently, this portion of the tutorial has been greatly expanded with a lot of new information that was not included previously (since it had not been developed yet).
The remaining two entries in the package declaration provide the vocabulary we will use to detail those metadata entries. You should make no changes to this above lines of code, with the sole exception of the unique-identifier value should you choose to use another name for your reference value, or if for some reason you prefer to make an ePub 2.0 ebook instead. But there is little reason to do either.
The first section is where the metadata for your ebook content lives - and this time I do mean artistic and literary content. This section can be merely a few short lines giving just the bare essentials of title, language, and identifier (the only ones required), or add a host of other information that can be useful for identifying and cataloging your title among the many millions of others out there. In this, more is always better, as individual systems can ignore non-relevant portions, but cannot make them up if not provided.
NOTE: A chart containing all 15 of the Dublin Core metadata elements is available for reference in the resources section, along with links to several important data value sets, though the most important ones are discussed below.
Before we get to the metadata proper, however, we must declare our reference systems, of which we'll be using two:
<metadata xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns="http://www.idpf.org/2007/opf">
Dublin Core (dc) is the primary set of metadata elements in use in ebooks, but the opf spec now adds a whole new set of very specific information that "refines" the general Dublin Core attributes, and which are useful for fine-tuning our statements, so we declare both here, as they pertain specifically to the metadata section.
NOTE: The previous ePub2 practice of appending an "opf" modifier to the dc element is no longer allowed in ePub3, as it has been superceded by the "meta refines" attribute/value set that we will look at shortly. Any use of the preceding system will result in a failed upload to Apple, and the rejection of your file.
The way the new system works is by stating a standard Dublin Core value, and then optionally appending additional information to it using the new meta refines attribute/values, either to define the core value more specifically, or to add more data to it. In the template there are six examples that refine the title, creator, and identifier entries. Many more are possible, and in fact the list of examples provided by the IDPF is not exhaustive, and is in a state of ongoing evolution. However, the key syntax and elements have been defined, so be sure to use the specs as reference if you want to pursue it further. I have provided the primary ones that most ebooks will need. You can easily add to these using the further values discussed in the next sections of this guide.
As stated, it is not required that you include the refines meta tags at all, but I highly recommend that you do, as it provides important information that will help your book be listed correctly, and therefore, found more readily. Think of every added work that you include as one more keyword that is searchable and that will give you an idea of the power of these entries.
I should mention further, here, that there are only four metadata entries that are actually required for all ePub3 ebooks, and these are the title, identifier, language, and the date the publication was last modified, each of which is marked as <!-- REQUIRED --> in the template for your reference. These will each be discussed below in due course.
There are many different types of identifiers, but the one most often used for books, of course, is the International Standard Book Number, or ISBN. The following is how the ISBN is added to the metadata section using the new meta refines tag:
<meta refines="#book-id" property="identifier-type" scheme="onix:codelist5">15</meta>
The first part gives the Dublin Core "identifier" attribute, with an id value added as a reference. In this instance that value is also used as our "unique-identifier" in the package declaration at the top of the document, so those two values must match exactly. It doesn't matter what you use, but the values are case-sensitive, so make sure they're the same.
The second line in our identifier entry provides the new meta refines attribute/value set, with the first entry being the id reference so the reading system knows which dc attribute we are refining. Just like in an HTML/CSS pair, the id value assigned is referenced using the "#" symbol followed by the id value name. This is followed by a property that states what the refines entry is, which in this case is what identifier-type we're talking about; that is, how exactly we are refining the core attribute. In this case we want to tell it that we're using an ISBN, rather than some other set of reference data, so we also need to add the scheme element to define it. This particular instance is about as opaque and unhelpful to the human eye as can ever be imagined, since the referenced onix data set not only gives the value as a completely meaningless number, but also numbers the codelist itself in a way that would make it all but impossible to actually find this information manually if you didn't know exactly where to look. Computers, of course, are vastly smarter than humans in this regard, and can search virtually endless databases in a matter of seconds. For us mere mortals, here is where the information actually resides:
Scan down until you get to "List 5" - which is Product Identifier type codes - and you will find the entry for the ISBN-13 as number 15 (not to be confused with the now outdated ISBN-10, which is number 02). That is where we get our value. Why they couldn't just have named it "ISBN" rather than assigning a completely random number is beyond me, but it is not helpful at all in my opinion, and whoever decided this should be fired and kicked off the planet. Fortunately, most metadata references are not nearly so useless as this. While you're gawking at that awful list, take note that there are many other data sets that you can use as your identifier, should you choose not to add an ISBN to your title (which, after all, are neither free nor cheap here in the States). The URN (Uniform Resource Name) at #22 is the next most common identifier used for books, while #01 is the Proprietary entry used for data sets like Amazon's ASIN. In this enormously lengthy set of lists you will find values for every conceivable product imaginable, and then some.
List 10 is also of interest to us here, as it provides reference values for E-Publication types, which, although rarely used, clearly apply to what we are doing, which is building #029 (not to be confused with #044, which is Apple's proprietary .ibooks format). The reason the List 10 values are not generally used is that, first, they are not required, and second, the type of publication has already been defined in the header as an EPUB version 3.0 document, and will also be quite readily apparent by the .epub extension appended to the file name.
You might also note that the ISBN-13 appears on more than one list, in a number of iterations with specific purposes, such as List 13 where it applies to a series collected in a single volume, or List 16, the use of which is somewhat unclear, although it must have its uses. In both cases the ISBN value is always "15" - just the codelist number would change if you were to use them.
Other identifiers can be used either instead of, or in addition to, an ISBN, and can, in fact, be almost any unique string of data, such as a website URL or UUID. I've included the "book-id" identifier in the sample template, but here is where you would insert the UUID mentioned previously in the toc.ncx lesson (if you read that part). The idea is just to include some string of data which is unique to this specific incarnation of your work, and ideally includes some type of version number. Each time you update the ebook's content, however trivial, you should also update the data string, preferably in a logical and expressive way that will allow users (i.e. collectors of your awesome body of brilliant work) to identify the particular version they are holding [NOTE: be sure to read the update section at the end of this tutorial for information on how to include Apple's new Versioning metadata]. The UUID is particularly useful in that it can be decoded to discover the exactly moment and location of creation, but looks like utter gibberish otherwise. If using an ISBN, you will only need to change it if you make a major change that would justifiably be considered a new edition, such an illustrated edition, or collector's edition with bonus content, for example. Also note that you are required to assign a unique ISBN to different ebook formats, such as to an epub and a mobi file, though not to an epub that is distributed to different retailers.
The next few metadata sections give the publication's title and credit its producers. You are required to include a title, for obvious reasons, but not an author, as works can be anonymous. You also so not need to add a publisher, since not all works are officially published. You may merely want to make an ebook to share with friends and family, or use for promotion (such as a sample chapter), or offer free like my iBooks template. While I list a publisher, it isn't really necessary in this case, except to give you an example of how the data value should be entered:
<dc:publisher id="en_publisher" xml:lang="en-US">Fantasy Castle Books</dc:publisher>
Here the id isn't really necessary either, since it's never referenced anywhere, nor is the xml:lang tag, although both are used in this instance to define the publisher's purview and language as English. You could just put <dc:publisher>Publisher Name</dc:publisher> and be just fine. This isn't one that needs a lot of clarification. But were you to publish in a number of different territories and languages, under different names, with specific rights applied to each region, this is one place you could define those values for this publisher.
Further, if you wanted to publish the same document in different territories with different primarily languages (but didn't want or have the means to issue a separately translated edition), or were publishing a dual-language edition, you would include the meta element in English (as here), and also on another line in the other language, such as
<dc:publisher id="es_publisher" xml:lang="es-ES">Fantasía Castillo Libros</dc:publisher>
for a Spanish audience, or English/Spanish edition.
Incidentally, the xml:lang tag can be applied only to 9 of the 15 DC elements, these being: contributor, coverage, creator, description, publisher, relation, rights, subject and title.
While the title it now has much greater capability to define a complex title, such as those with a subtitle, or in a series, it is also much more complicated to deal with. Fortunately we have already seen how some of it works under the identifier entry, so most of this will now be familiar:
<dc:title id="en_title" xml:lang="en-US">iBooks Fixed-Layout Template</dc:title> <!-- REQUIRED -->
<meta refines="#en_title" property="title-type">main</meta>
<meta refines="#edition" property="title-type">edition</meta>
<dc:title id="full_title">iBooks Fixed-Layout Template V.1.25</dc:title>
<meta refines="#full_title" property="title-type">expanded</meta>
As noted, the first line here is the only one required. And again, if not using a meta refines entry, the id can be omitted. But here we're actually applying three successive refinements to the first line, which contains the primary title. It is defined as the primary title not only by being listed first (which is imporant), but also by appending the "main" title-type property to it in the meta refines reference associated with its id value (en_title).
The second set of dc:title metadata is identified as an edition, and given a version number that is primarily of relevance to the creator of the publication, but which can also be used to identify successive versions of an updated file for the retailer (if not using Apple's own Versioning metadata, for example). Since we're building an iBooks file here and will be using Versioning (which we'll get to momentarily), this is only useful for internal purposes. In this case I used it to define successive versions of the file as I was making and/or updating it. That way I know which one I'm working on if I need to back up a step or two, or make changes to the newest file at a later date.
You can use any value that you want to for the name here, such as "1st Edition" or "Collector's Edition" or what have you. It doesn't need to have a number. Moreover, you can use a separate metadata entry for the "volume" number, or a whole host of other options. The list is not exhaustive, and technically you could define the title-type as anything you like, as there is no "officially" recognized set at present. The title set of meta refines attributes and values is, in fact, rather nebulous at this point, with a very unclear method of applying certain elements, such as subtitles, which is yet another value you could add as the title-type, in which case the named entry on the first line would be the subtitle. Since this template doesn't have one I've used an edition number instead. Another one would be a series name, such as The Lord of the Rings, which might have a volume number of "Part I" for the primary title of The Fellowship of the Ring. As you can see, the titles can now get quite lengthy and complex, which is both an asset and a burden. But at any rate it's there for you to make use of as you see fit.
Finally, the last set here gives the full expanded title, which combines the first two into one long name. In some cases you might want to put this full title as the first primary instance, and then just define its separate parts with refines values underneath. Conversely, you might not want that to the be the title that shows up on the product page, as I would not here with the version number appended, since that's just intended for my own use, or for others who might want to know if they have the latest version, but don't necessarily need it in the title. For the Tolkien example above, the expanded title would be The Lord of the Rings, Part I: The Fellowship of the Ring, which would be perfectly admissable as the primary title, but is probably best left as separate titles. This is still all fairly new, and its use and application in the broader world of e-commerce and digital distribution has yet to be worked out.
For further examples and the official specification see the Title Element section on the IDPF page.
<creator> / <contributor>
Like the title, he dc:creator and dc:contributor elements have a wide range of modifiers, including such functions as author, translator, editor, etc., and there can be multiples of each. None of these are required, but you will more than likely want to add at least the author of your work, and probably the illustrator for a children's book or graphic novel. These are given in the now more or less familiar format:
<dc:creator id="creator_aut">John Doe</dc:creator>
<meta refines="#creator_aut" property="role" scheme="marc:relators">aut</meta>
<meta refines="#creator_aut" property="file-as">Doe, John</meta>
Unlike the others, however, we now have an instance of multiple refinements to a single element. This shows how they can be stacked to produce a much more explicit data set than we would get by just providing the name as <dc:creator>John Doe</dc:creator>.
Again, the id value can be anything you choose, but I tend to like them to be as logical as possible. The value "creator_aut" tells me what I need to know at a glance, so that I readily know what bit of data I am modifying. This can become quite complicated when you have a hundred lines of entries listing all the many contributors to your project, particularly when they each now get more than one line.
For the creator/contributor entries we will use the MARC Code List for Relators, which is referenced as the scheme here, with a property that lists it as a role. These roles are given as three character sets, such as "aut" here for author, which is fairly obviously, though not all of them are. The other entry I have given in the template, for example, is my role as "mrk" - which some of you might guess correctly stands for"markup," which is what the ebook coder's function is. Most are far more obvious than this, however, such as "ill" for illustrator, or "trl" for translator, although "clr" for colorist is a bit opaque, as it not only represents the person who applies the color, but also the inker who applies the line art. The list is fairly old, apparently. Other odd ones are "aus" for screenwriter (author of screenplay), and "win" for the writer of an introduction. Use "dsr" (designer) for any graphic layout people, and "edt" for editors. If you happen to add an audio track for the Read Aloud feature you can credit the narrator using "nrt." Any contributors you are unsure of can be listed as "oth" for other.
There is also a publisher element for this entity ("pbl"), so you could put that here instead of as its given above, although the first is probably better, unless perhaps the publisher is more involved in the actual creation process than usual. The distinction here, by the way, between creator and contributor is not clearly defined, but might best be thought of in terms of those who produce the actual content (the creators) and those who help to shape it afterwards (the contributors). Incidentally, you would simply replace one word for the other in the lines of metadata, but otherwise they are used the same, and are otherwise interchangeable.
Be sure to visit the referenced website above for complete listings of your options.
Finally, I should mention here the file-as tag, which allows you to specify how you want your author's name to be listed in catalogs, which is generally last name first. If you leave this out your book will be listed under the first name in iBooks, which is not only confusing to find in your library, but cries out "amateur." I don't know how many times I've had to fix this in Calibre for "professionally" produced ebooks I purchased, even from the major trade publishers (though they have gotten much better over the years I will admit).
There are several events you can use for your date entry, including date of creation, copyright, and publication, of which you can include one, all, or none. You must, however, include the date the file was last modified:
<meta property="dcterms:modified">2015-01-01T00:00:00Z</meta> <!-- REQUIRED -->
You must also use the format given for the date/time stamp, which corresponds to the year-mth-day format of ####-##-## (i.e. 2012-01-24), plus the time format of T##:##:##Z (i.e. T12:01:01Z). You don't actually have to actually include all of this information (such as the minutes and seconds) - you can instead add zeroes if you want - but the format must be given exactly as specified or the file will be rejected. What you must do in any event, is change the time stamp when and if you ever update the file and resubmit it, as a new edition with the same date and time as a preceding one will also be rejected.
Incidentally, the date entity is only only required by Apple, not by the EPUB3 specification itself, as are the other three. But since we're making an iBooks file here, for our purposes it is required. Were you making an epub for upload to Kobo or Google you would not technically have to add this, although you probably should in any case. And on that note I should also mention that for the most part this file will work as a general epub document on other retailers, since they have removed the com.apple file in favor of the EPUB3 metadata. The few bits of iBooks specific data left would simply be ignored, and aren't required or relevant to other devices in any case (i.e. the iBooks "faux" binding).
If you want to add additional dates for other important points, such as those listed above, just use the basic Dublin Core format:
Here you don't have to add the time stamp, since this isn't a required value anyway. You can add as many date events as you like, though only the publication one will likely show up anywhere, so any others would just be for internal reference.
A statement of your rights is allowed, though not required, and just typically states All Rights Reserved (or creative commons, public domain, or whatnot). You can be as specific or general as you want here.
<dc:rights id="en_rights" xml:lang="en-US">Copyright 2015 - All Rights Reserved</dc:rights>
While the id="en_rights" and xml:lang="en-US" tags might at first glance seem to mean the right apply only to English territories, in fact it only specifies that the copyright statement itself is written in English. The appended tags have no legal bearing whatsoever, and only the statement itself is legally binding (and even then only insomuch as you can assert those rights in a court of law within your jurisdiction). I have included these to show where you might add specific entries for different territories in which you plan to distribute your work and wish to declare specific rights to each, for example if you're doing translations into different languages or reserving/selling specific rights in different countries. In most cases, if you're getting this involved you'll want to consult a literary agent or legal representative who knows publishing law.
Regarding the © symbol, while elsewhere in the HTML you are generally better off using the © named reference rather than the copyright symbol itself, which may not render correctly on older reading systems, it should be noted that within the xml framework of the opf document you cannot use the ampersand within a value string, as it will cause the file to malfunction and will be rejected. For HTML, however, you could also use the Decimal value © or the Hexadecimal © entry, as well as ©, all of which denote the copyright symbol. Just make sure you include the semi-colon at the end or it will render as characters rather the symbol.
The third and final element you are required to include is a language tag, and this should employ the standard RFC5646 Unicode language identifiers (although try this older list instead, as it's much easier to read: RFC 3066). The system employs either a base two-letter code, or that plus a secondary string after a dash. So, for example, English can be either plain en, or en-US for standard United States dialects, or en-GB for Great Britain strains, or any of a host of others.
<dc:language id="pub_language">en-US</dc:language> <!-- REQUIRED -->
Generally just the base language is all that is needed, unless you are writing in a more specific idiom. You can also add as many language entries as your publication warrants, but there must be at least one.
There are no hard and fast rules regarding the type element, and no comprehensive list of values. The IDPF provides a few example on its somewhat tentative EPUB Registry of Publication Types, where such ebook types as dictionaries, guides, and indexes are offered, but there are no entries whatsoever for works of fiction, as if they weren't a type of publication at all. In essence, the type element has no real value, and as far as I know it never shows up anywhere. You can enter category data such as whether the work is fiction, non-fiction, poetry, etc., but so far as I can tell there is no reason to include it. If you do add one, be aware that you can add only one, since a publication can only be of one type.
Subject, on the other hand, is where you would add Library of Congress Classifications, BISAC Subject Headings, or other genre/topic data, such as those used by Amazon to categorize their products into lists with sales rankings. One reason you're adding all this extra information is precisely so that retailers can add it to their product pages. Adding it here facilitates the quick and accurate transfer of metadata concerning your work, and this is your chance to make sure it's right. Moreover, you can generally only add a limited amount of data to the retailer's product pages manually, but you can put as much metadata in the OPF as you can manage to think of, and much of it will be pulled by the automated file processing apparatus of the e-retailing system.
One other tag you might include is format (iBooks in this case, but ePub or Kindle or whatever elsewhere), using the aforementioned codelist references from the MARC Code List for Relators. This might seem redundant since you've got the ebook right here in front of you, but not everyone reading this data will, and it's one more way this specific iteration of your work can be identified. For example, a library may be looking at a metadata listing in search of a particular format to include in their catalog, and other general ebook retailers will want to identify the format for their customers before selling it to them via whatever systems are put in place down the line as ebooks become more common.
This is where you would place the back jacket blurb or other descriptive content that tells the reader what the book is about. Anything you might desire a potential customer to know could go here, including reviews, extracts, or a general description such as what you would read on any book page. Give it some thought as it will show up all over the Internet on every ebook retailer, and once it's there it's there for good.
<meta name="cover" content="coverimage" />
The remaining items in the metadata section all begin with the meta attribute, and with the exception of the line above are dealt with in the final part of this tutorial, the update section dealing with the main EPUB3 changes in iBooks files. However, the line above has always been, and still remains, essential, and must be entered exactly as given, with the single caveat that the content value can be whatever you want, so long as it matches the id value for the cover image in the manifest below, which we will look at next.
Just like a shipping invoice, the manifest lists all the items included in the package. Every file in the OEBPS folder must be listed here, with the exception of the content.opf itself. In addition, any files in the META-INF folder are excluded, since in order to get to the manifest the system will already have employed those files.
<item id="coverimage" href="images/cover.jpg" media-type="image/jpeg" properties="cover-image"></item>
Each item gets its own line entry, starting with an item id that includes a descriptive name of your choice, which in the case of the line above is the coverimage referenced in the metadata "cover" entry we just looked at. This is followed by a href link which references the file's location so the reading system knows where to find it. Bear in mind that the reference is relative to where the OPF file resides, so any files in a subfolder must have that included in the link.
Next there is a media-type that tells the system what kind of file the item is, and must be correct for the item to function (the file extension isn't enough in itself, apparently). I have provided the main media types you might use, although for images you can also have image/png files as well as jpegs, but not gifs or tifs, which are not supported. Note that all html files are listed as application/xhtml+xml, regardless of what extension you use for the actual file itself (.xhtml, .html, .xml, etc.), but EPUB3 files should be built with xhtml to be compliant. You can also use OpenType and SVG fonts in addition to TrueType.
These three items can come in any order, but they all must be included for every entry in the manifest. Lastly, here we have an additional properties attribute that is unique to only this entry, and the navigation document. Only these two files need this added bit of data, and in the case of the cover image, the value must be "cover-image" and for the nav doc is must be "nav" exactly as given. The id value names can be anything you like, but the rest must conform to the expected format or the file will not work.
I have divided the various types of files into sections with headers for ease of reference, although you do not have to do this. It's just a way to keep your items organized and easy to find, which you will find is necessary if you hope to maintain any level of sanity.
The spine is a linear listing of the ebook contents in the order they will be presented, just as the pages in a print book are attached in specific order to the spine (hence, the name). Here you enter each html page you create using the item id you specified in the manifest above, as such:
where the idref is equal to the item id in the manifest. Only the html pages themselves need be entered, and not their component parts (i.e. css, images, etc.). Just list them all in the order you want the reader to see them and call it good.
NOTE: The previously-included Guide section of the OPF is no longer used, having first been superceded by the NCX, and now by the navigation document. Even in ePub2 is had no real functionality, as I discussed in the section on the Guide that used to be here. However, as there was no reason to keep it, I have removed that entire section, both from the template and this tutorial. That's one less thing you need to deal with!
TEST YOUR EBOOK
You now have enough content and supporting files to load your ebook into iBooks via iTunes and give it a test run. Just plug your device in click on the books icon for the device in iTunes, and drag your ebook file there. It will show up on the Books tabs in your iBooks library. If anything goes haywire you'll generally get an error message on the relevant page giving you a line and column reference to the offending element.
Of course, if it doesn't load at all you'll have to backtrack and work out your error. I've tried to make this as easy as possible by providing a working template into which you can simply add your own content, but all sorts of things can (and usually do) go wrong. Just take it one step at a time, make one change and test it to make sure it works, and then move on. Don't try to build your entire ebook all at once and then wonder why it won't work, because you'll never be able to find out. Just get the core files built with one page and get that to work. Using the template makes it easy, because all you have to do is change what's in it to what you want it to be. It already works, so just don't break it!
Of course, all you have so far is a book of pictures, which is fine if you're making a photo album, but you'll likely want to add some text even to that, so in the next installment we'll discuss embedding fonts and adding active text to your growing book.