Say I want to provide transcriptions of a comic to my site visitors, or maybe I'm making a comic with SVG. How best should I mark things up to make the meaning clear to assistive technology, search engines, and anyone looking at the source?

XML-style

Let's pretend it's the early 2000's, and XML is obviously the future. Once Microsoft ships XHTML support in IE7, there'll be no reason not to define a DTD for our own comic XML schema. We'd probably have something like this:

  <comic>
  <about>
    <title>Krazy Kat</title>
    <creator>George Herriman</creator>
    <published timestamp="1927-10-13">Thursday, October 13th, 1927</published>
  </about>
  <panel>
    <art>
      <image url="/panel-1.png"/>
      <description>Krazy Kat is walking along a desert road. The moon hangs low among rock formations.</description>
    </art>
    <dialogue>
      <speaker>Krazy Kat<speaker>:
        <line xml:lang="en-x-krazy-kat">Oh, wot an oiful chul!</line>
    </dialogue>
  </panel>
  <panel>
    <!-- you get the idea -->
  </panel>
</comic>

Of course, an actual comics XML vocabulary does exist, and it looks more like this:

  <cbml:panel ana="#action-to-action"
            characters="#cap #anon_man" n="5" xml:id="eg_000">
    <cbml:caption>Cap acts quickly to tranquilize the gun-happy pedestrian…</cbml:caption>
    <cbml:balloon type="speech" who="#cap" xml:id="eg_007">A little <emph rendition="#b">sleep</emph> will do wonders for you!</cbml:balloon>
    <sound>SPLAT!</sound>
    <cbml:balloon type="speech" who="#anon_man">Ugh!</cbml:balloon>
</cbml:panel>

Pretty horrible. The full vocabulary is even scarier. But, really, given enough time and attention, all markup languages become XML. Even with friendly ol' HTML, once we're done applying styling/scripting hooks, interweaving images with text, and the usual bits 'n bobs, we're going to have the equivalent of CBML, except with much vaguer tag names.

Markup languages are tough. We're taking a picture, writing out its thousand words, cross-referencing those words with other pictures-as-words, and then breaking them all down into machine-readable representations of concepts. There's no way it wouldn't result in a horrible mess.

Anyway, let's start with some friendly ol' HTML.

HTML5 document semantics

First, an <article>, because that makes sense, doesn't it?

  <article class="comic">
</article>

There are a couple attributes we can add if the comic's language differs from the surrounding web page. Imagine we have an English comic on an Arabic site, for instance:

  <article class="كوميدي" dir="ltr" lang="en">
</article>

But to keep things simple (and so I don't have to keep looking up class names in Arabic), we'll assume the web page and the comic are the same language.

Next, we'll probably need a <header>:

  <article class="comic">
  <header>
    <h1>Comic Title</h1>
    <address>by <a href="/about-me" rel="author">Author</a></address>
    <time datetime="2010-01-15">January 15th, 2010</time>
  </header>
</article>

And a <footer>, containing your usual watermarked info:

  <article class="comic">
  <header>
    <h1>Comic Title</h1>
    <address>by <a href="/about-me" rel="author">Author</a></address>
    <time datetime="2010-01-15">January 15th, 2010</time>
  </header>

  <footer>
    <small><a href="http://copyright.gov/title17/" rel="license">© 2010</a></small>
  </footer>
</article>

That's the easy parts done. Now then—how do you mark up a panel?

My first guess is <section>, but the spec is clear on <section>s being something you'd see in a table of contents. Panels are way too granular for that; they're almost the paragraph of comics. And it's even valid to use <p> that way:

  <p class="panel">
    <svg role="group">
    <image xlink:href="/panel-1.png">
      <title>Krazy Kat walks down a desert road.</title>
      <desc>The moon hangs low among rock formations.</desc>
    </image>

    <text class="speech-balloon">
      <title>Krazy kat says:</title>
      <tspan xml:lang="en-x-krazy-kat">Oh, wot an oiful chul!</tspan>
    </text>
  </svg>
</p>

The solution is to realise that a paragraph, in HTML terms, is not a logical concept, but a structural one.

And this isn't far off from their formal definition, where paragraphs are containers organizing related ideas. This is one of those situations where separation of style and content breaks down; you can change the meaning of a passage with a deftly-placed paragraph break, but what's considered a "real" paragraph and what isn't is... kind of an academic argument.

And things only get muddier from there: what Japanese considers a "paragraph" is quite different from English. If paragraphs were integral to the meaning of something, this sort of disagreement shouldn't be possible, yeah?

Which leads nicely into what, I think, the point of HTML semantics should be: you can rationalize anything, as long as the user experience is good. I would much rather write semantically "incorrect" markup if it made a screen-reader describe things more pleasantly. This provides a nice concrete measuring stick, because one could argue rarefied abstract semantics until they're blue in the spec.

Which leads me to my next question: if I present multiple comic "pages" in a single document, should I mark those up as <section>s? It may be "stylistic" because that's just how the panels fit into the alloted space, but they do have a "header" of their page number, or whatever, which is useful to refer to. Though, I can't imagine doing this for a Table of Contents:

  1. Page 1
  2. Page 2
  3. Page 3
  4. Page 4

So probably scrap that, too. <section>s would be best for longer-form narrative comics with actual chapters. But hm, pages, numbers, order...

  <ol class="panels">
  <li class="panel"></li>
  <li class="panel"></li>
  <!-- etc. -->
</ol>

I can dig it. Screen readers typically announce lists as having X items and provide a way to navigate them, so this seems pretty ideal. And even if we were to do things wholly in SVG:

  <svg role="list">
  <g class="panel" role="listitem"/>
  <g class="panel" role="listitem"/>
  <!-- etc. -->
</svg>

It's also possible to group sets of list items like this, unlike with <ol> and <ul>:

  <svg role="list">
  <g class="panel-set" role="group">
    <g class="panel" role="listitem"/>
    <g class="panel" role="listitem"/>
  </g>
</svg>

Anyway. We've got the structure figured out, but what about the actual content? How do you mark up a transcript of what's going on in the comic?

Marking up conversations

This is a classic issue. HTML5 used to have a <dialog> tag for this, but it was repurposed for modal windows. Now the spec says:

Instead, authors are encouraged to mark up conversations using p elements and punctuation. Authors who need to mark the speaker for styling purposes are encouraged to use span or b. Paragraphs with their text wrapped in the i element can be used for marking up stage directions.

Which, eh, okay. What else have we got? There's a lot of arguing about this. Kyle Weems of CSSquirrel wrote up longdescs for his comics, so we can look at those.

He uses <ul>s for the individual lines of dialogue, which should be <ol>s, but minor disagreement there. He does do the actual lines quite interestingly:

  <li><cite>Luke Wroblewski:</cite> <q>Look at the Lumia 900 with the Windows Phone OS! It's brilliant, and looks nothing like iOS! There's still room for innovation without imitation!</q></li>

Maybe it's not perfect according to a strict interpretation, but this does a good job working with the limited tools we have.

Unfortunately, if I were to use SVG to mark up a comic (and I would argue I would, since it's text and graphics working together), even HTML's wimpy vocabulary is unavailable. It's possible to fake things with <tspan>, such as:

  <tspan class="strong">What!?</tspan>

  .strong { font-weight: bold; }

...Which isn't that different from HTML, because believe it or not, screen readers don't distinguish <strong> or <em>. We sure could if we rolled our own read-aloud with the Web Speech API, but we'd have to do that work either way.

I think most speech balloons would end up like this:

  <g class="speech-balloon">
    <title>Character shouts:</title>
    <path d="..." role="presentation" class="balloon-shape"/>
  <text>
    Who's there?
  </text>
</g>

A <g> with a <title> or <desc> in it is implicitly assigned the group role, which works here, I'm pretty sure. The speaking character is conveyed with the balloon tail pointing at them, so it's a good idea to let visually-impaired users have a text fallback with the <title>.

Unfortunately, <title> is often rendered as hover text in browsers, so it's the alt vs. title debacle all over again. Pity, really. To avoid this, alternatives can be produced with aria-label or aria-labelledby.

Okay that was a lot of rambling

Let's see what we've got by marking up this strip from Frantic Stein, because it's public domain:

An old public domain comic strip with way too many words to fit into an alt text

  <article class="comic" lang="en-us">
  <header>
    <h1>Frantic Stein</h1>
    <p>By George Mercer</p>
  </header>
  <svg role="list">
    <g class="panel" role="listitem">
        <text class="caption">F.B.I. Headquarters...</text>

      <image xlink:href="panel-1.png">
        <title>Frantic Stein's boss addresses him within his office.</title>
        <desc>The boss sits behind his desk, with his policeman's cap on the desktop. He's striking a very odd pose.</desc>
      </image>

      <g class="speech-balloon">
        <title>Stein's boss growls:</title>
        <text>We've received a tip to the effect that there will be an attempt to harm the steamer <tspan class="u">Atlantis</tspan> on its good-will voyage to Europe! I want you to foil any such attempt!</text>
      </g>
    </g>

    <g class="panel" role="listitem">
      <image xlink:href="panel-2.png">
        <title>Frantic realizes the gravity of the situation.</title>
        <desc>We close in on Frantic's face. He looks very serious.</desc>
      </image>

      <g class="speech-balloon">
        <title>Stein says:</title>
        <text>So I board the steamer to Europe! Sounds interesting!</text>
      </g>

      <text class="caption">Next day Frantic Stein is aboard The Atlantis as it sets out to sea--and an indeterminable fate!</text>
    </g>

    <g class="panel" role="listitem">
      <image xlink:href="panel-3.png">
        <title>The Atlantis, a large steamer ship, cruises the open sea.</title>
        <desc>The weather is clear, with the sea wavy but calm. The Atlantis cuts swiftly through the waves. It has 3 smokestacks and 2 radio masts.</desc>
      </image>

      <g class="speech-balloon">
        <title>Stein's voice rings out from the ship:</title>
        <text>Saw only one suspicious character... but he turned out to be the captain of the ship!</text>
      </g>
    </g>

    <g class="panel" role="listitem">
      <image xlink:href="panel-4.png">
        <title>From their backs, we see Frantic and an unknown woman look out over the ocean.</title>
        <desc>They are gazing over the ship's railing. Frantic is smoking, wearing a fedora and trenchcoat, and is gripping the railing with one hand. The woman clasps her hands behind her back, and is wearing a dress, fitted jacket, and sun hat with a long, flowing ribbon.</desc>
      </image>

      <g class="speech-balloon">
        <title>The woman says:</title>
        <text>Surprise!</text>
      </g>

      <g class="speech-balloon">
        <title>Frantic shouts:</title>
        <text><tspan="strong">Darwyn!</tspan></text>
      </g>

      <text class="caption">Continued...</text>
    </g>
  </svg>
</article>

Of course, things will get much uglier when we add positioning attributes and styling hooks, but I think this is pretty good!

Conclusion

That was a lot of words just to imitate XML. And with Web Components, we'll come full circle, but we'll call it <x-comic> instead.


5,721 7 26