First, Some Facts
Once upon a time, there was a lovely language called HTML, which was so simple that writing websites with it was very easy. So, everyone did, and the Web transformed from a linked collection of physics papers to what we know and love today.
Most pages didn’t conform to the simple rules of the language (because their authors were rightly concerned more with the message than the medium), so every browser had to be forgiving with bad code and do its best to work out what its author wanted to display.
In 1999, the W3C decided to discontinue work on HTML and move the world toward XHTML. This was all good, until a few people noticed that the work to upgrade the language to XHTML2 had very little to do with the real Web. Being XML, the spec required a browser to stop rendering if it encountered an error. And because the W3C was writing a new language that was better than simple old HTML, it deprecated elements such as <img>
and <a>
.
A group of developers at Opera and Mozilla disagreed with this approach and presented a paper to the W3C in 2004 arguing that, “We consider Web Applications to be an important area that has not been adequately served by existing technologies… There is a rising threat of single-vendor solutions addressing this problem before jointly-developed specifications.”
The paper suggested seven design principles:
- Backwards compatibility, and a clear migration path.
- Well-defined error handling, like CSS (i.e. ignore unknown stuff and move on), compared to XML’s “draconian” error handling.
- Users should not be exposed to authoring errors.
- Practical use: every feature that goes into the Web-applications specifications must be justified by a practical use case. The reverse is not necessarily true: every use case does not necessarily warrant a new feature.
- Scripting is here to stay (but should be avoided where more convenient declarative mark-up can be used).
- Avoid device-specific profiling.
- Make the process open. (The Web has benefited from being developed in the open. Mailing lists, archives and draft specifications should continuously be visible to the public.)
The paper was rejected by the W3C, and so Opera and Mozilla, later joined by Apple, continued a mailing list called Web Hypertext Application Technology Working Group (WHATWG), working on their proof-of-concept specification. The spec extended HTML4 forms, until it grew into a spec called Web Applications 1.0, under the continued editorship of Ian Hickson, who left Opera for Google.
In 2006, the W3C realized its mistake and decided to resurrect HTML, asking WHATWG for its spec to use as the basis of what is now called HTML5.
Those are the historical facts. Now, let’s look at some hysterical myths.
The Myths
“I Can’t Use HTML5 Until 2012 (or 2022)”
This is a misconception based on the projected date that HTML5 will reach the stage in the W3C process known as Candidate Recommendation (REC). The WHATWG wiki says this:
For a spec to become a REC today, it requires two 100% complete and fully interoperable implementations, which is proven by each successfully passing literally thousands of test cases (20,000 tests for the whole spec would probably be a conservative estimate). When you consider how long it takes to write that many test cases and how long it takes to implement each feature, you’ll begin to understand why the time frame seems so long.
So, by definition, the spec won’t be finished until you can use all of it, and in two browsers.
Of course, what really matters is the bits of HTML5 that are already supported in the browsers. Any list will be out of date within about a week because the browser makers are innovating so quickly. Also, much of the new functionality can be replicated with JavaScript in browsers that don’t yet have support. The <canvas>
property is in all modern browsers and will be in Internet Explorer 9, but it can be faked in old versions of IE with the excanvas library. The <video>
and <audio>
properties can be faked with Flash in old browsers.
HTML5 is designed to degrade gracefully, so with clever JavaScript and some thought, all content should be available on older browsers.
“My Browser Supports HTML5, but Yours Doesn’t”
There’s a myth that HTML5 is some monolithic, indivisible thing. It’s not. It’s a collection of features, as we’ve seen above. So, in the short term, you cannot say that a browser supports everything in the spec. And when some browser or other does, it won’t matter because we’ll all be much too excited about the next iteration of HTML by then.
What a terrible mess, you’re thinking? But consider that CSS 2.1 is not yet a finished spec, and yet we all use it each and every day. We use CSS3, happily adding border-radius
, which will soon be supported everywhere, while other aspects of CSS3 aren’t supported anywhere at all.
Be wary of browser “scoring” websites. They often test for things that have nothing to do with HTML5, such as CSS, SVG and even Web fonts. What matters is what you need to do, what’s supported by the browsers your client’s audience will be using and how much you can fake with JavaScript.
HTML5 Legalizes Tag Soup
HTML5 is a lot more forgiving in its syntax than XHTML: you can write tags in uppercase, lowercase or a mixture of the two. You don’t need to self-close tags such as img
, so the following are both legal:
<img src="nice.jpg" />
<img src="nice.jpg">
You don’t need to wrap attributes in quotation marks, so the following are both legal:
<img src="nice.jpg">
<img src=nice.jpg>
You can use uppercase or lowercase (or mix them), so all of these are legal:
<IMG SRC=nice.jpg>
<img src=nice.jpg>
<iMg SrC=nice.jpg>
This isn’t any different from HTML4, but it probably comes as quite a shock if you’re used to XHTML. In reality, if you were serving your pages as a combination of text and HTML, rather than XML (and you probably were, because Internet Explorer 8 and below couldn’t render true XHTML), then it never mattered anyway: the browser never cared about trailing slashes, quoted attributes or case—only the validator did.
So, while the syntax appears to be looser, the actual parsing rules are much tighter. The difference is that there is no more tag soup; the specification describes exactly what to do with invalid mark-up so that all conforming browsers produce the same DOM. If you’ve ever written JavaScript that has to walk the DOM, then you’re aware of the horrors that inconsistent DOMs can bring.
This error correction is no reason to churn out invalid code, though. The DOM that HTML5 creates for you might not be the DOM you want, so ensuring that your HTML5 validates is still essential. With all this new stuff, overlooking a small syntax error that stops your script from working or that makes your CSS unstylish is easy, which is why we have HTML5 validators.
Far from legitimizing tag soup, HTML5 consigns it to history. Souper.
“I Need to Convert My XHTML Website to HTML5”
Is HTML5′s tolerance of looser syntax the death knell for XHTML? After all, the working group to develop XHTML 2 was disbanded, right?
True, the XHTML 2 group was disbanded at the end of 2009; it was working on an unimplemented spec that competed with HTML5, so having two groups was a waste of W3C resources. But XHTML 1 was a finished spec that is widely supported in all browsers and that will continue to work in browsers for as long as needed. Your XHTML websites are therefore safe.
HTML5 Kills XML
Not at all. If you need to use XML rather than HTML, you can use XHTML5, which includes all the wonders of HTML5 but which must be in well-formed XHTML syntax (i.e. quoted attributes, trailing slashes to close some elements, lowercase elements and the like.)
Actually, you can’t use all the wonders of HTML5 in XHTML5: <noscript>
won’t work. But you’re not still using that, are you?
HTML5 Will Kill Flash and Plug-Ins
The <canvas>
tag allows scripted images and animations that react to the keyboard and that therefore can compete with some simpler uses of Adobe Flash. HTML5 has native capability for playing video and audio.
Just as when CSS Web fonts weren’t widely supported and Flash was used in sIFR to fill the gaps, Flash also saves the day by making HTML5 video backwards-compatible. Because HTML5 is designed to be “fake-able” in older browsers, the mark-up between the video tags is ignored by browsers that understand HTML5 and is rendered by older browsers. Therefore, embedding fall-back video with Flash is possible using the old-school <object>
or <embed>
tags, as pioneered by Kroc Camen is his article “Video for Everybody!” (see the screenshot below).
But not all of Flash’s use cases are usurped by HTML5. There is no way to do digital rights management in HTML5; browsers such as Opera, Firefox and Chrome allow visitors to save video to their machines with a click of the context menu. If you need to prevent video from being saved, you’ll need to use plug-ins. Capturing input from a user’s microphone or camera is currently only possible with Flash (although a <device>
element is being specified for “post-5″ HTML), so if you’re keen to write a Chatroulette killer, HTML5 isn’t for you.
HTML5 Is Bad for Accessibility
A lot of discussion is going on about the accessibility of HTML5. This is good and to be welcomed: with so many changes to the basic language of the Web, ensuring that the Web is accessible to people who cannot see or use a mouse is vital. Also vital is building in the solution, rather than bolting it on as an afterthought: after all, many (most?) authors don’t even add alternate text to images, so out-of-the-box accessibility is much more likely to succeed than relying on people to add it.
This is why it’s great that HTML5 adds native controls for things like sliders (<input type=range>
, currently supported in Opera and Webkit browsers) and date pickers (<input type=date>
, Opera only)—see Bruce’s HTML5 forms demo)—because previously we had to fake these with JavaScript and images and then add keyboard support and WAI-ARIA roles and attributes.
The <canvas>
tag is a different story. It is an Apple invention that was reverse-engineered by other browser makers and then retrospectively specified as part of HTML5, so there is no built-in accessibility. If you’re just using it for eye-candy, that’s fine; think of it as an image, but without any possibility of alternate text (some additions to the spec have been suggested, but nothing certain yet). So, ensure that any information you deliver via <canvas>
supplements more accessible information elsewhere.
Text between <canvas>
tags simply becomes pixels, just like text in images, and so is invisible to assistive technology and screen readers. Consider using the W3C graphics technology Scalable Vector Graphics (SVG) instead, especially for things such as dynamic graphs and animating text. SVG is supported in all the major browsers, including IE9 (but not IE8 or below, although the SVGweb library can fake SVG with Flash in older browsers).
The situation with <video>
and <audio>
is promising. Although not fully specified (and so not yet implemented in any browsers), a new <track>
element has been included in the HTML5 spec that allows timed transcripts (or karaoke lyrics or captions for the deaf or subtitles for foreign-language media) to be associated with multimedia. It can be faked in JavaScript. Alternatively (and better for search engines), you could include transcripts directly on the page below the video and use JavaScript to overlay captions, synchronized with the video.
“An HTML5 Guru Will Hold My Hand as I Do It the First Time”
If only this were true. However, the charming Paul Irish and lovely Divya Manian will be as good as there for you, with their HTML5 Boilerplate, which is a set of files you can use as templates for your projects. Boilerplate brings in the JavaScript you need to style the new elements in IE; pulls in jQuery from the Google Content Distribution Network (CDN), but with fall-back links to your server in case the CDN server is down.
It adds mark-up that is adaptable to iOS, Android and Opera Mobile; and adds a CSS skeleton with a comprehensive reset style sheet. There’s even an .htaccess file that serves your HTML5 video with the right MIME types. You won’t need all of it, and you’re encouraged to delete the stuff that’s unnecessary to your project to avoid bloat.
Further Resources
HTML5 is a massive topic. Here are a few hand-picked links. Disclosure: the authors have their fingers in some of these pies.