Henrik speaking at a conference

Hi, I’m Henrik Joreteg

Mobile web consultant, developer, and speaker

posts | twitter | email | hire | book | my startup: Xchart.com

Your XHTML efforts are (probably) wasted


So, last night I read this blog post about starting to use the HTML 5 doctype. That got me curious, because according to them it’s perfectly fine to use the HTML 5 doctype which is gloriously simplified compared to other doctypes:

HTML 5 looks like this:

<!DOCTYPE html>

(X)HTML 1.0 Strict looks like this:

<!DOCTYPE html PUBLIC “-//W3C//DTD XHTML 1.0 Strict//EN” “http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd”>

Clearly, the HTML 5 version is actually memorizable (not a word… but should be), where any other doctype I’ve ever seen nearly brought tears to my eyes. So I’ve done a fair amount of digging and reading and learned something very interesting:

Tons and tons of web-developers, write XHTML compliant sites, but they still serve it using text/html mime-types which means, the browser treats it like plain ‘ol html anyways!

Here’s the crux of the matter as described by Maciej Stachowiak on the web-kit blog:

So what really determines if a document is HTML or XHTML? The one and only thing that controls whether a document is HTML or XHTML is the MIME type. If the document is served with a text/html MIME type, it is treated as HTML. If it is served as application/xhtml+xml or text/xml, it gets treated as XHTML. In particular, none of the following things will cause your document to be treated as XHTML:

  • Using an XHTML doctype declaration
  • Putting an XML declaration at the top
  • Using XHTML-specific syntax like self-closing tags
  • Validating it as XHTML

In fact, the vast majority of supposedly XHTML documents on the internet are served as text/html. Which means they are not XHTML at all, but actually invalid HTML that’s getting by on the error handling of HTML parsers. All those “Valid XHTML 1.0!” links on the web are really saying “Invalid HTML 4.01!”.

So who cares?!? What should we do? Well, arguably, as long as it renders properly it doesn’t really matter does it?!? But if you care, here’s what I suggest:

Either start serving your (X)HTML with the correct mime/type: application/xhtml+xml and deal with the fact that IE6 can’t handle that. (not that I care too much about IE6)

…OR…

Change your doctype to match what you’re actually serving up, which means switching your doctype back the HTML 4.01 Strict:

<!DOCTYPE HTML PUBLIC “-//W3C//DTD HTML 4.01//EN” “http://www.w3.org/TR/html4/strict.dtd”>

…OR…

Just do what Mozilla’s Web-Developer FAQ’s recommend and myself, Google, Apple (in some cases) and others have started doing: just use the HTML 5 doctype! It puts everything into standards mode all the way back to IE6 and you don’t really have to worry about copying funky doctypes ever again.

Then, you can tell your XHTML junkie friends that they’re on to something about accessibility and clean code, but unless they’re serving in xml they’re not actually gaining much of anything from their precious (X)HTML strict doctypes.

I’m not claiming to be the end-all expert. But I do my best to take well qualified advice. I consider the sources listed below and other example sites mentioned above to be very qualified. I hope this posts helps somebody who was as confused as I was.

Here’s all the relevant posts and documents that I referenced (all are worth reading):

Other potentially interesting items: