A Blog Not Limited

to web design, standards & semantics

The Beauty of Semantic Markup, Part 3: Headings

Nov 07, 2010

Published in

I always find myself drawn to fundamental concepts, because they can be deceptively simple. Headings are like that. You know, <h1>-<h6>.

They seem simple until you take time to think … think about structure, semantics, accessibility, search engines and, now, HTML5's sectioning model.

And I have, indeed, been thinking about headings lately, especially as I dive into HTML5 and (re?)consider the approaches I've taken in the past.

So this series now shifts focus to <h1>, <h2>, <h3>, <h4>, <h5> and <h6>.

Headings for Outlines

The semantic purpose of headings is to indicate a content outline; a structure:

A heading element briefly describes the topic of the section it introduces. Heading information may be used by user agents, for example, to construct a table of contents for a document automatically.

W3C

You can even see this heading-based outline using the W3C's validator service, if you have "Show Outline" selected (note: does not work with HTML5 doctype):

W3C Validator outline option

For example, here's the heading outline for one of my recent blog posts:

Heading outline for A Blog Not Limited post

Looking at this three–year–old markup now, I wouldn't take the exact same approach today, but the gist is there. My blog name is the first heading, with all the other headings "nested" hierarchically after.

Of course, not all sites are going to have a heading hierarchy, such as one with a columnar layout, where the most important heading (<h1>) appears after, for example, an <h2>:

  1. <div class="aside">
  2.    <h2>Quick Links</h2>
  3. </div>
  4. <div class="section">
  5.    <h1>Site Name</h1>
  6. </div>
  7. <div class="aside">
  8.    <h2>Search</h2>
  9. </div>

But even in this example, the headings are still used to convey a content structure.

Best Practices & Debates

As far as I can glean, the "best practices" for indicating content structure is simply to use <h1> for the most important information, <h2> for less important information and so on. Also, it is probably best to not skip any heading levels, such as going from <h1> to <h3>. But that's it.

For quite a long time, many folks believed there should be only one <h1> on a page, despite the fact that this is not part of the specification. I happened to be one of those people and, if I recall correctly, my reasoning for this "logic" was based on an assumption about search engine penalties.

Google now refutes this misperception, but does advise the judicial use of <h1>. Which leaves the argument that the reason for only one <h1> is that there can only be one "most important" heading on a page.

Traditionally, I've also agreed with this thinking. But, as you'll see later in this article, HTML5 has me thinking very differently about <h1>s. HTML5 aside, though, I'm still inclined towards the one <h1> approach … which brings up yet another debate (don't you love our little industry?).

This debate assumes a single <h1>, but questions what content should be inside that <h1>. Site name? Company name or logo? Page title?

As you can see from this blog (as well as pretty much every other project I've marked up) I've been on the side of the site name, which is often the company name. I've never used <h1> for a logo, and I can't say I even understand that approach. <h1> is for text. A logo is not text. There's no argument there for me.

But I'm now appreciating the logic that <title> is, semantically, the appropriate element for the site name, while <h1> may be more useful for the page heading.

Headings for Navigation

A wonderful result of using headings to indicate content structure is that it aids navigation. Users can scan headings on a web browser to more quickly find the information most important to them. Even non-browser users can take advantage of headings for this purpose, as many assistive technologies leverage the outline to navigate.

The JAWS screen reader, for example, lets users navigate the page by jumping from heading to heading:

This demonstration alone confirms for me that using <h1> for page headings is probably the best way to go. I imagine it gets old fast hearing the site name repeated because it is contained by an <h1>. But that's just my own personal decision (though a good one, I suspect).

In terms of "best practices" to support accessible navigation, as long as you are focusing on content structure, you are probably good. Regarding multiple <h1>s, there is no definitive answer about how it affects accessibility. Anecdotally, it could cause some screen reader users to miss key content.

Regarding heading hierarchy, the WCAG 2.0 accepts both nested (where <h3> follows <h2> which follows <h1>) and non-hierarchical headings. (And, in case you were wondering, Google doesn't mind non-hierarchical headings either.)

Headings for SEO?

Speaking of Google (how'd you like that segue?) … headings have historically been heralded as helping SEO. In fact, in the above image of my blog outline, you'll see that I strayed from the semantic, outline-focused approach with the use of <h2> for my blog's "tagline." This is because at some point in time (years ago) I heard that search engines favored headings with keywords, so I felt the semantic "bending" was worth it.

What now seems more accurate is that search engines use headings the same way that people do: to discern important content and understand content hierarchy. Both Google and Yahoo! advise authors to write headings with this approach.

The question that matters to me, though, is do search engines give greater weight to heading content? No idea. There are thousands (perhaps millions) of articles that say headings carry greater weight, but I could find nothing definitive from the major search engines.

So, what's my verdict? Today, I don't think I would use a heading for a site tagline just to achieve SEO. I suspect that the search engines have such sophisticated algorithms, that a single heading to expose a few keywords isn't going to help me in the rankings. And if it hinders accessible navigation by "confusing" the content outline, then it just isn't worth it to me.

Um, Isn't This Old News?

Maybe. This might be old news to you, and awesome if it is. That means you already take a thoughtful approach to markup, and we would probably be best of friends.

But it wasn't all old news to me. I never took time to consider the appropriate use of <h1>. I had outdated assumptions about SEO and headings. And, while I knew about screen reader navigation, I never took the time to actually watch someone use a screen reader on a site without headings (you really must watch that video above).

Then I started messing around with HTML5, and an entirely new world of possibility opened, forcing me to make sure I understood how and why to use headings. Hence, this post.

HTML5 Sections & Outlines

So back to this new world. HTML5 is pretty cool, especially if you are a POSH lover like me. It gives markup authors a broader semantic arsenal to work with (if you haven't yet, pick up a copy of Jeremy Keith's HTML5 for Designers to get up–to–speed).

New Semantic Elements

One of the many things HTML5 brings to the table is a new outline algorithm. This is based off of the new semantic, structural elements:

  • <section> is used for content that can be grouped thematically. A <section> can have a <header>, as well as a <footer>. The point is that all content contained by <section> is related.
  • <header> typically contains the headline or grouping of headlines for a page and/or <section>s, although it can also contain other supplemental information like logos and navigational aids.
  • <footer> is used for content about a page and/or <section>s, such as who wrote it, links to related information and copyrights.
  • <nav> is used to contain major navigation links for a page. While it isn’t a requirement, <nav> will often be contained by <header>, which, by definition, contains navigational information.
  • <article> is used for content that is self-contained and could be consumed independent of the page as a whole, such as a blog entry. <article> is similar to <section> in that both contain related content. The best rule of thumb for deciding which element is appropriate for your content is to consider whether the content could be syndicated. If you could provide an Atom or RSS feed for the content, <article> is most likely the way to go.
  • <aside> indicates the portion of a page that is tangentially related to the content around it, but also separate from that content, such as a sidebar or pull-quotes. A good method for deciding whether <aside> is appropriate is to determine if your content is essential to understanding the main content of the page. If you can remove it without affecting understanding, then <aside> is the element to use.
More In-Depth Outlines

These new elements provide authors a way to explicitly group content, and each has its own self-contained outline, so you don't have to follow a page-focused heading hierarchy. Instead, you can start with <h1> within each element, and the algorithm uses the hierarchy and nesting of the sectioning elements to determine the outline level of each <h1>.

Wait! What?

Yeah, that's what I said when I first learned this. The best way to grok this, I think, is with an example:

  1. <header>
  2.    <h1>Blog Archive</h1>
  3. </header>
  4. <section>
  5.    <h1>Posts by Month</h1>
  6.    <article>
  7.       <h1>Blog Post Title</h1>
  8.    </article>
  9.    <article>
  10.       <h1>Another Blog Post Title</h1>
  11.    </article>
  12. </section>
  13. <aside>
  14.    <h1>Popular Posts</h1>
  15. </aside>

The HTML5 outline algorithm, then, gives us:

  • Blog Archive
    • Posts by Month
      • Blog Post Title
      • Another Blog Post Title
    • Popular Posts

If this multiple <h1> approach was used with previous versions of HTML, the outline would be inaccurate:

  • Blog Archive
  • Posts by Month
  • Blog Post Title
  • Another Blog Post Title
  • Popular Posts
<hgroup>

HTML5 also introduces a new element, <hgroup>, which can be used to suppress headings from the content outline. A specific situation in which this would be useful is on this very blog, where I'm using an <h2> for my tagline. As I mentioned, I now think this idea wasn't the best because it could adversely affect accessible navigation.

But, by using <hgroup>, all headings after the first child are ignored by the content outline:

  1. <header>
  2.    <hgroup>
  3.       <h1>A Blog Not Limited</h1>
  4.       <h2>to web design, standards &amp; semantics</h2>
  5.    </hgroup>
  6. </header>
Benefits …

While it remains to be seen whether this will benefit my clients and projects, there is some sound reasoning behind this new approach to sectioning content and outlines.

First, with self-contained outlines, you can have an infinite number of heading levels. You are no longer limited to 6 (<h1>-<h6>). For a deep site, I could see this being useful. Not so much with shallower levels of information.

Second, self-contained content with independent heading hierarchies enable portable content. Consider a blog post that often appears on the home page, as well as its own page, and can even be syndicated or shared with other sites. Before, you would often have to modify the heading markup for the blog post depending on where it appeared. Now with the HTML5 outline algorithm, you can independently define the markup for the blog post without regard to where it may appear on a site (or another site).

… But Be Aware

Perhaps, over time, these admittedly practical benefits may be worth taking full advantage of headings in HTML5. For now, though, there are a few issues. As of now, browsers don't support this new outline algorithm. If you want to see the outline of an HTML5 page, you have to use an external tool.

It's not surprising, then, to know that assistive technologies aren't supporting this algorithm either. Which, for me, is my biggest concern. If I start with <h1>s in each container of related content, that is going to cause major problems with heading-based navigation in text browsers and screen readers.

Middle Ground?

Fortunately, HTML5 is backwards-compatible and flexible, so is isn't an either/or proposition. You can still approach headings from a page-level hierarchy and use the new HTML5 elements to group content. The spec even says authors can start sections with either <h1> elements or headings reflective of the section's nesting level. So that's what I plan on doing, at least until assistive technologies catch up.

HTML5 Cookbook

Interested in HTML5?
Get the Cookbook!

I was a contributing author for HTML5 Cookbook, available for sale on Amazon! Get yours now! (I hear chapters 1, 4 and 5 are particularly good.)

P.S. Don't forget my book Microformats Made Simple is still for sale!

Tags:

Share the Love

Luke Dorny's Gravatar

Luke Dorny opines:

11/08/2010

Very cool writeup, Emily. I appreciate your review and reasoning articles about things we may have missed in the fractured history that is “learning on the web”. Very complete and great examples. Thank you.

Emily's Gravatar

Emily responds:

11/10/2010

@Luke - Thanks for reading! Nothing helps me learn and figure things out better than putting a post like this together. I’m always happy when it helps someone other than myself :)

Chris's Gravatar

Chris opines:

11/16/2010

Hi Emily - great series.  Always looking forward to the next installment.

Question, though.  Are you aware of anywhere that people who aren’t as well-versed in semantics as you (e.g. me) can submit markup for inspection/validation to see how ‘good’ it is?  I guess the perfect solution would be something like the W3C Validator for semantics ... wishful thinking, maybe.

On my site I make every effort I can to write semantically correct markup but because so much of that is based on opinion it’s hard to tell how good it really is.

Emily's Gravatar

Emily responds:

12/11/2010

@Chris - First, apologies for taking almost a month to respond. No excuses, just the reality that I suck at everything except laziness ;)

Now, for your question ... I don’t know of any automated tool that will do that for you. I’m guessing because, as you mentioned, semantics can often be subjective.

There is the Semantic Checker Firefox add-on. It’s alright … I’ve only used it a few times and didn’t find too much value myself. But it doesn’t really tell you if semantic elements are being used correctly, only whether they are used at all.

brad's Gravatar

brad opines:

01/06/2011

Hey Emily,

I submitted a comment to this post awhile ago, like a month or two and actually forgot about it until now. It never got posted, I think it mentioned something about it looking like spam possibly because it was long and had a few links in it.

Anyhow, I’m wondering if it’s stuck in the backend spam filter and if you could “release” and comment on it. Like I said, it was a long comment and I know your exceptional talent for being lazy, so I won’t be holding my breath.

Emily's Gravatar

Emily responds:

01/06/2011

@Brad - Unfortunately, I don’t see your comment in the database. It’s probably due to the fact that I clear out any “spam” comments every few days ... it’s likely gone. Sorry :(

If you recall your question, post it. Or send me an email and I’ll be happy to reply there.

brad's Gravatar

brad opines:

01/06/2011

Thanks Emily,

I wish I could remember it and hadn’t deleted to text file I wrote it with. The basic thoughts were about using an [img] instead of plain text within the main [H1] that most sites have in the banner.

This was something I never really though too much about until I read this article. The comments are intriguing also.

Emily's Gravatar

Emily responds:

01/14/2011

@Brad - For the past few years I’ve been following the logic of using an inline image for logos for the same reasons the article you shared stated: logos are content, not style.

In years prior, though, I’d done the H1 image replacement thing. It made sense at the time, when I was trying to get the H1 SEO exposure. Now, it makes less sense as I approach my headings differently and see the value of inline logo images.

Live and learn, and I’m sure when someone else has an even better idea, I’ll adjust yet again :)

Commenting is not available in this channel entry.

The Coolest Person I Know

Emily Lewis

Yeah, that would be me: .

I'm a freelance web designer of the standardista variety, which means I get excited about things like valid POSH, microformats and accessibility. I ply my trade from my one-person design studio in Albuquerque, New Mexico 87106 USA.

A Blog Not Limited is my personal blog where I pontificate about web design, web standards, semantics and whatever else strikes my fancy. Head on over to Emily Lewis Design if you'd like to see my work or, even better, hire me.

More

I Tweet, Therefore I Am

Follow @emilylewis on Twitter!