So, I had planned to focus the second installment of this series on markup for images with captions. The topic was a request from my friend Ian and his birthday was coming up. However, his birthday has passed, and I'm just now writing. Plus, I've been thinking a lot lately about something more fundamental: bold and italicized text.
This may seem a trivial thing to be consuming my "markup mind," but after Tantek Çelik's HTML5 presentation, it's been bugging me. And what specifically has been bugging me is the recommendation of <b>
and <i>
in HTML5.
Shut the Front Door!
Yep, it is true. <b>
and <i>
are back and, apparently, more "useful." And when I first learned this, I was instantly put-off. I come from the "separate content from presentation" school that dropped these two elements in favor of the "more semantic" <strong>
and <em>
.
At the time when folks were thinking about the structural/semantic markup approach, <b>
and <i>
were strictly presentational. The HTML 4 spec declared these two as style elements that simply rendered text in bold and italics, respectively. Further, screen readers didn't differentiate them in any special manner, adding to the logic that they were only useful for visual differentiation.
Conversely, in HTML 4, <strong>
and <em>
offered meaning, as well as the default visual rendering. Content marked up with <em>
, for example, semantically indicated emphasis (and defaulted to italics in visual browsers), while the use of <strong>
indicated strong emphasis (and defaulted to bold).
Re-Definitions in HTML5
The WC3's HTML5 recommendation, however, brings us some redefinitions of these elements:
- The
<b>
element now represents a span of text to be stylistically offset from the normal prose without conveying any extra importance, such as keywords in a document abstract, product names in a review, or other spans of text whose typical typographic presentation is emboldened.- The
<i>
element now represents a span of text in an alternate voice or mood, or otherwise offset from the normal prose, such as a taxonomic designation, a technical term, an idiomatic phrase from another language, a thought, a ship name, or some other prose whose typical typographic presentation is italicized. Usage varies widely by language.- The
<strong>
element now represents importance rather than strong emphasis.
<em>
, meanwhile, isn't featured on the list of changed elements in HTML5. Although, the working draft does seem to define it slightly differently than the previous spec: as emphatic stress
rather than just emphasis.
Building an Argument
Upon first consideration, I was totally cool with the modified definitions of <strong>
and <em>
(although, admittedly, slightly confused as to what emphatic stress
meant), but I still felt <b>
and <i>
were presentational. I mean, the W3C even uses stylistically
and typographic presentation
in it's definitions for those elements.
But, thanks to this new series, I got to doing some research. And when I started, I was aiming to build an argument against <b>
and <i>
. First, I wanted to find out how screen readers treated these guys.
Screen Readers
As it turns out (and as I expected), two most popular screen readers don't, by default, read content contained by these elements any differently than other content.
What I didn't expect to discover, though, is that they also don't treat <strong>
and <em>
in any special way.
There goes the main "they aren't accessible" argument I was hoping for. None of the tags seems to offer any special accessibility to screen reader users.
Search Engine Optimization
So then I began a hunt for an SEO argument. Somewhere in the dusty annals of my mind, I recalled reading that Google paid particular attention to content contained by <strong>
.
Turns out, I was wrong yet again. In fact, at one point, Google gave greater weight to content marked up with <b>
, not <strong>
.
As of today, though, the search engine gives equal weight to <strong>
and <b>
, as well as <em>
and <i>
.
Crap. I thought I had all this ammunition against <b>
and <i>
, when what I really had were outdated and incorrect notions.
Forget the Argument, Focus on Semantics
I've said it before, but apparently I need to listen to my own advice about being too wedded to a particular semantic point of view … especially when operating with wrong assumptions. Time to focus on the entire point of this series: semantics.
So, let's take a closer look at <b>
and <i>
in HTML5
Presentation via CSS
In addition to the definitions I shared above, the HTML5 draft also specifies that CSS should control the presentation of <b>
and <i>
; that neither will, necessarily, appear in bold or italics by default.
Of course, this ultimately comes down to what the browser makers do, but this is a good clarification that these elements are no longer exclusively presentational in nature.
Further, the draft encourages authors to use the class
attribute to define why a <b>
or <i>
element is used in order to allow for unique styling of different implementations.
Consider the new definition of assigning <b>
to keywords and product names to offset those terms without adding importance. By extending <b>
with class="keyword"
or class="product"
(or some other equally semantic values), you have your CSS hooks to give each a unique presentation and you are also adding meaning to your markup (kinda like how microformats work).
Same is true for applying <i>
to taxonomy terms, idioms, phrases in another language and the like. Specifying the "why I'm using <i>
" via class
offers potential for both styling and semantics.
Common Typographic Conventions
Even with these caveats, though, I can't help but still think about the presentational nature of the definitions in HTML5. As I mentioned, stylistically offset
just screams presentation to me.
But then I started thinking about how bold and italicized text is commonly used in print. Yes, they do offer visual indicators, but more often (at least in my experience), text offset with italics or bold does conveys meaning, especially when considered in context.
Latin words, inner dialog or thoughts, titles of songs … I often see this type of content italicized in print. And, in context, I recognize the additional meaning the italics provides the content.
Media Independence
In HTML5, <b>
and <i>
are explicitly media independent. Essentially, because each element is no longer tied to bold or italics (visual presentation), the new semantic meaning they offer is available to non-visual browsers.
Again, it is up to those browser and screen reader makers to take advantage of that meaning, but media independence further supports the new semantic direction of these elements.
Warming Up
With all this additional information, I'm warming up a bit to using <b>
and <i>
again. But I'd be lying if I said I was completely comfortable.
<b>
and <i>
have historic ties to the notions of bold and italic. I mean, that's what "b" and "i" represent.
Why a new element wasn't introduced that is independent of this presentational history bugs me a bit. But, then again, using what people are already familiar with isn't always a bad thing.
Still, I worry that people will use these elements for presentational purposes. Or that folks won't apply the recommended class
values to differentiate instances of these elements.
I can't help but think that this is just a big can of worms that will get messy if markup authors don't understand and apply the spec properly. And let's not even talk about the "challenges" that could result from what browser makers will end up doing or not doing.
Practical Usage
Aside from my concerns, I do want to give some thought to how I would actually use <b>
and <i>
, now that they are semantic. And, of course, what roles <strong>
and <em>
will play in my markup.
<strong>
Even with the realigned definition of <strong>
in HTML5, I plan to use it as I always have, because I never really thought of it as strong emphasis
. I always used it as it is now defined: indicating importance.
For my projects, the types of content I commonly mark up with <strong>
include:
- Alerts
- Warnings
- Reminders
- Important content (duh)
For example:
<p><strong>Registration is required</strong> for this event.</p>
or
<p>The presentation begins at <strong>6:30 pm</strong>, so be sure to show up a few minutes early to avoid interrupting our speaker.</p>
or
<p><strong>Password provided for this username is incorrect.</strong> Please try again or you may request your password be emailed to you.</p>
I don't think there is a hard–and–fast rule about applying <strong>
. To me, it is more about content. What is important? Is it the time a presentation starts, or is it the reminder to arrive early?
And this is what I dig about semantic markup. Focusing on content.
<em>
Like <strong>
, I pretty much plan to use <em>
the same as I always have. Even with the new (slightly unclear) definition of emphatic stress
, <em>
still means, to me, stressed content. As in content that I would verbalize in a stressed tone to indicate emphasis.
And because I write the way I talk (with lots of stressed words), I use this element often in my content:
<p>Talking about microformats in less than 30 minutes (plus leaving time for questions) was <em>quite</em> a challenge.</p>
or
<p>You can use the <cite> attribute with <q> to indicate the source of a quote, <em>if</em> it's online</p>.
It is really a matter of knowing the content well enough to know what terms and/or phrases should be emphasized in this fashion.
<b>
To be honest, based on the HTML5 definition of <b>
, I'm not sure how often I'll actually use it. The draft suggests it can be used with product names and keywords, but I, personally, don't see a need to differentiate this type of content in any way.
Of course, a client might feel differently. Perhaps a client might want all of the product names on their site to appear stylistically offset
. So, in that situation, I would use it and take advantage of the recommended application of class
to indicate the purpose of the element:
<p>For data management, we offer two flagship products: <b class="product">Moxie</b> and <b class="product">Mojo</b>.</p>
And if that same client also wanted to highlight keywords associated with their products, I might:
<p><b class="product">Moxie</b> offers users the ability to <b class="keyword">cleanse</b>, <b class="keyword">extract</b> and <b class="keyword">transform</b> data.</p>
Meanwhile, in my CSS, I would style .product
and .keyword
in some fashion, likely both unique.
Also, HTML5 does specify that <b>
can be used simply to indicate text that needs unique styling, such as those typographic conventions of drop caps and paragraph leads:
<p><b class="dropCap">I</b>t was a cold and rainy night.</p>
Although, I'm not sure I would favor this approach over using :first-letter
in my CSS (like I already do on this blog). But I guess I could see it for styling a paragraph lead uniquely:
<p><b class="lead">It was a cold and rainy night,</b> despite what the weatherman had announced on the evening news. Bob was annoyed his stargazing plans were in danger from the looming storm.</p>
Still, even after considering those scenarios, I'm frankly not convinced <b>
is going to be a regular element in my arsenal.
<i>
As for <i>
, I can see using it much more often than <b>
. Particularly for technical, legal or medical terms, as well as foreign language phrases:
<p>A <i class="medicalTerm">patent foramen ovale</i> is a congenital defect between the two upper chambers of the heart.</p>
or
<p>I try to live my life according to the axiom, <i class="foreignLanguage">illegitimi non carborundum</i>.</p>
Foreign Languages
Since I used a Latin phrase in the last example, now might be a good time to address use of the lang
attribute. HTML5 Doctor provides an excellent article on the same topics I'm covering here.
In their examples of using <i>
for foreign language phrases, they apply the lang
attribute to indicate which foreign language is being referenced. For example:
<p>Mix baking soda and vinegar together, and <i lang="fr">voilá</i>, you get a cool chemical reaction.</p>
However, another article on the topic, Using <b> and <i> elements, warns against this approach:
… the language attribute only describes the language of the text, not the meaning. It is possible that you will want to style text in a different language differently according to the context in which it is used, either now or in the future.
As for me, I think that if I do use <i>
for foreign phrases, I'll likely skip the lang
attribute and rely on class
for any special styling.
Exercise Discretion
While I'm admittedly still a bit on the fence about actually using <b>
and <i>
regularly, you may feel differently and want to start marking up right away. If that is the case, please use these elements intelligently and correctly.
Don't just apply <b>
because you need a bold effect and you are feeling lazy. Don't use <i>
for a publication title, when <cite>
may be the appropriate element (see part 1 of this series for more on <cite>
).
Even the HTML5 draft recommends discretion:
… authors are encouraged to consider whether other elements might be more applicable than the i element, for instance the em element for marking up stress emphasis, or the dfn element to mark up the defining instance of a term.
Go Forth & Experiment
After gathering all this information, I had hoped to have a firm conclusion about <b>
and <i>
. Alas, I don't. So, what I shall do is try different approaches and see how they work for me, my sites and my clients.
I have some clients who I know won't take the time to add the extra markup for something like <b class="dropCap">
, while some clients may embrace that extra level of control. And I have some CMS implementations that currently aren't configured in a way that will easily allow the addition of <i class="foreignLanguage">
.
And then there's still that little voice in my head that can't seem to fully accept <b>
and <i>
as semantic elements.
Only time and practice will tell how big a role <b>
and <i>
will play for me. Until then, I'm eager to hear your thoughts!
♥ Share the Love