Semantical Mark-up

HTML Semantical Mark-up


Historically, designers and developers have been forced to use HTML to control both the content and the presentation of web pages. In order to achieve increasingly complex designs in inconsistent browsers, the limited set of tags provided by HTML was used in more and more convoluted ways, resulting in documents that looked good, but which were vastly inefficient, inflexible and inaccessible.

As a result pages became simultaneously difficult to maintain and more prone to cross-browser incompatibility: the worst of both worlds. Vitally important tags like <p>, <blockquote> and <table> were misused because of their default presentational characteristics, or replaced by other less meaningful tags. Purely presentational tags like <font> and <b> were used to make things look right, with no attention paid to the editorial meaning of the content. Often, even after many hours spent breaking the rules still further in an attempt to solve problems, pages didn’t make sense to non-visual browsers. Following the 1990 ADA (Americans with Disabilities Act) Title III unreasonably inaccessible web pages are now subject to legal action by the Department of Justice (DOJ).

With the introduction of CSS (Cascading Style Sheets) and improvements in browser standards, it has become possible to achieve design in a way that preserves the meaning of the content. By separating content from presentation, and by using the proper tools for the job, pages have become smaller, more flexible, easier to maintain, easier to index in search engines, and accessible by all. Finally, the content written by writers can be presented in the way designers intend and in a way that developers can build and maintain.

1. Introduction

1.1 Semantic mark-up is HTML that describes the content, rather than the manner in which the content is presented. It allows the meaning to be delivered to users regardless of the browser they use, so that content can be provided to the widest possible audience.

1.2 As an example, <em>this text</em> uses HTML as it was intended by the W3C, while <i>this text</i> does not. The appearance of this visual browsers (as opposed to screenreaders, etc.) will be the same, but only by using semantic HTML is the meaning of the content preserved across all types of browser. Only by using semantic HTML can the emphasis of the words be maintained, whether emphasis is presented visually as italics (as it is by default) or in another way.

1.3 AHA/ASA endorses the principle of separating content from presentation in web pages, using HTML as a semantic mark-up language.

2. Scope of This Standard

2.1 This standard is an ‘ideal’ ambition, which cannot be fully realised at this time, due to the legacy of non-semantic content currently on www.heart.org and www.strokeassociation.org.

2.2 This standard must be applied as far as is possible – to all new templates (e.g. Content Management System templates), or pages produced outside of existing templating systems and content management systems. (e.g. third-party cross-sell, specialized javascript, etc.).

3. Principles of Application

3.1 You MUST NOT use semantic tags outside the purposes defined below, e.g. if you use the blockquote tag you MUST ONLY use it as defined below and not for any other purpose, such as to set a particular presentation style.

3.2 If you are capturing semantic meaning in a document, you MUST use the appropriate semantic tag, e.g. an <abbr> tag MUST be used in preference to <span class="abbr">

4. Headings

4.1 All pages MUST use heading elements

4.2 Heading elements MUST convey the structure of the document, rather than the editorial emphasis of its content (e.g. the most important story on the page).

4.3 Heading elements MUST be ordered hierarchically, i.e. if there is an H2 element on the page it SHOULD be preceded by an H1 element somewhere on the same page, if there is an H3 element on the page it SHOULD be preceded by an H2 element somewhere on the same page, if there is an H4 element on the page it SHOULD be preceded by an H3 element somewhere on the same page, etc. Intermediate levels MUST NOT be omitted (e.g. H1 directly to H3).

NOTE: this is to maintain well structured pages, which are essential in order to deliver a good user experience to those using assistive technologies such as screenreaders.

4.4 Headings SHOULD be followed by further content, e.g.

<h3>Title</h3> <p>Text text</p>

4.5 Headings SHOULD NOT be treated as ‘standalone’ content.

4.6 Headings MUST NOT have a consecutive series of same level headings without content between each e.g.

<h3>Title</h3>
<h3>Title</h3>

4.7 Headings MAY have sequential headings (without content between each) to specify hierarchy. For example:

<h3>Section</h3> 
<h4>Sub section</h4>

4.8 There MUST be one, and only one H1 per page. UPDATE: The latest HTML5.3 Specification now allows multiple H1 tags, but only when a new SECTION or ARTICLE is introduced onto a page.

4.9 The first H1 MUST be subject of that page.

5. Lists

5.1 There are three valid list types: ordered lists <ol>; unordered lists <ul>; definition lists <dl>

5.2 All lists SHOULD be preceded by a header – <h*>description</h*> – that describes the content of the list. Example: <h2>Other top stories</h2> before a list of other top stories.

5.3 <ul> unordered list: MUST ONLY be used where the order of the list is not editorially significant.

5.4 <ol> ordered list: MUST ONLY be used where the order of the list items is editorially significant. (Even if the numbers are hidden with CSS).

5.5 <ul> and <ol> type lists MUST have at least one <li> item.

5.6 <dl> lists MUST contain at least one <dt> with a corresponding <dd>.

5.7 <dd> MUST have at least one corresponding <dt>.

5.8 <dl> MAY have multiple terms for a given definition, as well as multiple definitions for a given term.

5.9 <dl> definition list: MUST only be used to describe terms and their definitions.

5.10 In content, you SHOULD use h tag structure rather than nested lists.

5.11 In navigation you MAY use nested lists.

5.12 Nested lists MUST NOT be more than 3 levels deep. i.e. this limit is demonstrated as follows:


 <ul>
  <li>item</li>
  <li>item
   <ul>
    <li>item</li>
    <li>item
     <ul>
      <li>item</li>
      <li>item</li>
     </ul>
    </li>
   </ul>
  </li>
  <li>item</li>
  <li>item</li>
 </ul>
				

6. Table Mark-up

6.1 Tables SHOULD ONLY be used for conveying tabular data, not presentational (layout) use.

6.2 Tables MUST be used for tabular data. Tabular data is data that has relationships in two or more dimensions.

6.3 If you are displaying tabular data you MUST use a “summary” attribute to describe the editorial intent of the data.

6.4 If you wish to apply a caption to your table you SHOULD use a caption tag to do this, e.g. source or copyright of data.

6.5 If you wish to supply a title to your data you MUST use a heading tag, so as to enable navigation to the table within the page by screenreaders.

6.6 In a data table you MUST make use of <thead> and <tbody>.

6.7 If your table has a footer this MUST be encapsulated in a <tfoot> tag.

6.8 If you have table headings you MUST use <th> tags for these.

7. Other Semantic Tags

7.1 These presentational tags MUST NOT be used.

<b>
bold contents
<i>
italic contents
<big>
increased font size
<blink>
alternating for- and background colours
<marquee>
for scrolling text
<s>
strikes through text
<small>
decreases font size
<strike>
strikes through text
<tt>
teletypewriter style
<u>
underlines contents
<center>
centers a section of text
<nobr>
creates a region of non-breaking text
<font>
changes the size, style and color of text

7.2 These semantic tags MUST be used where the content matches their description:

<p>
defines a paragraph of text.
<em>
indicates emphasis.
<strong>
indicates stronger emphasis.

7.3 These semantic tags SHOULD be used where the content matches their description:

<blockquote>
defines a block quotation.
<q>
defines a short quotation (inline).
<cite>
contains a citation or a reference to other sources.
<abbr>
indicates an abbreviated form (e.g HTML).
<dfn>
defines instances of special terms or phrases.
<code>
designates a fragment of computer code.
<samp>
designates sample output from programs, scripts, etc.
<kbd>
indicates text that is typed on a keyboard.
<var>
indicates an instance of a variable or program argument.
<ins>
defines inserted document content.
<del>
defines deleted document content.
<address>
defines an address

7.4 These notational tags MUST be used.

<sub>
subscripted text
<sup>
superscripted text

7.5 These tags MAY be used.

<br />
inserts a line break in to the text flow
<pre>
preformats text – although alternatives to using this tag are recommended

7.6 You MUST ONLY use <br /> tags to create single line breaks. For more than one line break an alternative should be found (e.g. paragraphs).

7.7 You SHOULD NOT use <br /> tags in general. Recognised exceptions are poetry, address areas and where the line break may be argued as part of the meaning rather than the presentation.

8. Tag Attributes

8.1 You MAY use height and width attributes on images and embedded media.

8.2 You SHOULD NOT use height and width attributes on any other tags. Height and width SHOULD be defined by the CSS.

8.3 You SHOULD NOT use border attributes on tags. Borders SHOULD be defined by the CSS.

8.4 You SHOULD NOT use align, valign or clear attributes.

8.5 You SHOULD NOT use style attributes, except where using syndicated content or internal syndicating systems.

8.6 For alt and title attributes see Textual Equivalents Standard.

9. Microformats

9.1 You MAY use microformats on your site where there are agreed, not draft, specifications (refer to the Microformats community wiki site for details) with the exception of those that use the title attribute of HTML’s abbr element.

9.1.1 Some microformats use the abbr element to conceal machine-readable data; for example, date-times and geographical coordinates. For screen-reader users that expand abbreviations they will hear the full date-time or coordinate; for example 2008-05-15T19:30:00+01:00 instead of 19:30.

9.1.2 If you want to use microformats in the abbr element you MUST first discuss this with your product lead.

9.1.3 If you want to use draft microformats you MUST first discuss this with your product lead.

9.2. If you do use microformats, you MUST ensure that the title attribute contains human-readable data.

10. Forms

Be careful not to get semantics and structure confused. HTML elements exist, primarily for expressing structure. Semantics is about giving the structure meaning. While there is a some semantic meaning in some HTML markup, it is very generic meaning. So my answer is broken along those lines:

Semantics

Why is it important for you to express semantic meaning through your form? Is the markup supposed to be consumed by a client other than a standard browser? Is it a special use-case?

If you need to infuse semantic meaning to the elements of your form do so by decorating your structural markup with appropriate classes and ids – you won’t likely get any semantic meaning from the HTML elements in your form regardless of which element type you use to group/separate your inputs.

Structure

10.1 If you’re just looking to provide visual separation of inputs and want to use the least possible markup then use either <br /> after or <p> around, your input tags.

10.2 If you want to structurally group your inputs to their labels then use <fieldset>, <ul>, or <ol> – all of these tags can achieve this objective equally well.

10.3 For compound elements (where text is used to label a form element), the <label> tag MUST be used to explicitly associate the relevant text label with its form element.

10.4 This MUST be done using a ‘for’ attribute on the label and a pairing ‘id’ attribute on the element.
e.g. <label for="apple">apple</label><input id="apple" />.

10.5 If there is no text that labels the form element then the element MUST have a title attribute.

10.6 A label-input pair (compound element) SHOULD be contained in a block level element (e.g. <fieldset>, <p> or <li> tag).

10.7 A label-input pair SHOULD NOT be contained in a <dl>, as this provides no additional structural information.

11. HTML5 Semantic Data Types

HTML5 introduces no less than a baker’s dozen new input types for forms. These new input types have several benefits: using them means less development time, better mobile/responsive experience and an improved overall user experience.

11.1 Color

11.2.1 date

<label for="date">Select a date:</label> <input type="date" id="date" name="date">

11.2.2 datetime

<label for="datetime">Date and Time:</label> <input type="datetime" id="datetime" name="datetime">

11.2.3 datetime-local

<label for="datetime-local">Date and Time:</label> <input type="datetime-local" id="datetime-local" name="datetime-local">

11.2.4 month

<label for="month">Month:</label> <input type="month" id="month" name="month">

11.2.5 week

<label for="week">Select a week:</label> <input type="week" id="week" name="week">

11.2.6 time

<label for="time">Select a time:</label> <input type="time" id="time" name="time">

11.3 email

<label for="email">E-mail:</label> <input type="email" id="email" name="email"> <input type="submit">

11.4 number

<label for="number"> Quantity (between 1 and 10):</label> <input type="number" name="number" id="number" min="1" max="10">

11.5 range

<label for="range">Range:</label> <input type="range" name="range" id="range" min="1" max="10">

  • max – specifies the maximum value allowed
  • min – specifies the minimum value allowed
  • step – specifies the legal number intervals
  • value – Specifies the default value

11.6 search

<label for="search">Search:</label> <input id="search" type="search" name="search">

11.7 tel

<label for="tel">Telephone:</label> <input type="tel" id="tel" name="tel">

11.8 url

<label for="url">URL:</label> <input type="url" id="url" name="url">

12. Accessible Icons

Modern versions of assistive technologies will announce CSS generated content, as well as specific Unicode characters. To avoid unintended and confusing output in screen readers (particularly when icons are used purely for decoration), we hide them with the aria-hidden="true" attribute.

If you’re using an icon to convey meaning (rather than only as a decorative element), ensure that this meaning is also conveyed to assistive technologies – for instance, include additional content, visually hidden with the .sr-only class.

If you’re creating controls with no other text (such as a <button> that only contains an icon), you should always provide alternative content to identify the purpose of the control, so that it will make sense to users of assistive technologies. In this case, you could add an aria-label attribute on the control itself.

Examples:

<span class="fa fa-info-circle" aria-hidden="true"></span>

An icon used in an alert to convey that it’s an error message, with additional .sr-only text to convey this hint to users of assistive technologies.

<div class="alert alert-danger" role="alert">
  <span class="fa fa-info-circle" aria-hidden="true"></span>
  <span class="sr-only">Error:</span>
  Enter a valid email address
</div>


Leave a Comment