Ferrysoft

Ferrysoft Help Desk

Click to learn more about Ferrysoft Help Desk

Ferrysoft Help Desk is a web based help desk solution. The Express Edition is free of charge. Click the screen shot to learn more about Ferrysoft Help Desk.

How to use DOCTYPE as an aid to developing well-formed HTML

Summary

Many web pages don't conform to well-formed HyperText Markup Language (HTML). Most web browsers are quite forgiving of this. However, if the HTML isn't well-formed then the chances increase that the page will appear differently than intended. The use of an explicit DOCTYPE declaration in a web page allows the page to be validated as well-formed for a range of HTML levels. This document describes an approach to migrating web pages up the HTML hierarchy by using the appropriate DOCTYPE.

Introduction to Document Type Definitions

A Document Type Definition (DTD) is a document that defines the structure of a markup language as well as elements and attributes. Because HTML is a markup language, the World Wide Web Consortium (W3C) provides DTDs for all HTML versions.

More detail is available on this at the W3C HTML Home Page.

If a web author wants to indicate that a web page conforms to a particular DTD then he / she includes a DOCTYPE declaration at the top of the web page. A DOCTYPE consists of two parts; a definition of the HTML version that a page uses and a path, also called a Uniform Resource Identifier (URI), to the DTD that defines the HTML version.

The following example shows the DOCTYPE declaration for the HTML 4.01 Transitional specification with the version appearing on the second line and the DTD URI appearing on the third line.

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
    "http://www.w3.org/TR/html4/loose.dtd">

If a web page doesn't contain a valid DOCTYPE declaration then the web browser will assume a DTD and display the web page based on that assumption. This will almost certainly mean that an out of date DTD will be assumed and the web page will not appear as intended.

If you have web pages without a DOCTYPE declaration then a good place to start is HTML 4.01.

HTML 4.01 has three DTDs:

DTD Description
HTML 4.01 Strict The HTML 4.01 Strict DTD includes all elements and attributes that have not been deprecated or do not appear in frameset documents.
HTML 4.01 Transitional The HTML 4.01 Transitional DTD includes everything in the strict DTD plus deprecated elements and attributes.
HTML 4.01 Frameset The HTML 4.01 Frameset DTD includes everything in the transitional DTD plus elements and attributes for creating frames. It is normally better to avoid using frames web pages if possible because of their intrinsic limitations. If frames are avoided then there is no need to use this DTD.

The following sections outline how you would migrate your web pages from having no DOCTYPE declaration through HTML 4.01 Transitional, HTML 4.01 Strict and finally to XHTML 1.0 Strict.

HTML 4.01 Transitional

If a web page has no DOCTYPE declaration then a good first step is to migrate to HTML 4.01 Transitional. The following example shows the DOCTYPE for HTML 4.01 Transitional.

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
    "http://www.w3.org/TR/html4/loose.dtd">
<html>
    .
    .
    .
</html>

When this DOCTYPE declaration has been added to the web page, it can be validated via the W3C Markup Validation Service. This service will indicate whether the web page conforms to HTML 4.01 Transitional or not.

It isn't possible to predict the number or type of errors that will have to be corrected to achieve HTML 4.01 Transitional compliance. However, they should all be corrected, using the diagnostic information provided by the validation service, before proceeding to stricter DTDs.

Probably the most significant impact the introduction of this DOCTYPE will have is on the way the web page is rendered by a web browser. Typically a web browser will have two modes for rendering web pages:

  • Quirks mode.
  • Standards mode.

In particular, Internet Explorer 6 switches from rendering in Quirks mode to rendering in Standards mode when this DOCTYPE is introduced.

Quirks mode attempts to handle non-standard behaviour from older versions of HTML and older browsers so that existing content on the web isn't broken.

Standards mode causes the browser to respect the behaviour described in the HTML standards.

It is likely that the HTML in a web page will need to be tweaked to get it to appear as the web author intended as a result of the browser switching to Standards mode.

HTML 4.01 Strict

If a web page already conforms to HTML 4.01 Transitional then a good next step is to migrate to HTML 4.01 Strict. The following example shows the DOCTYPE for HTML 4.01 Strict.

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
    "http://www.w3.org/TR/html4/strict.dtd">
<html>
    .
    .
    .
</html>

HTML 4.01 Strict does not allow depreciated elements, largely related to markup functionality that is alternatively available via Cascading Style Sheets (CSS). It is therefore likely that you'll need to introduce an external CSS file to handle the style information previously embedded in the web page. An external CSS file can be linked to a web page as in the example below.

<head>
    <link rel="stylesheet" type="text/css" href="example.css" />
</head>

In a similar way to validating the HTML markup, the CSS file can be validated using the W3C CSS Validation Service.

XHTML 1.0 Strict

Extensible HyperText Markup Language (XHTML) is the successor to HTML and it reformulates HTML in Extensible Markup Language (XML).

If a web page already conforms to HTML 4.01 Strict then a good next step is to migrate to XHTML 1.0 Strict. The following example shows the DOCTYPE for XHTML 1.0 Strict.

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
    "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" lang="en" xml:lang="en">
    .
    .
    .
</html>

Notice that the root element (html) of the document must contain an xmlns declaration for the XHTML namespace and an example of that is shown above.

It isn't possible to predict the number or type of errors that will have to be corrected to achieve XHTML 1.0 Strict compliance. However, all the errors flagged by the W3C Markup Validation Service for the web page and the W3C CSS Validation Service for the CSS file should be corrected to complete the migration to XHTML 1.0 Strict.

References

Conclusion

By using the appropriate DOCTYPE declarations and the W3C validation services, the quality of your HTML pages can be improved to create well-formed HTML that is more likely to display correctly in the maximum number of browsers.

About the author

Mike Green is the founder of Ferrysoft, a software development company specialising in Help Desk software technology.