Skip to main content


developerWorks  >  XML  >

XHTML

Use XML to develop Web content

developerWorks

Level: Intermediate

Contributors: W3C

06 Feb 2007
Updated 25 Apr 2007

XHTML is a Web presentation language based on HTML but recast in well-formed XML. It's designed to continue the trend in HTML 4.01 of encouraging the separation of content from presentation. Discover the many changes that XHTML 2.0 will offer, including features that will improve the ability of authors to express content structure and meaning.

XHTML 1.0 [W3C Recommendation] is mostly HTML 4 recast as well-formed XML. HTML is a Standard Generalized Markup Language (SGML) application, and when XML was developed as a simplification and specialization of SGML for the Web, HTML (itself the lingua franca of the Web) became the chief candidate for adoption to XML. The result is a variation named XHTML. The goal of the XHTML work is an HTML language for which parsing is simpler (because of XML's stricter syntax). XHTML is easily processed using off-the-shelf XML tools, and it strives to better separate content from presentation. XHTML is one of the oldest XML applications and has a huge number of contributing interests, resulting in many parts and versions.

XHTML 1.0 defines distinct Document Type Definitions (DTDs) and namespaces to correspond to the three HTML 4 DTDs -- Strict, Transitional, and Frameset. XHTML Modularization [W3C Recommendation] provides a framework for breaking down XHTML into separate modules defined as distinct DTDs. For example, all element and attribute types used for defining lists would be in one module, and element types geared toward presentation would be in another module. In this way, you can develop and refine XHTML by adding, subtracting, and updating generally independent modules. The first step along these lines was XHTML Basic [W3C Recommendation], which defines the minimum set of XHTML modules required for any language that counts as XHTML. XHTML Basic in itself could be used as the content language for Web clients such as mobile phones, personal digital assistants (PDAs), pagers, and set-top boxes. XHTML 1.1 [W3C Recommendation] is basically the XHTML 1.0 Strict DTD broken down using the module framework.

XHTML 2.0 [in development] is a rewrite of XHTML without considerations of backward compatibility with HTML. The idea is pretty much to start from scratch in developing a content language for the Web, learning from the past without being enslaved to the past. Examples of big changes include:

  • Eliminating <br/>, <img/>, and other elements considered excessively presentation-oriented
  • Eliminating HTML-style forms in favor of XForms
  • Eliminating HTML-style linking elements in favor of HLink
  • Replacing many JavaScript™-driven dynamic tasks with XML Events
  • Replacing HTML-style frames with XFrames

More importantly, XHTML 2.0 makes many additions that improve the ability of authors to express content structure and meaning. Breaking backward compatibility has been controversial. Some commentators say that maintaining the (X)HTML name and bumping the revision number will lead to confusion. Others say that the changes are much needed and that XHTML is still an Extensible Hypertext Markup Language, so the name remains appropriate.

XHTML is often used with other embedded formats, such as Mathematical Markup Language (MathML), Resource Description Framework (RDF), Scalable Vector Graphics (SVG), Synchronized Multimedia Integration Language (SMIL), and Voice Extensible Markup Language (VoiceXML). Such combination documents are called multi-modal or non-monolithic. The World Wide Web Consortium (W3C), International Organization for Standardization (ISO), and other organizations are putting a good deal of effort into encouraging strong support for such documents.


Resources


Back to top


Document options

Document options requiring JavaScript are not displayed