SI
SI
discoversearch

We've detected that you're using an ad content blocking browser plug-in or feature. Ads provide a critical source of revenue to the continued operation of Silicon Investor.  We ask that you disable ad blocking while on Silicon Investor in the best interests of our community.  If you are not using an ad blocker but are still receiving this message, make sure your browser's tracking protection is set to the 'standard' level.
Technology Stocks : Max90's LINK STORAGE to stock quotes

 Public ReplyPrvt ReplyMark as Last ReadFilePrevious 10Next 10PreviousNext  
To: LTK007 who wrote (196)12/5/1999 1:01:00 PM
From: LTK007  Read Replies (1) of 3906
 
Lesson 124: XML and XSL

by Alan Zeichick

The Extensible Markup Language (XML) is a World Wide Web Consortium
(W3C) standard, approved in February 1998, for describing the content of
Web pages. The Extensible Stylesheet Language (XSL) is a draft W3C
standard, released in August 1998, for describing how to present XML pages
within a Web browser. But before I talk about what XML does, consider
what the ubiquitous Hypertext Markup Language (HTML) doesn?t do. (In
this Tutorial, I?m assuming that you have some knowledge of HTML and can
read simple HTML code.)

HTML BASICS

Say you wish to display the contents of
this Tutorial on a Web page. You might
put the word ?Tutorial? in headline style 2,
the lesson number (Lesson 124) and title
(XML and XSL) in headline style 1, the
author?s name in headline style 3, then the
text in normal body style. In simple form,
and with apologies to Network
Magazine?s long-suffering Webmaster, the
code would look like:
<html>
<h2>Tutorial</h2>
<h1>Lesson 124: XML and XSL</h1>
<h3>by Alan Zeichick</h3>
The Extensible Markup Language
(XML) is a World Wide Web
Consortium (W3C) standard ...
</html>

So far, so good?except there?s no logical
rhyme or reason behind the coding. If you
want to search a site (or a long document)
for tutorials, for certain lesson numbers, or
for everything by a particular author,
you?d have to know that the site usually
uses h2 for the article type, and h3 for the
author name. Not particularly efficient, and rather ad hoc. You?d probably
prefer to design the code to read as:

<html>
<article_type>Tutorial </article_type>
<lesson_number>124 </lesson_number>
<article_title>XML and XSL </article_title>
<author>by Alan Zeichick</author>
<article_text>The Extensible Markup Language (XML) is ...
</article_text>
</html>

That?s the principle behind XML, which lets Web site developers use
meaningful tags based on the content of a Web page. Furthermore, XML
allows site creators to define new tags as needed, rather than rely on a fixed
set of generic HTML tags blessed by the W3C, or ?embraced and extended?
by a particular browser manufacturer.

Before you get carried away, please note that the second code fragment
above is nonsensical: It?s not HTML, and it?s not really XML (but it?s close).
So, next I will look at what makes up an XML document, and then rewrite
the example in genuine XML.

XML DOCUMENTS

XML isn?t a page-description language?it?s a structured data-definition
language. It describes the content of a Web page using tags. (XML can
actually describe any arbitrary data, but for this Tutorial assume XML code is
being written for the Web.) An XML document must start with an XML
prolog, which begins with a declaration that the document is written in XML;
it may optionally include a Document Type Definition (DTD) that describes
the elements, tags, attributes, and other elements of the document. (DTD will
be discussed in more detail later.) The prolog is followed by a single tag that
encapsulates the entire document. That tag usually is named root or
document.

Further, in XML, unlike HTML, all tags must be closed, and although tags
may be nested, they may not overlap.

So, here is the example rewritten in XML (save it as tutorial.xml). The XML
prolog, which begins with <?xml, contains two attributes; one describes the
version of XML used, and the other states that the document is complete in
this one file. An XML document that adheres to these rules (and a few others
regarding reserved characters) is said to be well-formed and should be
interpretable by any XSL processor.
<?xml version="1.0" standalone="yes"?>
<document>
<article_type>Tutorial </article_type>
<lesson_number>124</lesson_number>
<article_title>XML and XSL </article_title>
<author>Alan Zeichick</author>
<article_text>The Extensible Markup Language (XML) is ...
</article_text>
</document>

XSL: DISPLAYING XML

XML, as stated earlier, is a way to describe the meaning of a document.
Unlike HTML, it does not describe how to display a document in a Web
browser. No browser can understand arbitrary tags like <author> and
<lesson_ number>. That?s why you need the Extensible Stylesheet Language
(XSL), which describes the intended physical appearance of an XML
document. You will also need software, called an XSL processor, to read the
XML document, apply the XSL style sheet to its tags, and produce standard
HTML as output. I will talk about where to find XSL processors, but first I
will create a sample XSL style sheet.

An XSL style sheet consists of a text file contained within <xsl> and </xsl>
tags. Between those tags are a series of rules describing the XML tags within
an XML document and telling how to format them in HTML. Remember I
wanted the article_type above to be displayed in HTML headline style 2? The
XSL rule for that XML tag would be:
<rule>
<target-element type="article_type"/>
<h2><children/></h2>
</rule>

The special element <children/> tells the XSL processor to apply the XSL
rule to the contents of the tagged item. Thus, the XML text
<article_type>Tutorial</article_type> will be processed into
<h2>Tutorial</h2>. Note that any HTML tag can be included as part of an
XSL rule. Also, XML tags can be nested; in that case, XSL rules are applied
recursively as needed. This capability allows Web site designers to exercise
very fine control over the appearance of pages?far more than is illustrated
using this simple example.

Here is the complete XSL document tutorial.xsl for the tutorial.xml file.
The final rule, which doesn?t explicitly mention a target-element type, is a
catch-all that applies a default formatting to all tags not specifically defined.
<xsl>
<rule>
<target-element type="article_type"/>
<h2><children/></h2>
</rule>
<rule>
<target-element type="lesson_ number"/>
<h1>Lesson<children/></h1>
</rule>
<rule>
<target-element type="article_title"/>
<h1><children/></h1>
</rule>
<rule>
<target-element type="author"/>
<h3>by <children/></h3>
</rule>
<rule>
<target-element type="article_text"/>
<p><children/></p>
</rule>
<rule>
<target-element/>
<p><children/></p>
</rule>
</xsl>

GENERATING HTML

So, you?d like to see the HTML code generated by the XSL style sheet?
Well, it?s not as easy as loading it on a browser, as no generally available
Web browser currently contains XSL processing capabilities. Microsoft
includes some rudimentary XML parsing capability in Internet Explorer 4, and
Netscape has demonstrated some XML functionality in Mozilla 5 (the core of
its next-generation browser), but for now you?ll need external software to
apply the XSL style sheets to an XML document. You can find links to a
number of freeware XSL processors at www.w3.org/XML/.

The simplest processor to play with is Microsoft?s MSXSL.EXE,
dowloadable from www.microsoft.com/xml/xsl/ msxsl.asp/. This
command-line utility can apply XSL style sheets to an XML document, and
produce an HTML output file. It?s simple enough to run: from a DOS prompt,
run MSXSL ?i xmlfilename ?s xslfilename ?o htmlfilename.

If there is an error in the XML or XSL code, the MSXSL processor will let
you know roughly where the problem occurred, and it may even guess at
what went wrong (such as an argument mismatch: In my sample XSL file
above, I had initially terminated the <h1> command with </h2>). But if all
goes according to plan, you?ll end up with the HTML file:
<div><h2>Tutorial</h2>
<h1>Lesson 124</h1>
<h1>XML and XSL</h1>
<h3>by Alan Zeichick</h3>
<p>The Extensible Markup Language (XML) is ...</p></div>

If this example represented the best that XML could do, the technology
would have died a swift death; after all, many of those capabilities can be
handled using straightforward HTML with Cascading Style Sheets (CSS).
The real payoff will come from using XML?s more advanced features, such as
data validation using the DTDs and the Extensible Linking Language.

DOCUMENT TYPE DEFINITIONS

Earlier, I discussed the prolog section of an XML document. The prolog must
contain the <?xml statement, but it may optionally include either DTDs or a
link to another file containing DTDs for application to the XML file. DTD
validates a well-formed (that is, syntactically correct) XML document; all tags
used in the body of the XML document must be defined in the DTD.

The DTD for this Tutorial, for example, would need to define the article_type
field. It could simply define it as a random string of characters, but that
wouldn?t be of much benefit. A better strategy might be to predefine all of the
possible article types, such as ?Tutorial,? ?Feature,? or ?NT Techniques.?
DTDs can further specify that the article_type field must occur once, and only
once, within an XML document, but that the author field can occur more than
once, so articles with multiple authors can be supported. It could further
specify that a new field, author_email, is valid only if the ?@? symbol is
included within it exactly once, or if the author_phone field can contain only
digits. You get the idea.

An XML file can be validated only with an XML validating parser. (For a
current list of validating parsers, nearly all of which are Java classes a
developer can include in custom applications, rather than ready-to-run
programs, see www.w3.org/XML/.) In general, if you?re using only XML to
build pages for displaying within a browser, you need not worry about DTDs.
Those rigid definitions are required, however, to use XML as a
domain-specific data-definitional language?to pass e-commerce data
between servers, for example, or to standardize a way to describe chemical
data, astronomical readings, or consumer credit reports. In those cases,
having a strict definition of an XML document?s allowable fields, and each
field?s allowable values and format, will make it easy to implement Web pages
that enable the automated transfer of data between applications or
organizations. One group doing just that is the XML/EDI Group
(www.xmledi.com), which submitted an e-commerce-oriented DTD to the
W3C in August 1998.

XML is a work in progress. The only production use of XML technology that
I?ve found so far is in Microsoft?s Channel Definition Format (CDF), which
pushes Web content to users supported by Internet Explorer 4. However,
now is a good time to begin exploring the technology, as next-generation
browsers will have some level of support for parsing or displaying
well-formed XML documents.

Alan Zeichick, former editor-in-chief of Network Magazine, is now
Director of Camden Associates, a technology analysis and consulting
group based in San Bruno, CA. He can be reached at
Report TOU ViolationShare This Post
 Public ReplyPrvt ReplyMark as Last ReadFilePrevious 10Next 10PreviousNext