For general information about xml read my article “introduction into XML”.
TOC: Quick Links
- Hyperlinks in documents
- 1. XPath, XPointer and XLink
- 2. Persistent document names
- 3. Summary
Hyperlinks in documents
With HTML it is only possible to generate unidirectional links, whereas with XML and the “XML Linking Language” you can also create multi-directional links.
Informationen about standards and recent changes can be found on the W3C Homepage.
Followingly I will discuss some aspects of the version 1.0 of “XML Linking Language (XLink)”, which exists as a W3C Recommendation since June, 27th 2001.
1. XPath, XPointer and XLink
Broadly speaking, XPath provides a mechanism for identifying a particular node or set of nodes in the document’s structure tree as a starting point for a link anchor; XPointer elaborates what XPath can achieve by allowing “selections” to be made which transcend node boundaries (for example, a document might be tagged up with every sentence separately identified between <SENTENCE> and </SENTENCE> markers and yet a user wants to create a link “hot spot” that spans the end of one sentence and the beginning of the next). Finally, XLink specifies the syntax and semantics for embedded and external links including options for an “extended out-ofline” pointer which offers the prospect of multiway links stored in separated linkbases.
1.1. XPath – Structure Tree and nodes
The primary purpose of the language XPath is to address parts of an XML document. XPath provides a common syntax and semantics for functionality shared between XSL Transformations (XSLT) and XPointer. Therefore XPath models an XML document as a tree of nodes (There are different types of nodes, such as element, attribute and text nodes). A node is modeled as a pair consisting of a local part and a (possibly null) namespace URI.
[Source: Xpath Tutorial on W3Schools]
XPath in version 2.0 also supports a function so called “collection()”, which allows to process multiple documents with an XSLT-Engine.
[Source of this map: iX 8/2005, p. 60: XML-Verarbeitung (German article)]
1.2. XPointer – Node boundaries
XPointer is based on the XML Path Language (XPath) and allows the hyperlinks to point to specific parts of the XML document. It allows for examination of a hierarchical document structure and choice of its internal parts based on various properties, such as element types, attribute values, character content, and relative position. An example of XPointer Syntax is shown below, where XPointer is used to point to the third item in a list with a unique id of “milk”:
Das Web von gestern Das Web von morgen ------------------- ------------------ XLink, XPointer HTML XHTML ... | | SGML XML
As you might already know, the normal way of linking objects within XML is to use XLink. Therewith, you can put a marker on elements and tell them to act as hyperlinks. Those hyperlinks are objects which are addressed by a URI. An interesting thing (among others) about Xlink is that XLink does not define an own element but uses/allocates an existing one.
1.3.1 XLink – Simple links
The following example shows a hyperlink created in HTML and a simple xlink in xml linking to same URI:
HTML link: <a href="http://www.w3.org">HOME OF W3C</a> --------------------------------------------- simple XLink: <myLink xmlns:xlink="http://www.w3.org/1999/xlink" xlink:type="simple" xlink:href="http://www.w3.org"> HOME OF W3C </myLink>
1.3.2 XLink – Extended links
Definition: An extended link is a link that associates an arbitrary number of resources. The participating resources may be any combination of remote and local. [XLink Version 1.0, Paragraph 5.1 Extended Links]
XLink brings with it some attributes, such as
xlink:actuate to specify when the linked object/resource should be shwown/read,
xlink:href to specify the URL to link,
xlink:show to specify where the link should be opened (replace, embed, new, etc.) and
xlink:type to specify the type of link (simple, extended, etc.).
In general, extended XLinks add some function to its child elements, define roles, specify more than one linking, … Thereby you use the type-value “locator” for specifying the remote resource, the type “arc” for defining traversing rules between participating resources, the type “title” to define human-readable labels, and “resource” for describing local resources. The folloginw listing an example, how such an xlink of type extended could look like in a document:
<multilink xlink:type="extended"> <src xlink:type="resource" xlink:label="src"> Download </src> <dst1 xlink:type="locator" xlink:href="http://www.xyz.de/dl1.rar" xlink:label="download1" /> <dst2 xlink:type="locator" xlink:href="http://www.abc.de/dl2.rar" xlink:label="download2" /> </multilink>
Unfortunately there is only some browser support of XLink; some of these are Mozilla v0.98 and higher, Netscape v6.02 and higher, and only a bit of support in IE v6.0. Earlier versions of all of common browsers had no XLink support at all.
2. Persistent document names
Although the introduction of a DOM together with XPath, XPointer and XLink will support external linkbases in documents, there is also a burden of maintaining links in HTML and PDF. In fact links are embedded within inadequately structured documents, and links are pointing at where a target document is stored rather than what it is. In other words documents need to be “called by name” rather than being “called by location” and these persistent names should be used as a matter of course in abstract hyperlink specifications.
This resource naming problem, we have just identified, has been recognised for some time and is being addressed by initiatives such as the Universal Resource Name (URN) and CNRI Handle proposals. This is not to say that the adoption of name resolution protocols would banish for ever the horror of broken links and “404 not found”-messages, but there is every prospect that such a service could automatically track the latest incarnation of a named target document or, at the very least, some metadata that describes it. The problems of scale in having the WWW be able to locate documents by name should not be underestimated but initiatives such as the Digital Object Identifier (DOI), which is a form of URN being proposed and tested by a consortium of international publishers, shows very clearly the e-commerce and e-publishing benefits to be gained from an agreed persistent naming scheme.
Even if link markup is embedded within the material to which it refers (and there is a strong case that links which an author wants to be part and parcel of a document should be treated in this special way), there should always be a clear distinction (very much along the lines first envisaged in the Dexter Hypertext Reference Model [Dexter model for hypertext systems] between the data structures for link anchors and link storage, as against those for document presentation. In this respect having a widely-adopted standard for document structuring is of enormous help and it seems that we can expect ever-increasing pressure to structure documents in XML-approved ways, starting, perhaps, with the adoption of XHTML [XHTML 1.0 Recommendation) (a cleaned up version of HTML designed to be XML conformant). The adoption of the XML metasyntax ensures that a single parser can cope with all sorts of `added value’ associated with a document, ranging from links specified via XLink to metadata specified via RDF, and/or document standards such as Adobe’s PDF, which does not currently use XML notation for its new range of embedded structure tags, may soon be forced to move in the XML direction, if only to ensure that such documents can inter-work reasonably seamlessly with a new generation of XML-compliant Web documents.
XLink creates links in XML documents (and uses XPointer to choose between different addressing schemes), XPointer, which is again based on XPath let you select any parts of XML documents.