XML transformation language

An XML transformation language is a programming language designed specifically to transform an input XML document into an output document which satisfies some specific goal.

An XML to XML transformation

There are two special cases of transformation:

  • XML to XML: the output document is an XML document.
  • XML to Data: the output document is a byte stream.

XML to XML

edit

As XML to XML transformation outputs an XML document, XML to XML transformation chains form XML pipelines.

XML to Data

edit

The XML (EXtensible Markup Language) to Data transformation contains some important cases. The most notable one is XML to HTML (HyperText Markup Language), as an HTML document is not an XML document.

SGML origins

edit

The earliest transformation languages predate the advent of XML as an SGML profile, and thus accept input in arbitrary SGML rather than specifically XML. These include the SGML-to-SGML link process definition (LPD) format defined as part of the SGML standard itself; in SGML (but not XML), the LPD file can be referenced from the document itself by a LINKTYPE declaration, similarly to the DOCTYPE declaration used for a DTD.[1] Other such transformation languages, addressing some of the deficiencies of LPDs, include Document Style Semantics and Specification Language (DSSSL) and OmniMark.[2] Newer transformation languages tend to target XML specifically, and thus only accept XML, not arbitrary SGML.

Existing languages

edit
  • XSLT: XSLT is the best known XML transformation language. The XSLT 1.0 W3C recommendation was published in 1999 together with XPath 1.0, and it has been widely implemented since then. XSLT 2.0 has become a W3C recommendation since January 2007 and implementations of the specification like Saxon 8 are already available.
  • XQuery: XQuery is a full functional language, despite having "query" in the name. It is a de facto standard used by Microsoft, Oracle, DB2, MarkLogic, etc., is the foundation for the XRX web programming model, and has a W3C recommendation for versions 1.0. XQuery is not written in XML itself like XSLT is, so its syntax is much lighter. The language is based on XPath 2.0. XQuery programs cannot have side-effects, just like XSLT and provides almost the same capabilities (for instance: declaring variables and functions, iterating over sequences, using W3C schema types), even though the program syntax are quite different. XQuery is logic driven, using FOR, WHERE and function composition (e.g. fn:concat("<html>", generate-body(), "</html>")). In contrast, XSLT is data-driven (push processing model) where certain conditions of the input document trigger the execution of templates rather than the code executing in the order in which it is written.
  • XProc: XProc is an XML Pipeline language. The XProc 1.0 W3C Recommendation was published in May 2010.
  • XML document transform: Is a Microsoft standard for performing simple transforms on XML documents. Primarily for creating IIS Web.config files (Config Transforms), other implementations allow it to be used for generic config files as build time (Slow Cheetah) or from the command line (CTT).
  • STX: STX (Streaming Transformations for XML) is inspired by XSLT but has been designed to allow a one-pass transformation process that never prevents streaming. Implementations are available in Java (Joost) and Perl (XML::STX).
  • XML Script: XML Script is an imperative scripting language inspired by Perl that uses the XML syntax. XML Script supports XPath and its proprietary DSLPath for selecting nodes from the input tree.
  • FXT: FXT is a functional XML transformation tool, implemented in Standard ML.
  • XDuce: XDuce is a typed language with a lightweight syntax, compared to XSLT. It is written in ML.
  • CDuce: CDuce extends XDuce to a general-purpose functional programming language, see CDuce homepage.
  • XACT: XACT is a Java-based system for programming XML transformations. Notable features include XML templates as immutable values and a static analysis to ensure type safety using XML Schema types (XACT home page).
  • XFun: XFun is a functional language X-Fun for defining transformations between XML data trees, while providing shredding instructions. X-Fun can be understood as an extension of Frisch's XStream language with output shredding, while pattern matching is replaced by tree navigation with XPath expressions. ([1])
  • XStream: XStream is a simple functional transformation language for XML documents based on CAML. XML transformations written in XStream are evaluated in streaming: when possible, parts of the output are computed and produced while the input document is still being parsed. Some transformations can thus be applied to huge XML documents which would not even fit in memory. The XStream compiler is distributed under the terms of the CeCILL free software license.
  • Xtatic: Xtatic applies methods from XDuce to C#, see Xtatic homepage.
  • HaXml: HaXml is a library and collection of tools to write XML transformations in Haskell. Also see this paper about HaXml published in 1999 and this IBM developerWorks article. See also the more recent HXML and Haskell XML Toolbox (HXT), which is based on the ideas of HaXml and HXML but takes a more general approach to XML processing.
  • XMLambda: XMLambda (XMλ) is described in a 1999 paper by Erik Meijer and Mark Shields. No implementation is available. See XMLambda home page.
  • FleXML: FleXML is an XML processing language first implemented by Kristofer Rose. Its approach is to add actions to an XML DTD specifying processing instructions for any subset of the DTD's rules.
  • Scala: Scala is a general-purpose functional and object-oriented language with specific support for XML transformation in the form of XML pattern matching, literals, and expressions, along with standard XML libraries.[3]
  • LINQ to XML: LINQ to XML is a .NET 3.5 syntax and programming API available in C#, VB and some other .NET languages. LINQ is primarily designed as a query language, but it also supports XML transforms.

See also

edit

References

edit
  1. ^ Goldfarb, Charles F. (1990). Clause 12—Markup Declarations: Link Process Definition. Oxford: Clarendon Press. pp. 433–449. ISBN 0-19-853737-9. {{cite book}}: |work= ignored (help)
  2. ^ Kimber, W. Eliot. "Why I Want the SGML LINK Feature". CoverPages.org.
  3. ^ Fancellu, Dino; Narmontas, William (June 2014). "XML Processing in Scala". XML London 2014: 63–75. doi:10.14337/XMLLondon14.Narmontas01 (inactive 1 November 2024). ISBN 978-0-9926471-1-7.{{cite journal}}: CS1 maint: DOI inactive as of November 2024 (link)