Transclusion

In computer science, transclusion is the inclusion of part or all of an electronic document into one or more other documents by reference via hypertext. Transclusion is usually performed when the referencing document is displayed, and is normally automatic and transparent to the end user.^[1] The result of transclusion is a single integrated document made of parts assembled dynamically from separate sources, possibly stored on different computers in disparate places.

Transclusion facilitates modular design (using the "single source of truth" model, whether in data, code, or content): a resource is stored once and distributed for reuse in multiple documents. Updates or corrections to a resource are then reflected in any referencing documents.

In systems where transclusion is not available, and in some situations where it is available but not desirable, substitution is often the complementary option, whereby a static copy of the "single source of truth" is integrated into the relevant document. Examples of both are provided by the ways in which they are both used in creating the content of Wikipedia, for example (see Wikipedia:Transclusion and Wikipedia:Substitution for more information). Substituted static copies introduce a different set of considerations for version control than transclusion does, but they are sometimes necessary.

Ted Nelson coined the term for his 1980 nonlinear book Literary Machines, but the idea of master copy and occurrences was applied 17 years before, in Sketchpad. Currently it is a common technique employed by textbook writers, where a single topic/subject needs to be discussed in multiple chapters. An advantage of this system in textbooks is that it helps data redundancy and keeps the book to a manageable size.

Technical considerations

Context neutrality

Transclusion works better when transcluded sections of text are self-contained, so that the meaning and validity of the text is independent of context. For example, formulations like "as explained in the previous section" are problematic, because the transcluded section may appear in a different context, causing confusion. What constitutes "context-neutral" text varies, but often includes things like company information or boilerplate. To help overcome context sensitivity issues such as those aforementioned, systems capable of transclusion are often also capable of suppressing particular elements within the transcluded content. For example, Wikipedia can use tags such as "noinclude", "onlyinclude", and "includeonly" for this purpose. Typical examples of elements that often require such exceptions are document titles, footnotes, and cross-references; in this way, they can be automatically suppressed upon transclusion, without manual reworking for each instance.

Parameterization

Under some circumstances, and in some technical contexts, transcluded sections of text may not require strict adherence to the "context neutrality" principle, because the transcluded sections are capable of parameterization. Parameterization implies the ability to modify certain portions or subsections of a transcluded text depending on exogenous variables that can be changed independently. This is customarily done by supplying a transcluded text with one or more substitution placeholders. These placeholders are then replaced with the corresponding variable values prior to rendering the final transcluded output in context.

Origins

The concept of reusing file content began with computer programming languages: COBOL in 1960,^[2] followed by BCPL, PL/I, C,^[3] and by 1978, even FORTRAN. An include directive allows common source code to be reused while avoiding the pitfalls of copy-and-paste-programming and hard coding of constants. As with many innovations, a problem developed. Multiple include directives may provide the same content as another include directive, inadvertently causing repetitions of the same source code into the final result, resulting in an error. Include guards help solve this by, after a single inclusion of content, thereafter omitting the duplicate content.^[4]

The idea of a single, reusable, source for information lead to concepts like: Don't repeat yourself and the abstraction principle. A further use was found to make programs more portable. Portable source code uses an include directive to specify a standard library, which contains system specific source code that varies with each computer environment.^[5]

History and implementation by Project Xanadu

Ted Nelson, who originated the words hypertext and hypermedia, also coined the term transclusion in his 1980 book Literary Machines. Part of his proposal was the idea that micropayments could be automatically exacted from the reader for all the text, no matter how many snippets of content are taken from various places.

However, according to Nelson, the concept of transclusion had already formed part of his 1965 description of hypertext.^[6] Nelson defines transclusion as, "...the same content knowably in more than one place," setting it apart from more special cases, such as the inclusion of content from a different location (which he calls transdelivery) or an explicit quotation that remains connected to its origins, (which he calls transquotation).

Some hypertext systems, including Ted Nelson's own Xanadu Project, support transclusion.^[7]

Nelson has delivered a demonstration of Web transclusion, the Little Transquoter (programmed to Nelson's specification by Andrew Pam in 2004–2005).^[8] It creates a new format built on portion addresses from Web pages; when dereferenced, each portion on the resulting page remains click-connected to its original context.

Implementation on the Web

HTTP, as a transmission protocol, has rudimentary support for transclusion via byte serving: specifying a byte range in an HTTP request message.

Transclusion can occur either before (server-side) or after (client-side) transmission. For example:

An HTML document may be pre-composed by the server before delivery to the client using Server-Side Includes or another server-side application.
XML Entities or HTML Objects may be parsed by the client, which then requests the corresponding resources separately from the main document.
A web browser may cache elements using its own algorithms, which can operate without explicit directives in the document's markup.
AngularJS employs transclusion for nested directive operation.^[9]

Publishers of web content may object to the transclusion of material from their own web sites into other web sites, or they may require an agreement to do so. Critics of the practice may refer to various forms of inline linking as bandwidth theft or leeching.

Other publishers may seek specifically to have their materials transcluded into other web sites, as in the form of web advertising, or as widgets like a hit counter or web bug.

Mashups make use of transclusion to assemble resources or data into a new application, as by placing geo-tagged photos on an interactive map, or by displaying business metrics in an interactive dashboard.

Client-side HTML

HTML defines elements for client-side transclusion of images, scripts, stylesheets, other documents, and other types of media. HTML has relied heavily on client-side transclusion from the earliest days of the Web (so web pages could be displayed more quickly before multimedia elements finished loading), rather than embedding the raw data for such objects inline into a web page's markup.

Through techniques such as Ajax, scripts associated with an HTML document can instruct a web browser to modify the document in-place, as opposed to the earlier technique of having to pull an entirely new version of the page from the web server. Such scripts may transclude elements or documents from a server after the web browser has rendered the page, in response to user input or changing conditions, for example.

Future versions of HTML may support deeper transclusion of portions of documents using XML technologies such as entities, XPointer document referencing, and XSLT manipulations.

Proxy servers may employ transclusion to reduce redundant transmissions of commonly requested resources.

A popular Front End Framework known as AngularJS developed and maintained by Google has a directive callend ng-transclude that marks the insertion point for the transcluded DOM of the nearest parent directive that uses transclusion.

Server-side transclusion

Transclusion can be accomplished on the server side, as through Server Side Includes and markup entity references resolved by the server software. It is a feature of substitution templates.

Transclusion of source code

Transclusion of source code into software design or reference materials lets source code be presented within the document, but not interpreted as part of the document, preserving the semantic consistency of the inserted code in relation to its source codebase.

Transclusion in content management

In content management for single-source publishing, top-class content management systems increasingly provide for transclusion and substitution. Component content management systems, especially, aim to take the modular design principle to its optimal degree. MediaWiki provides transclusion and substitution and is a good off-the-shelf option for many smaller organizations (such as smaller nonprofits and SMEs) that may not have the budget for other commercial options; for details, see Component content management system.

Implementation in software development

A common feature in programming languages is the ability of one source code file to transclude, in whole or part, another source code file. The part transcluded is interpreted as if it were part of the transcluding file. Some of the methods are:

Include: Some programs will explicitly INCLUDE another file. The included file can consist of executable code, declarations, compiler instructions, and/or branching to later parts of the document, depending on compile-time variables.
Macro: Assembly languages, and some high-level programming languages, will typically provide for macros, special named instructions used to make definitions, generate executable code, provide looping and other decisions, and modify the document produced according to parameters supplied to the macro when the file is rendered.
Copy: The Cobol programming language has the COPY command, in which a copied file is inserted into the copying document, replacing the COPY command. Code and declarations in the copied file can be modified by a REPLACING argument as part of the copying command.

References

^ Glushko, Robert J., ed. (2013). The Discipline of Organizing. Cambridge, Massachusetts: MIT Press. p. 231. ISBN 9780262518505.
^ Initial Specifications for a COMMON BUSINESS ORIENTED LANGUAGE (COBOL) for Programming Electronic Digital Computers (PDF). Washington: Department of Defense. April 1960. pp. V-27. INCLUDE: Function: To save the programmer effort by automatically incorporating library subroutines into the source program.
^ Ritchie, Dennis M. (1993-03-01). "The development of the C language". ACM SIGPLAN Notices. 28 (3): 201–208. doi:10.1145/155360.155580. Archived from the original on 27 February 2020. Many other changes occurred around 1972-3, but the most important was the introduction of the preprocessor, partly at the urging of Alan Snyder [Snyder 74], but also in recognition of the utility of the the[sic] file-inclusion mechanisms available in BCPL and PL/I. Its original version was exceedingly simple, and provided only included files and simple string replacements: #include and #define of parameterless macros. Soon thereafter, it was extended, mostly by Mike Lesk and then by John Reiser, to incorporate macros with arguments and conditional compilation. The preprocessor was originally considered an optional adjunct to the language itself. Alt URL Archived 2020-02-04 at the Wayback Machine
^ Stallman, Richard M.; Weinberg, Zachary. "Header Files" (PDF). The C Preprocessor: For gcc version 6.3.0 (GCC). pp. 10–11. Alternatives to Wrapper #ifndef : CPP supports two more ways of indicating that a header file should be read only once. Neither one is as portable as a wrapper '#ifndef' and we recommend you do not use them in new programs, with the caveat that '#import' is standard practice in Objective-C. [...] Another way to prevent a header file from being included more than once is with the '#pragma once' directive. If '#pragma once' is seen when scanning a header file, that file will never be read again, no matter what.
^ Johnson, S. C.; Ritchie, D. M. (July–August 1978). "UNIX time-sharing system: Portability of C programs and the UNIX system". The Bell System Technical Journal. 57 (6): 2021–2048. doi:10.1002/j.1538-7305.1978.tb02141.x. ISSN 0005-8580. S2CID 17510065. Retrieved 27 February 2020. Even before the advent of the Interdata machine, it as realized, as mentioned above, that many programs depended to an undesirable degree not only on UNIX I/O conventions but on details of particularly favorable buffering strategies for the PDP-11. A package of routines, called the "portable I/O library," was written by M. E. Lesk and implemented on the Honeywell and IBM machines as well as the PDP-11 in a generally successful effort to overcome the deficiencies of earlier packages
^ Theodor H. Nelson, "A File Structure for the Complex, the Changing and the Indeterminate." Proceedings of the ACM 20th National Conference (1965), pp. 84-100
^ Kolbitsch, Josef; Maurer, Hermann (January 27, 2017). "Transclusions in an HTML-Based Environment" (PDF). Archived from the original (PDF) on July 1, 2017. Retrieved January 27, 2017.
^ The Little Transquoter Xanadu.com.au
^ "AngularJS". docs.angularjs.org. Retrieved 2016-08-11.

External links

Ted Nelson: Transclusion: Fixing Electronic Literature—on Google Tech Talks, 29 January 2007.

[1] Glushko, Robert J., ed. (2013). The Discipline of Organizing. Cambridge, Massachusetts: MIT Press. p. 231. ISBN 9780262518505.

[2] Initial Specifications for a COMMON BUSINESS ORIENTED LANGUAGE (COBOL) for Programming Electronic Digital Computers (PDF). Washington: Department of Defense. April 1960. pp. V-27. INCLUDE: Function: To save the programmer effort by automatically incorporating library subroutines into the source program.

[3] Ritchie, Dennis M. (1993-03-01). "The development of the C language". ACM SIGPLAN Notices. 28 (3): 201–208. doi:10.1145/155360.155580. Archived from the original on 27 February 2020. Many other changes occurred around 1972-3, but the most important was the introduction of the preprocessor, partly at the urging of Alan Snyder [Snyder 74], but also in recognition of the utility of the the[sic] file-inclusion mechanisms available in BCPL and PL/I. Its original version was exceedingly simple, and provided only included files and simple string replacements: #include and #define of parameterless macros. Soon thereafter, it was extended, mostly by Mike Lesk and then by John Reiser, to incorporate macros with arguments and conditional compilation. The preprocessor was originally considered an optional adjunct to the language itself. Alt URL Archived 2020-02-04 at the Wayback Machine

[4] Stallman, Richard M.; Weinberg, Zachary. "Header Files" (PDF). The C Preprocessor: For gcc version 6.3.0 (GCC). pp. 10–11. Alternatives to Wrapper #ifndef : CPP supports two more ways of indicating that a header file should be read only once. Neither one is as portable as a wrapper '#ifndef' and we recommend you do not use them in new programs, with the caveat that '#import' is standard practice in Objective-C. [...] Another way to prevent a header file from being included more than once is with the '#pragma once' directive. If '#pragma once' is seen when scanning a header file, that file will never be read again, no matter what.

[5] Johnson, S. C.; Ritchie, D. M. (July–August 1978). "UNIX time-sharing system: Portability of C programs and the UNIX system". The Bell System Technical Journal. 57 (6): 2021–2048. doi:10.1002/j.1538-7305.1978.tb02141.x. ISSN 0005-8580. S2CID 17510065. Retrieved 27 February 2020. Even before the advent of the Interdata machine, it as realized, as mentioned above, that many programs depended to an undesirable degree not only on UNIX I/O conventions but on details of particularly favorable buffering strategies for the PDP-11. A package of routines, called the "portable I/O library," was written by M. E. Lesk and implemented on the Honeywell and IBM machines as well as the PDP-11 in a generally successful effort to overcome the deficiencies of earlier packages

[6] Theodor H. Nelson, "A File Structure for the Complex, the Changing and the Indeterminate." Proceedings of the ACM 20th National Conference (1965), pp. 84-100

[7] Kolbitsch, Josef; Maurer, Hermann (January 27, 2017). "Transclusions in an HTML-Based Environment" (PDF). Archived from the original (PDF) on July 1, 2017. Retrieved January 27, 2017.

[8] The Little Transquoter Xanadu.com.au

[9] "AngularJS". docs.angularjs.org. Retrieved 2016-08-11.

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]