'''XML''', or e'''X'''tensible '''M'''arkup '''L'''anguage, is a [data format].
** See Also **
[Natively accessing XML]:
[Interfacing with XML]: a developers discussion about merging the various XML packages
[Pull down menus in XML]:
[Simple XML report writer]: parse a report in XML, display it in a [Tk] [canvas], and produce [postscript] from it
[Regular Expressions Are Not A Good Idea for Parsing XML, HTML, or e-mail Addresses]:
[XML pretty-printing]: tDOM builds-in a pretty-printing serialization option. Those with an interest in a comparable function for TclDOM are welcome to try/use/improve [http://phaseit.net/claird/comp.lang.tcl/dom_pretty_print.html%|%dom_pretty_print].
[XML Tree Walking]:
[XML tutorials]:
[XML-list]: a survey of [list]-based representations of XML documetns
[XML/tDOM encoding issues with the http package]:
[XML_Wrapper]: create [Tk] forms on the fly from an XML file
[XSD schema validate an XML document]:
** Resources **
[http://www.xml.org/%|%xml.org]:
** Reading **
[http://web.archive.org/web/20110606145728/http://www.ibm.com/developerworks/webservices/library/ws-xtcl/index.html%|%Programming XML and Web services in TCL, Part 1 : An initial primer] ,[Cameron Laird] ,2001-04-01: surveys the state-of-the-art as of spring 2001, mainly from a [Zveno]-biased perspective. One deficiency of that article is its neglect of [Jochen Loewer]'s [tDOM] work.
[http://web.archive.org/web/20110727035158/http://ldn.linuxfoundation.org/column/untaught-xml-schema%|%Untaught XML Schema] ,[Cameron Laird] ,2009-06-05:
[http://web.archive.org/web/20110616090536/http://www.zdnet.com/news/microsoft-patents-xml-word-processing-documents/329645%|%Microsoft patents XML word processing documents], Rupert Goodwins ,2009-08-07: Amusingly enough, Microsoft has also in August 2009 been ordered to stop selling their Word product '''because''' of its XML functionality: [http://www.computerworld.com/s/article/9136539/Injunction_on_Microsoft_Word_unlikely_to_halt_sales?source=CTWNLE_nlt_dailyam_2009-08-12%|%Injunction on Microsoft Word unlikely to halt sales] ,Nancy Gohring ,2009-08-11.
** Examples **
[Simple XML report writer]:
[XML Graph to canvas]:
** Description **
XML is a simplified form of [SGML], but stricter (more regular) in some
aspects:
* Singleton elements must end with `/>`
* attribute values must be quoted
Example:
======none
======
Tcl's excellent [Unicode] abilities make it a good language for processing XML.
** [Parsing] **
[tDOM] and [TclXML]/[TclDOM] are the two main Tcl extensions for parsing XML,
providing both [SAX] parsing for stream-oriented parsing, and [DOM] for
document-oriented parsing.
See Also:
[A little XML parser]:
[Parsing XML]:
[snitDom]:
[tclhttpd XML server]: a simple wrapper around Sleepycat's dbxml library version 2 to implement a remote XML database server
[TAX: A Tiny API for XML]: inspired by [Stephen Uhler's HTML parser in 10 lines]
[tkxmllint]: a [GUI] frontent to xmllint
[XML shallow parsing with regular expressions]:
[xmlp]:
[YAXMLP an XML parser]: re-entrant, and designed to not use [regexp] or [string map]
** Validation **
One way of specifying the valid tag structure of a class of documents is to use
a Document Type Definition, [DTD] for short. This way was inherited from
[SGML]. There are alternative ways ... XMLSchema, Relax(NG), ...
See Also:
[A little XML Schema validator]:
** Generating **
See Also:
[Formatting ls information in XML]: an example of a manual approach to generating XML
[Minimalist XML Generation]:
[Howto export Microsoft Outlook contacts to XML using tcom and tDom]:
[Migrating MS Access to other databases using XML]:
----
In a mailing list conversation [[reference?], [Steve Ball] succinctly advised,
"When creating XML, I generally use [TclDOM]. Create a [DOM] tree in memory,
and then use 'dom::DOMImplementation serialize $doc' to generate the XML. The
TclDOM package will make sure that the generated XML is well-formed.
Alternatively, XML is just text so there's no reason why you can't just create
the string directly. Eg:
======
puts $content
======
The problem with this is that (a) you have to worry about the
XML syntax nitty-gritty and (b) the content variable may contain
special characters which you have to deal with.
There are also some generation packages available, like the '[html]'
package in [tcllib] (this will be added to TclXML RSN, when my
workload permits)."
[DKF]: If you're going for the cheap-hack method of XML generation mentioned
above, you'll want this:
======
proc asXML {content {tag document}} {
set XML_MAP {
< <
> >
& &
\" "
' '
}
return <$tag>[string map $XML_MAP $content]$tag>
}
======
Naturally, the ''XML_MAP'' variable is factorisable...
[MHo]: Why not using '''html::quoteFormValue''' for this purpose?
For generation of XML (HTML) the pure Tcl way, have a look at the [xmlgen /
htmlgen%|%xmlgen] module of [TclXML]
[DKF]: That's when you're moving away from cheap hacks. And HTML has a lot more
entities than XML, though most are optional.
If you want to get particular about entity encoding '''arbitrary text''', this
is working for me:
======
variable entityMap [list & &\; < <\; > >\; \" "\;\
\u0000
======
** Browsing / Editing **
See Also:
[a little XML browser]:
[XML DOM Tk Text Browser Editor]:
[FanXE]:
** Related Technologies **
There are a whole host of technologies related to XML:
[XPath]: for selecting nodes from a document
[XML Query%|%XQuery]:
[XSL]/[XSLT]: for transforming documents
[DTD] and XMLSchema: for specifying document schemas
[tDOM] and [TclXML] both provide good support for at least XPath and XSLT.
** XML Formats **
XML by itself is just a partially-standardized syntax for data. It's used as
the basis for a variety of different applications, such as:
[CML]: Chemical Markup Language
[excel xml] for [Excel]:
(X)[HTML] for web pages:
[RDF] and [OWL] for general relational/logical data models:
[DocBook] for technical documentation: along with other office document formats (e.g. Microsoft's office XML format, [excel xml], OpenDoc, etc)
[MathML]:
[OpenMath]:
[SOAP] and [XML-RPC]: for remote procedure calls/web-services.
[VML]: Vector Markup Language, for vector graphics
[XLink]: a common notation for links in XML to other resources
Various configuration file formats (especially in the [Java] world):
** Alternatives **
Alternatives to using XML for data files include:
[Tcl] itself:
[TDL]:
[JSON]:
<> Data Serialization Format | XML