If you spot a problem with this page, click here to create a Bugzilla issue.

Quickly fork, edit online, and submit a pull request for this page. Requires a signed-in GitHub account. This works well for small changes. If you'd like to make larger changes you may want to consider using a local clone.

std.xml

Warning: This module is considered out-dated and not up to Phobos' current standards. It will remain until we have a suitable replacement, but be aware that it will not remain long term.

Classes and functions for creating and parsing XML

The basic architecture of this module is that there are standalone functions, classes for constructing an XML document from scratch (Tag, Element and Document), and also classes for parsing a pre-existing XML file (ElementParser and DocumentParser). The parsing classes may be used to build a Document, but that is not their primary purpose. The handling capabilities of DocumentParser and ElementParser are sufficiently customizable that you can make them do pretty much whatever you want.

Example This example creates a DOM (Document Object Model) tree from an XML file.

import std.xml;
import std.stdio;
import std.string;
import std.file;

// books.xml is used in various samples throughout the Microsoft XML Core
// Services (MSXML) SDK.
//
// See http://msdn2.microsoft.com/en-us/library/ms762271(VS.85).aspx

void main()
{
    string s = cast(string) std.file.read("books.xml");

    // Check for well-formedness
    check(s);

    // Make a DOM tree
    auto doc = new Document(s);

    // Plain-print it
    writeln(doc);
}

Example This example does much the same thing, except that the file is deconstructed and reconstructed by hand. This is more work, but the techniques involved offer vastly more power.

import std.xml;
import std.stdio;
import std.string;

struct Book
{
    string id;
    string author;
    string title;
    string genre;
    string price;
    string pubDate;
    string description;
}

void main()
{
    string s = cast(string) std.file.read("books.xml");

    // Check for well-formedness
    check(s);

    // Take it apart
    Book[] books;

    auto xml = new DocumentParser(s);
    xml.onStartTag["book"] = (ElementParser xml)
    {
        Book book;
        book.id = xml.tag.attr["id"];

        xml.onEndTag["author"]       = (in Element e) { book.author      = e.text(); };
        xml.onEndTag["title"]        = (in Element e) { book.title       = e.text(); };
        xml.onEndTag["genre"]        = (in Element e) { book.genre       = e.text(); };
        xml.onEndTag["price"]        = (in Element e) { book.price       = e.text(); };
        xml.onEndTag["publish-date"] = (in Element e) { book.pubDate     = e.text(); };
        xml.onEndTag["description"]  = (in Element e) { book.description = e.text(); };

        xml.parse();

        books ~= book;
    };
    xml.parse();

    // Put it back together again;
    auto doc = new Document(new Tag("catalog"));
    foreach (book;books)
    {
        auto element = new Element("book");
        element.tag.attr["id"] = book.id;

        element ~= new Element("author",      book.author);
        element ~= new Element("title",       book.title);
        element ~= new Element("genre",       book.genre);
        element ~= new Element("price",       book.price);
        element ~= new Element("publish-date",book.pubDate);
        element ~= new Element("description", book.description);

        doc ~= element;
    }

    // Pretty-print it
    writefln(join(doc.pretty(3),"\n"));
}

License:

Boost License 1.0.

Authors:

Janice Caron

Source std/xml.d

pure nothrow @nogc @safe bool isChar(dchar c);

Returns true if the character is a character according to the XML standard

Standards:

XML 1.0

Parameters:

dchar c the character to be tested

pure nothrow @nogc @safe bool isSpace(dchar c);

Returns true if the character is whitespace according to the XML standard

Only the following characters are considered whitespace in XML - space, tab, carriage return and linefeed

Standards:

XML 1.0

Parameters:

dchar c the character to be tested

pure nothrow @nogc @safe bool isDigit(dchar c);

Returns true if the character is a digit according to the XML standard

Standards:

XML 1.0

Parameters:

dchar c the character to be tested

pure nothrow @nogc @safe bool isLetter(dchar c);

Returns true if the character is a letter according to the XML standard

Standards:

XML 1.0

Parameters:

dchar c the character to be tested

pure nothrow @nogc @safe bool isIdeographic(dchar c);

Returns true if the character is an ideographic character according to the XML standard

Standards:

XML 1.0

Parameters:

dchar c the character to be tested

pure nothrow @nogc @safe bool isBaseChar(dchar c);

Returns true if the character is a base character according to the XML standard

Standards:

XML 1.0

Parameters:

dchar c the character to be tested

pure nothrow @nogc @safe bool isCombiningChar(dchar c);

Returns true if the character is a combining character according to the XML standard

Standards:

XML 1.0

Parameters:

dchar c the character to be tested

pure nothrow @nogc @safe bool isExtender(dchar c);

Returns true if the character is an extender according to the XML standard

Standards:

XML 1.0

Parameters:

dchar c the character to be tested

S encode(S)(S s);

Encodes a string by replacing all characters which need to be escaped with appropriate predefined XML entities.

encode() escapes certain characters (ampersand, quote, apostrophe, less-than and greater-than), and similarly, decode() unescapes them. These functions are provided for convenience only. You do not need to use them when using the std.xml classes, because then all the encoding and decoding will be done for you automatically.

If the string is not modified, the original will be returned.

Standards:

XML 1.0

Parameters:

S s The string to be encoded

Returns:

The encoded string

Example

writefln(encode("a > b")); // writes "a &gt; b"

enum DecodeMode: int;

Mode to use for decoding.

NONE Do not decode

LOOSE Decode, but ignore errors

STRICT Decode, and throw exception on error

pure @safe string decode(string s, DecodeMode mode = DecodeMode.LOOSE);

Decodes a string by unescaping all predefined XML entities.

This function decodes the entities &, ", ', < and &gt, as well as decimal and hexadecimal entities such as €

If the string does not contain an ampersand, the original will be returned.

Note that the "mode" parameter can be one of DecodeMode.NONE (do not decode), DecodeMode.LOOSE (decode, but ignore errors), or DecodeMode.STRICT (decode, and throw a DecodeException in the event of an error).

Standards:

XML 1.0

Parameters:

string `s`	The string to be decoded
DecodeMode `mode`	(optional) Mode to use for decoding. (Defaults to LOOSE).

Throws:

DecodeException if mode == DecodeMode.STRICT and decode fails

Returns:

The decoded string

Example

writefln(decode("a &gt; b")); // writes "a > b"

class Document: std.xml.Element;

Class representing an XML document.

Standards:

XML 1.0

string prolog;

Contains all text which occurs before the root element. Defaults to <?xml version="1.0"?>

string epilog;

Contains all text which occurs after the root element. Defaults to the empty string

this(string s);

Constructs a Document by parsing XML text.

This function creates a complete DOM (Document Object Model) tree.

The input to this function MUST be valid XML. This is enforced by DocumentParser's in contract.

Parameters:

string s the complete XML text.

this(const(Tag) tag);

Constructs a Document from a Tag.

Parameters:

const(Tag) tag the start tag of the document.

const bool opEquals(scope const Object o);

Compares two Documents for equality

Example

Document d1,d2;
if (d1 == d2) { }

const scope int opCmp(scope const Object o);

Compares two Documents

You should rarely need to call this function. It exists so that Documents can be used as associative array keys.

Example

Document d1,d2;
if (d1 < d2) { }

const scope @trusted size_t toHash();

Returns the hash of a Document

You should rarely need to call this function. It exists so that Documents can be used as associative array keys.

const scope @safe string toString();

Returns the string representation of a Document. (That is, the complete XML of a document).

class Element: std.xml.Item;

Class representing an XML element.

Standards:

XML 1.0

Tag tag;

The start tag of the element

Item[] items;

The element's items

Text[] texts;

The element's text items

CData[] cdatas;

The element's CData items

Comment[] comments;

The element's comments

ProcessingInstruction[] pis;

The element's processing instructions

Element[] elements;

The element's child elements

pure @safe this(string name, string interior = null);

Constructs an Element given a name and a string to be used as a Text interior.

Parameters:

string `name`	the name of the element.
string `interior`	(optional) the string interior.

Example

auto element = new Element("title","Serenity")
    // constructs the element <title>Serenity</title>

pure @safe this(const(Tag) tag_);

Constructs an Element from a Tag.

Parameters:

const(Tag) tag_ the start or empty tag of the element.

pure @safe void opOpAssign(string op)(Text item) if (op == "~");

Append a text item to the interior of this element

Parameters:

Text item the item you wish to append.

Example

Element element;
element ~= new Text("hello");

pure @safe void opOpAssign(string op)(CData item) if (op == "~");

Append a CData item to the interior of this element

Parameters:

CData item the item you wish to append.

Example

Element element;
element ~= new CData("hello");

pure @safe void opOpAssign(string op)(Comment item) if (op == "~");

Append a comment to the interior of this element

Parameters:

Comment item the item you wish to append.

Example

Element element;
element ~= new Comment("hello");

pure @safe void opOpAssign(string op)(ProcessingInstruction item) if (op == "~");

Append a processing instruction to the interior of this element

Parameters:

ProcessingInstruction item the item you wish to append.

Example

Element element;
element ~= new ProcessingInstruction("hello");

pure @safe void opOpAssign(string op)(Element item) if (op == "~");

Append a complete element to the interior of this element

Parameters:

Element item the item you wish to append.

Example

Element element;
Element other = new Element("br");
element ~= other;
   // appends element representing <br />

const bool opEquals(scope const Object o);

Compares two Elements for equality

Example

Element e1,e2;
if (e1 == e2) { }

const @safe int opCmp(scope const Object o);

Compares two Elements

You should rarely need to call this function. It exists so that Elements can be used as associative array keys.

Example

Element e1,e2;
if (e1 < e2) { }

const scope @safe size_t toHash();

Returns the hash of an Element

You should rarely need to call this function. It exists so that Elements can be used as associative array keys.

const string text(DecodeMode mode = DecodeMode.LOOSE);

Returns the decoded interior of an element.

The element is assumed to contain text only. So, for example, given XML such as "<title>Good & Bad</title>", will return "Good & Bad".

Parameters:

DecodeMode mode (optional) Mode to use for decoding. (Defaults to LOOSE).

Throws:

DecodeException if decode fails

const scope string[] pretty(uint indent = 2);

Returns an indented string representation of this item

Parameters:

uint indent (optional) number of spaces by which to indent this element. Defaults to 2.

const scope @safe string toString();

Returns the string representation of an Element

Example

auto element = new Element("br");
writefln(element.toString()); // writes "<br />"

enum TagType: int;

Tag types.

START Used for start tags

END Used for end tags

EMPTY Used for empty tags

class Tag;

Class representing an XML tag.

Standards:

XML 1.0

The class invariant guarantees

that type is a valid enum TagType value
that name consists of valid characters
that each attribute name consists of valid characters

TagType type;

Type of tag

string name;

Tag name

string[string] attr;

Associative array of attributes

pure @safe this(string name, TagType type = TagType.START);

Constructs an instance of Tag with a specified name and type

The constructor does not initialize the attributes. To initialize the attributes, you access the attr member variable.

Parameters:

string `name`	the Tag's name
TagType `type`	(optional) the Tag's type. If omitted, defaults to TagType.START.

Example

auto tag = new Tag("img",Tag.EMPTY);
tag.attr["src"] = "http://example.com/example.jpg";

const bool opEquals(scope Object o);

Compares two Tags for equality

You should rarely need to call this function. It exists so that Tags can be used as associative array keys.

Example

Tag tag1,tag2
if (tag1 == tag2) { }

const int opCmp(Object o);

Compares two Tags

Example

Tag tag1,tag2
if (tag1 < tag2) { }

const size_t toHash();

Returns the hash of a Tag

You should rarely need to call this function. It exists so that Tags can be used as associative array keys.

const @safe string toString();

Returns the string representation of a Tag

Example

auto tag = new Tag("book",TagType.START);
writefln(tag.toString()); // writes "<book>"

const pure nothrow @nogc @property @safe bool isStart();

Returns true if the Tag is a start tag

Example

if (tag.isStart) { }

const pure nothrow @nogc @property @safe bool isEnd();

Returns true if the Tag is an end tag

Example

if (tag.isEnd) { }

const pure nothrow @nogc @property @safe bool isEmpty();

Returns true if the Tag is an empty tag

Example

if (tag.isEmpty) { }

class Comment: std.xml.Item;

Class representing a comment

pure @safe this(string content);

Construct a comment

Parameters:

string content the body of the comment

Throws:

CommentException if the comment body is illegal (contains "--" or exactly equals "-")

Example

auto item = new Comment("This is a comment");
   // constructs <!--This is a comment-->

const bool opEquals(scope const Object o);

Compares two comments for equality

Example

Comment item1,item2;
if (item1 == item2) { }

const scope int opCmp(scope const Object o);

Compares two comments

You should rarely need to call this function. It exists so that Comments can be used as associative array keys.

Example

Comment item1,item2;
if (item1 < item2) { }

const nothrow scope size_t toHash();

Returns the hash of a Comment

You should rarely need to call this function. It exists so that Comments can be used as associative array keys.

const pure nothrow scope @safe string toString();

Returns a string representation of this comment

const pure nothrow @nogc @property scope @safe bool isEmptyXML();

Returns false always

class CData: std.xml.Item;

Class representing a Character Data section

pure @safe this(string content);

Construct a character data section

Parameters:

string content the body of the character data segment

Throws:

CDataException if the segment body is illegal (contains "]]>")

Example

auto item = new CData("<b>hello</b>");
   // constructs <![CDATA[<b>hello</b>]]>

const bool opEquals(scope const Object o);

Compares two CDatas for equality

Example

CData item1,item2;
if (item1 == item2) { }

const scope int opCmp(scope const Object o);

Compares two CDatas

You should rarely need to call this function. It exists so that CDatas can be used as associative array keys.

Example

CData item1,item2;
if (item1 < item2) { }

const nothrow scope size_t toHash();

Returns the hash of a CData

You should rarely need to call this function. It exists so that CDatas can be used as associative array keys.

const pure nothrow scope @safe string toString();

Returns a string representation of this CData section

const pure nothrow @nogc @property scope @safe bool isEmptyXML();

Returns false always

class Text: std.xml.Item;

Class representing a text (aka Parsed Character Data) section

pure @safe this(string content);

Construct a text (aka PCData) section

Parameters:

string content the text. This function encodes the text before insertion, so it is safe to insert any text

Example

auto Text = new CData("a < b");
   // constructs a &lt; b

const bool opEquals(scope const Object o);

Compares two text sections for equality

Example

Text item1,item2;
if (item1 == item2) { }

const scope int opCmp(scope const Object o);

Compares two text sections

You should rarely need to call this function. It exists so that Texts can be used as associative array keys.

Example

Text item1,item2;
if (item1 < item2) { }

const nothrow scope size_t toHash();

Returns the hash of a text section

You should rarely need to call this function. It exists so that Texts can be used as associative array keys.

const pure nothrow @nogc scope @safe string toString();

Returns a string representation of this Text section

const pure nothrow @nogc @property scope @safe bool isEmptyXML();

Returns true if the content is the empty string

class XMLInstruction: std.xml.Item;

Class representing an XML Instruction section

pure @safe this(string content);

Construct an XML Instruction section

Parameters:

string content the body of the instruction segment

Throws:

XIException if the segment body is illegal (contains ">")

Example

auto item = new XMLInstruction("ATTLIST");
   // constructs <!ATTLIST>

const bool opEquals(scope const Object o);

Compares two XML instructions for equality

Example

XMLInstruction item1,item2;
if (item1 == item2) { }

const scope int opCmp(scope const Object o);

Compares two XML instructions

You should rarely need to call this function. It exists so that XmlInstructions can be used as associative array keys.

Example

XMLInstruction item1,item2;
if (item1 < item2) { }

const nothrow scope size_t toHash();

Returns the hash of an XMLInstruction

You should rarely need to call this function. It exists so that XmlInstructions can be used as associative array keys.

const pure nothrow scope @safe string toString();

Returns a string representation of this XmlInstruction

const pure nothrow @nogc @property scope @safe bool isEmptyXML();

Returns false always

class ProcessingInstruction: std.xml.Item;

Class representing a Processing Instruction section

pure @safe this(string content);

Construct a Processing Instruction section

Parameters:

string content the body of the instruction segment

Throws:

PIException if the segment body is illegal (contains "?>")

Example

auto item = new ProcessingInstruction("php");
   // constructs <?php?>

const bool opEquals(scope const Object o);

Compares two processing instructions for equality

Example

ProcessingInstruction item1,item2;
if (item1 == item2) { }

const scope int opCmp(scope const Object o);

Compares two processing instructions

You should rarely need to call this function. It exists so that ProcessingInstructions can be used as associative array keys.

Example

ProcessingInstruction item1,item2;
if (item1 < item2) { }

const nothrow scope size_t toHash();

Returns the hash of a ProcessingInstruction

You should rarely need to call this function. It exists so that ProcessingInstructions can be used as associative array keys.

const pure nothrow scope @safe string toString();

Returns a string representation of this ProcessingInstruction

const pure nothrow @nogc @property scope @safe bool isEmptyXML();

Returns false always

abstract class Item;

Abstract base class for XML items

abstract const @safe bool opEquals(scope const Object o);

Compares with another Item of same type for equality

abstract const @safe int opCmp(scope const Object o);

Compares with another Item of same type

abstract const scope @safe size_t toHash();

Returns the hash of this item

abstract const scope @safe string toString();

Returns a string representation of this item

const scope @safe string[] pretty(uint indent);

Returns an indented string representation of this item

Parameters:

uint indent number of spaces by which to indent child elements

abstract const pure nothrow @nogc @property scope @safe bool isEmptyXML();

Returns true if the item represents empty XML text

class DocumentParser: std.xml.ElementParser;

Class for parsing an XML Document.

This is a subclass of ElementParser. Most of the useful functions are documented there.

Standards:

XML 1.0

Bugs:

Currently only supports UTF documents.

If there is an encoding attribute in the prolog, it is ignored.

this(string xmlText_);

Constructs a DocumentParser.

The input to this function MUST be valid XML. This is enforced by the function's in contract.

Parameters:

string xmlText_ the entire XML document as text

class ElementParser;

Class for parsing an XML element.

Standards:

XML 1.0

Note that you cannot construct instances of this class directly. You can construct a DocumentParser (which is a subclass of ElementParser), but otherwise, Instances of ElementParser will be created for you by the library, and passed your way via onStartTag handlers.

const pure nothrow @nogc @property @safe const(Tag) tag();

The Tag at the start of the element being parsed. You can read this to determine the tag's name and attributes.

ParserHandler[string] onStartTag;

Register a handler which will be called whenever a start tag is encountered which matches the specified name. You can also pass null as the name, in which case the handler will be called for any unmatched start tag.

Example

// Call this function whenever a <podcast> start tag is encountered
onStartTag["podcast"] = (ElementParser xml)
{
    // Your code here
    //
    // This is a a closure, so code here may reference
    // variables which are outside of this scope
};

// call myEpisodeStartHandler (defined elsewhere) whenever an <episode>
// start tag is encountered
onStartTag["episode"] = &myEpisodeStartHandler;

// call delegate dg for all other start tags
onStartTag[null] = dg;

This library will supply your function with a new instance of ElementHandler, which may be used to parse inside the element whose start tag was just found, or to identify the tag attributes of the element, etc.

Note that your function will be called for both start tags and empty tags. That is, we make no distinction between and .

ElementHandler[string] onEndTag;

Register a handler which will be called whenever an end tag is encountered which matches the specified name. You can also pass null as the name, in which case the handler will be called for any unmatched end tag.

Example

// Call this function whenever a </podcast> end tag is encountered
onEndTag["podcast"] = (in Element e)
{
    // Your code here
    //
    // This is a a closure, so code here may reference
    // variables which are outside of this scope
};

// call myEpisodeEndHandler (defined elsewhere) whenever an </episode>
// end tag is encountered
onEndTag["episode"] = &myEpisodeEndHandler;

// call delegate dg for all other end tags
onEndTag[null] = dg;

Note that your function will be called for both start tags and empty tags. That is, we make no distinction between and .

pure nothrow @nogc @property @safe void onText(Handler handler);

Example

// Call this function whenever text is encountered
onText = (string s)
{
    // Your code here

    // The passed parameter s will have been decoded by the time you see
    // it, and so may contain any character.
    //
    // This is a a closure, so code here may reference
    // variables which are outside of this scope
};

pure nothrow @nogc @safe void onTextRaw(Handler handler);

Register an alternative handler which will be called whenever text is encountered. This differs from onText in that onText will decode the text, whereas onTextRaw will not. This allows you to make design choices, since onText will be more accurate, but slower, while onTextRaw will be faster, but less accurate. Of course, you can still call decode() within your handler, if you want, but you'd probably want to use onTextRaw only in circumstances where you know that decoding is unnecessary.

Example

// Call this function whenever text is encountered
onText = (string s)
{
    // Your code here

    // The passed parameter s will NOT have been decoded.
    //
    // This is a a closure, so code here may reference
    // variables which are outside of this scope
};

pure nothrow @nogc @property @safe void onCData(Handler handler);

Example

// Call this function whenever a CData section is encountered
onCData = (string s)
{
    // Your code here

    // The passed parameter s does not include the opening <![CDATA[
    // nor closing ]]>
    //
    // This is a a closure, so code here may reference
    // variables which are outside of this scope
};

pure nothrow @nogc @property @safe void onComment(Handler handler);

Example

// Call this function whenever a comment is encountered
onComment = (string s)
{
    // Your code here

    // The passed parameter s does not include the opening <!-- nor
    // closing -->
    //
    // This is a a closure, so code here may reference
    // variables which are outside of this scope
};

pure nothrow @nogc @property @safe void onPI(Handler handler);

Example

// Call this function whenever a processing instruction is encountered
onPI = (string s)
{
    // Your code here

    // The passed parameter s does not include the opening <? nor
    // closing ?>
    //
    // This is a a closure, so code here may reference
    // variables which are outside of this scope
};

pure nothrow @nogc @property @safe void onXI(Handler handler);

Example

// Call this function whenever an XML instruction is encountered
// (Note: XML instructions may only occur preceding the root tag of a
// document).
onPI = (string s)
{
    // Your code here

    // The passed parameter s does not include the opening <! nor
    // closing >
    //
    // This is a a closure, so code here may reference
    // variables which are outside of this scope
};

void parse();

Parse an XML element.

Parsing will continue until the end of the current element. Any items encountered for which a handler has been registered will invoke that handler.

Throws:

various kinds of XMLException

const pure nothrow @nogc @safe string toString();

Returns that part of the element which has already been parsed

pure @safe void check(string s);

Check an entire XML document for well-formedness

Parameters:

string s the document to be checked, passed as a string

Throws:

CheckException if the document is not well formed

CheckException's toString() method will yield the complete hierarchy of parse failure (the XML equivalent of a stack trace), giving the line and column number of every failure at every level.

class XMLException: object.Exception;

The base class for exceptions thrown by this module

class CommentException: std.xml.XMLException;

Thrown during Comment constructor

class CDataException: std.xml.XMLException;

Thrown during CData constructor

class XIException: std.xml.XMLException;

Thrown during XMLInstruction constructor

class PIException: std.xml.XMLException;

Thrown during ProcessingInstruction constructor

class TextException: std.xml.XMLException;

Thrown during Text constructor

class DecodeException: std.xml.XMLException;

Thrown during decode()

class InvalidTypeException: std.xml.XMLException;

Thrown if comparing with wrong type

class TagException: std.xml.XMLException;

Thrown when parsing for Tags

class CheckException: std.xml.XMLException;

Thrown during check()

CheckException err;: Parent in hierarchy
string msg;: Name of production rule which failed to parse, or specific error message
size_t line;: Line number at which parse failure occurred
size_t column;: Column number at which parse failure occurred

Library Reference

std.xml