SimpleXML

Unit: 2 out of 10

What makes a document an XML document is actually its structure. An XML file is, in fact, a plain text file with an XML structure. This means that it doesn’t really matter how we treat this file, as long as we can get the relevant data out of it. If we have a tag:

<mytag>Hello from XML</mytag>

we need to systematically remove the “Hello xml” sentence from it. This is exactly what most systems do for working with XML, but not in such a minimalist form. One of them is the SimpleXML class.

SimpleXML is a simple but powerful class that can read, create, modify, and record XML documents. It works by creating the SimpleXML object from a source (string or file). This object then remains accessible throughout coding, in which case the object represents the root element of the document and the attributes are its children.

E.g:

1
2
3
4
5
6
7
8
<myxml>
<element>
my first element
</element>
<element>
my first element
</element>
</myxml>

 

If we store this XML document in a SimpleXML object, its structure will look like this:

Image  2.1. The structure of the XML document contained in the SimpleXML object

The SimpleXML object would represent the entire document, respectively its root element myxml would do that. Each level in this object would be represented as a string:

 

Image 2.2. Introducing the root element and its subelements

Here’s what this example would look like in code:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
//Crearea documentului xml și stocarea în variabila xmlText
$xmlText = <<<XML
<myxml>
<element>
my first element
</element>
<element>
my first element
</element>
</myxml>
XML;
//Crearea obiectului SimpleXMLElement, xml, pe baza documentului xml creat xmlText
$xml = new SimpleXMLElement($xmlText);
//Dupa ce am creat acest obiect, gasim elementele uşor prin structura obiectului insusi,
//structura care se creeaza pe baza structurii documentului xml
echo $xml->element[0];

In this example I used the Heredoc string syntax. So you can simply copy this code into the environment. Quite often, the XML document will not be visible inside the functionality, but will come from outside, via string.

From this example, we see that we have initialized this object with the command new SimpleXMLElement(). SimpleXMLElement is the central object of the class. As a constructor, I sent him a string with an XML document, which is the only parameter required for this constructor. Besides that, the constructor also accepts some alternative parameters:

new SimpleXMLElement($xmlText, LIBXML_DTDVALID);

The first additional parameter of the constructor is options – the constant that activates the additional option when the object is constructed. These are the constants in the libxml library for handling XML. The list of all parameters is at the following address:

http://www.php.net/manual/en/libxml.constants.php

If we want more options, we separate them with |.

The next parameter is data_is_url. Instead of being made up of strings, the class will be made up of a document from the URL. Naturally, in this case, the source must also be replaced with one from the web:

$xml = new SimpleXMLElement(“http://myxmldocument.xml“,null,true);

It was necessary to set the second parameter as well, but that’s only so that PHP doesn’t “confuse” the parameters. The third parameter (true) means that the XML document will be retrieved from a URL.

The last and penultimate parameters (ns and is_prefix) refer to the document namespace. If is_prefix (the last parameter) is true, i.e. it is active, the namespace of the document will also be taken into account. But only if the namespace has been highlighted in the previous parameter and properly defined in the XML document:

 

1
2
3
4
5
6
7
8
9
10
11
12
13
$xmlText = <<<XML
<myxml xmlns:myNs="http://mynamespace">
<myNs:element>
my element
</myNs:element>
<myNs:element>
my element 123
</myNs:element>
</myxml>
XML;
$xml = new SimpleXMLElement($xmlText,null,false,"myNs", true);
echo $xml->element[1];

Most often we will provide the XML document from a file.

E.g:

1
2
$xmlText = file_get_contents("xmlFile.xml");
$xml = new SimpleXMLElement($xmlText);

Instead, we can build the object directly, via the function simplexml_load_file():

$xml = simplexml_load_file(“xmlFile.xml”);

In the background of this function, nothing happens other than building the SimpleXMLElement object in the way explained above.

The variation of this function, simplexml_load_string(), works identically, except that it accepts string instead of file:

$xml = simplexml_load_string($xmlText);

Once we have the document object, we can read it as a standard string:

1
2
for($i=0;$i<sizeof($xml->element);$i++)
    echo $xml->element[$i];

or

1
2
foreach($xml as $element)
        echo $element;

Once we get an element, its subelements can be read as a string (as in the previous examples), but we can also create a separate element from it and read it that way. One of the ways we can do this is through the children() method. This method simply returns the entire array of subelements of the current element:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
$xmlText = <<<XML
<issues>
<book>
<title>Wayne's Way</title>
<author>Bruce Wayne</author>
</book>
<book>
<title>Joker's Way</title>
<author>Joker</author>
</book>
</issues>
XML;
$xml = new SimpleXMLElement($xmlText);
//$chi = $xml->children();
//echo $chi[0]->title;
foreach($xml as $issues)
        {
            $chi = $issues->children();
            echo "Issue: " . $chi[0] . "<br>";
            echo "Author: " . $chi[1] . "<br>";
        }

 

Of course, this example is just a demonstration. It can be done in a much clearer way:

1
2
3
4
5
foreach($xml as $book)
    {
        echo "Issue: " . $book->title . "<br>";
        echo "Author: " . $book->author . "<br>";
    }

Also, when we get to an element, it is very important to read its attributes. They are in an associative table under the same object.

If, for example, we wanted to add the id attribute to each card, the addition to the code above would look like this:

….
<book id=”01″>

</book>
….
echo “ID: ” . $book[“id”] . “<br>”;
….

In the same way, we can add more attributes:


<book id=”01″ isbn=”1234567″>

echo “ID: ” . $book[“id”] . “<br>”;
echo “ISBN: ” . $book[“isbn”] . “<br>”;

Let’s see how this looks in practice. One of the ways of using XML is to create RSS, respectively RSS feeds.

For example:

http://rss.cnn.com/rss/cnn_world.rss

If you open this address in your browser, you will get the following form of the page (the content will be different, because it is current news):

Image 2.3. CNN’s RSS Feed Presentation

 

Browsers have a built-in mechanism for recognizing RSS feeds and usually display them that way.

If we open the source code of this feed, we will see a scary amount of text. But, in fact, it is a very simple XML structure with specific characteristics. Something like this:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
<rss>
<channel>
<item>
<title></title>
<guid></guid>
<link></link>
<description></description>
<pubDate></pubDate>
</item>
<item>
<title></title>
<guid></guid>
<link></link>
<description></description>
<pubDate></pubDate>
</item>
</channel>
</rss>

Obviously, the root element is the rss document and its subelement is channel, which actually contains the news (the item elements). When we know all this, we can write and control the feed in a simple way, i.e. we can create an RSS reader:

1
2
3
4
5
6
$xml = new SimpleXMLElement("http://rss.cnn.com/rss/cnn_world.rss",null,true);
foreach($xml->channel->item as $news)
    {
        echo $news->title . "<br>";
        echo $news->description  . "<br>";
    }

Of course, I have limited the output data to title and description, but you can enter other data as well.

Modifying the XML document

So far, we have managed to read the XML document, but we have not yet managed to intervene on the content, respectively on its structure.

If we want to change a document, we can do so by sequentially checking all the elements, one by one, and then we can transcribe them into a new document, along with any changes. If we are building a document from scratch, this process will be different because there will be no reading section, only writing to the new document.

First, we create the SimpleXML object with the root element:

$xml = new SimpleXMLElement(“<issues></issues>”);

We can immediately enter the statement:

$xml = new SimpleXMLElement(“<?xml version=\”1.0\” encoding=\”utf-8\”?><issues></issues>”);

Now, we can manage this document by adding sub-elements and attributes to it:

1
2
3
4
5
6
7
8
9
10
11
//ADĂUGAREA UNEI CĂRȚI
//crearea elementului carte
$book = $xml->addChild("book");
    //adăugarea atributului la elementul carte (id și isbn)
    $book->addAttribute("id","01");
    $book->addAttribute("isbn","1234567");
    //adăugarea subelementelor cărții
    $title=$book->addChild("title","Wayne's Way");
    $title=$book->addChild("autor","Bruce Wayne");
//emiterea xml-ului în string
echo $xml->asXML();

 

We can repeat this block of code (except for the last line that emits XML in string) as many times as we have data.

From this example we can distinguish some new commands (method of the SimpleXMLElement class):

addChild()

Adds a subelement to the element. It can have a single parameter (the name of the subelement), two parameters (the name of the subelement and its content), or three parameters, where the third parameter represents the namespace.

addAttribute()

Append attribute to existing element. It accepts the same number and types of parameters as the addChild() method (name, value, and namespace).

asXML()

This method has two variants of execution. In one (if it has no parameters) it returns the string as a result, in the second variant it records the XML in a file whose path is entered as a parameter (asXML(“my file.xml” )) and returns boolean type (successful or unsuccessful).

Instead of this way, you can also create the XML document manually by writing it in string, line by line:

1
2
3
4
5
6
$xml = "<?xml version=\"1.0\" encoding=\"utf-8\"?><issues>";
$xml .= "<book id=\"01\" isbn=\"123456\">";
$xml .= "<title>Wayne's Way</title>";
$xml .= "<author>Bruce Wayne</author>";
$xml .= "</book>";
$xml .= "</issues>";

How do we create a new SimpleXML object?

Exercise

Using the exercise created in the previous lesson and SimpleXML, type the complete XML document into the XML class, delete one message and add two new messages via SimpleXML with all elements and content, then write the XML to the window.

Note:

The solution to the task can be found in the video materials of this lesson.