The The R Primer logo Primer

Read data from a simple XML file

You want to import a dataset stored as a simple structure in the XML file format

Solution: The XML (eXtensible Markup Language) was designed to transport and store data and XML has seen widespread use in interchanging data over the Internet.

An XML file consists of a series of elements which form a document tree. The tree starts at the root and branches to the lowest level of the tree. XML documents must contain a root node (or element) which is "the parent" of all other nodes, and all nodes can have their own sub nodes ("child elements").

The XML package provides numerous tools for parsing and generating XML in R. Since XML is such a flexible format, the XML package primarily consists of functions that must be combined to parse and extract information from a specific type of XML structure.

XML document files with a simple structure can be imported and converted to a data frame directly using the xmlToDataFrame function. By simple, we mean a collection of nodes that have the same sub-nodes such that each node corresponds to an observation or row in the data frame and each of its sub-nodes contain primitive values corresponding to the variables.

> library(XML)
> url <- "http://www.statistics.life.ku.dk/primer/mydata.xml" # Data location
> indata <- xmlToDataFrame(url)
> head(indata)
  Girth Height Volume
1   8.3     70   10.3
2   8.6     65   10.3
3   8.8     63   10.2
4  10.5     72   16.4
5  10.7     81   18.8
6  10.8     83   19.7

See rule 1.2 in The R Primer for more information.

Back to tips.