Indexing and query processing of XML documents

The Extensible Markup Language (XML) is becoming the de facto standard for information representation and exchange over the Internet. Owing to its hierarchical (recursive) and self-describing syntax, XML is flexible enough to express a large variety of information. To retrieve useful information from XML, queries expressed in query language like XPath is used to specify some elements that suit a given criteria. An XPath expression is comprised of a sequence of location steps, each consisting of an axis, a node test, and possibly a predicate.