TGV : Tree Pattern Queries

Tree Patterns

Although canonization reduces the XQuery queries expression to a canonical form, XQuery modeling is a difficult exercise since XQuery provides a large set of functionalities.
Moreover, we also choose to integrated mediation requirements (data localization on sources, heterogeneous sources capabilities).

We propose the TGV (Tree Graph View) model implemented in the XLive mediator for XQuery processing. It is based on Tree Patterns [,].
The TGV model extends the Tree Pattern representation in order to make it intuitive, has full XQuery support and has support to be manipulated, optimized and then evaluated.

Each of element in the TGV model has been defined formaly using Abstract Data Type (see formalization in []) and has a graphical representation.

A TreePattern selects nodes based on their structural characteristics.
TreePatterns are made of Node defined by a label and NodeLink that represents relations between the Nodes.
These relationship include XPath axis characteristics (child, descendant, self, etc.), but also additional informations as mandatory/optional.
A Tree Pattern is illustrated on figure 1.To represent all XQuery richness, we must introduce specific Tree Patterns to model each characteristic of XQuery:

  • SourceTreePattern (STP) is a Tree Pattern defined by a targeted document or set of documents and a root path.
  • IntermediateTreePattern (ITP) is a Tree Pattern defined on a previous one that specializes the domain on a specific Node.
  • ReturnTreePattern (RTP) is a Tree Pattern that defines the result construction of an XML document. Each node coincide with a tag on the XML result set, or a contents text.
  • AggregateTreePattern (ATP) is a Tree Pattern that builds a temporary result set. It represents nested queries and aggregated functions on a set of trees.

Figure 2 gives example of different types of TreePatterns.

Node-NodeLink[1]
1) A Tree Pattern

TreePatterns[1]
2) Several TreePatterns and Hyperlinks example

Hyperlinks

Other links named Hyperlinks (see figure 2 and 3) have been defined to represent additional relation between nodes belonging to different tree pattern:

  • Association Hyperlinks for restriction purposes on some sets of elements.
    • JoinHyperlinks are made to connect two Nodes with a Constraint that represents a join constraint.
    • ConstraintHyperlinks connect two Constraints with a Boolean connector associated to a return clause, in order to preserve constraint association (with and/or operators) and the treatment declaration level
  • Directional Hyperlinks to transform set of trees into a new one.
    • ProjectionHyperlink is a Node to Node Directional Hyperlink. It contains a mandatory or optional status. It represents a value projection of the given node into the projected one.
    • SpecializedHyperlink is a Node to Tree Pattern Directional Hyperlink. It contains a mandatory or optional status. It represents the specialization of a Node, by specifying a new TreePattern which root is the given node.
    • GeneralizedHyperlink is a Tree Pattern to Node Directional Hyperlink. It contains a mandatory or optional status. It represents a TreePattern generalization result set, which result is projected into the given node.
    • SetHyperlink is a set of Tree Patterns to Node under Constraint Directional Hyperlink. The Constraint possible values are: Union, Intersect or Difference. It represents a set operation between few TreePatterns on a single Node.
    • ConditionalHyperlink is a set of Elements to Node under Constraint Directional Hyperlink. Elements can be a Node or an AggregateTreePattern, and the constraint is a Predicate or a Function. It represents a conditional expression which result is deduced by the constraint status.

Constraint[1]
3)Constraint and Hyperlink example

A Tree Graph View is a combination of the presented previous element that compose an XQuery query.

Tree patterns for source document, return document and intermediate structure are shown in the figure.
On tree pattern, label, constraints on value and ascendant-descendant links are represented (simple line for father-son link, double link otherwise), and a variable is bound.
Between trees, hyperlinks representing aggregation, join, projection or condition are represented.

[tplist include= »57,56,59,53″ headline= »0″ style= »numbered »]

Laisser un commentaire