TGV : Annotation Model

The more information we have on data on which the request apply, the more efficiently we can define and use optimization rules on TGV.

A Generic Annotation Model

We define a generic annotation model to annotate the TGV for (a) any granularity of information and (b) any type of information (cost models and statistics, constraint, accuracy, security, rule tracability).

  • Granularity : Indeed, in a distributed heterogeneous environment, information as cost can be more or less available according to sources. Some manageable sources can provide precise information (statistics, algorithms) for each operator and data it handles. At the other extreme, very autonomous sources (eg. web site) do not give any piece of statistics information, and we just can estimate a global behavior of the system.
  • Type : is for characterizing information such as cost that can be for example time cost, resource cost, price cost, energy cost, etc. or other information as accuracy, constraints as presented in the work of weighted tree pattern [], fuzzy [] or flexible [] queries.

Our annotation model is generic and allows any type of information.
We define an annotation alyer as follow:

Annotation layer = 1 set of TGV elements + 1 annotation

 
The set of TGV elements indicates the annotation granularity.
A TGV element can be either a node name or a father-son link or a predicate on node or an hyperlink.
In order to visually represent the annotation on the TGV we use different colors.
Each annotation frame is associated to a unique color, and so every element of an annotation frame has the same color1.

Information annotated can be of any form: formulae, equations system, files, string, etc. They are given with the associated color near the TGV model.


annotations[1]
1)Different annotations layers

 
Figure 1 illustrates several annotation layers on a given TGV.

In the XLive system, we have mainly concentrated on cost model annotation.
To estimate cost of execution plan generated by previous transformations, and to allow physical rules to use physical information, information as statistics of sources, data cardinalities, network speed, etc., are provided to the optimizer.
They are annotated on the TGV.

Laisser un commentaire