Apache OODT's profile servers, product servers, and other components all use the same format for a query. It's encapsulated by the class org.apache.oodt.xmlquery.XMLQuery. In this tutorial, we'll look at this class and see how it represents queries. You'll need this knowledge both to make queries to OODT servers, as well as to understand queries coming into OODT servers.
Capturing various aspects of a query is difficult to do in general, and OODT's implementation is not stellar or complete. But, it has proved succesful in a variety of applications, so let's see what concepts it encapsulates.
First, forget the fact that the XMLQuery has "XML" in its name. It doesn't mean you can query only XML resources. It's called XMLQuery probably because the person who came up with it thought XML was pretty cool, or that you can represent an OODT query in XML format.
While you can represent an XMLQuery in XML, you usually only use the Java representation, that is, you create and manipulate Java objects of the class org.apache.oodt.xmlquery.XMLQuery.
In theory, the XMLQuery can represent any query for information. It captures generic aspects of a query, such as the domain of the question being posed, the range in which the desired response should be formulated, and constraints on what selects the response. In XMLQuery parlance, we call these the "from element set" (domain), the "select element set" (range), and the "where element set" (constraints).
In practice, none of the current OODT implementations use any but the "where element set." And indeed, for most problems presented to OODT, that is sufficient. However, the framework is there to support more aspects of a query, and you're welcome to use them in your own deployments.
The XMLQuery concept captures metadata about a query as well, such as the title for the query, whether the query itself is secret or classified, how many results to return at most, how to propagate the query through a network, and so forth. In practice, though, none of these additional attributes are used in current deployments of OODT. Moreover, none of the current OODT components obey such settings such as maximum number of results or propagation types.
As a result, you should ignore these aspects of the XMLQuery and merely use its default values. We'll see these shortly.
The following diagram shows the XMLQuery and related classes (note the diagram is outdated; "jpl.eda.xmlquery" should read "org.apache.oodt.xmlquery"):
A single XMLQuery object has three separate lists of QueryElement objects, representing the "from", "select", and "where" element sets. In practice, the "from" and "where" sets are empty, though, as mentioned. There's also a single QueryHeader object capturing query metadata. Within the XMLQuery itself is additional query metadata. Finally, there's exactly one QueryResult object which captures the results of the query so far.
The XMLQuery class uses lists of QueryElement objects to represent its "from", "select", and "where" element sets. The lists form a postfix boolean stack, with the zeroth element of the list being the top of the stack. Although you can populate these stacks by manipulating their corresponding java.util.Lists, the XMLQuery class provides a boolean expression language that lets you directly populate them.
The XMLQuery class also respects that some queries just cannot be formulated as a boolean expression. In these cases, you can pass in a string that the XMLQuery will otherwise carry unparsed. Note that your profile and product servers will then have the responsibility of handling that string in some appropriate way.
The query language that XMLQuery uses to generate postfix boolean stacks is a series of infix, not postfix, element-and-value expression linked by boolean operators. Here's an example:
temperature > 36 AND latitude < 45
As you can see, these are triples linked in a logical expression. Each triple has the form (element, relation, literal). For example, the first triple has element = temperature, relation = GT (greater-than), and literal = 36. That triple is linked to the next one with the boolean AND operator.
The full set relation operators include: = (EQ), != (NE), < (LT), <= (LE), > (GT), >= (GE), LIKE, and NOTLIKE. The logical operators include AND, &, OR, |, NOT, and !. You can use parenthesis to group things too.
Here are a few more examples:
specimen = Blood bac > 0.05 AND priors = 3 surname LIKE 'Simspon%' OR numChildren <= 3 AND RETURN = numEpisodes
The "where" element set is actually a java.util.List of org.apache.oodt.xmlquery.QueryElement objects, arranged in a boolean stack with the top of the stack as the zeroth element in the list. QueryElement objects themselves have two attributes, a role and a value.
The role tells what role the QueryElement is playing. It can be elemName for the element part of a triple, RELOP for the relation part of a triple, LITERAL for the literal part of a triple, or LOGOP for a logical operator linking triples together. The value tells what the element is, what the relational operator is, what literal value is being related, or what the logical operator is.
The XMLQuery parses a query expression and generates a corresponding stack of QueryElements. Let's look at a couple examples. The expression
latitude > 45
generates the "where" stack
While the expression
artist = Bach AND NOT album = Poem OR track != Aria
generates the "where" stack
A special element is reserved by XMLQuery: RETURN. It's used to indicate what to select, and so any value specified with RETURN goes into the "select" set, not the "where" set.
Moreover, the RETURN element doesn't pay attention to how it's linked with boolean expressions in the rest of query, or what relational operator is used with the literal value being returned. For example, that means all of the following expressions would generate identical XMLQueries:
specimen = Blood AND RETURN = volume specimen = Blood OR RETURN = volume specimen = Blood AND RETURN != volume specimen = Blood AND RETURN < volume specimen = Blood AND RETURN LIKE volume
All QueryElements from RETURN triples would go into the "select" instead of the "where" set.
To construct a query, you'll use a Java constructor of the following form:
XMLQuery(String keywordQuery, String id, String title, String desc, String ddId, String resultModeId, String propType, String propLevels, int maxResults, java.util.List mimeAccept, boolean parseQuery)
The parameters are summarized below:
Parameter | Purpose | Sample values |
---|---|---|
keywordQuery | A string representing your query expression, in the query language described above, or in some other application-sepcific language. | numDonuts = 3, select volume_remaining from specimens where specimen_type = 4 |
id | An identifier for your query | query-1, 1.3.6.1.1316.4.1, myQuery, urn:ibm:sys:0x39ad930a |
title | A title for your query | My First Query, Query for Blood Specimens, Simpson's Query |
desc | Description of the query | H.J. Simpson is looking for donut shops |
ddId | Data dictionary ID. This identifies the data dictionary that provides definitions for the elements used in the query like "specimen" or "numDonuts". It's not used by any current OODT deployment or the OODT framework. | null |
resultModeId | Identifies what to return from the query. Defaults to ATTRIBUTE. Not used by any current OODT deployment or the OODT framework. | null |
propType | How to propagate the query, defaults to BROADCAST. It's not used by any current OODT deployment or the OODT framework. | null |
propLevels | How far to propagate the query, defaults to N/A. Not used by any current OODT deployment or the OODT framework. | null |
maxResults | At most how many results to return; not enforced by OODT framework. | 1, 100, Integer.MAX_VALUE, -6 |
mimeAccept | List of acceptable MIME types for returned products, defaults to */* | List types = new ArrayList(); types.add("text/xml"); types.add("text/html"); types.add("text/*"); |
parseQuery | Should the class parse the query as a boolean expression? True says to generate the boolean expression stacks. False says to just save the expression string. | true, false |
All of the values above can be set to null to use a default or non-specific value (except for maxResults and parseQuery, which are int and boolean types and can't be assigned null). For most applications, using null is perfectly acceptable. Since the OODT framework doesn't use maxResults, you can use any value. However, specific profile servers' and product servers' query handlers may pay attention to value if so programmed.
The last parameter, parseQuery, tells if you want the XMLQuery class to parse your query and generate boolean expression stacks (discussed above) or not. Set to true, the class will parse the string as if in the XMLQuery language described above, and will generate the "from", "select", and "where" element boolean stacks. Set it to false and the class won't parse the string or generate the stacks. It will instead store the string for later use by a profile server's or product server's query handler.
For example, if you pass in the XML query language expression,
donutsEaten > 5 AND RETURN = episodeNumber
then set the parseQuery flag to true. As another example, suppose the query expression is
select episodeNumber from episodes where donutsEaten > 5
This is an SQL expression, probably targeted to a product server than can handle SQL expressions. In this case, set parseQuery to false.
The current OODT deployments for the Planetary Data System and the Early Detection Research Network both use parsed queries.
Internet standards for mail, web, and other applications use MIME types (described in RFC-2046 amongst other documents) to describe the content and media type of data. So does OODT. When you construct an XMLQuery, you can also pass in a list of MIME types that are acceptable to you for the format of any returned products, much in the same way your web browser tells a web server what media types it can display.
The list of acceptable MIME types is only used for product queries since products can come in any shape and flavor. Profile queries ignore the list; profiles are always returned as a list of Java org.apache.oodt.profile.Profile objects.
You've probably seen MIME types before, but here are some examples in case you haven't:
In the XMLQuery constructor, you can pass in a list of MIME types that shows your preference for returned products. Product servers' query handlers examine the query to see if they can provide a matching product, and they examine the list of MIME types to see if they can provide matching products in the format you desire.
As an example, suppose you create a MIME type list as follows:
List acceptableTypes = new ArrayList(); acceptableTypes.add("image/tiff"); acceptableTypes.add("image/png"); acceptableTypes.add("image/jpeg");
and you pass acceptableTypes as the mimeAccept parameter of the XMLQuery constructor. This tells query handlers receiving your query that you'd really prefer a TIFF format image. However, failing that, you'll accept a PNG format image. And, as a last resort, a JPEG will do.
You can also use wildcards in your MIME types. Suppose we did the following:
List acceptableTypes = new ArrayList(); acceptableTypes.add("image/tiff"); acceptableTypes.add("image/png"); acceptableTypes.add("image/*");
Now we tell query handlers in product servers that we really prefer TIFF format images. If a query handler can't do that, then a PNG format will be OK. And if a query handler can't do PNG, then any image format will be fine, even loathesome GIF.
If you pass a null or an empty list in the mimeAccept parameter, the OODT framework will convert into a single item list: */*, meaning any format is acceptable.
The XMLQuery class is also an executable class. By running it from the command-line, you can see how it generates its XML representation. It also lets you pass in a file containing an XML representation of an XMLQuery and parses it for validity.
Let's try just seeing that XML representation. (In these examples, we'll be using a Unix csh like command environment. Other shells and non-Unix users will have to adjust.)
First up, we'll need two components:
Download the binary distribution of each of these packages and extract their contents. Then, create a single directory and collect the jar files together in one place.
To generate the query, pass the command-line argument -expr. That tells the XMLQuery that the rest of the command line is the query expression. It will expect it to be in the XMLQuery query language (meaning that it will create an XMLQuery object with parseQuery set to true).
Here's an example:
% java -Djava.ext.dirs=. \ org.apache.oodt.xmlquery.XMLQuery \ -expr donutsEaten \> 5 AND RETURN = episodeNumber kwdQueryString: donutsEaten > 5 AND RETURN = episodeNumber fromElementSet: [] results: org.apache.oodt.xmlquery.QueryResult[list=[]] whereElementSet: [org.apache.oodt.xmlquery.QueryElement[role=elemName,value=donutsEaten], org.apache.oodt.xmlquery.QueryElement[role=LITERAL,value=5], org.apache.oodt.xmlquery.QueryElement[role=RELOP,value=GT]] selectElementSet: [org.apache.oodt.xmlquery.QueryElement[role=elemName,value=episodeNumber]] ======doc string======= <?xml version="1.0" encoding="UTF-8"?> <query> . . .
The program prints out some fields of the XMLQuery such as the "from" element set, the current results (which should always be empty since we haven't passed this query to any product servers), the "where" element set, and the "select" element set. It then prints out the XML representation.
If you examine the XML representation closely, you'll see things like the list of acceptable MIME types:
<queryMimeAccept>*/*</queryMimeAccept>
This says that any type is acceptable. You'll also see the passed in query string:
<queryKWQString>donutsEaten > 5 AND RETURN = episodeNumber</queryKWQString>
Regardless of whether you passed true or false in the parseQuery parameter, the XMLQuery always saves the original query string. For unparsed queries, this is how the string is packaged on its way to a product server. For parsed queries, product servers will use the boolean stacks. (Since this was a parsed query, you'll also see the boolean stacks in XML format if you look closely. They're there.)
Alert readers will have noticed that the results of a query have a place in XMLQuery objects. This actually applies to product queries only. After sending an XMLQuery to a product server, the query object comes back adorned with zero or more matching results. You then access the XMLquery object methods to retrieve those results.
The following class diagram demonstrates the relationship (again, the diagram is outdated; "jpl.eda.xmquery" should read "org.apache.oodt.xmlquery"):
As you can see, a single query has a single org.apache.oodt.xmlquery.QueryResult, which contains a java.util.List of org.apache.oodt.xmlquery.Result objects. Result objects may have zero or more Headers, and Result objects may actually be LargeResult objects.
To retrieve the list of Result objects, call the XMLQuery's getResults method, which returns the java.util.List directly.
Each result also includes
The headers of a result are optional. They're used for tabular style results to indicate column headings. Each Header object captures three strings, a name, a data type, and units.
For example, suppose you retrieved a product that was a table of temperatures at various locations on the Earth. There might be three headers in the headers list:
List Index | Header | ||
---|---|---|---|
Name | Data Type | Units | |
0 | latitude | float | degrees |
1 | longitude | float | degrees |
2 | temperatuer | float | kelvins |
Suppose the product you get back as a picture of a tissue specimen. In this case, there would be no headers.
To retrieve the actual data comprising your product, call the Result object's getInputStream method. This returns a standard java.io.InputStream that lets you access the data. How you interpret that data, though, depends on the MIME type of the product, which you can get by calling the Result's getMIMEType method.
For example, if the MIME type was text/plain, then the byte stream would be a sequence of Unicode characters. If it were image/jpeg, then the bytes would be image data in JPEG/JFIF format.
In this tutorial, we learned about the structure of the standard query component in OODT, the XMLQuery. We saw the query language that XMLQuery supports and how it generates postfix boolean expression stacks. You can also encode any query expression by using a special constructor argument that tells XMLQuery to not parse the query string. We also execute the XMLQuery class directly. Finally, we saw how product data is embedded in the XMLQuery and how to deal with such results.
As a client of the OODT framework, you can now create XMLQuery objects to query product servers from within your Java applications. As a server in the framework, you know how to deal with incoming query objects.