|
|||
$Id: //open/dt/dev/saffron/doc/overview.html#2 $ | |||
(C) Copyright 2002-2003 Disruptive Tech | |||
Author | Julian Hyde | ||
---|---|---|---|
Created | November 5th, 2002 |
Julian Hyde, November 5th, 2001.
Saffron is a seamless integration of SQL-like relational expressions into Java. As we shall see, the tight integration means that relational expressions are much more powerful than those offered by SQL.
exists
condition:if (exists (select from emp where emp.salary > 50000)) { ... }
An in
condition:
if ("Fred" in (select emp.ename from emp where emp.salary > 50000)) { ... }
A for
loop:
for (emp in (select from emp order by emp.salary)) { System.out.println(emp.ename + " earns " + emp.salary); }
And an array initialized using a relational expression:
Emp[] programmers = (select from emp where emp.title.equals("Programmer"));
select emp.ename.substring(ename.indexOf(".")) from emp
DECIMAL(8,2)
column to a java float
variable) are inefficient, error-prone, and can lose information.
The names of tables and columns are checked at compile time, so your programs
are more maintainable.true
or false
, whereas in SQL a
condition can sometimes evaluates to NULL.manufacturers
), and tables from two
different relational databases (orderDb
and productDb
).Should the query be executed on the orders database or the products database? And how should the manufacturer list be sent over: as a hard-coded in-list, or by restarting the query for each manufacturer in the list? It really doesn't matter: the Saffron optimizer will decide for you.String[] manufacturers = new String[] {"Heinz", "Guinness"}; int[] customers = ( select order.customerId from productDb.products as product join orderDb.orders as order on order.productId == productId where product.manufacturer in manufacturers);
We have already discussed how Saffron makes your life easier if you are embedding SQL statements in Java using JDBC. But Saffron can help even if you are not accessing data stored in a relational database. This is because in-memory data is often relational, and the algorithms to deal with it are often the familiar relational operations: sort, join, filter, union, intersect, minus.
Suppose I have two lists of customers, and I want to create a list of both, with duplicates eliminated. An experienced Java programmer would probably write something like this:
Customer[] addCustomerLists(Customer[] list1, Customer[] list2) { Hashtable hashtable = new Hashtable(); for (int i = 0; i < list1.length; i++) { Customer c = list1[i]; hashtable.put(c, c); } for (int i = 0; i < list2.length; i++) { Customer c = list2[i]; hashtable.put(c, c); } Customer[] list3 = new Customer[hashtable.size()]; hashtable.asList().copyInto(list3); return list3; }
They would test it, notice that the customers are now in arbitrary order, and add
java.util.Arrays.sort(list3);
just before returning the result. In Saffron, a programmer would write
Customer[] list3 = select from (list1 union list2) c order by c;
They don't need to know Hashtable is the 'correct' data structure to use. They tell the system what they want, and the system figures how to compute it efficiently.
This is a rather simple example. It gets better, because Saffron is a closed language. This means that you can use a Saffron expression anywhere
Definition. A set expression is
an expression which returns more than one item; namely, a query, an array, an
iterator or collection, or a Table
.
select
[{
[expression [as
alias]], ...}
| expression] [from
from-expression] [where
expression] [order by
[expression [asc
|desc
]], ...]expressionunion
expressionexpressionintersect
expressionexpressionminus
expression
The syntax for the from list from-expression is as follows:
expression [as alias]from-expression [left
|right
|full
| ]join
from-expression [on
condition]
Notes:
select from emp
has type
Emp
); otherwise, the result is a record, with one field per from item
(for example, each row from select from emp join dept
has type
{Emp emp; Dept dept}
; each row
from select
has type {}
).select three from new int[] {1, 2, 3} as three
is an int
. {
... }
) means create a record for each row. The
optimizer will create a synthetic class to describe the type of the row. Two
rows with the same types and aliases, in the same order, will have the same
synthetic class (and are therefore union-compatible). For example, each
row from select {three} from new int[] {1, 2, 3} as three
is
class Synthetic$124 { int three; }
; the type of select {emp.empno,
emp.ename as name} from emp
is class Synthetic$527 { int empno;
String name; }
.Emp emp; select emp,
dept.deptno, dept.dname.substring(3) from dept
are emp
,
deptno
, and substring
, respectively. Other
kinds of expression have no default alias, and therefore an alias must be
specified. Whether implied or explicit, the aliases in a select or from
list must be distinct.on
condition is omitted from a from-expression
it defaults to true
(hence, a cartesian product). The keywords
left
, right
and full
indicate that an
outer join is to be performed.We add the following boolean expressions to Java:
expressionin
expressionexists
(expression)
The right-hand expressions must be set expressions.
These expressions can, of course, occur inside relational expressions. The
sub-query so created is said to be correlated if it refers to rows
from the outer query. For example, the following query
is correlated on dept.deptno
:
select from
deptsas
deptwhere exists
(select from
empsas
empwhere
emp.deptno == dept.deptno && emp.gender.equals("F"))
We do not provide a not
operator to negate these. Use the
regular Java !
operator, for example, !("apple" in fruit)
or !exists (select from fruit where fruit.equals("apple")
.
for
([type] variablein
expression) { statement list }
Simple example:
select
emp.deptno, sum(emp.sal)as
sumSalgroup by
emp.deptnofrom
empsas
emp
Unlike SQL, the group by
clause comes before the from
clause, and is mandatory if you want to aggregate. The equivalent of the SQL select
sum(emp.sal) from emp
has an empty group by
clause in Saffron: select sum(emp.sal)
group by from emp
.
In Saffron, the where
clause is applied after aggregation
has been performed, more like a SQL having
clause than a
where
clause. Expressions in the where
clause can only
reference aggregations and grouping expressions, not ungrouped expressions. If
you want to filter the rows going into the aggregation, use a subquery in the
from list:
select
femp.deptno, sum(femp.sal)as
sumSalgroup by
femp.deptnofrom
(select
from
empsas
empwhere
emp.gender.equals("F"))as
fempwhere
avg(femp.sal) > 50000
There are 3 kinds of aggregation function:
count
) are treated
specially by the system, and generate optimized code.emp.sal
is of type double
, then
double sum(double[] a)
would match. Likewise, int
count(Iterator iter)
would match select count(emp) from emps as
emp
, because Emp can be up-cast to Object, the element type of Iterator.Now let's see how the Saffron system invokes a user-defined aggregation
while evaluating a query. Suppose we have written class MySum
,
which can total both int
and double
values, and
evaluate the query
select
MySum(i)from new int
[] {2, 3, 4}as
i
The Saffron system makes following calls. Note how the resolve
method deals with operator overloading. Note how int
s are wrapped
as Integer
s then in arrays to be passed to addOne
, and returned from
result
.
AggregatorClass sum = new MySum(); AggregationExtender intSum = sum.resolve(new Class[] {Integer.TYPE}); Object total = intSum.start(null); intSum.addOne(total, new Object[] {new Integer(2)}); intSum.addOne(total, new Object[] {new Integer(4)}); Object total2 = intSum.start(null); intSum.addOne(total2, new Object[] {new Integer(3)}); intSum.merge(total, total2); System.out.println("Result is " + ((Integer) intSum.result(total)).intValue());
Examples of more complex aggregations:
median
aggregation ({@link saffron.ext.Median}) requires a
variable-size workspace. A typical implementation would use an {@link java.util.ArrayList} to collect the values, sort them, then return the middle one.nth
aggregation {@link saffron.ext.Nth}) returns the nth item of a set. n
is passed to the start
method, so must be independent of any
particular row. Example:select
new Nth(3).aggregate(name)from
(select from
empsorder by
salary)
closest
aggregation computes the nearest of a set of
points to a given point. Its start
method receives the latitude
and longitude of the initial point (as new Object[] {new Double(42), new
Double(129)}
), and the addOne
method receives the latitude
and longitude of each subsequent point.select
closest(42, -129, office.latitude, office.longitude)from
officesas
office
Notes & issues:
select
nth(count() / 2, sorted)from
(select from new int
[] {2, 3, 4}as
iorder by
i)as
sorted
{@link saffron.AggregationExtender}.result()
method would have to be non-destructive.)One of Java's most important features is its strong type system. To take advantage of this, every relational expression is always has a result which is an array type.
This doesn't mean that the array is actually created. If you use the
for (... in ...)
construct, for example, the system will implement the loop
using an iterator, and is unlikely to construct the result as an array.
And the optimizer tries hard not to convert intermediate sets, such as that
created by an in
operator, into arrays.
Regular java expressions can also be used as relational expressions:
Collection
, including Vector
)Enumeration
, Iterator
)The problem with collections and iterators (until generics arrive in Java 1.4) is that we don't know what type of objects they contain. We allow the programmer to specify that type information using by a dummy cast to an array type. For example:
Vector fruit = new Vector(); fruit.addElement("apple"); fruit.addElement("banana"); fruit.addElement("orange"); fruit.addElement("mango"); String[] tropicalFruit = (select fruit from (String[]) fruit where fruit.indexOf("n") >= 0);
You can cast a collection or iterator to an array of primitive types, provided that the collection does not contain null values.
Some data sources, such as vectors, iterators and external data sources such as JDBC tables, contain very little type information, and this makes it difficult to write concise queries. For example,
Vector emps = new Vector(); emps.addElement(fred); emps.addElement(barney); for (emp in select from emps as emp where ((Emp) emp).deptno == 20) { print(((Emp) emp).name); }
Better to tell the system just once that emps
is a collection of
Emp
objects:
for (emp in select from (Emp[]) emps as emp where emp.deptno == 20) { print(emp.name); }
The cast does not necessarily cause the vector to be converted to an array: we are simply providing the Saffron compiler with more type information, and allowing it to choose the most efficient plan.
By default, a relational expression returns an array. You can produce results in a different format by using a dummy cast, like this:
Iterator emps; Iterator femaleEmps = (Iterator) (select from (Emp[]) emps as emp where emp.gender.equals("F"));
This is particularly useful where the output is large (even infinite!) and you to iterate over the first few rows without computing the whole result.
The available types are {@link java.util.Iterator}, {@link
java.util.Enumeration}, {@link java.util.Collection}, {@link java.util.Vector},
and type[]
. In future, we may add {@link java.sql.ResultSet}.
The interfaces Table, Schema and Connection allow you to access data sources
which have arbitrary access methods. A table (interface {@link
saffron.Table}
) represents a set of relational data, and it provides
methods to describe the data source, discover its access methods, and so
forth. Schemas (interface {@link saffron.Schema}
) make it easy to
expose a set of tables and the information necessary to compile programs which
use them. If you can to access data in a JDBC database, derive a class from
class {@link saffron.ext.ReflectSchema}
and create a member of
type {@link saffron.ext.JdbcTable} for each table your wish to access
(class {@link sales.Sales}
is an example of this.)
Let's look at what happens when you compile a program which uses a schema.
sales.emps
) which
accesses a member or method of an object (sales
) which implements
interface {@link saffron.Connection}
.getSchema
on that
object. (Why call a static method, not a method in the interface?
Well, remember that this is happening at compile time. The real object
does not exist yet.)getMemberTable
or
getMethodTable
method, to retrieve a Table
. If there is no
such table, these methods may throw openjava.mop.NoSuchMemberException
,
which will be reported as a compilation error.toRel
method, passing the
piece of parse tree which represents the expression which will yields the
schema. (The real schema object only gets created at run time,
remember.)In particular, we provide abstract class {@link saffron.ext.JdbcTable}
implements {@link saffron.Table}
to make it easy to access tables via
JDBC. We recommend that you group tables into a schema class, which extends
class {@link saffron.ext.ReflectSchema}
. For
example,
class SalesConnection extends saffron.ext.JdbcConnection implements saffron.Connection { public SalesConnection(String connect) { super(connect); } // required for saffron.Connection public static saffron.Schema getSchema() { return new SalesSchema(); } }; class SalesSchema extends saffron.ext.ReflectSchema implements saffron.Schema { public static class Emp { public int empno; public String name; public int deptno; }; public static class Dept { public int deptno; public String name; }; public saffron.Table emp = new saffron.ext.JdbcTable( this, "EMP", Emp.class); public saffron.Table depts = new saffron.ext.JdbcTable( this, "DEPT", Dept.class); }; SalesConnection sales = new SalesConnection("jdbc:odbc:MyDatabase"); for (emp in sales.emps) { some code }
The Saffron compiler creates an instance of the schema, and interrogates it
regarding the objects it contains. When it sees the expression sales.emps
,
it invokes SalesConnection.getSchema().getTableForMember("emps")
to
get a {@link saffron.Table} which describes emps
and, in particular,
that it returns rows of type Emp
. The translator then calls
the table's toRel()
method to create a relational expression node;
in this case a {@link saffron.rel.TableAccess}. (The translator actually converts
sales.emp
into a brief intermediate expression (Emp[])
sales.contentsAsArray("emps")
. The contentsAsArray
function
is a placeholder which has special meaning to the translator. Note how an array
cast is used to provide type information.)
As well as type information, the schema could in future provide information about volumetrics, indexes, and even custom rules for optimizing the query.
todo: Expose Map
s (including Hashtable
s) as
relations with 2 columns.
A schema could be used to present non-relational data in a relational format. It would know that 'getDept(40)' was the same as 'select from dept where deptno = 40' and 'dept.getEmployees()' is the same as 'select from emp where emp.deptno == dept.deptno'. We would need a mechanism to construct an extent (the set of all instances of a given entity). The schema could declare an explicit method to find the instances of an entity; another way is to use mandatory relationships (for example, since every employee must work in a department, given all departments, we can find all employees by traversing the 'works in' relationship). It helps if the relationship is many- (or one-) to-one, since we don't need to eliminate duplicates. There could be more than one access path to an entity.
How to represent such mappings? Views: dept.getEmployees() is equivalent to select from emp where emp.deptno == dept.deptno. Better, parameterized views. Example {@link sales.Reflect} presents Java's type system ({@link java.lang.Class}, {@java.lang.reflect.Field}, et cetera) as relations.
todo: Could allow a type (e.g. boolean
,
int
) to serve as a relation. Could implement by iterating over all
possible values of that type (rarely practical) or joining to an attribute of a
more finite relation. Example: select from int as i where i in (select
emp.deptno from emp)
.
Saffron uses Java, not SQL, null semantics. null
is always the Java null value. Every boolean expression evaluates to
either true
or false
. And yes, null ==
null
evaluates to true
.
In Saffron, as in Java, the ==
operator
compares addresses, not content. The expression "abc" == "a"
+ "bc"
may, on some Java systems, evaluate to false, so you should
write "abc".equals("a" + "bc")
instead.
The in
operator expands to expressions which use the
equals()
and hashCode()
methods, so you should only uses it
on classes for which these methods are well-behaved. And the expression
null in new String[] {"apple", "orange"}
will give a NullPointerException
.
The order by
operator requires that objects implement the
interface java.lang.Comparable
. (todo: Compile- or run-time error if
not?)
Primitive types are compared using the primitive operators ==
,
<
etc.
Two instances of a synthetic classes are equal if and only if each of their components are equal. Synthetic classes are compared lexicographically. Example:
select from (select from twos join threes) as i join (select from twos join threes) as j where i < j
Saffron accepts primitive values, such as int
, char
,
double
, in most places that it accepts objects. A couple of
exceptions:
select {twos, threes} from new int[] {2, 4, 6, 8} as twos left join new int[] {3, 6, 9} as threes on twos == threeswould produce
2, 0 4, 0 6, 6 8, 0
select from new int[] {2, 4, 6, 8} as twos minus select from new int[] {3, 6, 9} as threesIt automatically boxes and unboxes primitive values.
The Saffron system runs in static and dynamic mode.
In static mode, relational expressions are embedded into Java programs. A pre-processor checks that the relational expressions are valid, and converts them into regular java. There are many ways to evaluate a relational expression,
A query optimizer chooses an evaluation plan.
If a relational expression has more than one expression in its select
clause, it produces rows which are not atomic. For example,
int[] twos = new int[] {2,4,6,8}; int[] threes = new int[] {3,6,9}; for (i in (select twos, threes twos cross join threes)) { ... }
creates rows which are pairs of int
s. It is translated to
int[] twos = new int[] {2,4,6,8}; int[] threes = new int[] {3,6,9}; for (int x = 0; x < twos.length; x++) { for (int y = 0; y < threes.length; y++) { Synth1 i = new Synth1(twos[x], threes[y]); ... } }
This illustrates two issues with composite rows. First, these rows need
a type. The Saffron compiler solves this issue by generating a synthetic
class (in this case, class Synth1
). We don't know what this
class will be called, so we use for (variable in ...)
syntax, which lets the system infer the type for us.
Second, the system creates a new object for every row, which might be expensive. In this case, they are emitted from the query, so we can't avoid creating them. If the composite rows are not emitted, the Saffron optimizer will do its best to avoid creating them.
The job of the optimizer is to convert a parse tree containing Saffron expressions into a regular parse tree. Its input and output, therefore, are both parse trees.
In between times, it searches for Saffron expressions, and converts them to
tree of intermediate relational expressions (class {@link saffron.rel.Rel}
).
These relational expressions still contain fragments of regular Java expressions for select expressions, method calls, and so forth. The Java expressions are translated into internal form. Except for references to variables outside the Saffron expression, variables tend to disappear, to be replaced by mysterious-looking expressions. For example, in the expression
select from location join (select from emp join dept on emp.deptno == dept.deptno) on location.state.equals(dept.state)
the join condition location.state.equals(dept.state)
becomes $input0.state.equals($input1.$f1.state)
.
Here, $input0
and $input1
represent the 2
inputs to the join node (which is the context for this Java fragment); $f1
is the first field of the composite row produced by the inner query. That
field's type is Dept, and therefore it has a state
field.
The optimizer uses a dynamic programming approach based upon that used by the Volcano optimizer. Starting with the initial expression, it applies transformation rules to convert it, and its sub-expressions, into semantically equivalent expressions. It estimates the cost of each expression, and cultivates variants of expressions until it finds cheap implementations of them.
An expression can be implemented using several calling conventions. Calling conventions include array, vector, iterator. The most important (and efficient) calling convention is inline. Inline means that a piece of code will be executed for every row in the expression; no collection of objects is created, just stuff happens. An inline implementation of
for (o in (select from emp join dept on emp.deptno == dept.deptno)) { System.out.println(o.emp.ename + " works in " + o.dept.dname); }
would be
for (int i = 0; i < emp.length; i++) { for (int j = 0; j < dept.length; j++) { if (emp[i].deptno == dept[j].deptno) { Synth1 o = new Synth1(emp[i], dept[j]); System.out.println(o.emp.ename + " works in " + o.dept.dname); } } }
This is rather tricky to implement, because the stuff in the middle of the for loop corresponds to stuff at the top of Rel tree. Each Rel node's implement method has to handle a callback from a child saying 'I have placed a row in variable xyz; please generate code to do something with it.'
If two expressions have the same meaning but different calling conventions, they are not immediately interchangeable. It may be cheaper to evaluate a particular expression as an array, but another algorithm may find it more convenient if that expression is a vector. For this reason, there are special kinds of rule called conversion rules. They can convert from one convention to another, but they have a non-zero cost. The optimizer may decide that it is cheaper to build a vector in the first place, rather than paying the cost of converting it to a vector later.
There is a special kind of node called a Terminator
(class {@link saffron.rel.java.Terminator}
).
After a tree has been optimized, a Terminator is placed on the root to drive
code-generation. It behaves somewhat like a Rel node (in that it has
inputs, and can be called-back to generate code), but it does not actually
produce any result. (todo: Maybe Terminator should be a calling
convention?)
group by
There are several problems with a SQL-like aggregation:
select
statement contains a group by
clause,
the validation rules for expressions within that statement are altered: only
expressions which are in the group by
can be used outside of an
aggregate function.group by
clause present, aggregation
implicitly occurs if an aggregate function occurs anywhere in the select
statement. For example, every expression inselect emp.gender.toLowerCase(), emp.name from emp where emp.name.startsWith("P")
becomes illegal if an aggregation is added:
select emp.gender.toLowerCase(), emp.name from emp where emp.name.startsWith("P") && min(emp.salary) > 50000
This syntax problem could be fixed fairly easily, by requiring an empty
group by
clause in such cases.
int count(Object[] a) { int n = 0; for (int i = 0; i < a.length; i++) { if (a[i] != null) { n++; } } return n; } class Emp { String name; Person[] dependents; }; select count(emp.dependents) from emp;
It is not clear whether Saffron should return a row for each employee with
the number of the dependents that employee has, or a single number, the number
of employees whose dependents
field is not null.
The underlying problem is that SQL aggregation makes a scalar expression behave like a set, and this undermines the usual rules of expression validation and type-checking. We can make the set explicit, but the resulting expression
select deptno, sum(select emp.sal from emps as emp where emp.deptno == deptno) as sumSal, min(select emp.sal from emps as emp where emp.deptno == deptno) as minSal from (select distinct deptno from emps) as deptno
is more verbose -- the underlying set, emps
, is referenced three
times -- and perhaps less efficient.
We need to make grouping an aggregation explicit.
QUEL aggregations are more like scalar sub-queries. They are more verbose. They are also more powerful: note how we can sum only female employees'
salaries. In this syntax, the set of values to be aggregated over (in this case,
select ... from emp
) is formed for each aggregate expression; the
optimizer recognizes common expressions and evaluates them only once. Example:
select distinct
emp.deptno, min(select
emp.salfrom
empsas
emp2where
emp2.deptno == emp.deptno)as
minSal sum(select
emp.salfrom
empsas
emp2where
emp2.deptno == emp.deptno && gender.equals("F"))as
sumFemaleSalfrom
empsas
emp
Jim Gray et al. [GBLP96] classify aggregation operators into 3 categories, in increasing order of complexity:
sum
, count
, min
,
max
. To merge two partial sums, take sum(S1 u S2) =
sum({sum(S1), sum(S2)})
. (Here, S1
and S2
are
the sets of values which formed the partial sums, and S1 u S2
is
their union, the set of all values.) min
and max
work the same. count
is different, in that it cannot be used to
merge two sums; instead, you use +
: count(S1 u S2) =
count(S1) + count(S2)
.avg
,
stddev
.median
, countDistinct
.Some algebraic aggregates can be decomposed into distributive aggregates, as follows: avg(S) = sum(S) / count(S). We could provide better support for these; maybe a new method <code>Expression {@link saffron.rel.Aggregation}.preExpand()</code>, which is called before the parse tree is converted into a {@link saffron.rel.Aggregate} relational operator.
It is hard to implement holistic aggregations efficiently. For example,
median
can be translated into an efficient relational expressions by
hand:
select {nth(countEmp / 2, salary) as medianSalary} group by {emp.deptno} from (select from emp order by emp.deptno) empSorted join (select {emp.deptno, count() as empCount} group by {emp.deptno} from emp) empCounted on empSorted.deptno == empCounted.deptno
Similarly distinct count
:
select { empGrouped.deptno, count(empGrouped.gender) as countDistinctGender} group by {empGrouped.deptno} from ( select distinct {emp.deptno, emp.gender} from emps as emp) as empGrouped
We should supply an extender which can transform either the parse tree or the relational operator tree in such a way.
Aggregations can take parameters per query, per group, and per row. Examples:
One way to provide per-group parameters is for aggregations to have two sets of parameters: the first passed at start time, the other passed per row.
But custom aggregations can also do the job. They are a little less efficient, and the syntax is a little more verbose, but what's going on is clearer.
Custom aggregation to do locale-specific min:
select {emp.dept.deptno, new LocaleMin(emp.dept.locale).apply(name) minName} group by {emp.dept} from emps as emp
The ideas in this section are not implemented. So, if your iterator is infinite, the optimizer may generate a plan which never finishes. And it may generate a plan which requires iterators to be restarted.
Iterators (and enumerations) present peculiarities that finite data sources such as JDBC tables, arrays, and collections don't present.
Restarting an iterator
First, they need to be restarted. Suppose the optimizer implemented
Iterator nouns, verbs; for (i in (String[]) verbs intersect (String[]) nouns) { print(verb + " is both a noun and a verb"); }
as a nested loops join,
while (nouns.hasNext()) { String noun = (String) nouns.next(); while (verbs.hasNext()) { String verb = (String) verbs.next(); if (verb.equals(noun)) { print(verb " is both a noun and a verb"); } } }
This doesn't work. The problem is, the second time around the outer loop,
verbs.hasNext()
has already returned false
.
If possible, you should use a class which implements {@link java.util.Collection}
or {@link saffron.runtime.Iterable}; the optimizer calls the iterate()
method each time it needs to start an iteration.
If you supply an {@link java.util.Iterator}, the optimizer wraps it in a {@link saffron.runtime.BufferedIterator}.
Enhancement: If the plan never requires that the iterator is restarted, the optimizer should leave the iterator unwrapped.
Infinite iterators
The second problem with iterators is that they can be infinite. This is a bad idea if you're copying the results to an array, but you can achieve spectacular results if you're prepared to go with the whole streaming thing. Here's an example:
class MultiplesOf implements Iterator { int i, n; MultiplesOf(int n) { this.n = n; this.i = 0; } public boolean hasNext() { return true; } public Object next() { return i += n; } public void remove() { throw new UnsupportedOperationException(); } } Iterator twos = new MultiplesOf(2); Iterator tens = (Iterator) (select from twos where twos % 5 == 0);
We have created one infinite iterator from another!
The problem is that the optimizer might be tempted to do stupid things, like storing the results in an intermediate array. We tell it not to by implementing an empty interface, {@link saffron.runtime.Infinite}.
We have not implemented support for infinite iterators. Expressions involving infinite iterators might just go on for ever.
List:
select
statements with a group by
clause). Example #1:
select {deptno, count(select from emps as e2 where e2.deptno ==
e1.deptno) as countEmp} from emps as e1
. Example #2: if (count(select from
emps as e2 where e2.deptno == e1.deptno
.End $Id: //open/dt/dev/saffron/doc/overview.html#2 $ |