 | Level: Intermediate Serge Rielau (srielau@ca.ibm.com), Senior Software Developer, IBM
13 Oct 2005 Using examples, this article describes step-by-step how to map CONNECT BY syntax and its associated pseudo columns to common table expressions that work in DB2® Universal Database™ (DB2 UDB) for Linux®, UNIX®, and Windows®.
Motivation
When porting or migrating an application from Oracle to DB2 UDB for Linux, UNIX, and Windows, recursive queries present a significant obstacle.
A developer who understands and uses Oracle-style recursion often does not naturally understand DB2 recursion and the same is true the other way around.
On first blush, the reason appears to be that DB2 implements recursion as it is defined in the SQL standard using common table expressions (CTE) and UNION ALL.
Oracle, on the other hand, uses a CONNECT BY clause and a set of so-called pseudo columns and system procedures to define recursion.
The difference, however, goes beyond syntax. The underlying problem is that DB2 processes recursion by essentially producing
one level of recursion at a time (breadth first), while Oracle uses depth first.
The later approach naturally produces an output that matches an org chart.
In this article, I use a running example that matches CONNECT BY and the related pseudo columns
to DB2 recursion, one step at a time.
Installation
To perform some of the queries, you must download and expand the CONNECT_BY.zip file available in
the Download section at the end of this article.
If you are working on a Windows 32-bit, a Linux on Intel 32-bit, or an AIX® 64-bit system, all you
need to do is copy the connect_by (connect_by.dll on Windows) executable found in the
appropriate subdirectory of the CONNECT_BY.zip file into your sqllib/function directory.
If you are working on any other operating system or hardware platform with an installed C compiler,
you can copy the files from the Connect_By/FunctionSource directory into sqllib/samples/c.
After changing to that directory, simply run bldrtn connect_by. This command will compile the
provided C-source code. bldrtn will also copy the produced executable into sqllib/function and you are ready
to go.
For convenience, running the following files in CONNECT_BY.zip when connected to a test database
will reproduce all the DB2 SQL in this article:
-
db2 -tvf
connect_by_functions.sql
-
db2 -tvf
connect_by_sample.sql
Setting up the example
To start, we will define and fill an employee table. The table has an employee ID, which is the primary key,
a manager ID, which links the employee to his manager, and name and salary columns:
1 CREATE TABLE emp(empid INTEGER NOT NULL PRIMARY KEY,
2 name VARCHAR(10),
3 salary DECIMAL(9, 2),
4 mgrid INTEGER);
5 INSERT INTO emp
6 VALUES ( 1, 'Jones', 30000, 10),
7 ( 2, 'Hall', 35000, 10),
8 ( 3, 'Kim', 40000, 10),
9 ( 4, 'Lindsay', 38000, 10),
10 ( 5, 'McKeough', 42000, 11),
11 ( 6, 'Barnes', 41000, 11),
12 ( 7, 'O''Neil', 36000, 12),
13 ( 8, 'Smith', 34000, 12),
14 ( 9, 'Shoeman', 33000, 12),
15 (10, 'Monroe', 50000, 15),
16 (11, 'Zander', 52000, 16),
17 (12, 'Henry', 51000, 16),
18 (13, 'Aaron', 54000, 15),
19 (14, 'Scott', 53000, 16),
20 (15, 'Mills', 70000, 17),
21 (16, 'Goyal', 80000, 17),
22 (17, 'Urbassek', 95000, NULL);
|
Looking at the inserted rows, it is apparent that employee 'Urbassek' is the top manager for whom 'Goyal' and 'Mills' work.
'Scott' in turn works for 'Goyal' and has no employees of his own. 'Monroe,' on the other hand, manages 'Hall,' 'Kim,' and 'Lindsay'
and is an employee of 'Mills.'
A reasonable question to ask is: "Who works directly or indirectly for 'Goyal'?"
A simple recursive query
To answer the question of who works for 'Goyal,' an Oracle developer might write the following query:
1 SELECT name
2 FROM emp
3 START WITH name = 'Goyal'
4 CONNECT BY PRIOR empid = mgrid
|
START WITH denotes the seed of the recursion while CONNECT BY describes the recursive step.
That is how to get from step n to step (n + 1). Since it is important to distinguigh between the nth and the (n + 1)th
step during name resolution, PRIOR is used to show that empid belongs to the nth step while mgrid belongs
to step (n + 1)th. So with empid being 16 for step 1, mgrid must be 16 as well and hence step 2 produces
'Scott,' 'Henry,' and 'Zander.' Their empids will now serve as PRIOR to step 3, and so on and so forth.
The Oracle syntax is very concise. The SQL standard and DB2 syntax does not use any recursion-specific keywords and uses
regular SQL to describe the exact same relationships. As you will see, it is more verbose, but equally straightforward:
1 WITH n(empid, name) AS
2 (SELECT empid, name
3 FROM emp
4 WHERE name = 'Goyal'
5 UNION ALL
6 SELECT nplus1.empid, nplus1.name
7 FROM emp as nplus1, n
8 WHERE n.empid = nplus1.mgrid)
9 SELECT name FROM n;
|
The WITH clause allows for the definition of a named query within a statement, which can be referred
to at a later point in the same statement. Its technical term is a common table expression (CTE).
CTEs are by now supported in most major SQL-based DBMS, including Oracle 9i, SQL Server 2005, and, of course, in DB2.
What makes this CTE special is that it is referred to within its very own defintion. This is what distinguishes a
regular CTE from a recursive CTE. Here I named the CTE n to correlate to the recursive steps.
A recursive CTE consists of two parts combined with a UNION ALL:
- The seed or step 1 of the recursion. This is what is described in Oracle using START WITH.
In a recursive CTE, it is simply any query providing a set of rows. In this case we query the emp table and filter
for 'Goyal.' We select the name of course and also the empid, because we need it for the recursive step.
-
The recursive step going from n to (n + 1). Here we refer to step n (the CTE n) and join in step (n + 1) using the
same predicate used in CONNECT BY. Instead of PRIOR, regular correlation names are used to distinguish n from (n + 1).
It is noteworthy to look at the output of this query:
NAME
----------
SQL0347W The recursive common table expression "SRIELAU.N"
may contain an infinite loop. SQLSTATE=01605
Goyal
Zander
Henry
Scott
McKeough
Barnes
O'Neil
Smith
Shoeman
9 record(s) selected with 1 warning messages printed.
|
DB2 does not perform any automatic cycle checking.
That is why it will return this warning when it determines that cycles are possible.
Those DB2 developers who care usually add a level column that increments and is limited by a fixed value.
DB2 recognises this pattern and will suppress the warning.
For the purpose of mapping CONNECT BY, we will ignore the warning from now on since we will
add a cycle check later on.
Further, it is interesting to note that the order in which DB2 has returned the results exposes how the recursion
is processed one level at a time, instead of using a tree-walk.
Before moving on to pseudo columns and more complex examples, I want to briefly explain where the WHERE predicate
of an Oracle recursion needs to be placed. We will modify the example to return all employees, including their salaries, working for
'Goyal' who earn more than 40,000.
1 SELECT name, salary
2 FROM emp
3 WHERE salary > 40000
4 START WITH name = 'Goyal'
5 CONNECT BY PRIOR empid = mgrid
|
It is interesting to note how the WHERE clause preceded the recursive specification. For a DB2 developer, it
might seem that the predicate belongs to the set of rows being considered to begin with. This, however, is not correct.
Instead, the WHERE filters the final result and belongs at the end of the matching DB2 query:
1 WITH n(empid, name, salary) AS
2 (SELECT empid, name, salary
3 FROM emp
4 WHERE name = 'Goyal'
5 UNION ALL
6 SELECT nplus1.empid, nplus1.name, nplus1.salary
7 FROM emp as nplus1, n
8 WHERE n.empid = nplus1.mgrid)
9 SELECT name FROM n WHERE salary > 40000;
NAME SALARY
---------- -----------
Goyal 80000.00
Zander 52000.00
Henry 51000.00
Scott 53000.00
McKeough 42000.00
Barnes 41000.00
6 record(s) selected
|
Be aware that what is true for the WHERE clause is not true for the ON clause.
I will get back to this issue in a later example. First we will dive into pseudo columns.
LEVEL pseudo column
The most trivial of the pseudo columns is LEVEL. The purpose of this column is to show the number of
the recursive step n that produced the row. In our running example, it indicates the levels of management between 'Goyal' and the employee plus 1
(because LEVEL starts with 1). Here is the original Oracle example enhanced with LEVEL:
1 SELECT LEVEL, name
2 FROM emp
3 START WITH name = 'Goyal'
4 CONNECT BY PRIOR empid = mgrid
|
The SQL standard saw no need to add syntax for this feature because it can be expressed using regular SQL:
1 WITH n(level, empid, name) AS
2 (SELECT 1, empid, name
3 FROM emp
4 WHERE name = 'Goyal'
5 UNION ALL
6 SELECT n.level + 1, nplus1.empid, nplus1.name
7 FROM emp as nplus1, n
8 WHERE n.empid = nplus1.mgrid)
9 SELECT level, name FROM n;
LEVEL NAME
----------- ----------
1 Goyal
2 Zander
2 Henry
2 Scott
3 McKeough
3 Barnes
3 O'Neil
3 Smith
3 Shoeman
9 record(s) selected
|
All I have done here is to introduce a level column, which starts with 1 and increments by 1. Of course, any
semantics is possible, but this one happens to provide the same semantics as LEVEL.
It is up to the developer to choose which information to pass along and how as the next example will show.
The CONNECT_BY_ROOT expression
CONNECT_BY_ROOT operates on a column and returns the value of the root ancestor of the row.
To show what it does and how it gets translated, I change the example to use 'Goyal's' direct reports as root:
1 SELECT CONNECT_BY_ROOT name AS root, name
2 FROM emp
3 START WITH name IN ('Zander', 'Henry', 'Scott')
4 CONNECT BY PRIOR empid = mgrid
|
To map to DB2, all that is required is to pass along the root unmodified.
1 WITH n(empid, root, name) AS
2 (SELECT empid, name, name
3 FROM emp
4 WHERE name IN ('Zander', 'Henry', 'Scott')
5 UNION ALL
6 SELECT nplus1.empid, n.root,
7 nplus1.name
8 FROM emp as nplus1, n
9 WHERE n.empid = nplus1.mgrid)
10 SELECT root, name FROM n;
ROOT NAME
---------- ----------
Zander Zander
Henry Henry
Scott Scott
Zander McKeough
Zander Barnes
Henry O'Neil
Henry Smith
Henry Shoeman
8 record(s) selected
|
Next, I will show how to present the whole management chain and not just the root.
The SYS_CONNECT_BY_PATH() procedure
A common question when running queries is: "How does this element relate to the beginning of the recursion?" or,
in other words, "What are the ancestors of this row?" In Oracle, SYS_CONNECT_BY_PATH() can be used to concatenate values
from each recursive step and thus build an ancestor path. Find below the by now well-known example showing for each employee
of 'Goyal' the "reports to" chain:
1 SELECT SYS_CONNECT_BY_PATH(name, ':') AS chain
2 FROM emp
3 START WITH name = 'Goyal'
4 CONNECT BY PRIOR empid = mgrid
|
In this example, each value is separated by a ':'. To answer this question in DB2, the same technique is used as for LEVEL.
The only difference is that concatenation is used instead of addition.
1 WITH n(empid, chain) AS
2 (SELECT empid, CAST(name AS VARCHAR(30))
3 FROM emp
4 WHERE name = 'Goyal'
5 UNION ALL
6 SELECT nplus1.empid,
7 n.chain || ':' || nplus1.name
8 FROM emp as nplus1, n
9 WHERE n.empid = nplus1.mgrid)
10 SELECT chain FROM n;
CHAIN
------------------------------
Goyal
Goyal:Zander
Goyal:Henry
Goyal:Scott
Goyal:Zander:McKeough
Goyal:Zander:Barnes
Goyal:Henry:O'Neil
Goyal:Henry:Smith
Goyal:Henry:Shoeman
9 record(s) selected
|
There is complete freedom on how to build the chain. For instance, the concatenation could be reversed to put 'Goyal'
at the end of the chain. Also note the CAST that has been performed. It is needed for string data types so DB2
knows how long the chain can grow. The rules for recursive CTE specify that the length of the data types shall be defined
by the seed, or the first half of the UNION ALL query.
So far mapping from CONNECT BY to UNION ALL has been a breeze. But now that you are comfortable
with recursive CTE, it is time to crank up the volume and attack the harder problems.
One such problem is ordering. If all that is required is to order the result in the same way as the elements in the path
that is selected back anyway, then an ORDER BY on the path in the final SELECT will suffice:
1 WITH n(empid, chain) AS
2 (SELECT empid, CAST(name AS VARCHAR(30))
3 FROM emp
4 WHERE name = 'Goyal'
5 UNION ALL
6 SELECT nplus1.empid,
7 n.chain || ':' || nplus1.name
8 FROM emp as nplus1, n
9 WHERE n.empid = nplus1.mgrid)
10 SELECT chain FROM n
11 ORDER BY chain;
CHAIN
------------------------------
Goyal
Goyal:Henry
Goyal:Henry:O'Neil
Goyal:Henry:Shoeman
Goyal:Henry:Smith
Goyal:Scott
Goyal:Zander
Goyal:Zander:Barnes
Goyal:Zander:McKeough
9 record(s) selected
|
This is one form of the common org-chart. But what if the required order is on some other column
and it is not a string?
ORDER SIBLINGS BY expression
In Oracle ORDER SIBLINGS BY defines the order in which siblings under the same parent shall be returned.
To order all employees recursively by their salary using CONNECT BY, the query is formulated like this:
1 SELECT name, level, salary
2 FROM emp
3 START WITH name = 'Goyal'
4 CONNECT BY PRIOR empid = mgrid
5 ORDER SIBLINGS BY salary
|
The SQL Standard has specific syntax for this requirement.
However, DB2 UDB for LUW has not (yet) implemented this part of the SQL standard.
To map this feature in DB2, it is time to get inventive.
On a case-by-case basis, it is possible to build a path again
and then sort the string, but this can get a bit ugly since numeric values need to be converted to strings, which increases
the length of the path. Also subsequent mappings will show that it is beneficial to provide a generic solution.
At this point, I therefore introduce a user defined function (UDF). A UDF is an extension to DB2's
capabilities. In this case the UDF (and those following below) is implemented in C and the executable is available to DB2, if you have followed the Installation
chapter above.
1 CREATE FUNCTION CONNECT_BY_POS
2 (POS VARCHAR(256) FOR BIT DATA,
3 N INTEGER,
4 POSPLUS1 BIGINT)
5 RETURNS VARCHAR(256) FOR BIT DATA
6 SPECIFIC CONNECT_BY_POS
7 EXTERNAL NAME 'connect_by!connect_by_pos'
8 NOT FENCED RETURNS NULL ON NULL INPUT
9 DETERMINISTIC NO SQL NO EXTERNAL ACTION
10 LANGUAGE C PARAMETER STYLE SQL ALLOW PARALLEL;
|
This function has two purposes:
-
The function will take in the BIGINT value of the global position of the row posplus1 and translate this,
given the pattern pos of the parent into the pattern for the next row. The level of the parent n is used to
speed this process up. In principle the function does nothing other than building a binary-encoded path. The provided implementation
uses integers internally. So, the arbitrarily chosen length of 256 for the binary string allows for 64 nesting levels, which can
be trivially expanded.
-
As an additional feature, the function also tests whether the global position of the new row already appears in the parent path.
If that is the case, a runtime error is raised. Oracle does not allow cycles unless explicitly specified. So this feature makes
the DB2 query compatible.
The usage is best described by showing the example matching the CONNECT BY version above:
1 WITH source(empid, name, salary, mgrid, rownum) AS
2 (SELECT empid, name, salary, mgrid,
3 ROW_NUMBER() OVER(ORDER BY salary)
4 FROM emp),
5 n(empid, name, salary, level, pos) AS
6 (SELECT empid, name, salary, 1,
7 CONNECT_BY_POS('', 0, rownum)
8 FROM source
9 WHERE name = 'Goyal'
10 UNION ALL
11 SELECT nplus1.empid, nplus1.name,
12 nplus1.salary, level + 1,
13 CONNECT_BY_POS(n.pos, level, rownum)
14 FROM source AS nplus1, n
15 WHERE n.empid = nplus1.mgrid)
16 SELECT name, level, salary
17 FROM n
18 ORDER BY pos;
NAME LEVEL SALARY
---------- ----------- -----------
Goyal 1 80000.00
Henry 2 51000.00
Shoeman 3 33000.00
Smith 3 34000.00
O'Neil 3 36000.00
Zander 2 52000.00
Barnes 3 41000.00
McKeough 3 42000.00
Scott 2 53000.00
9 record(s) selected
|
What just happened? Adding another CTE source allows us to issue a global order for all the rows
according to the one requested by ORDER SIBLINGS BY. Once this is done, the recursive CTE n refers to
source to pick up the rownum. The rest of the query has already been explained. In essence, CONNECT_BY_POS() produces a
PATH as a binary string not unlike what SYS_CONNECT_BY_PATH() does as a regular string. In the end, the query gets ordered by the path
that provides the position within the hierarchy pos.
At this point, I want to come back to the possibility of an ON-clause in the CONNECT BY query.
Any joins, or even more complex queries, can easily be expressed in the same way by simply pushing the composition of the source for the recursion into a CTE.
The NOCYCLE keyword
Now that I have introduced the capability to record the path of a recursion and to detect recursion at runtime, it
is possible to map the NOCYCLE keyword of Oracle's CONNECT BY query syntax. In short, NOCYCLE
prevents the recursion from entering a cycle. Any candidate row for step (n + 1) that already exists in the ancestor path
is simply ignored. In Oracle, the usage looks like this:
1 SELECT name, level, salary
2 FROM emp
3 START WITH name = 'Goyal'
4 CONNECT BY NOCYCLE PRIOR empid = mgrid
|
Given the existence of the ancestor path, mapping the semantics is pretty simple. A second UDF will take care of it.
The UDF will return 0 if a cycle is detected. If none is detected, the value 1 is returned. With the registration
of the UDF I will also introduce a cycle into the management chain. 'Urbassek' shall now report to 'O'Neil.'
1 CREATE FUNCTION CONNECT_BY_NOCYCLE
2 (N VARCHAR(256) FOR BIT DATA,
3 LEVEL INTEGER,
4 NPLUS1 BIGINT)
5 RETURNS SMALLINT
6 SPECIFIC CONNECT_BY_NOCYCLE
7 EXTERNAL NAME 'connect_by!connect_by_nocycle'
8 NOT FENCED RETURNS NULL ON NULL INPUT
9 DETERMINISTIC NO SQL NO EXTERNAL ACTION
10 LANGUAGE C PARAMETER STYLE SQL ALLOW PARALLEL;
11 UPDATE emp SET mgrid = 7 WHERE name = 'Urbassek';
12 WITH source(empid, name, salary, mgrid, rownum) AS
13 (SELECT empid, name, salary, mgrid,
14 ROW_NUMBER() OVER(ORDER BY salary)
15 FROM emp),
16 n(empid, name, salary, level, pos) AS
17 (SELECT empid, name, salary, 1,
18 CONNECT_BY_POS('', 0, rownum)
19 FROM source
20 WHERE name = 'Goyal'
21 UNION ALL
22 SELECT nplus1.empid, nplus1.name,
23 nplus1.salary, level + 1,
24 CONNECT_BY_POS(n.pos, level, rownum)
25 FROM source AS nplus1, n
26 WHERE n.empid = nplus1.mgrid
27 AND CONNECT_BY_NOCYCLE(n.pos, level, rownum) = 0
27 SELECT name, level, salary
28 FROM n
29 ORDER BY pos;
NAME LEVEL SALARY
---------- ----------- -----------
Goyal 1 80000.00
Henry 2 51000.00
Shoeman 3 33000.00
Smith 3 34000.00
O'Neil 3 36000.00
Urbassek 4 95000.00
Mills 5 70000.00
Monroe 6 50000.00
Jones 7 30000.00
Hall 7 35000.00
Lindsay 7 38000.00
Kim 7 40000.00
Aaron 6 54000.00
Zander 2 52000.00
Barnes 3 41000.00
McKeough 3 42000.00
Scott 2 53000.00
17 record(s) selected
|
As you can see, 'Urbassek' has been added to the output, including all of 'Mills's' employees.
'Goyal' has not been added again, however.
This is pretty good, but how do you find out where the cycle is?
CONNECT_BY_ISCYCLE
For the purpose of "coloring" a cycle in the recursion, Oracle provides yet another pseudo column, CONNECT_BY_ISCYCLE.
This column returns 1 if the row is part of a cycle and 0 if it is not.
1 SELECT name, level,
2 CONNECT_BY_ISCYCLE AS cycle
3 FROM emp
4 START WITH name = 'Goyal'
5 CONNECT BY NOCYCLE PRIOR empid = mgrid
|
To map this feature to DB2, I will once more introduce a UDF. CONNECT_BY_ISCYCLE() detects a cycle,
just like CONNECT_BY_NOCYCLE() did.
But when CONNECT_BY_ISCYCLE() detects a cycle, it also compares the given path to that of another path.
If the two paths overlap within the cycle, a 1 is returned; otherwise a 0 is returned.
Here is the declaration of the UDF:
1 CREATE FUNCTION CONNECT_BY_ISCYCLE
2 (POS VARCHAR(256) FOR BIT DATA,
3 LEVELPOS INTEGER,
4 NPLUS1 BIGINT,
5 POSANCESTOR VARCHAR(256) FOR BIT DATA,
6 LEVELANCESTOR INTEGER)
7 RETURNS SMALLINT
8 SPECIFIC CONNECT_BY_ISCYCLE
9 EXTERNAL NAME 'connect_by!connect_by_iscycle'
10 NOT FENCED RETURNS NULL ON NULL INPUT
11 DETERMINISTIC NO SQL NO EXTERNAL ACTION
12 LANGUAGE C PARAMETER STYLE SQL ALLOW PARALLEL;
|
To achieve the goal, the new query has to:
- Return all rows from the recursion
- Join in child rows of each row for comparions
- Drive one recursive step for each child row to detect whether that step leads to a cycle
In the end, what is needed is a three-way join:
one self join on the output of the recursion, and a reference to the numbered source,
similar to what was done in the original recursion.
There is one advantage, though:
It is sufficient to find any child row that introduces recursion.
There is no need to find them all, and indeed, it would be harmful, because we want to return
each row only once.
Note that with a small change, one could count in how many cycles a given row participates.
I'll leave that riddle for the interested reader to figure out on his own.
1 WITH source(empid, name, salary, mgrid, rownum) AS
2 (SELECT empid, name, salary, mgrid,
3 ROW_NUMBER() OVER(ORDER BY salary)
4 FROM emp),
5 n(empid, name, mgrid, level, pos) AS
6 (SELECT empid, name, mgrid, 1,
7 CONNECT_BY_POS('', 0, rownum)
8 FROM source
9 WHERE name = 'Goyal'
10 UNION ALL
11 SELECT nplus1.empid,
12 nplus1.name,
13 nplus1.mgrid,
14 level + 1,
15 CONNECT_BY_POS(n.pos, level, rownum)
16 FROM source AS nplus1, n
17 WHERE n.empid = nplus1.mgrid
18 AND CONNECT_BY_NOCYCLE(n.pos, level, rownum) = 0),
19 cycles(name, level, pos, iscycle) AS
20 (SELECT name, level, pos,
21 COALESCE((SELECT 1 AS iscycle
22 FROM n
23 LEFT OUTER JOIN source
24 ON n.mgrid = source.empid
25 WHERE CONNECT_BY_ISCYCLE(n.pos,
26 n.level,
27 source.rownum,
28 ancestor.pos,
29 ancestor.level) = 1
30 FETCH FIRST ROW ONLY), 0) AS iscycle
31 FROM n AS ancestor)
32 SELECT name, level, iscycle
33 FROM cycles
34 ORDER BY pos;
NAME LEVEL ISCYCLE
---------- ----------- -----------
Goyal 1 1
Henry 2 1
Shoeman 3 0
Smith 3 0
O'Neil 3 1
Urbassek 4 1
Mills 5 1
Monroe 6 0
Jones 7 0
Hall 7 0
Lindsay 7 0
Kim 7 0
Aaron 6 0
Zander 2 0
Barnes 3 0
McKeough 3 0
Scott 2 0
17 record(s) selected
|
OK: This one was a piece of work, but at least now you know why you read that far.
As in any good story, it is now time to add the Hollywood ending, where everything is nice and easy.
CONNECT_BY_ISLEAF
Compared to CONNECT_BY_ISCYCLE, CONNECT_BY_ISLEAF is indeed lighter fare.
All this pseudo column does is to return 1 if a given row is a leaf in recursion.
In other words: The row did not produce any further rows.
In this example, a leaf is an employee who is not a manager.
1 SELECT name, level,
2 CONNECT_BY_ISLEAF AS isleaf
3 FROM emp
4 START WITH name = 'Goyal'
5 CONNECT BY PRIOR empid = mgrid
|
Indeed, all that is required here is to take out most of the previous additions.
For the purpose of readability, we start fresh and 'Urbassek' becomes her own boss again.
Also note that I took out the NOCYCLE option above to concentrate on the matter at hand.
After all, there is no need to dwell on the past, especially when it was successfully mastered.
By now the query practically writes itself.
1 UPDATE emp SET mgrid = NULL WHERE name = 'Urbassek';
2 WITH n(empid, name) AS
3 (SELECT empid, name
4 FROM emp
5 WHERE name = 'Goyal'
6 UNION ALL
7 SELECT nplus1.empid, nplus1.name
8 FROM emp as nplus1, n
9 WHERE n.empid = nplus1.mgrid)
10 SELECT name,
11 COALESCE((SELECT 0 FROM emp
12 WHERE n.empid = emp.mgrid
13 FETCH FIRST ROW ONLY), 1) AS isleaf
11 FROM n;
NAME ISLEAF
---------- -----------
Goyal 0
Zander 0
Henry 0
Scott 1
McKeough 1
Barnes 1
O'Neil 1
Smith 1
Shoeman 1
9 record(s) selected
|
This was indeed quite painless.
Unless Oracle has introduced yet another pseudo column or keywords that I'm not aware of,
all of Oracle's recursive features have now been mapped to DB2 UDB for Linux, UNIX, and Windows.
Conclusion
In this article I provided generic mappings from the Oracle style CONNECT BY recursive query
syntax to DB2's standard compliant recursive common table expressions using UNION ALL.
All the individual mappings work well together and I have shown that DB2's recursion feature is as least
as expressive.
While the Oracle syntax is less verbose because it provides keywords for various common semantics, this also
means that it is less expressive since little new semantics can be added without changes to the DBMS.
As an ISV or end-customer trying to port an application from Oracle to DB2, you probably do not care too much about that
at this point. All you want and need to know now is how to translate from Oracle to DB2.
Showing this has been the main purpose of this article.
Once you get comfortable with the SQL standard syntax, I'm confident you will find value in DB2's
expressiveness, even if it means typing a few more lines.
Acknowledgements
Thanks to the countless developers, both the IBMers and customers,
who share their SQL challenges with me.
I learn more from you than you from me.
Download | Description | Name | Size | Download method |
|---|
| Sample code | CONNECT_BY.zip | 22 KB | FTP | HTTP |
|---|
Resources Learn
Get products and technologies
Discuss
About the author  | 
|  | Serge Rielau is part of the DB2 Solutions Development team, where he closely works with customers and business partners to port or migrate their applications from competitive RDBMS to DB2 for Linux. UNIX, and Windows. Prior to this role, he worked for 7 years as a team lead and technical manager in the DB2 SQL Compiler Development team. As an expert in the SQL language, Serge is an active participant in the comp.databases.ibm-db2 newsgroup. |
Rate this page
|  |