All there was were columns.
But then along came routines. Routines had parameters, local variables and FOR loops.
Next came global variables, then modules and ultimately pseudo-columns.
All of these objects share one and the same name space..
In this post I will try to illuminate the rules by which DB2 decides who is who.
Let's start with a simple example:
Presumably to no-one's surprise.CREATE TABLE T(c1 VARCHAR(10));
INSERT INTO T VALUES 'Table T';
SELECT c1 FROM T;
C1
-----------
Table T
1 record(s) selected.
Let's complicate matters as bit:
There are two columns C1 that could be resolved here. But DB2 will always resolve to the one in the innermost scope.CREATE TABLE S(c1 VARCHAR(10));
INSERT INTO S VALUES 'Table S';
SELECT (SELECT c1 FROM T) FROM S;
C1
-----------
Table T
1 record(s) selected.
The innermost scope is always the immediate FROM of a given query.
Two lessons are being taught by the example above.SELECT (SELECT c1 FROM T AS X(c2)) FROM S;
C1
----------
Table S
1 record(s) selected.
First when the column is renamed in the FROM clause is renamed then the original name becomes inaccessible.
That is in the above example neither T nor T.c1 are accessible. They are now known as X and X.c2 respectively.
Second when c1 cannot be found within the innermost FROM clause DB2 will look in the surrounding context.
What defines "outer"? Let's try some more scenarios:
This didn't go so well.SELECT (SELECT res FROM T, (VALUES c1) AS X(res)) FROM S;
SQL0204N "C1" is an undefined name. SQLSTATE=42704
DB2 didn't pick up T.c1 here because T is not an outer scope to the VALUES. It is "lateral" to VALUES.
But DB2 also didn't jump two levels up to S.c1. I suppose it could but, alas, it does not.
A small modification:
By telling DB2 that we want to allow LATERAL correlation it will find T.c1.SELECT (SELECT res FROM T, LATERAL(VALUES c1) AS X(res)) FROM S;
RES
----------
Table T
1 record(s) selected.
But will it also be able to find S.c1?
Indeed! So LATERAL not only allows DB2 to refer to laterally correlated columns it also allows it to look further out in the scope.SELECT (SELECT res FROM T AS Y(c2), LATERAL(VALUES c1) AS X(res)) FROM S;
RES
----------
Table S
1 record(s) selected.
Let's add another twist
So DB2 can look outside of an EXISTS to correlate it without problem.SELECT 1 FROM T WHERE EXISTS(SELECT c1 FROM S AS X(c2));
1
-----------
1
1 record(s) selected.
Let's summarize:
- WIthin an SQL Statement local columns reference are preferred over outer column references
- To access columns in the same FROM from clause as the subquery where a column is referenced the LATERAL keyword needs to be used.
The column c1 wins over the variable because the variable is "further out" than the column.SET SERVEROUTPUT ON;
--#SET TERMINATOR @
BEGIN
DECLARE c1 VARCHAR(10) DEFAULT 'Var';
CALL DBMS_OUTPUT.PUT_LINE((SELECT c1 FROM T));
END @
Table T
Let's validate that we can get to the variable.
FOR loops also produce a form of local variables.BEGIN
DECLARE c1 VARCHAR(10) DEFAULT 'Var';
CALL DBMS_OUTPUT.PUT_LINE((SELECT c1 FROM T AS X(c2)));
END @
Var
So the FOR loop introduces a level of nesting here and therefore eclipses the local variable.BEGIN
DECLARE c1 VARCHAR(10) DEFAULT 'Var';
FOR loop AS SELECT 'For' AS c1 FROM SYSIBM.SYSDUMMY1 DO
CALL DBMS_OUTPUT.PUT_LINE((SELECT c1 FROM T AS X(c2)));
END FOR;
END @
For
Note that compounds can be nested:
If we want to refer to the for loop we need to qualify:BEGIN
DECLARE c1 VARCHAR(10) DEFAULT 'Var';
FOR loop AS SELECT 'For' AS c1 FROM SYSIBM.SYSDUMMY1 DO
INNER: BEGIN
DECLARE c1 VARCHAR(10) DEFAULT 'Var Inner';
CALL DBMS_OUTPUT.PUT_LINE(c1);
END;
END FOR;
END @
Var Inner
Let's move further out. The next level out are parameters of routines and transition variables of trigger.BEGIN
DECLARE c1 VARCHAR(10) DEFAULT 'Var';
FOR loop AS SELECT 'For' AS c1 FROM SYSIBM.SYSDUMMY1 DO
INNER: BEGIN
DECLARE c1 VARCHAR(10) DEFAULT 'Var Inner';
CALL DBMS_OUTPUT.PUT_LINE(loop.c1);
END;
END FOR;
END @
For
I'll limit myself to demonstrating functions here:
The pattern is becoming pretty obvious now the local variable is inside the compound block and it eclipses the parameter with the same nameCREATE OR REPLACE FUNCTION foo(c1 VARCHAR(10) DEFAULT 'Parm Foo')
RETURNS VARCHAR(10)
outer: BEGIN
DECLARE c1 VARCHAR(10) DEFAULT 'Var Outer';
RETURN c1;
END@
VALUES foo()@
1
----------
Var Outer
1 record(s) selected.
If we want to reference the parameter we qualify it with the routine name
There is one more interesting scenario within functions.CREATE OR REPLACE FUNCTION foo(c1 VARCHAR(10) DEFAULT 'Parm Foo')
RETURNS VARCHAR(10)
outer: BEGIN
DECLARE c1 VARCHAR(10) DEFAULT 'Var Outer';
RETURN foo.c1;
END@
VALUES foo()@
1
----------
Parm Foo
1 record(s) selected.
It is not possible to see a variable from compound that a statement isn't itself within.CREATE OR REPLACE FUNCTION foo(c1 VARCHAR(10) DEFAULT 'Parm Foo')
RETURNS VARCHAR(10)
outer: BEGIN
inner: BEGIN
DECLARE c1 VARCHAR(10) DEFAULT 'Var Outer';
END inner;
RETURN c1;
END@
VALUES foo()@
1
----------
Parm Foo
1 record(s) selected.
Let's summarize:
- Within compound blocks local variables within an inner block eclipse those with the same name of an outer block
You must use compound labels to disambiguate in the same fashion as you use table names or correlation names within SQL. - Still within an SQL statement columns will always eclipse variables
- FOR loops are treated like compound blocks where the loop name is the qualifier and the column names of the cursor are the variables
- Parameters compose the outermost block and an be referenced by the routine name.
- In case of triggers the columns of the transition variables can be qualified by the transition variable name.
Again we are going inside out. The parameter wins over the module variable, but the module variable can be referred to by qualifying it with the module name or unqualified if it is not eclipsed:CREATE OR REPLACE MODULE mod@
ALTER MODULE mod ADD VARIABLE c1 VARCHAR(10) DEFAULT 'Var Module'@
ALTER MODULE mod PUBLISH FUNCTION FOO(c1 VARCHAR(10) DEFAULT 'Parm Foo')
RETURNS VARCHAR(10)
RETURN c1@
VALUES mod.foo()@
1
----------
Parm Foo
1 record(s) selected.
Note that within a module context even variables which are not published can be accessed as long as they belong to the same module as the routine.ALTER MODULE mod DROP FUNCTION FOO(VARCHAR())@
ALTER MODULE mod PUBLISH FUNCTION FOO(c2 VARCHAR(10) DEFAULT 'Parm Foo')
RETURNS VARCHAR(10)
RETURN c1@
VALUES mod.foo()@
1
----------
Var Module
1 record(s) selected.
ALTER MODULE mod DROP FUNCTION FOO(VARCHAR())@
ALTER MODULE mod PUBLISH FUNCTION FOO(c1 VARCHAR(10) DEFAULT 'Parm Foo')
RETURNS VARCHAR(10)
RETURN mod.c1@
VALUES mod.foo()@
1
----------
Var Module
1 record(s) selected.
We are slowly digging digging ourselves out from the depth of name resolution to the outer layers.
Next we deal with global variables.
By now I trust you will believe me if I tell you that any local module variable, parameter, local variable or column will eclipse a global variable. No need for proof.
But we are now starting to encounter a new curious beast. The PATH.CREATE OR REPLACE VARIABLE schema.c1 VARCHAR(10) DEFAULT 'Var Schema'@
CREATE OR REPLACE FUNCTION FOO() RETURNS VARCHAR(10) RETURN c1@
SQL0206N "C1" is not valid in the context where it is used. LINE NUMBER=1. SQLSTATE=42703
Since variables reside in schemas we can only see them if their schema is on the PATH
OK, now as far as DB2 would be concerned we should be done now. But there are some rather curious behaviors that our beloved competition has concocted that, like it or not, DB2 must support for the sake of portability. The first one are PSEUDOCOLUMsSET PATH = CURRENT PATH, SCHEMA@
CREATE OR REPLACE FUNCTION FOO() RETURNS VARCHAR(10) RETURN c1@
VALUES FOO()@
1
----------
Var Schema
1 record(s) selected.
A pseudo column looks like a column but it isn't one at all. The three most common pseudo columns are NEXTVAL, CURRVAL and ROWNUM.
Let's investigate:
In Oracle S.NEXTVAL has the same meaning as NEXT VALUE FOR S.CREATE OR REPLACE VARIABLE S.NEXTVAL VARCHAR(10) DEFAULT 'Var Global'@
CREATE OR REPLACE SEQUENCE S@
VALUES (NEXT VALUE FOR S, S.NEXTVAL)@
1 2
----------- ----------
1 Var Global
1 record(s) selected.
DB2 will prefer referencing to anything else first before, as a last resort try it's luck with a pseudo column
OK, this post certainly making up for my six weeks of silence in sheer length. There is one last curiosity that I want to discuss for completeness.DROP VARIABLE S.NEXTVAL@
VALUES (NEXT VALUE FOR S, S.NEXTVAL)@
1 2
----------- -----------
2 2
1 record(s) selected.
Hold it.... wait for it....CREATE OR REPLACE FUNCTION C1() RETURNS VARCHAR(10) RETURN 'Func'@
VALUES C1@
1
----------
Var Schema
1 record(s) selected.
Tadah!DROP VARIABLE SCHEMA.C1@
VALUES C1@
1
----------
Func
1 record(s) selected.
Again, thanks to our friends at Redwood Shores you may think you are referring to a column, or a variable, or a parameter or whatever, but really you are referring to a function.
DB2 will only do this if it has exhausted absolutely every other possibility.
Now I could talk about the rules for qualified names, but perhaps I reserve this for another day.
Let's do a final summary:
The order of resolution for simple identifiers is as such:
- Column belonging to the local FROM clause
- Column belonging to a lateral table in the parent's FROM clause if LATERAL is specified and the reference is a subquery in that FROM clause
- Column belonging a FROM clause in the parent's FORM clause
- local variable or FOR loop variable resolving from the innermost block or FOR loop to the outside
- Parameter of a routine or transition variable column of a trigger
- Variable in the module in which the cursor or routine is defined
- Global variable if the variable is on the PATH
- Pseudo column or a function without parameters or a function with defaults for all parameters if the function is on the PATH