I have come across a strange error in DB2.Well, of course it would be a bug if that is what really happened - but it didn't - or did it?
By accident I omitted the predicate in a WHERE clause and DB2 actually compiled it.
This should be a bug, shouldn't it?
Hmm, what's going on here? I claim there is no bug because there is no empty WHERE clause, yet there it is, an open and shut case so to speak.CREATE TABLE T(c1 INT);
INSERT INTO T VALUES 1;
SELECT * FROM T WHERE;
1 record(s) selected.
Let me give you a hint:
DB2 thinks WHERE is a table name! Why?SELECT WHERE.* FROM T WHERE;
1 record(s) selected.
There are two reasons:
- The AS keyword which is used to rename a table is optional.
So while I recommend to use FROM T AS MY_T it is not mandatory to specify AS
- DB2 has very few hard keywords.
A hard keyword is a keyword which cannot be used as an identifier without escaping it.
For example <= is a hard keyword. If you want to have a table named <= you will need to double-quote it "<=":
WHERE is not a hard keyword. It's is a soft keyword.
CREATE TABLE <=(c1 INT);
SQL0104N An unexpected token "<=" was found following "CREATE TABLE ".
Expected tokens may include: "<table_name>". SQLSTATE=42601
CREATE TABLE "<="(c1 INT);
DB20000I The SQL command completed successfully.
- A literal string or number like 7.5 or 'Hello'
- Keywords like WHERE, SELECT, <=
- identifiers like T or "S"
What makes T or S an identifier? The fact that they are nothing else or in the case of "S" the fact, that they are double-quoted.
It does that by reading the string left to right and matching it to the defined grammar.
When it run's into a problem such as:
The parser literally takes a step back.SELECT * FROM T WHERE
It thinks (sometimes it sure feels like it does think anyway), well there is nothing coming after WHERE, but the grammar says there ought to be a predicate.
Instead of raining a -104 syntax error right away it will re-classify the last token WHERE into an identifier "WHERE" and try if it has more luck that way.
We call this "soft keyword" recovery.
Now, why in the name of Codd would DB2 do such a thing? Every release the SQL language grows. Sometimes just a tiny bit, and sometimes by a lot.
Nearly every time that happens new keywords are being added. If these keywords would be hard keywords. I.e. they cannot be re-classified as identifiers then applications written to backlevel versions of DB2 could suddenly cease to work. That is a big no-no and must never happen.
For example in DB2 9.7 the rules for where NULL can be specified where loosened significantly.
It turned out that that there are column names called "NULL" even in DB2's own catalogs (e.g. SYSCAT.COLUMNS.NULL)..
Now, there are some humorous things one can do with DB2's SQL that way. For example:
Why doesCREATE TABLE WHERE(SELECT INT);
SELECT SELECT FROM WHERE WHERE;
0 record(s) selected.
not work?SELECT SELECT FROM FROM WHERE WHERE;
The soft keyword recovery today steps back only one token.
Can anyone come up with a palindrome?