Motivation Fact: "John's birthday is Oct 12, 1965" If you know John and you know his birthday to be Oct 12, 1965, then never mind. If you know John and you thought his birthday was a different one, then there is a conflict. If you know John and you didn't know his birthday, then you just learned something new about John. If you didn't know John then you now know of his existence and his birthday. How does this relate to databases, DB2 and SQL? When ingesting new data into a table it is very common that you encounter duplicates and... [More]
UPDATE: Feb 11, 2012 I have added a TRACE procedure and renamed ENABLE/DISABLE_EXPLAIN_ACTUALS to ENABLE/DISABLE_ACTIVITY_MONITOR. The TRACE procedure returns a formatted trace of all SQL monitored by the activity monitor. This includes nested compiled triggers, procedure calls and function invocations as well as error conditions. Next on my list: trace-backs for errors and also tracing of parameters. For the past months I have been tinkering on and of with the "new monitoring API" which has been introduced in DB2 9.7. Here are the... [More]
DB2 has had the FETCH FIRST n ROWS ( FFnR ) clause for many years.
FFnR is one of those tricky features that does actually exactly what it claims to do - surprise! A simple test: CREATE TABLE emp(name VARCHAR(10), salary INTEGER); INSERT INTO emp VALUES ('Joe' , 20000), ('Martin', 25000), ('Murphy', 30000), ('Sophie', 18000), ('Brett' , 22000), ('Joel' , 31000); Selecting the first three rows can be done like this: SELECT * FROM emp FETCH FIRST 3 ROWS ONLY; NAME SALARY ---------- ----------- Joe 20000... [More]
Background The discussion about the pros and cons of NULL in SQL has been raging for as long as SQL has been around. In this post I won't dare wading into this controversy. For those interested in such matters I point to the paper "Much ado about nothing" by C.J. Date. Here, instead I want to discuss the implications of NULL, given that it is part of DB2's SQL. What is NULL? NULL represents the absence of a value. Technically it is not a value in itself. However we often use the phrase... [More]
PIVOT The meaning of pivoting a table is to turn n-rows into n-columns. For example given a ales table with a composite primary key of year and quarter you may want to get a result that shows only one row for each year, but a separate column for each quarter. CREATE TABLE Sales(Year INTEGER NOT NULL, Quarter INTEGER NOT NULL, Results INTEGER); CREATE UNIQUE INDEX PK_IDX ON Sales(Year, Quarter) INCLUDE(Results); ALTER TABLE Sales ADD PRIMARY KEY (Year, Quarter); INSERT INTO Sales VALUES (2004, 1, 20),... [More]
Motivation Yesterday one of my coworkers from India asked for my help in migrating the following SQL statement from Sybase to DB2. SELECT DISTINCT c1, c2 FROM t ORDER BY c3 When this statement is run in DB2 (assuming the appropriate definition of "T") the following error is being returned: SQL0214N An expression in the ORDER BY clause in the following position, or starting with "C3" in the "ORDER BY" clause is not valid. Reason code = "2". Why does DB2 raise this error? The reason is simple: It's not... [More]
Motivation It is very common nowadays for SQL statements to be generated by the application itself for by some productivity middle ware which abstracts the database. On the server however, within routines, triggers or anonymous blocks most SQL is fixed and will execute just the way the definer of the object has specified it. There are, however exceptions to this rule: Executing DDL statements such as CREATE TABLE or DROP INDEX While DB2 supports select DDL statements, generally DDL is not part of the SQL PL syntax Execution of highly... [More]
DB2 provides several means to generate unique identifiers depending on ones need. For example: IDENTITY column That's a column of a table which is generated automatically, typically in an ever increasing fashion.
CREATE TABLE emp(empid INTEGER NOT NULL GENERATED ALWAYS AS IDENTITY, name VARCHAR(20), salary DECIMAL(10, 2)); INSERT INTO emp VALUES(DEFAULT, 'Jones', 20000); VALUES IDENTITY_VAL_LOCAL(); 1 --------------------------------- 1. 1 record(s) selected. An... [More]
Motivation There are multiple ways in which DB2 supports recursion in SQL. One of which is recursion using the ANSI SQL recursive UNION ALL approach. The other approach is using the CONNECT BY clause. Actually I like to add explanation of recursion into my "SQL on Fire" talks because the topic never seems to get old. In this blog however I will not talk about recursion in SQL itself. Instead I want to answer a frequent email question about how to achieve recursion in SQL PL (or PL/SQL). Fibonacci number A commonly used example when... [More]
Motivation In this blog I last discussed how to generate unique timestamps and earlier I introduced a Java UDF to generate globally unique identifiers (uuid) . The former naturally produces timestamps, while the later produces a binary string. But what if what you need is a number? Of course both timestamps and binary strings can be represented as numbers. But these will be very long.For example a naive representation of a timestamp could be the number 20120303070659765345. This is too long even for a bigint. Certainly you would not want to... [More]
Having demonstrated how to aggregate strings using XML in an earlier post I have been asked about how to do the inverse. Giving a a string and a defined separator, produce a table with one row per substring. So here is what I came up with: CREATE OR REPLACE FUNCTION PK_BASE.SPLIT(text VARCHAR(32000), split VARCHAR(10)) RETURNS TABLE(column_values VARCHAR(60)) RETURN WITH rec(rn, column_value, pos) AS (VALUES (1, VARCHAR(SUBSTR(text, 1, DECODE(INSTR(text, split, 1), ... [More]
In this entry I'll illuminate techniques on how to pick out specific rows from a set. Sound easy right? Well we shall see. Let's start simple. What is the biggest salary made by an employee in my department? CREATE TABLE emp(empid INTEGER, name VARCHAR(20), salary INTEGER, dept INTEGER); INSERT INTO emp VALUES (1, 'Jones', 20000, 100), (2, 'Smith', 25000, 100), (3, 'Friedman', 22000, 101), (4, 'Muller', 21000, 101), (5, 'Holmes', 30000, 101); To get the maximum salary we use MAX(), easy:
SELECT MAX(salary) FROM emp WHERE dept = 101; 1... [More]
The Problem I've just finished debugging some quality assurance (QA) scenarios which failed just on Windows. The failure ended up being in the test case implementation rather than DB2. The kind of failure, however was interesting although I have seen the pattern repeatedly since my early days in DB2 development both internally as well as in customer code. Therefore I figure it's a good topic to blog about. Imagine a table recording events - perhaps for auditing, logging. Anything of the sorts: CREATE TABLE event(stamp TIMESTAMP NOT NULL... [More]
When I started leading the project to add Oracle's PL/SQL to DB2 it became quickly clear that perhaps amongst the top five features needed to claim a reasonable level of compatibility were autonomous transactions . An autonomous transaction ("#PRAGMA AUTONOMOUS" in PL/SQL speak) is a transaction which executes in total isolation from its invocation context. While this feature has seen a lot of uptake with the Oracle to DB2 realm I have rarely seen it used in a traditional DB2 environment. The reasons for that may be: Poor... [More]
Some of you may know that I am engaged in enabling Oracle applications to DB2. One of the frequent "issues" we see are differences between the way how indexing works in Oracle and DB2 and, more specifically about different semantics for unique indexes. In DB2 a unique index requires that no two keys are the same. Sounds simple and clear on the surface, but not so clear when you consider NULLs. Generally speaking a NULL cannot be compared to a NULL and thus it is debatable whether one should allow more than one NULL key in a unique... [More]
Introduction Recently I discussed the motivation behind using anonymous blocks. The reasoning was centered around the need to execute complex scripts on the server without being able or willing to define a routine in the database's catalog. Inside of an anonymous block you can then define all the logic flow you need. You can also declare local variables to hold temporary data. DB2 10 expands beyond that by allowing you to declare local types and local procedures. Local types When ARRAY types were introduced in DB2 9.5 we discussed whether there... [More]
Motivation As part of my job I work in a group called the SQL Language council (SLC). The purpose of that group is, roughly three fold: Ensure any SQL added to DB2 (and to some extend IDS, Netezza and Derby) remains compatible between the products. Bring SQL extensions forward for standardization by ANSI and ISO Ensure that SQL Extensions are added with a vision that extends beyond the immediate product or business requirement. Needless to say meetings of the SLC can be quite exciting as different code-bases clash. I like those. On the flip... [More]
Background Have you ever seen: "SQL0670N The row length of the table exceeded a limit of "<length>" bytes. (Table space "<tablespace-name>".)" When creating or altering a table DB2 imposes limits on the total worst case size of a row. This size depends on the page size of the tablespace in which the table resides: 4005 bytes in a table space with a 4K page size 8101 bytes in a table space with an 8K page size 16293 bytes in a table space with an 16K page size 32677 bytes in a table space with an 32K... [More]
Motivation Remember the days when DB2 re-entered the TPC-C fight after a long hiatus? The time was DB2 8.1 FP4. I seriously do not remember years.. Time passes in releases. While my colleagues were tuning code path, bufferpools disks my team was looking at something more fundamental. What can we do to make SQL as efficient as possible for TPC-C. And ideally how can we make SQL more efficient for any class of OLTP workload. In other words how can we pour the most bang into the least SQL for a typical OLTP transaction? The result was what we... [More]
Background When our team added compatibility for DB2 with Oracle applications we learned that no two functions were compatible between Oracle and DB2 as far as date-time arithmetic was concerned.
We ended up adding an entirely new library to achieve compatibility.
It appears the difference between DB2 and MS SQL Server is equally profound in this area given that I was recently asked to provide a match for the DATEADD() function. DATEADD() This TSQL is a bit of an odd one.
It takes three arguments: A "measure" The measure can be... [More]
Prelude I have have been engaged in discussions on how to use DB2 for LUW for as long as I have been employed with IBM. Until not too long ago, before I started blogging, the primary platform for such discussion was usenet. And one of the very oldest debates I recall was with Bernard D. For at least a decade he has been tirelessly trying to convince us to extend DB2 for LUW to support comparisons of "row value constructors" (rvc). In fact the oldest record I can find on Google on this debate dates back to October 21, 2000. Well,... [More]
One of the most popular blog entries on SQLTips4DB2 is the one on Aggregating Strings (October, 29 2010) . In this post I describe a means to sum strings into a a single comma separated text using XML. When I composed this post I was already working on a built-in function called LISTAGG . But somehow I forgot to blog about it assuming that first blog entry already contained its description. I must be getting old. So, without further delay here is a quick overview of the LISTAGG function introduced in DB2 9.7.4. We use the same sample data as... [More]
What seems like eons ago we introduced IDENTITY columns into DB2. This was in DB2 UDB V7.1 to be precise. As you may know IDENTITY columns are special forms of SEQUENCEs which are tight to a specific column in a specific table. The purpose of an IDENTITY column is to supply an abstract primary key for the table which generally is maintained by DB2. For example: CREATE TABLE emp (empid INT NOT NULL GENERATED ALWAYS AS IDENTITY PRIMARY KEY, name VARCHAR(20), salary DECIMAL(10, 2)); You can now insert employees and let DB2 take care of... [More]
One of the FAQs I see in newsgroups is the request on how to aggregate strings. That is, how do I collapse rows of a string columns into e.g. a comma separated list. DB2 (as of 9.7.3) does not have any built-in function for this, but there are interesting ways to get the job done. Let's use an employee table as an example:
CREATE TABLE emp( empid INT NOT NULL PRIMARY KEY, name VARCHAR(20), salary DECIMAL(10, 2), dept VARCHAR(20)); INSERT INTO emp VALUES (1, 'Jones' , 20000, 'Research'), (2, 'Schmidt' , 23000, 'Research'),... [More]
In this post I want to introduce a handy procedure which can be used to suspend a session for a specified amount of time.
A naive approach to wait is to simply execute a tight loop in a stored procedure such as this:
--#SET TERMINATOR @ CREATE OR REPLACE PROCEDURE SLEEP(seconds INTEGER) BEGIN DECLARE end TIMESTAMP; SET end = CURRENT TIMESTAMP + seconds SECONDS; wait: LOOP IF CURRENT TIMESTAMP >= end THEN LEAVE wait; END IF; END LOOP wait; END @ --#SET TERMINATOR ; VALUES CURRENT TIMESTAMP; 1 --------------------------... [More]
A co-worker just walked in and asked how to subtract timestamps in SQL. Easy, you just use the TIMESTAMPDIFF() function. Now, personally I just plain hate this function because it has some serious shortfalls: It approximates the difference between timestamps assuming 365 day years and 30 day months It takes some really odd arguments. So, I think there is a need for an exact and generic function which subtracts dates or timestamps. There are two ingredients which DB2 supplies natively which come in handy: JULIAN_DAY() Given a DATE or TIMESTAMP... [More]
Yesterday I got an interesting request for a MEDIAN function in DB2. After a quick look at Wikipedia MEDIAN I figured this can't be too hard and the result may be of interest to others. At the beginning there is a table, preferably with some data CREATE TABLE T(c1 INT); INSERT INTO T VALUES 10, 12, 30, 33, 50; Now the MEDIAN is the middle value of a set of values. So in the case above that would be 30. We can get to the middle value by counting the number of values, number them while we do that and then take the one in the middle. SELECT c1 ... [More]
Background This week a co-worker converting an Oracle application to DB2 contacted me and requested a REVERSE function. REVERS() is a function which reverses strings by reordering all the characters in reverse order. For example: 'Hello World!' becomes '!dlroW olleH'. Not a terribly useful function within an application, generally speaking. However reversing strings can reduce contention on index pages when inserting rows into a table in ascending order. At any rate, I have learned over the years that conversions are much more successful when... [More]
It is by now reasonably known that DB2 9.7 speaks to a high degree Oracle. and there are articles of mine on developer works which speak to that. The question I keep hearing often from DB2 customers however is: What's in it for me? Can I mix Oracle and DB2 dialects? So in this entry I'll try to separate what is acceptable Oracle-speak around your DB2 applications and what is, well, language you should not use around the dinner table. You can classify DB2 9.7 Oracle speak into four categories: Extensions to DB2's functionality These are... [More]
The most satisfying work in developing SQL to me is whenever we add infrastructure to DB2 which makes DB2 more extensible. For example adding a function like DECODE which is syntactic sugar for CASE expressions is nice, but it only solves a very specific issue. Contrast that with what we did in DB2 9.7 when we introduced global variables. Here are some of the key properties of global variables They are schema qualified That is you can have multiple variables with the same name in different schemas Variables are resolved by PATH So you can make... [More]