Full reverse: A REVERSE function that handles unicode
Comments (6) Visits (12906)
This week a co-worker converting an Oracle application to DB2 contacted me and requested a REVERSE function.
REVERS() is a function which reverses strings by reordering all the characters in reverse order.
Not a terribly useful function within an application, generally speaking.
However reversing strings can reduce contention on index pages when inserting rows into a table in ascending order.
At any rate, I have learned over the years that conversions are much more successful when one does not demand the customer to redesign whatever it is they have been doing for reasons that may elude me.
REVERSE written in SQL
It is fairly trivial to write a function in SQL PL which walks an input strings and spits it out in reverse order.
However, such a function, if written in inline SQL PL will be effectively limited to non DPF scenarios since inline SQL PL executes on the coordinator.
In fact my coworker had written such a beast and found it to be too slow for that reason.
Writing a function in compiled SQL PL won't solve the DPF problem and would make it likely even slower.
So the first question is:Can the function be written as SQL without the need for BEGIN.. END.
Here is one possible solution using recursion:
CREATE OR REPLACE FUNCTION REVERSE(INSTR VARCHAR(4000)) RETURNS VARCHAR(4000) SPECIFIC REVERSE DETERMINISTIC NO EXTERNAL ACTION CONTAINS SQL RETURN WITH rec(pos, res) AS (VALUES (1, CAST('' AS VARCHAR(4000))) UNION ALL SELECT pos + 1, SUBSTR(INSTR, pos , 1) || res
OK, we need to have system temp >4KB to hold teh temp table for the recursion.
CREATE BUFFERPOOL bp32 PAGESIZE 32K; CREATE SYSTEM TEMPORARY TABLESPACE tsp32 PAGESIZE 32K BUFFERPOOL bp32; VALUES reverse('Hello World!'); 1 ----
We may stop here. However this SQL UDF has a couple of drawbacks:
While this logic is certainly quite fast it is a lot of machinery just to reverse a string.
If this function is called a lot, then it could become a performance drag.
REVERSE written in C
When executing heavy logic with no SQL in need for high performance unfenced C-UDF are the better choice.
A C-UDF is quite easy to write and deploy, assuming you have a C-compiler available.
In this case we need three file
#include <sqludf.h> #ifdef __cplusplus extern "C" #endif void SQL_API_FN Reve
This file should be copied into the sqllib/samples/c dire
LIBRARY UDFREVERSE DESCRIPTION 'Library for DB2 REVERSE function' EXPORTS ReverseSBCP
This file is needed by Windows only to export the entry points of the function.
It too needs to be copied to sqllib/samples/c.
If you are on a Unix or Linux system you use the following file instead:
Now run the bldrtn make file to compile, link and copy the executable into sqll
That being done we can register the function and test it:
CREATE OR REPLACE FUNCTION REVE
You may have noted that I named the C function ReverseSBCP for "Single Byte Code Page".
The problem is that the function, as implemented, only works if we use a single byte code page.
In a Unicode database, which is default in DB2, this would only work for ASCII characters as we can easily see when we add some German umlaute:
VALUES REVERSE('führen'); 1 ----
Back to the drawing board...
REVERSE written in C with Unicode support
In Unicode UTF-8 which is what is being used in a DB2 Unicode database for the VARCHAR data type every character is between one and 4 bytes long.
The first few bits of the first byte encode how many more bytes are needed to encode the entire character.
That means our implementation will look like this:
I gave the function a different name so we need to adjust the .def and .exp files acco
LIBRARY UDFREVERSE DESCRIPTION 'Library for DB2 REVERSE function EXPORTS Reverse1208
Or, if you are on a Unix or Linux system you use the following file:
Now run the bldrtn again:
And we need to re-register to the new entry point:bldrtn udfreverse
CREATE OR REPLACE FUNCTION REVE
In the actual attached files:
So you can pick which suits you best based on the usage requirements.
udfreverse.sql contains all three SQL definitions to choose from.