Topic
  • 3 replies
  • Latest Post - ‏2013-09-22T01:02:52Z by MatthiasNicola
Praveen Kulkarni
Praveen Kulkarni
2 Posts

Pinned topic Need Help on XQuery

‏2013-08-29T16:34:20Z |

Hello,

      I need help on one of the XQueries which I am struggling with. 

The table structure as below:

 
CREATE TABLE "SRCH "."CLAIM_SRCH"  
(
                  "TRC_ID" BIGINT NOT NULL ,
                  "PLNCD" CHAR(3) NOT NULL ,
                  "SYS_CREATE_DT" DATE NOT NULL ,
                  "FMAT_VER_ID" SMALLINT NOT NULL ,
                  "INB_IND" CHAR(1) NOT NULL ,
                  "REAL_TM_IND" CHAR(1) NOT NULL ,
                  "RTA_MLE_CD" CHAR(1) NOT NULL ,
                  "TRAN_DOC" XML INLINE LENGTH 16000 NOT NULL
);
 
I have two Date Attibutes in the XML Column:

SEARCH_SUBMITTED_REMITTED_CLAIM/SERV_LINES/SERV_LINE/DATE_OF_SERVICE/DOS_FROM_DATE

and

SEARCH_SUBMITTED_REMITTED_CLAIM/SERV_LINES/SERV_LINE/DATE_OF_SERVICE/DOS_TO_DATE.

I need a XQuery to find all the data >=DOS_FROM_DATE and <=DOS_TO_DATE. But the complexity is DOS_TO_DATE can have the date in MMDD format or MMDDYYY Format where as DOS_FROM_DATE is always DDMMYY.

Therefore, Before I can do the comparison I need to check if length of DOS_TO_DATE is 4 then concatenate with the year(YY) from DOS_FROM_DATE and then do the search where as if its length is 6 just go ahead with comparison.

Please help me with the below:

1. How can I check the length of DOS_TO_DATE  in the where clause in a if statement or a case statement. I was trying to use the fn:string-length function and was able to do that if validate if the DOS_TO_DATE is of length 4 with the below query:

select trc_id from srch.CLAIM_SRCH_UNION where xmlexists('$d/.[fn:string-length(fn:string-join(*:SEARCH_SUBMITTED_REMITTED_CLAIM/SERV_LINES/SERV_LINE/DATE_OF_SERVI
CE/DOS_TO_DATE,""))=4]' passing srch.CLAIM_SRCH_UNION_VW.TRAN_DOC as "d") ORDER BY TRC_ID WITH UR;

But not sure how to get the length of the element and push that to a conditional check.

2. Can I use the  fn:substring and fn:concat to extract the YY part of DOS_FROM_DATE and concatenate it with MMDD of DOS_TO_DATE.

 

It would be of great help if someone could help me with the above scenario. Thanks a lot in advance

  • MatthiasNicola
    MatthiasNicola
    322 Posts

    Re: Need Help on XQuery

    ‏2013-08-30T06:07:54Z  

    Hi Praveen,

    in your case it is really unfoirtunate that DOS_TO_DATE can have the date in MMDD format or MMDDYYY format whereas DOS_FROM_DATE is always DDMMYY. That is not a very clean data design.

    Such data quality issues are best addressed at the source that produces the data. If that is not possible, you may want to consider cleaning the data (i.e. converting to a proper and common date format) before or upon insert into your DB2 database.  Then you incur the burden of dealing with the "bad" date formats only once, and not for every query.

    Yes, you can use the functions fn:substring, fn:concat, etc. to perform the required operations. http://pic.dhe.ibm.com/infocenter/db2luw/v10r5/topic/com.ibm.db2.luw.xml.doc/doc/xqrfncategory.html

     

    Also, here are some coding examples for a similar (but not identical) problem secenario with poorly formatted dates. These example are not exact soluytion to you rcase but might give you saome ideas what else yopu can try:
     http://nativexmldatabase.com/2011/03/18/xquery-and-sqlxml-how-to-convert-a-date-that-is-not-a-date-into-a-date/

    In this example described in the blog, the function fn:tokenize is used to split the date value along a known delimiter such as a space. Without a delimiter you would have to use the substring function.

    Matthias

     

    Updated on 2013-08-30T06:08:06Z at 2013-08-30T06:08:06Z by MatthiasNicola
  • Praveen Kulkarni
    Praveen Kulkarni
    2 Posts

    Re: Need Help on XQuery

    ‏2013-09-21T10:33:13Z  

    Hi Praveen,

    in your case it is really unfoirtunate that DOS_TO_DATE can have the date in MMDD format or MMDDYYY format whereas DOS_FROM_DATE is always DDMMYY. That is not a very clean data design.

    Such data quality issues are best addressed at the source that produces the data. If that is not possible, you may want to consider cleaning the data (i.e. converting to a proper and common date format) before or upon insert into your DB2 database.  Then you incur the burden of dealing with the "bad" date formats only once, and not for every query.

    Yes, you can use the functions fn:substring, fn:concat, etc. to perform the required operations. http://pic.dhe.ibm.com/infocenter/db2luw/v10r5/topic/com.ibm.db2.luw.xml.doc/doc/xqrfncategory.html

     

    Also, here are some coding examples for a similar (but not identical) problem secenario with poorly formatted dates. These example are not exact soluytion to you rcase but might give you saome ideas what else yopu can try:
     http://nativexmldatabase.com/2011/03/18/xquery-and-sqlxml-how-to-convert-a-date-that-is-not-a-date-into-a-date/

    In this example described in the blog, the function fn:tokenize is used to split the date value along a known delimiter such as a space. Without a delimiter you would have to use the substring function.

    Matthias

     

    Hello Matthias,

            Thanks for providing details on your previous reply. I went through the details in one of your post (http://www.ibm.com/developerworks/data/library/techarticle/dm-0708nicola/) and used the XMLTABLE function to get the required XQuery. But wanted some inputs from you to improve the same.

    Here are the different combinations of Date formats we have in the XML:

     

    FROM_DATE                                       TO_DATE

    20130901(YYYYMMDD)                    20130931(YYYYMMDD)

    053112 (MMDDYY)                             061412 (MMDDYY)

    053112(MMDDYY)                              0614 (MMDD)

     

    Below is the query I built using XMLTABLE option to suit all the date formats:

    WITH TEMP AS (
    SELECT TRC_ID,TRAN_DOC,FMAT_VER_ID,
     CASE WHEN LENGTH(X.FDOS)=6
     THEN '20'||SUBSTR(X.FDOS,5,2)||SUBSTR(X.FDOS,1,4)
     ELSE X.FDOS END AS FRDOS,
    CASE WHEN LENGTH(X.TDOS)=4
    THEN '20'||SUBSTR(X.FDOS,5,2)||X.TDOS
    WHEN LENGTH(X.TDOS)=6
    THEN '20'||SUBSTR(X.TDOS,5,2)||SUBSTR(X.TDOS,1,4)
    ELSE X.TDOS END AS TODOS
    FROM SRCH.CLAIM_SRCH_UNION_VW,XMLTABLE('$d/SEARCH_SUBMITTED_REMITTED_CLAIM/SERV_LINES/SERV_LINE/DATE_OF_SERVICE/DOS_TO_DATE'  passing tran_doc as "d"
    COLUMNS
    FDOS varchar(100) path '../DOS_FROM_DATE',
    TDOS varchar(100) path '.'
    ) as X
    )
    SELECT T.TRC_ID,T.TRAN_DOC FROM TEMP T WHERE FMAT_VER_ID=2
    AND
    INTEGER(T.FRDOS)>=20120930 AND INTEGER(T.TODOS)<=20120930 WITH UR;

    I am using the TEMP Table to convert the Dates to YYYYMMDD format and then getting the required results. But the above query works only when the year is in 20xx. Can you please let me know the below:

    1. Can the above be achieved with XQueries only and without converting the XML to Relational Data? Is there any performance impact by using the XMLTABLE option. I checked the Explain Plan of the above Query and its very high

    2. In XML can we perform date comparisons on dates in MMDDYY Format.

    3. Can we use Where clause on the table that is used in the XMLTABLE Function. For instance, in the below query with XMLTABLE Function:

    SELECT TRC_ID, TRAN_DOC, FMAT_VER_ID, X.FODS, X.TDOS
    FROM SRCH.CLAIM_SRCH_UNION_VW,XMLTABLE('$d/SEARCH_SUBMITTED_REMITTED_CLAIM/SERV_LINES/SERV_LINE/DATE_OF_SERVICE/DOS_TO_DATE'  passing tran_doc as "d"
    COLUMNS
    FDOS varchar(100) path '../DOS_FROM_DATE',
    TDOS varchar(100) path '.'
    ) as X;

    I would like to include a where condition on the FMAT_VER_ID. I would like to run the query and fetch the data only for FMAT_VER_ID=2. Is there a option to include that?

    Regrets for lengthy questions but wanted to keep my doubts clear and did not want to create confusions. Once again Thanks a lot for your support on this. Its being very helpful for me to fix things on this and I am confident that your response to this post will surely help me to improve the Query.

     

     

    Updated on 2013-09-21T10:48:50Z at 2013-09-21T10:48:50Z by Praveen Kulkarni
  • MatthiasNicola
    MatthiasNicola
    322 Posts

    Re: Need Help on XQuery

    ‏2013-09-22T01:02:52Z  

    Hello Matthias,

            Thanks for providing details on your previous reply. I went through the details in one of your post (http://www.ibm.com/developerworks/data/library/techarticle/dm-0708nicola/) and used the XMLTABLE function to get the required XQuery. But wanted some inputs from you to improve the same.

    Here are the different combinations of Date formats we have in the XML:

     

    FROM_DATE                                       TO_DATE

    20130901(YYYYMMDD)                    20130931(YYYYMMDD)

    053112 (MMDDYY)                             061412 (MMDDYY)

    053112(MMDDYY)                              0614 (MMDD)

     

    Below is the query I built using XMLTABLE option to suit all the date formats:

    WITH TEMP AS (
    SELECT TRC_ID,TRAN_DOC,FMAT_VER_ID,
     CASE WHEN LENGTH(X.FDOS)=6
     THEN '20'||SUBSTR(X.FDOS,5,2)||SUBSTR(X.FDOS,1,4)
     ELSE X.FDOS END AS FRDOS,
    CASE WHEN LENGTH(X.TDOS)=4
    THEN '20'||SUBSTR(X.FDOS,5,2)||X.TDOS
    WHEN LENGTH(X.TDOS)=6
    THEN '20'||SUBSTR(X.TDOS,5,2)||SUBSTR(X.TDOS,1,4)
    ELSE X.TDOS END AS TODOS
    FROM SRCH.CLAIM_SRCH_UNION_VW,XMLTABLE('$d/SEARCH_SUBMITTED_REMITTED_CLAIM/SERV_LINES/SERV_LINE/DATE_OF_SERVICE/DOS_TO_DATE'  passing tran_doc as "d"
    COLUMNS
    FDOS varchar(100) path '../DOS_FROM_DATE',
    TDOS varchar(100) path '.'
    ) as X
    )
    SELECT T.TRC_ID,T.TRAN_DOC FROM TEMP T WHERE FMAT_VER_ID=2
    AND
    INTEGER(T.FRDOS)>=20120930 AND INTEGER(T.TODOS)<=20120930 WITH UR;

    I am using the TEMP Table to convert the Dates to YYYYMMDD format and then getting the required results. But the above query works only when the year is in 20xx. Can you please let me know the below:

    1. Can the above be achieved with XQueries only and without converting the XML to Relational Data? Is there any performance impact by using the XMLTABLE option. I checked the Explain Plan of the above Query and its very high

    2. In XML can we perform date comparisons on dates in MMDDYY Format.

    3. Can we use Where clause on the table that is used in the XMLTABLE Function. For instance, in the below query with XMLTABLE Function:

    SELECT TRC_ID, TRAN_DOC, FMAT_VER_ID, X.FODS, X.TDOS
    FROM SRCH.CLAIM_SRCH_UNION_VW,XMLTABLE('$d/SEARCH_SUBMITTED_REMITTED_CLAIM/SERV_LINES/SERV_LINE/DATE_OF_SERVICE/DOS_TO_DATE'  passing tran_doc as "d"
    COLUMNS
    FDOS varchar(100) path '../DOS_FROM_DATE',
    TDOS varchar(100) path '.'
    ) as X;

    I would like to include a where condition on the FMAT_VER_ID. I would like to run the query and fetch the data only for FMAT_VER_ID=2. Is there a option to include that?

    Regrets for lengthy questions but wanted to keep my doubts clear and did not want to create confusions. Once again Thanks a lot for your support on this. Its being very helpful for me to fix things on this and I am confident that your response to this post will surely help me to improve the Query.

     

     

    Hi Praveen,

    let me try to answer your questions one by one:

    >> 1. Can the above be achieved with XQueries only and without converting the XML to Relational Data? Is there any performance impact by using the XMLTABLE option. I checked the Explain Plan of the above Query and its very high

    Yes, the same can be achieved in XQuery without the use of SQL or the XMLTABLE function. XQuery has all necessary functions for length, substring, and other string manipulation. Instead of the CASE expession in SQL you would use the XQuery if-then-else expression.

    http://pic.dhe.ibm.com/infocenter/db2luw/v10r5/topic/com.ibm.db2.luw.xml.doc/doc/xqrcondexp.html
    http://pic.dhe.ibm.com/infocenter/db2luw/v10r5/topic/com.ibm.db2.luw.xml.doc/doc/xqrfncategory.html

    In terms of performance there should be no big difference. The key question is whether your query can use indexes for your date predicates or not. If you apply string functions to the date values (either in SQL or in XQuery) then your query won't use indexes for the predicates.

     

    >> 2. In XML can we perform date comparisons on dates in MMDDYY Format.

    No. The data type xs:date in XQuery defines date values to be of the following form: yyyy-mm-dd.

    See:  http://pic.dhe.ibm.com/infocenter/db2luw/v10r5/topic/com.ibm.db2.luw.xml.doc/doc/xqrdate.html

    Again, my recommendation is to fix this data quality problem at the source of the data or when/before you insert the data into the database. The reason is the following: if you use string functions and case expressions to in your queries to convert the format of the date values, then your query will most likley not be able to use indexes on these dates, i.e. your query will always do a table scan which is not optimal if the table is large and you want to select only  a subset of the dates.

     

    >> 3. Can we use Where clause on the table that is used in the XMLTABLE Function. For instance, in the below query with XMLTABLE Function:

    Yes. You can simply add a WHERE clause. For example:

    SELECT TRC_ID, TRAN_DOC, FMAT_VER_ID, X.FODS, X.TDOS

    FROM SRCH.CLAIM_SRCH_UNION_VW,

               XMLTABLE('$d/SEARCH_SUBMITTED_REMITTED_CLAIM/SERV_LINES/SERV_LINE/DATE_OF_SERVICE/DOS_TO_DATE'

                                        passing tran_doc as "d"

            COLUMNS            FDOS varchar(100) path '../DOS_FROM_DATE',

                                          TDOS varchar(100) path '.') as X

    WHERE FMAT_VER_ID=2;

     

    Best regards,

    - Matthias

     

     

     

    Updated on 2013-09-22T01:06:06Z at 2013-09-22T01:06:06Z by MatthiasNicola