pureXMLDevotee2012

This page has not been liked. Updated 5/26/13 6:01 PM by sumalaikaTags: None

pureXML Sessions in 2012

We are running some pureXML telcons in 2012

Please contact Susan Malaika if you have suggestions for topics you would like to hear about, or if you would like to set up or be part of a local pureXML interest group, possibly as a special chapter of an existing DB2 user community.


pureXML Devotee Community Call 11 December 2012 at 11am US Eastern

Talk Title : An Introduction to DB2 RDF

Slides : DB2_NoSQL_GraphStore.v03.pdf

Recording : DB2RDF.11Dec2012.560779494.mp3

RDF (Resource Description Framework) is a way of representing information in triples, as subject-predicate-object expressions. RDF is one of a collection of specifications in support of the Semantic Web. In this talk, Mario Briggs introduced DB2's RDF support, and will described how RDF compares with other ways of representing information such as relational tables, XML, and JSON. There was some discussion during the call and email follow-ups. The feedback from the talk was good, e.g.:

  • Very nice presentation, thanks.
  • Very useful/helpful to better understand and position
  • Yes, thank you for the presentation.
  • Very good talk, thanks

Some suggestions for topics for next year include:

  • Performance and tuning ; Indexing ; Storage
  • Experiences
  • Design Considerations
  • JSON

Speaker:

Mario Briggs is a senior technical staff member leading the RDF support in DB2 and also the Open Source offerings for IBM DB2 and IBM Informix including PHP, Ruby/Rails, Python/Django, Perl, and Java data access frameworks. Mario has about 14 years of experience in Software development with many of those years spent in the area of data access, relational engines, and application-database performance.


pureXML Devotee Community Call 10 October 2012 at 10am US Eastern (daylight savings / summer time)

Talk Title 1 : Data Normalization Reconsidered

Slides:  DataNormalizationReconsidered.2012-10-10.v04.pdf

Recording: DataNormalizationReconsidered.2012.10.10.502414320.mp3

In this talk, Matthias Nicola and Susan Malaika  presented "Data Normalization Reconsidered". Relational databases have been fundamental to business systems for more than 25 years. Data normalization is a methodology that minimizes data duplication to safeguard databases against logical and structural problems, such as data anomalies. Relational database normalization continues to be taught in universities and practiced widely. Normalization was devised in the 1970s when the assumptions about computer systems were different from what they are today.

This talk provided a review of record keeping inside and outside of computer systems. Based on this background, the talk examined the problems associated with data normalization, such as complexity and the difficulty of mapping business records to normalized data in a changing world. The talk described how the World Wide Web has impacted the creation and exchange of non-normalized business records. Alternative data representations, such as XML, JSON, and RDF to overcome normalization issues or to introduce schema flexibility, were covered.

The talk is based on these articles on developerWorks:

About 25 people were on the call. The majority of the attendees in the session confirmed that they use data normalization. During the talk some of the attendees pointed out that tools are still oriented towards relational data, which can make it difficult to work with alternative data storage representations. The feedback was good, e.g., :

  • yes it was good - covered some of the things we are doing with relational and XML tables
  • Really appreciate the additional material in the slide deck

There were some suggestions for further talks including:

  • NoSQL and JSON
  • RDF and Linked Data
  • IMS and XML

Speakers:

Matthias Nicola is a senior technical staff member in the DB2 team at IBM's Silicon Valley Lab, in San Jose, CA, USA. Matthias also works closely with customers and business partners to help them design, optimize and implement DB2 solutions. Previously Matthias worked on data warehouse performance with Informix Software.



Susan Malaika is a senior technical staff member specializing in web and data.

 


pureXML Devotee Community Call 12 September 2012 at 10am US Eastern (daylight savings / summer time)

Many thanks to the two speakers Yonghua Ding and Bill Carey. We had very nice feedback:

  • Thank you very much Yonghua Ding and Bill Carey for the helpful and informative talks
  • Very nice presentations....thank you

Talk Title 1 : Using COBOL with pureXML on DB2 z/OS

Slides: Using COBOL with pureXML on DB2 for zOS.Sep2012b.pdf

Recording: cobol.zos.xml.12Sep2012.476036766.sho.mp3 (First Part)

In this talk, Yonghua Ding presented the COBOL samples to help application developers understand DB2 for z/OS pureXML features. The following is a list of major categories of XML features in DB2 for z/OS which were referred to in the presentation:

  • Inserting XML data to an XML Column
  • Retrieving an XML document from an XML Column
  • SQL/XML functions: XMLQUERY, XMLEXISTS, XMLTABLE, XMLCAST
  • Updating an XML document in an XML Column
  • Deleting rows depending on an XML value via XMLEXISTS
  • Triggering relational data extraction from XML Data using XMLTABLE
  • Analyzing access plan with or without XML indexes
  • Using XML HOST VARIABLES (XML AS BLOB/CLOB/DBCLOB, XML AS BLOB-FILE/CLOB-FILE/DBCLOB-FILE), LOB AND LOB LOCATOR HOST VARIABLES used for XML Data
  • Issuing the DESCRIBE Statement referring to an XML column
  • Issuing FETCH WITH CONTINUE and FETCH CURRENT CONTINUE on an XML COLUMN
  • Accessing the XML Schema Repository and performing validation
  • XML sub-document update via SQL/XML function XMLMODIFY

     

Talk Title 2 : XML Processing on z/OS

Slides:  XML-Parsing-on-ZOS-Overview.PureXML-Devotee-Call-Sept-2012.pdf

Recording: cobol.zos.xml.12Sep2012.476036766.sho.mp3 (Second Part)

In this talk, Bill Carey presented an overview of options for the processing of XML data on z/OS with a particular emphasis on options that directly or indirectly utilize System z zAAP and zIIP specialty engines.

Bios:



Yonghua Ding has been working on DB2 for z/OS since 2004. Previously he worked on DB2 parser, precompiler and coprocessor. Since 2009 he focused more on the XML development including XQuery parser, XQuery semantic transformation, XML index, regular expression in XML etc. He received his Ph.D in Computer Science from Purdue University in 2004.





Bill Carey is a Senior Software Engineer in the System z Technology and Strategy organization. His 39 years with IBM includes time focusing on the IPCS, TSO/E, and ISPF products and strategies related to systems management, file and print serving, XML processing, and other technology areas. Bill received his BS degree from Manhattan College and earned his Masters degree at RPI.


pureXML Devotee Community Call 13 June 2012 at 10am US Eastern (daylight savings / summer time)

Tina Lee gave a detailed talk on XML Indexing for DB2 on UNIX and Windows. The session included DB2 10 indexing features. Mallarswami (Mailar) Nonvinkere participated in the call, answering questions. About 35 people attended the call representing various industries including banking, insurance, medical, government.

Here are some of the questions and answers from the session:

  • Question: Would it be fair to say that AS SQL VARCHAR HASHED will perform better than VARCHAR?
    • Answer: VARCHAR HASHED is good for saving space and if queries are only using equality predicates
  • Question: Can I see the HASHED value?
    • Answer: db2dart can show the hashed value
  • Question: What value includes blanks at the end of varchars as significant? use-case?
    • Answer: For XML VARCHAR indexes trailing blanks are meaningful, but for SQL they are not. It is a difference between the standards. XQuery provides fn:normalize-space to remove unneeded spaces: returns the value of $arg with leading and trailing whitespace removed, and sequences of internal whitespace reduced to a single space within the string: eq 'a' eq normalize-space('a ') returns true.
  • Question: Do indexes require the creation of a schema first?
    • Answer: Schema is not necessary for index creation

At the end of the session, there was a discussion about composite indexes and indexing anding (performs multiple index scans). Use cases were provided to illustrate the need for composite indexes:

  • One patient per XML record : Search for patient id and patient facility; or patient id and patient date of birth
  • Multiple trades per XML record : Search for all trades for a given date
  • Multiple policies per XML record : Search for all policies within a particular date range

There were also discussions which continued off-line about when XDA objects are created temporarily, and about MicroXML.

These were suggestions for future sessions:

  • pureXML caching in memory (Linux/UNIX) in bufferpools or other memory heaps, and exactly how that works.
  • XQuery update : performance tips and logging
  • XQuery Vs XMLTable which is better for performance

With many thanks to Tina Lee for her very informative presentation, and to Mailar Nonvinkere for his assistance with the questions. Here are samples of some of the appreciation from the attendees:

  • Great job, Tina
  • Excellent! Thanks Tina
  • Thx
  • Good chat / thanks
  • Session was useful, thanks Tina.
  • Great, thank you

Talk Title: pureXML Indexing Overview (DB2 9, 9.5, 9.7 and 10 for Linux, Unix, and Windows)

Here are the materials:

Abstract:

This presentation provides a technical overview of pureXML indexing in DB2 9 and then covers the enhancements available in DB2 9.5 and DB2 9.7 and DB2 10 such as index compression, partitioned indexes, case insensitive search. Learn how to create XML indexes so you can improve the performance of your queries on XML documents and pick up valuable tips to avoid common errors.



Speaker: Tina Lee

 

Bio:

Christina (Tina) Lee is a senior software engineer at IBM's Silicon Valley Lab, in San Jose, CA, USA. She is currently the manager of the Data Warehouse Storage and Indexing Development team for DB2 for Linux, Unix, and Windows. Prior to her current position, she was the XML indexing technical lead and developed software in the kernel area of DB2 (memory management, buffer pools, process model, etc.) for the pureScale feature. Before moving to DB2 for Linux Unix & Windows, Tina developed software in DB2 for z/OS in the Data Manager, LOB Manager, and Index Manager components.



Participant: Mallarswami (Mailar) Nonvinkere



Bio:

Mallarswami Nonvinkere is working as a DB2 consultant with IBM's India Software Lab in Bangalore. He specializes in DB2 pureXML technology. He works with IBM customers, business partners and independent software vendors to help them understand the use of DB2 technology and develop high performance applications. He has been helping customers with best practices to get best performance from XML based solutions. As part of the IBM team, he has been actively supporting DB2 migration projects for various customers and business partners. He has been actively involved in DB2 beta programs and has been answering customer queries on DB2 and educating them as and when required.

Notes for the indexing session:

  • 1. Be able to create XML indexes to improve query performance
  • 2. Understand how the XML data is converted to the index data type
  • 3. Know how to use DB2 9.5 CREATE INDEX options to control index behavior
  • 4. Learn what types of queries can use indexes on XML columns
  • 5. XML indexing enhancements for DB2 9.7 and DB2 10
    • a. Online Index Create and Reorg support, Range Partitioned Table support, Compression
    • b. New Data Type Support, Case Insensitive Search, Existential Predicates

pureXML Telcon session on Mar 14, 2012 - 10am US Eastern (daylight savings / summer time)

We had the following sessions on pureXML:

About 30 people attended the session: A mix of DB2 z/OS users, and DB2 UNIX users. The first hour of the session was about when to store XML, and when to store relational. The second hour covered DB2 z/OS pureXML and what's new for pureXML in DB2 z/OS Version 10.

Some of the topics mentioned in discussion included:

Here are some of the questions and answers during the session included:

  • Question: high level XML question - after heavy insert / delete offline longlob reorg ; will there be online option? anything in design to avoid this ?
    • We are discussing this question off-line
  • Question: Does FRV unload is likely to require ZFS which takes a lot to set up in our shop.
    • Reply: FRV unload does not require ZFS
  • Question: Are XML data types in version 9 of DB2 z/OS compatible in version 10 without changes to code?
    • Reply: yes,
    • Follow-Up: that is good to know. Only if you want to take advantage if the new features you would need to recode.
    • More Follow-Up: If you want to use the new features in V10, then your application needs to be changed.
  • Question: Is it recommended to define XML tablespaces with compression on for DB2 z/OS? Aux tablespaces, that is
    • Reply: yes, compression on! we started out without it, and it improved after compression was turned on

The presenters received some very nice Thank Yous from the participants:

  • Thank you Philip for taking the time out of your busy schedule to give this talk that puts XML storage in perspective : Much appreciated ...
  • Nice job, Phil. This was useful.
  • Great information, thanks Phil!
  • Thanks Phil.
  • Useful. Good overview.
  • Thank you Phil, good perspective on XML vs relational
  • Thank you, Phil, for outlining the decisions for determining if XML is a good method to store data.
  • Thank you very much Yonghua for all your preparations - and for getting up well before 6am to be at work to give this talk
  • Thanks Yonghua, good overview of XML
  • Thanks Rick - good presentation
  • Thanks Youghua, and Rick!
  • Lots more: Thanks ; Thanks a lot and Thank you Yonghua and Rick

When to use XML; when to use relational; when to shred if you use XML ?

Platform: Cross-platform

Abstract:

Phil Nelson will discuss when it would be appropriate to store data as native XML and when it would be better to use a relational

database design.



Bio: Philip Nelson is a DB2 specialist with nearly a quarter of a century of experience in the field. By day he is a "Senior Subject Matter Expert" (basically an internal consultant) for Lloyds Banking Group. In his spare time he also works on developing and administering DB2-based systems for the SMB (small/medium business) market. He has a reputation for exploring using DB2 with new technologies and in recent times has had fun with Ruby on Rails, Amazon EC2 cloud instances and Android mobile devices. He has been using pureXML from the beginning and, as a reviewer, can claim to have read every word of the DB2 pureXML Cookbook !!!

More information here:

An Introduction to pureXML on DB2 for z/OS

Platform: z/OS

Abstract:

Yonghua Ding and Rick Chang will give an introduction to pureXML for DB2/OS. The session will include the following topics:

  • storage
  • index
  • schema registration & validation
  • SQL/XML - XMLQuery, XMLExists, XMLTable
  • Utilities
  • DB2 V10 new features
  • XQuery
  • Q & A

Bios:



Yonghua Ding has been working on DB2 for z/OS since 2004. Previously he worked on DB2 parser, precompiler and coprocessor. Since 2009 he focused more on the XML development including XQuery parser, XQuery semantic transformation, XML index, regular expression in XML etc. He received his Ph.D in Computer Science from Purdue University in 2004.





Rick Chang has been working as software engineer for over 20 years. Rick has been working on different projects in the area of storage management, e-commerce, contain management and recently DB2. Rick joined DB2 XML team in 2005 and worked on XML publishing function, XPath scanning and matching of a XML document.


Possible Future Sessions

  • What's coming for XML talk
  • Migrating from LOB to pureXML
  • pureXML and pureScale
  • DB2 for z/OS
  • IMS and XML
  • CICS and XML
  • ISO20022, FpML, XBRL
  • Stylesheets with pureXML
  • Using pureXML with Cobol
  • pureXML and analytics
  • XML Schema Validation on z/OS
  • XML Extender Migration
  • XML Namespaces with pureXML
  • User Experiences
  • pureXML caching in memory (Linux/UNIX) in bufferpools or other memory heaps, and exactly how that works.
  • XQuery update : performance tips and logging
  • XQuery Vs XMLTable which is better for performance
  • NoSQL and JSON
  • RDF and Linked Data

DB2 pureXMl Cookbook

See https://www.ibm.com/developerworks/community/wikis/home?lang=en#!/wiki/pureXML/page/DB2%20pureXML%20Cookbook

pureXML Devotees