Even though I don't work daily on DB2 any more, I thought this might be a helpful post.
Last night I was online, shopping coincindentally on Cyber Monday, for presents for both my Dad's birthday and my nephew's birthday.
A work colleague happened to catch me online to ask some questions. Now, these answers are not from a book, just what I said right out of my head -- JUST as I would if you and I had this conversation. Here is our Lotus Sametime conversation, verbatim, no editing. updated: one minor edit marked by  -- OK, and I also put in the html tags and made the requester anonymous, of course. So -- sure, I could have said more about the virtues of data sharing and WLM. They are many! Forgive me, it was nighttime and I wasn't in a "marketing" mood.
In any case, please enjoy the truthiness of this exchange.
Dec 1, 2008
8:25:22 PM coworker Hi, do you have time to give me a 101 lesson on db2 z stuff?
8:28:11 PM me right now?
8:28:19 PM coworker yes
8:28:38 PM me what do you need to know?
8:29:07 PM coworker ok here come the stupid questions...
8:29:10 PM what is data sharing?
8:29:58 PM me it's 2 DB2 subsystems that share one set of data, for availability and scaling beyond one machine
8:30:06 PM 2 or more that is
8:31:02 PM coworker ok so that leads me to another question, can you explain subsystems, lpars ? what is it like in windows world?
8:31:46 PM me a subsystem is a DB2 installation
8:32:18 PM an lpar is a "logical partition", kind of like a VM on windows...
8:32:59 PM coworker can you install anything else on a subsytstem?
8:33:12 PM how do subsystems and lpars relate to each other?
8:34:34 PM me a subsystem is just db2's code, it runs in an operating system which is in an LPAR. other programs (like IMS and websphere) can be installed in that same operating system instance in that same LPAR
8:36:49 PM coworker oh oi see, i'm getting confused with the subsystem term to mean a z/OS OS term
8:36:50 PM got it
8:36:57 PM so tell me about hypersockets?
8:37:56 PM me they are a "fast pipe" for TCP/IP between LPARs on a physical z box - like a 1GB network but faster (don't know the exact numbers)
8:39:10 PM coworker so is this DRDA or no?
8:39:15 PM or using db2 connect
8:40:30 PM me DRDA communication messages travel over over TCP/IP, so if both sides are in LPARs, then they can go over the hypersockets - such as Linux on z on one side in an LPAR, and DB2 z on the other side in an LPAR
8:41:27 PM coworker so when 2 lpars communicate they do not use TCp/IP, they use hypersockets
8:41:30 PM is this correct?
8:42:32 PM me they do use TCP/IP, which travels over the hypersockets as a physical path - like a network wire
8:42:53 PM now... 2 datasharing members don't use either, if that's what you're getting at
8:43:12 PM they use the coupling facility to communicate
8:43:21 PM coworker is this built into z/OS or db2?
8:43:26 PM me z/OS
8:43:26 PM coworker the copuling faciloity
8:43:29 PM i see
8:43:50 PM me it's a separate feature of z - parallel sysplex, which is what datasharing is built on
8:44:25 PM coworker how come a customer would want to use data sharing? benefits? cons?
8:45:26 PM me benefits are availability, since all of db2 maintenance can go on one member at a time so they can keep their data available
8:45:49 PM also scalable, since the number of threads to connect to one db2 is limited, the other threads can go to another db2 -- they share all the data
8:45:51 PM coworker o i c
8:46:02 PM ok got it
8:46:08 PM me good!
8:46:13 PM coworker wait one more question hahaha
8:46:15 PM me k
8:46:17 PM coworker tell me about workload mgmt
8:46:22 PM tha'ts a z/OS thing right?
8:47:05 PM me yes - it helps balance when there's more than one subsystem on the LPAR, also helps balance priority between different db2 threads and even helps choose which data sharing subsystem gets a connection based on how busy the systems are
8:47:57 PM coworker how come this is such a big advantage over windows let's say
8:48:06 PM doesn't all operation systems do this
8:48:56 PM me no - windows doesn't let you set the priority of different programs/users/etc (as far as i know)
8:50:52 PM coworker yes you can via task manager
8:50:55 PM although not sure how smart it is
8:51:15 PM what makes z/os workload mgmt so sophisticated
8:51:21 PM i'm just wondering what's the big deal
8:52:28 PM me it's got a pretty complex set of rules and levels of priority, frequent sampling, decaying priority over time, different rules for different times of day, etc. That said, I didn't know it was considered that big of a deal
8:54:30 PM coworkerhahahaha
8:54:36 PM thanks for your db2 z/os 101
8:54:43 PM me any time[Read More]
I often get asked about an architecture where a stored procedure is used for a single SQL statement. This is one of the most common errors in designs using stored procedures. It's always best to amortize the CALL overhead by including several SQL statements and even some business logic in every stored procedure. But folks that come to DB2 for z/OS from other DBMS's still do this.
A new twist on this is with our DB2 9 for z/OS support for native SQL procedures. Folks ask me, well now is it OK to have a single SQL statement in a native SQL procedure?
The thing is, the native SQL procedure is still a package for DB2 to load or at least switch to. So there really still is overhead, not to mention the network time to get over to DB2, as most apps will have to invoke several stored procedures to accomplish their logic. And guess what? Those applications typically invoke the same stored procedures in the same order with the same application logic in between. So... why not make that whole set a single stored procedure?
Bottom line: even with native SQL procedures executing in the DB2 engine rather than the WLM-managed address space, I still recommend an architecture with multiple SQL statements and some business logic rather than a single SQL statement per procedure.
If you do have an architecture with single SQL statements in stored procedures, that's not to say it won't work - it will work, just not as efficiently as it would if you follow the above advice. And even if the single-threaded app performs OK, it won't scale as well as having more SQL statements and logic in the stored procedure, due to longer-held locks over network transmissions as well as some database engine serialization required for external SQL procedures (that part at least is gone in DB2 9).
SQL procedures are absolutely strategic and I do recommend an architecture based on them. The advice I've given other customers is to determine the ones that are invoked most often, analyze the patterns of when they are invoked together from the same applications(s), and work to at least combine those into a single stored procedure with several SQL statements and some logic. If you implement SQL procedures on V8, the switch to DB2 9 native SQL procedures is a snap - drop and create and go, with no change to the code.
So, go, create, reuse, and be happy. :-)[Read More]
I had the pleasure this week of participating in a couple of live IBM Academy of Technology events - the first for me in several years. It was great to reconnect with some of my colleagues, and get a chance to talk over IBM technology and client engagements.
Our academy president, Rashik Parmar, had arranged for a session with Bill Gajda from Visa. Bill's role at Visa includes mobile strategy as well as global innovation. We had a frank and open discussion about Visa's use of data, who their customers are, and IBM's technology role.
Please note that any errors in here are likely mine, I didn't take notes, I was far too engaged and inspired thinking about all the data (!) during the discussion.
Clearly, credit card transactions are "big data". I don't recall the exact numbers, but here's an article from over 5 years ago which references 300 million transactions a day. Bill Gajda told us that Visa keeps around 5-7 years of past transactions. They primarily use this information for real-time fraud scoring of individual transactions. So, when you charge something, Visa gets the approval request, and attaches a score which indicates the likelihood that the transaction is fraudulent. This is based on several factors, such as your usual spending patterns and the location of the transaction. That is the primary use for the historical data. Bill also told us of a partnership they did with The Gap, where if a Gap customer in their loyalty program opted in with their phone number, Visa would tell the Gap when a purchase was being made nearby a Gap store, and then The Gap could text an offer to their customer for a discount at the Gap. I was able to find an article describing that, and was surprised to see it was from 2011! Now, if you're a geek like me, you'll think for a second about how the data flows. (This is just Peggy speculating, I have no further knowledge of the internals of this...) So say you charge your lunch on your Visa card, and the approval data flows to Visa, then after Visa processes the approval, it also looks at the zip code and compares it to a list of zip codes it has from The Gap of their store locations (I'm making this easy by Zip code rather than a lattitude/longitude based proximity lookup). When there's a match, Visa initiates a message to The Gap to tell them you're close to a Gap store... and Gap then sends you a text message with a promotion. Phew! Do you think Visa also has a indicator on your credit card that you're a Gap customer? They must... I can't imagine they do this on "every" Visa transaction... ! Actually, I guess there is "Gap Visa" card, so it's probably those which are targeted.
I don't know about you, but I have fun thinking about this kind of stuff. Some other interesting facts I learned are that Visa doesn't have your name or personal information - that data all resides with the issuing bank. One of my colleagues asked Bill Gajda whether if a person has multiple Visa cards, does Visa have information across the cards. At first, he answered "yes", but then with another colleague's question about matching the names, he said "no" - Visa has no idea that you are the same person when you have multiple cards. Interesting thought about whether we could perhaps match spending patterns to identify who might be the same person? That might not be something Visa wants to do, of course, I was thinking of it more as a theoretical exercise. Also, Visa doesn't really consider cardholders customers, despite its advertising budget - its customers are the issuing banks and the accepting retailers, so they are the ones who would be likely consumers of the volume of data or aggregated insights from it. Tho I guess the banks already have it, too.
In any case, again, data about people and their habits is big data and big business. I personally don't find any of this "scary" but I guess some people might. I just like people, so data about people is interesting data to me. I might those who do find it scary might just have to take a closer look at some of those tiny print privacy agreements!
I've received a few questions about text search in DB2 9 for z/OS lately, so I thought I'd share the basic information here.
Prior to DB2 9, there was an offering called "DB2 Text Extender". This was an early attempt at text indexing that runs on z/OS. It is dependent on some z/OS code called "Text Search". Later on, the team that worked on the extenders also released something called "DB2 Net Search Extender", aka NSE. DB2 for Linux, Unix, and Windows had a significant upgrade with NSE, but that same upgrade was not shipped for DB2 for z/OS. So DB2 for z/OS customers have not had a significant upgrade to text search since "DB2 Text Extender" in DB2 V7.
In DB2 9, there is a completely new text search solution. Text search is provided by a built-in engine function called CONTAINS. This solution requires an external text search server that runs on a Windows or Linux operating system, which is provided as part of the DB2 Accessories suite. It is not a direct replacement of Text Extender function, so applications and administration policies need to be updated for this change. The best source for detailed information about this is in the information center topic Administering IBM text search for DB2 for z/OS
. And I have an earlier blog entry
about the announcement.
What about the DB2 Text Extender? It's not available in DB2 9 for z/OS. That means you won't see an equivalent for DB2 V8 FMID JDB881C. And you won't need IBM Text Search FMID-HIMN230 in DB2 9 for z/OS either, because that was a prerequisite for the DB2 Text Extender.
The above is all specific to DB2 for z/OS. DB2 for LUW 9.5 FP 1 ships the same Omnifind text search server, and DB2 LUW still ships support for their NSE as well.
Now that we've got that all cleared up :-), next we'll evaluate some use cases for text search applicability.[Read More
Modified by PeggyZ
I'm really enjoying being back working at IBM's Silicon Valley Lab, after spending some time on assignment in IBM's Global Business Services division. Since November, I've been working as one of the product architects for IBM's Big Insights product, focusing on Text Analytics.
This week we were working closely with our top-notch design team, and it occurred to me that the four big areas of data actifity we were discussing spelled out LOVE.
Load, Organize, Visualize, Export.
Since we started just after Valentine's day, we decided this was appropriate. And for me, this has all truly been a labor of LOVE. I'm trying to make it stick as a code name. You heard it here first!
I found out about this one last week, and I think it sounds pretty interesting. IBM is holding a data management 'virtual conference' on February 25th. This one sounds like it's a lot more than just a webinar, as there is a show floor with virtual Expo Pedestals as well as speakers and the chance to chat with experts.
The screen shots that I saw look intriguing, although I admit that I don't yet have an exact sense of what it will be like to "be there". What I do know is that travel budgets are likely to range from tight to non-existent for everyone, but the need for knowledge and personal technical contacts is greater than ever.
Here's the agenda:
8:00 AM ET Show floor opens
11:00 Understanding the Foundations of The Information Agenda
12:00 Noon Chat with the experts
1:30 PM Integrated Data Management Revolution with Merv Adrian of Forrester Research
2:30 PM Chat with the experts
6:00 PM Show floor closes
If you haven't seen much on the Information Agenda, it's worth a look. It's all about trusted information and getting the right information out of the silos and working for the business. Over my years in the Information Management area of IBM, I've seen this message evolve and it continues to make more sense to me - how about you? The way I see it, if you're responsible for one silo of information, making that available for the business to benefit from should be one of your main objectives, well, that along with ensuring the security and reliability of that information.
Here's what the Expo Solution Pedestals will have:
Information AgendaLower Costs with IBM Data ServersOptim Data GrowthArchitect and Developer ProductivityDBA Efficiency and Autonomics
Along with the fact that there's no travel cost, the virtual conference is free to attend! Interested? Register now!
. And let me know how you liked it, did it work? It's easy to say that this can't fully replace a real in-person conference experience (after all there's no beer), but I like the idea of supplementing the real ones with other ways to stay connected and informed. This should be a step up from a webinar, as there seem to be many opportunities to interact real-time.
Now... how are they going to handle free swag from those pedestals? :-)[Read More]
Hi all, and welcome to my new blog!
All this week, I am busy at the International DB2 User's Group conference here in Athens, Greece. The two topics I'm presenting are "Native SQL stored procedures in DB2 9 for z/OS" and "Java stored procedures".
I was very glad to get the chance to get across this key point in yesterday's presentation on native SQL stored procedures -- yes, when they execute they are eligible for redirect to the zIIP processor, but only when they are invoked from a remote client, and then at the same percentage as other DDF work. Lots of presentations lately have stated this a bit more broadly, leaving people with the wrong impression. So, let's be clear - when a remote thread comes into DB2, it executes on an enclave SRB, and DB2 dials the zIIP redirect to a certain percentage. A DB2 9 native SQL procedure executes on the invoking execution block and not on a WLM-SPAS TCB - that's one of their big advantages. Thus, when a native SQL stored procedure is invoked from a remote client over a TCP/IP connection, it runs on the enclave SRB and thus picks up that same DDF zIIP redirect percentage. On the other hand, when a native SQL stored procedure is invoked locally on z/OS, it is executed on the TCB that PC'd in from CICS or batch, and that is not eligible for zIIP redirect.
Lots of other good stuff going on here - tomorrow is a DB2 for z/OS Special Interest Group as well as our IBM query panel.[Read More]
I know that many of you are getting ready to attend the IOD conference
next week in Las Vegas. Alas, I will not be there, but that doesn't stop me from giving you advice about how to get the most out of a conference. After all, you are spending your time there, and your company's money, so you might as well make the most of it.
Earlier this month, I was fortunate to attend the Grace Hopper Celebration of Women in Computing in Keystone, Colorado. I was notified of my participation only two weeks before, and I wasn't aware of this conference at all. I made my reservations and printed out the conference agenda, which I reviewed on the plane on the way there.Most conferences give you access to the agenda ahead of time, and it's a good idea to print that out and have a look at it.
As for me, after one quick pass through it, I started by putting a "dot" next to anything I might be interested in, and a "star" next to anything that was a 'must see'. Then I started to notice some trends emerging. I came up with these goals for the conference, based on what was in the agenda:
- meet other IBM women (this was, after all, a Women in Computing conference, and IBM was sending me)
- text analytics technologies
- women in technology issues, including attracting more students
- social networking and collaboration
- cool stuff other than the above
To that end, I circled the name of any IBMer, and labeled each of the rest of what I noted as one of the above. Now, I realize, this approach might seem a bit, well, organized -- particularly for me. But it really worked, particularly to keep my attention and also keep a balance of different topics, as well as provide a tiebreaker when there were multiple sessions at the same time - I just look at the balance of the other sessions I've been attending.
Another thing you want to do while you're there is talk to other people about what they've seen, what they are going to see, etc. It's so easy to miss something or misinterpret something in an agenda.
If you keep a good balance between "use right now" and "good to know for the future" topics, you can help keep your brain from overflowing - that's always a danger at a multi-day conference![Read More]
Too many blog posts start with "It's been awhile since I've posted here", so I'm gonna skip that part and pretend that we've been talking regularly all along, OK ?
Firstly, yes, I'll be at IOD
this year! I've missed a couple due to being completely heads-down on helping get this cool new product out, so it will be great on a personal and professional networking level to reconnect with many of you. I'm there to help represent an IBM product called IBM Content Analytics
affectionately known as "ICA". It's part of our Enterprise Content Management suite. ICA integrates very well in an enterprise but at its core it is a standalone product - no prereqs. I hope that we get a chance to chat about it if we run into each other at IOD, I promise to keep it fun and friendly. :-)
The value proposition for IOD is one of unlocking insight with business value from your text content. Of course there is amazing technical capability behind it, and that's where my time is spent. Remember those old ads for the Evelyn Woods Speed Reading
courses (kinda funny that it's still around, huh!)? Conceptually, it's like this ICA system knows how to do that, because it can just go and "read" anything and everything you've got, and then tell you what the important things are across the whole smash of documents. You aren't searching, you're being presented the hot terms, words, phrases, words that come from a list you provide, and
how they are trending over time (you get to choose the date for analyzing trends - received date? incident date? date extracted from text? hire date? birth date? -- get the picture?). ICA also shows you how other terms, phrases, words correlate to what you've already identified. Right away out of the box, ICA will show you insight -- and it connects to all different enterprise sources
. We've got a great suite of built-in capability for text analytics and intuitive visualizations for your insights. And then on top of that, ICA is completely customizable, too - make it find the insights/concepts you are interested in, display them how you need to, get instant results for a document, export analytics results to your favorite reporting tools, and oh, so much more!
My technical lead focus on 2010 has been on the ICA tooling and customization capabilities - how to add new custom text analytics, how to use the LanguageWare Resource Worbench
to customize ICA, and how to make sure the sources of text are represented the way you want them. In that vein, I led a team of brilliant capable students in an Extreme Blue
project this summer who really helped us to show how you can unlock the value in your data using this platform. And I've been having a lot of geeky fun designing and giving demos to lots of different clients.
Come learn more if you'll be at IOD. It's easy to search this ECM roadmap
for Content Analytics sessions, and that's where you'll find me hanging around, plus I'll be at the Expo pedestals. Or let's catch up over a beverage - it's been too long!
I was pointed to this interesting article
from the New York Times, about a new technology invented by two software engineers, Jonathan Lindo and Jeffrey Daudel, to be able to "replay" the events that led up to a system crash. Not that I really want to see my "blue screen of death" from yesterday again, but if it would help identify the problem and get a fix, I could probably live through it a couple more times.
Reading the article, I was struck by a couple of points. They quote Lindo as saying that the inspiration came to them as "Wouldn't it be great if we could just TiVo this and replay it?" And then it says this:
Innovation by analogy is a powerful concept, says Giovanni Gavetti, an associate professor at the Harvard Business School who, with his colleague Jan W. Rivkin, has published research on how businesses can use analogic reasoning as a strategic tool. Human beings are analogy machines, he notes, dealing with new information by comparing it to things they already know something about.
That's true, I often try out analogies when I'm trying to understand or explain something. And I can really see how that could lead to innovations, as well as to some odd product evolutions. For a consumer example, I love how the iPhone lets me listen to my voicemail messages in any order, instead of sequentially, which must have been a leftover paradigm from when messages were stored on an analog tape. I can picture someone saying - "why can't I access my messages like I read my email?" - and voila - innovation.
Then I started wondering just how much you could tinker with the crash replay. Could you start eliminating concurrently-running applications, for example, to see if any of them contributed to the crash? And could you test a fix with the replay to see if it fixes the crash?
I also wonder whether IBM's customers would voluntarily seek out software like this to help them narrow down problems. It's not from IBM, and I really don't know any more about it than is in the article above. It's from a company called Replay Solutions, and it runs on several versions of the Microsoft Windows operating system. So, no mainframe support yet (grin). But you could ask them about it!