 |
What? IBM has Text Analytics capability?
As many of you know, I changed jobs early in 2008 to switch my developer focus from DB2 for z/OS over to unstructured text technologies. Since many DB2 folks are heads-down in structured data, that whole "content"
side is a bit of a mystery. Sure, you know about search, and you likely have a vague understanding of what it means to have a text index supporting keyword search. But really, there's so much more...
I've been learning a ton about all of this, which is to be expected after over a year (!) in this job. I have some terrific colleagues in text analytics, in research, development and services, who continually amaze me with their breadth and depth of knowledge, as well as their passion for the topic and their eagerness to help customers.
I happen to love linguistics, so this job is a great fit for me. I love to read, and the turn of a phrase in a book or a song lyric brings me joy. I like to think about the best way to phrase things and ways to interpret sentences. The more I interact with non-native English speakers, the more I appreciate both the beauty and the limitations of language, and the inherent difficulties in both generating and understanding sentences. I truly enjoyed all my study of French in school, too. It's always been such an intellectual snobbery to say something like "the French have a word for that", but anyone who knows more than one language knows it to be true -- language translation is never exact and concepts cannot always be expressed well, even in one's native language. Bottom line is that it's so interesting for me to dig into and help shape the technology and rules around extracting meaning from unstructured text.
Last month I was talking with a long-time friend and colleague who was here with her company at the IBM Silicon Valley Lab for a technology briefing. She and I have had several conversations at conferences over the years on topics like O-O databases, Java, XML, as they were emerging towards mainstream over the years. In the briefing, we talked a bit about unstructured data in the context of the Information Agenda, and one of their company's thought leaders said that unstructured data inclusion is implied. Cool, but, um, how exactly? Their (very reasonable) response when I probed a little further was that they needed to hook up with the business guys on that. YES! That's where I think we all absolutely have to start -- what is the unstructured data, and what questions do you need answered from it for business value. Then we go into more of the logistics around that.
Specifically, just this week I've worked on a couple of items that can help me bring some meaning to what text analytics is all about to folks who haven't been exposed to it deeply. The first was working on a report for a well-known analyst group, where we describe our information access technologies and offerings for unstructured data. And the second is a new offering that I hope will become available soon, to help quantify needs and specific business value that can be derived from unstructured data. If you are curious about any specifics, a great place to start digging into and even playing with some text analytics technology is the LanguageWare capabilities, system text and UIMA.
If you're interested in this kind of stuff, please let me know, or contact your local IBM rep and ask them (and tell them to ask me if they want a starting contact!) I'm passionate and eager to help! :-)
Jun 12 2009, 03:49:22 PM EDT
Permalink
|
coming soon - a virtual conference
I found out about this one last week, and I think it sounds pretty interesting. IBM is holding a data management 'virtual conference' on February 25th. This one sounds like it's a lot more than just a webinar, as there is a show floor with virtual Expo Pedestals as well as speakers and the chance to chat with experts.
The screen shots that I saw look intriguing, although I admit that I don't yet have an exact sense of what it will be like to "be there". What I do know is that travel budgets are likely to range from tight to non-existent for everyone, but the need for knowledge and personal technical contacts is greater than ever.
Here's the agenda:
Agenda:
8:00 AM ET Show floor opens
11:00 Understanding the Foundations of The Information Agenda
12:00 Noon Chat with the experts
1:30 PM Integrated Data Management Revolution with Merv Adrian of Forrester Research
2:30 PM Chat with the experts
6:00 PM Show floor closes
If you haven't seen much on the Information Agenda, it's worth a look. It's all about trusted information and getting the right information out of the silos and working for the business. Over my years in the Information Management area of IBM, I've seen this message evolve and it continues to make more sense to me - how about you? The way I see it, if you're responsible for one silo of information, making that available for the business to benefit from should be one of your main objectives, well, that along with ensuring the security and reliability of that information.
Here's what the Expo Solution Pedestals will have:
Information Agenda
Lower Costs with IBM Data Servers
Optim Data Growth
Architect and Developer Productivity
DBA Efficiency and Autonomics
Along with the fact that there's no travel cost, the virtual conference is free to attend! Interested? Register now!. And let me know how you liked it, did it work? It's easy to say that this can't fully replace a real in-person conference experience (after all there's no beer), but I like the idea of supplementing the real ones with other ways to stay connected and informed. This should be a step up from a webinar, as there seem to be many opportunities to interact real-time.
Now... how are they going to handle free swag from those pedestals? :-)
Feb 09 2009, 01:32:27 PM EST
Permalink
|
A little bit of DB2 for z/OS 101
Even though I don't work daily on DB2 any more, I thought this might be a helpful post.
Last night I was online, shopping coincindentally on Cyber Monday, for presents for both my Dad's birthday and my nephew's birthday.
A work colleague happened to catch me online to ask some questions. Now, these answers are not from a book, just what I said right out of my head -- JUST as I would if you and I had this conversation. Here is our Lotus Sametime conversation, verbatim, no editing. updated: one minor edit marked by [] -- OK, and I also put in the html tags and made the requester anonymous, of course. So -- sure, I could have said more about the virtues of data sharing and WLM. They are many! Forgive me, it was nighttime and I wasn't in a "marketing" mood.
In any case, please enjoy the truthiness of this exchange.
Dec 1, 2008
8:25:22 PM coworker Hi, do you have time to give me a 101 lesson on db2 z stuff?
8:28:11 PM me right now?
8:28:19 PM coworker yes
8:28:38 PM me what do you need to know?
8:29:07 PM coworker ok here come the stupid questions...
8:29:10 PM what is data sharing?
8:29:58 PM me it's 2 DB2 subsystems that share one set of data, for availability and scaling beyond one machine
8:30:06 PM 2 or more that is
8:31:02 PM coworker ok so that leads me to another question, can you explain subsystems, lpars ? what is it like in windows world?
8:31:46 PM me a subsystem is a DB2 installation
8:32:18 PM an lpar is a "logical partition", kind of like a VM on windows...
8:32:59 PM coworker can you install anything else on a subsytstem?
8:33:12 PM how do subsystems and lpars relate to each other?
8:34:34 PM me a subsystem is just db2's code, it runs in an operating system which is in an LPAR. other programs (like IMS and websphere) can be installed in that same operating system instance in that same LPAR
8:36:49 PM coworker oh oi see, i'm getting confused with the subsystem term to mean a z/OS OS term
8:36:50 PM got it
8:36:57 PM so tell me about hypersockets?
8:37:56 PM me they are a "fast pipe" for TCP/IP between LPARs on a physical z box - like a 1GB network but faster (don't know the exact numbers)
8:39:10 PM coworker so is this DRDA or no?
8:39:15 PM or using db2 connect
8:40:30 PM me DRDA communication messages travel over over TCP/IP, so if both sides are in LPARs, then they can go over the hypersockets - such as Linux on z on one side in an LPAR, and DB2 z on the other side in an LPAR
8:41:27 PM coworker so when 2 lpars communicate they do not use TCp/IP, they use hypersockets
8:41:30 PM is this correct?
8:42:32 PM me they do use TCP/IP, which travels over the hypersockets as a physical path - like a network wire
8:42:53 PM now... 2 datasharing members don't use either, if that's what you're getting at
8:43:12 PM they use the coupling facility to communicate
8:43:21 PM coworker is this built into z/OS or db2?
8:43:26 PM me z/OS
8:43:26 PM coworker the copuling faciloity
8:43:29 PM i see
8:43:50 PM me it's a separate feature of z - parallel sysplex, which is what datasharing is built on
8:44:25 PM coworker how come a customer would want to use data sharing? benefits? cons?
8:45:26 PM me benefits are availability, since all of db2 maintenance can go on one member at a time so they can keep their data available
8:45:49 PM also scalable, since the number of threads to connect to one db2 is limited, the other threads can go to another db2 -- they share all the data
8:45:51 PM coworker o i c
8:46:02 PM ok got it
8:46:08 PM me good!
8:46:13 PM coworker wait one more question hahaha
8:46:15 PM me k
8:46:17 PM coworker tell me about workload mgmt
8:46:22 PM tha'ts a z/OS thing right?
8:47:05 PM me yes - it helps balance when there's more than one subsystem on the LPAR, also helps balance priority between different db2 threads and even helps choose which data sharing subsystem gets a connection based on how busy the systems are
8:47:57 PM coworker how come this is such a big advantage over windows let's say
8:48:06 PM doesn't all operation systems do this
8:48:56 PM me no - windows doesn't let you set the priority of different programs/users/etc (as far as i know)
8:50:52 PM coworker yes you can via task manager
8:50:55 PM although not sure how smart it is
8:51:15 PM what makes z/os workload mgmt so sophisticated
8:51:21 PM i'm just wondering what's the big deal
8:52:28 PM me it's got a pretty complex set of rules and levels of priority, frequent sampling, decaying priority over time, different rules for different times of day, etc. That said, I didn't know it was considered that big of a deal
[]
8:54:30 PM coworkerhahahaha
8:54:36 PM thanks for your db2 z/os 101
8:54:43 PM me any time
Dec 02 2008, 01:42:39 PM EST
Permalink
|
How to get the most out of a conference
I know that many of you are getting ready to attend the IOD conference next week in Las Vegas. Alas, I will not be there, but that doesn't stop me from giving you advice about how to get the most out of a conference. After all, you are spending your time there, and your company's money, so you might as well make the most of it.
Earlier this month, I was fortunate to attend the Grace Hopper Celebration of Women in Computing in Keystone, Colorado. I was notified of my participation only two weeks before, and I wasn't aware of this conference at all. I made my reservations and printed out the conference agenda, which I reviewed on the plane on the way there.
Most conferences give you access to the agenda ahead of time, and it's a good idea to print that out and have a look at it.
As for me, after one quick pass through it, I started by putting a "dot" next to anything I might be interested in, and a "star" next to anything that was a 'must see'. Then I started to notice some trends emerging. I came up with these goals for the conference, based on what was in the agenda:
- meet other IBM women (this was, after all, a Women in Computing conference, and IBM was sending me)
- text analytics technologies
- women in technology issues, including attracting more students
- social networking and collaboration
- cool stuff other than the above
To that end, I circled the name of any IBMer, and labeled each of the rest of what I noted as one of the above. Now, I realize, this approach might seem a bit, well, organized -- particularly for me. But it really worked, particularly to keep my attention and also keep a balance of different topics, as well as provide a tiebreaker when there were multiple sessions at the same time - I just look at the balance of the other sessions I've been attending.
Another thing you want to do while you're there is talk to other people about what they've seen, what they are going to see, etc. It's so easy to miss something or misinterpret something in an agenda.
If you keep a good balance between "use right now" and "good to know for the future" topics, you can help keep your brain from overflowing - that's always a danger at a multi-day conference!
Oct 20 2008, 08:05:11 PM EDT
Permalink
|
What if you had to produce all relevant emails?
I think that most of you reading this work for large companies, and our U.S. large companies tend to have pretty active legal departments. One of the hot topics these days around litigation is the investigation of email to answer legal requirements for evidence. Yep, they're likely keeping all of your email, and are required to comply when asked to provide the relevant ones as part of a lawsuit. Getting that set right is a big deal.
Now, I'm not a lawyer. I do happen to come from a family of lawyers, but that's not really here nor there for this discussion. The group where I work in IBM's Information Mangement has just produced a pretty cool part of the eDiscovery puzzle. It's called eDiscovery Analyzer. As you can see in the announcement letter, it works in conjunction with other IBM products to analyze email content in a repository.
The cool part is what's under the hood here. Based on the open, unstructured information management architecture-based search and text analytics (known as UIMA to those who know and love it), this product processes the text inside as well as the associated information about all the emails. This processing in turn allows a legal email analyzer person to work with and filter based on extracted entities from the email, such as people and company names, and stuff like sender, recipient and date. Combine that with powerful free-text search and you really have some amazing capability to categorize, gather, flag... this really helps a legal staff when they're asked to provide exactly what's needed and no more.
Now... what if you had this kind of capability on other information besides legal email repositories in your enterprise. What would you do with it? What other business problems could this kind of technology solve for you?
Sep 30 2008, 07:21:33 PM EDT
Permalink
|
This Wednesday, Oct 1 -- be part of the community
Some of my IBM colleagues have created a pretty cool idea - that we, the community of folks with an interest in IBM's information management technology, should designate a day to connect virtually online. This means not just reading content, but actually taking a step further and participating.
I've always seen online social networking tools as extensions of what is done better in person, and a pretty good substitute for when it's just not practical to be in person. This goes back years and years, to online forums, prodigy (remember that?), etc. If you think of your participation online as much like an in-person event as you possibly can, you'll benefit the most possible.
Say, for example, if you attended a talk at a conference, and you gained a lot of useful knowledge from it, and then found your self face-to-face with the speaker right afterwards, you'd say "thanks, I learned a lot from your talk". And if you were sitting at lunch and someone said "Do you know anyone here who can help me with an SQL issue", you'd point them across the room to where your favorite SQL expert sat. Or, you'd do your best to answer the SQL issue yourself.
What we're thinking is that perhaps if we picked a day and asked everyone to speak up in just one small way, we might get some folks more comfortable with participating online, and everyone would benefit - make some contacts, get some questions answered, reconnect with someone you met in person, etc.
So, this Wednesday October 1, get out there to your favorite Information Management online sites and find a way to speak up. There are more ideas and links mentioned here.
Sep 29 2008, 12:36:58 PM EDT
Permalink
|
Answering some questions about DB2 for z/OS text search
I've received a few questions about text search in DB2 9 for z/OS lately, so I thought I'd share the basic information here.
Prior to DB2 9, there was an offering called "DB2 Text Extender". This was an early attempt at text indexing that runs on z/OS. It is dependent on some z/OS code called "Text Search". Later on, the team that worked on the extenders also released something called "DB2 Net Search Extender", aka NSE. DB2 for Linux, Unix, and Windows had a significant upgrade with NSE, but that same upgrade was not shipped for DB2 for z/OS. So DB2 for z/OS customers have not had a significant upgrade to text search since "DB2 Text Extender" in DB2 V7.
In DB2 9, there is a completely new text search solution. Text search is provided by a built-in engine function called CONTAINS. This solution requires an external text search server that runs on a Windows or Linux operating system, which is provided as part of the DB2 Accessories suite. It is not a direct replacement of Text Extender function, so applications and administration policies need to be updated for this change. The best source for detailed information about this is in the information center topic Administering IBM text search for DB2 for z/OS. And I have an earlier blog entry about the announcement.
What about the DB2 Text Extender? It's not available in DB2 9 for z/OS. That means you won't see an equivalent for DB2 V8 FMID JDB881C. And you won't need IBM Text Search FMID-HIMN230 in DB2 9 for z/OS either, because that was a prerequisite for the DB2 Text Extender.
The above is all specific to DB2 for z/OS. DB2 for LUW 9.5 FP 1 ships the same Omnifind text search server, and DB2 LUW still ships support for their NSE as well.
Now that we've got that all cleared up :-), next we'll evaluate some use cases for text search applicability.
Categories
: [ database_text_search | db2zOS ]
Jul 23 2008, 07:48:32 PM EDT
Permalink
|
Wouldn't you be tempted to peek at this data?
News broke this week of further data access breaches at the UCLA medical center. You can read the L.A. Times article on it for a detailed description.
It seems that an employee accessed some data that he or she shouldn't have. And they are still dealing with the extent of the breach. Now... I admit that it sure would be tempting to peek at celebrity health records. I don't know why, I guess somehow every aspect of a famous person's life is interesting. However, this particular employee went on and disclosed information to the National Enquirer! I can't imagine that the money received could possibly have been worth enough to lose one's job over. But sure, it would be tempting to just look, or at least test my authority to peek if I knew I might find something to give me a chuckle or a smug feeling that I knew something others didn't. ("What? Cher had cosmetic surgery??") Now - the thought that I might be tracked or caught would absolutely shut down that temptation for me.
Clearly the press on this data access breach is horrible and condemning. But those of us in the information technology industry aren't shocked by this -- in general, I've seen that IT people can all access more than we should be able to, just to be able to do our job. And we're trustworthy, even if it's tempting!! But the systems we manage still need to change to prevent this access.
This latest blow-up is one more indication that scrutiny has to tighten around who can access what data and how to track it, and it's becoming a very public concern. Of course I see that IBM has lots of solutions around this - a search of ibm.com for 'data privacy' came up with over 200,000 hits.
Be careful out there - you don't want to end up on the news!
Categories
: [ data_governance | information_management ]
Apr 11 2008, 07:11:40 PM EDT
Permalink
|
Interesting technology to replay a system crash
I was pointed to this interesting article from the New York Times, about a new technology invented by two software engineers, Jonathan Lindo and Jeffrey Daudel, to be able to "replay" the events that led up to a system crash. Not that I really want to see my "blue screen of death" from yesterday again, but if it would help identify the problem and get a fix, I could probably live through it a couple more times.
Reading the article, I was struck by a couple of points. They quote Lindo as saying that the inspiration came to them as "Wouldn't it be great if we could just TiVo this and replay it?" And then it says this:
Innovation by analogy is a powerful concept, says Giovanni Gavetti, an associate professor at the Harvard Business School who, with his colleague Jan W. Rivkin, has published research on how businesses can use analogic reasoning as a strategic tool. Human beings are analogy machines, he notes, dealing with new information by comparing it to things they already know something about.
That's true, I often try out analogies when I'm trying to understand or explain something. And I can really see how that could lead to innovations, as well as to some odd product evolutions. For a consumer example, I love how the iPhone lets me listen to my voicemail messages in any order, instead of sequentially, which must have been a leftover paradigm from when messages were stored on an analog tape. I can picture someone saying - "why can't I access my messages like I read my email?" - and voila - innovation.
Then I started wondering just how much you could tinker with the crash replay. Could you start eliminating concurrently-running applications, for example, to see if any of them contributed to the crash? And could you test a fix with the replay to see if it fixes the crash?
I also wonder whether IBM's customers would voluntarily seek out software like this to help them narrow down problems. It's not from IBM, and I really don't know any more about it than is in the article above. It's from a company called Replay Solutions, and it runs on several versions of the Microsoft Windows operating system. So, no mainframe support yet (grin). But you could ask them about it!
Categories
: [ bug | recreate | software ]
Mar 31 2008, 04:26:37 PM EDT
Permalink
|
What grade would your organization get?
I heard an interesting story on the news last week, about how the individual states of the U.S. were graded on how
they use information. The state I live in, California, got a C+. How can this be, with our advanced technology
centers in Silicon Vallley?
I found the article online here and found some interesting things, although nothing specific about California.
The article says:
When all is said and done, a state’s skill
with information is found at the intersection
of three distinct operations: the willingness
to share data, the capacity to generate
good information, and the ability to
get those who should use the data to do so.
Well, that sounds a lot like stuff that I have talked about when describing IBM's Information on Demand strategy. Is your organization good at doing this? I particularly noted the last point in the article, because some of the states complain that their legislators just aren't interested in using the data! Maybe we information professionals have to make that easy (and fun?) to do.
What about the highest-graded states? The article had this to say about one of them:
In Washington State, Governor Christine
Gregoire held a series of town hall
meetings on the budget to communicate results
to citizens and follow up on the budgetary
priorities she had previously established
with much citizen input. “We want
to give concrete information about whether
a difference has been made or hasn’t"
Yep... this is what everyone wants to know. What did we say we'd do? Did it make a difference? In fact, I've been trying to get this type of information from my financial analyst for some time!
What about states that were graded worse than California?
Some state employees in Rhode Island are
still operating with typewriters—electric, of
course, but still a far cry from the ability to
share information in a database. New
Hampshire has such weak data-sharing systems
that it doesn’t know how much it
spends each month—kind of like an average
Joe who’s lost his checkbook.
And finally:
At the opposite end of the spectrum, there’s Wyoming. Its
transportation department has linked geographic
information systems to financial
systems and now knows with exact specificity
how money is being spent, down to the
cost of the salt used between each mile
marker on the state’s snowy roads.
OK, well, perhaps that is an example of too much information! :-)
Categories
: [ information_on_demand ]
Mar 11 2008, 05:20:11 PM EDT
Permalink
|
Wow - one-time charge for DB2 for z/OS!
Announced today: New pricing options for DB2 for z/OS running new workloads! All you data center folks who lament to us that pricing for "other" databases can't be compared to DB2 for z/OS - rejoice!!
Announcing today, and already found here is this gem of a news item tidbit:
IBM is also announcing the immediate availability of DB2 for z/OS Value Unit Edition, which provides a new one-time-charge offering that enables the deployment of new application workloads. This offering strengthens the role of System z as a cornerstone for key business initiatives such as SOA, Data Warehousing, Business Intelligence and packaged applications such as SAP. DB2 for z/OS Value Unit Edition and IBM Information Server enable System z clients to further deliver trusted information for their dynamic warehousing requirements.
Just updated: Here is where you can find the gory details.
Is this cool or what? Doesn't this just remove the last and final objection that the application architects have for leaving DB2 for z/OS out of the running for those new applications?
Now, lest you think I am somehow reflecting a non-developer perspective, look, I have spent most of my efforts in DB2 for z/OS developing the kinds of new technologies designed to attract new workloads, and since even I have heard the pricing objection, isn't it perfectly fair for me to mention this in my DW space? And heck, since I am a developer, not a pricing person by any stretch of the imagination, if this has gotten my attention, you know it's big news!
Bring on those new workloads! And then come to us in development and tell us what you need to bring more work onto z, OK?
Categories
: [ db2_z/OS | db2_z/OS_VUE ]
Feb 26 2008, 02:36:35 PM EST
Permalink
|
Native SQL stored procedures in DBM1 - What, me worry?
When I describe native SQL procedures in DB2 9 for z/OS, I often hear variations of these types of questions:
- Doesn't the external WLM-managed infrastructure provide some throttling of stored procedures? What's going to happen when this is gone?
- Can DBM1 handle the same amount of concurrent stored procedures as multiple WLM-SPAS?
- User routines only use below the bar storage, so how much below the bar storage is available in DBM1 for these native SQL procedures?
In order to answer this, I have to explain a little bit about how DB2 handles native SQL procedures. They are simply packages, with "runtime structures" for the SQL statements to be executed. So, when you invoke a native SQL procedure, DB2 finds and loads the package and executes the statements.
In contrast, an external stored procedure with SQL needs a complete language environment for the user program, and then that external program comes back to DBM1 to get its package loaded and SQL statements executed. That's what needs to be "throttled" - the external program execution environments and their associated TCBs. When an incoming stored procedure request is queued for WLM, the DB2 thread is suspended in DBM1. Many customers have experienced delays and DBM1 storage problems when their WLM goals weren't adjusted properly and the queued requests built up. The solution is to either adjust the WLM goals, or else adjust the limit on DB2 threads (local and/or distributed).
With native SQL procedures, the thread will just switch packages when the call statement is processed and run the procedure - no queuing. The storage used for the local variables is above the bar and managed with efficient algorithms. The maximum concurrent first-level native SQL procedures is effectively the same as your setting for maximum DB2 threads. (What I mean by first-level is that a native SQL procedure may have a nested call to another native SQL procedure, so the actual number of concurrent native SQL procedures may be even higher).
So, I guess the way I'd answer the questions is:
- Yep. When it's gone, SPs will run much more efficiently
- Yep - in fact likely more
- n/a - SQL procedures aren't "really" user routines - they are a pre-defined set of SQL statements, and they don't use below the bar storage
Of course I recommend that you test your native SQL procedures in your environment and measure for yourself, and do capacity planning based on the results of your testing. Native SQL procedures will use some DBM1 storage, after all, and how much depends on what statements and what variables are used in the program.
Oh, and if you didn't recognize it, the "What, me worry?" is a reference to the signature quote from Alfred E. Neuman. It's more than a little tongue-in-cheek.
Feb 11 2008, 03:37:53 PM EST
Permalink
|
SQLPROCEDURECOLS - a metadata stored procedure
Also among the 'recommended practices' that I often present on DB2 for z/OS stored procedures is this one:
- Don't call the metadata stored procedures
Many invocations of DB2 for z/OS stored procedures come from a Java(TM) or a CLI application. The software stack for these programs accessing DB2 for z/OS is through a "driver" program. These driver programs have SQL packages bound to DB2 for z/OS, and in the case of the application invoking a stored procedure, there is a fair amount of code executed in the driver program.
For a CLI program (the term CLI is often used interchangeably with ODBC) -- this is usually something running from a Microsoft(TM) application accessing DB2 for z/OS. The DB2 connect software that includes the driver for DB2 for z/OS has some smarts in it so that if the application is coded using incorrect data types for the stored procedure being invoked, the driver recovers and invokes the SQLPROCEDURECOLS metadata stored procedure on DB2 for z/OS to find out what the data types are and then re-sends the stored procedure call to DB2 for z/OS. Yes, you got it right, this means that a poorly coded application can invoke 3 stored procedure calls for every SQL CALL it's trying to do -- one to the original SP, one to SYSIBM.SQLPROCEDURECOLS, and then again to the original SP with the correct parm types! How do you recognize this? Well, you could run a client-side DRDA trace and it will show up there. Or you can look at statistics at the server. Or you can set the value DESCRIBEPARAM=0 in the db2cli.ini file on the client, and let the applications get the error SQLCODE -301 because now the driver won't do the metadata PS call and instead will let the application fail due to using the wrong datatype. Same result if you issue a -STOP PROCEDURE (SYSIBM.SQLPROCEDURECOLS) ACTION(REJECT) command on the DB2 for z/OS server.
For a Java(TM) program, the current driver is the DB2 Universal Java Driver, and it will not invoke the metadata stored procedure. So this is an excellent reason to switch to the current driver, because the older version of the driver went through the CLI code path and had the same problem as described above.
Note that if you invoke a stored procedure from the command line (the CLP), that code will always invoke the SQLPROCDURECOLS stored procedure since the command line doesn't provide anything for what data type the arguments are.
Now, if you are stuck with a CLI program that you can't modify, what can you do to improve the performance of SQLPROCEDURECOLS? Well, APAR PK57017 just shipped which reduces the size of the package for this stored procedure, so you can free up some EDM pool usage and get a small CPU usage improvement. You can also be sure you run RUNSTATS so that the data access for this SP is the most efficient it can be. I have also heard rumors of some customers creating additional indexes on the tables used by SQLPROCEDURECOLS, but I don't have any specifics on that, sorry.
Categories
: [ db2_z/OS | metadata_stored_procedures | stored_procedures ]
Jan 21 2008, 12:58:14 PM EST
Permalink
|
Max of 512 stored procedures in a WLMENV?
Among the 'recommended practices' that I often present on DB2 for z/OS stored procedures is this one:
- No more than 512 SP's in a WLM
Let me explain why I recommend this. It's actually at the bottom of the list, and that's because it doesn't come up that often. But it has, and when it does, it can cost in I/O. DB2 has a Language Environment table of load modules in each stored procedures address space. For stored procedures defined STAY RESIDENT YES, we only have room for 512 load modules in that table. A load module has to be in the table in order for DB2 to invoke it. So, starting with the 512th, we'll delete it from the table after we call it, even if it's STAY RESIDENT YES. And come to think of it, we have separate tables for TYPE MAIN and TYPE SUB.
So to be completely accurate, the recommendation could actually say something like this:
- No more than 512 different load modules for STAY RESIDENT YES SP's in a WLM application environment, that are all either PROGRAM TYPE MAIN or PROGRAM TYPE SUB and invoked during the lifetime of a single instance of a WLM-SPAS.
For that last bit, remember that different invokers of a stored procedure that end up classified in different WLM enclaves will not have their SPs run in the same instance of a WLM-SPAS.
What's a WLM-SPAS? It's what I use to abbreviate a "WLM-established stored procedures address space".
And this post has motivated me to get a more recent copy of my stored procedures recommended practices presentation out online!
Categories
: [ DB2_limits | WLM_application_environment | db2_z/OS | stored_procedures ]
Jan 17 2008, 07:15:19 PM EST
Permalink
|
Pointer to article on enterprise search
I found this article online today, which highlights the importance of enterprise search.
Some excerpts:
Company networks contain mountains of structured and unstructured data archived in numerous formats, some of them decades old and stored in secure servers.
IBM also is building a portfolio of enterprise search tools and services, under the OmniFind brand.
Of course you know that DB2 for z/OS data contains mountains of information! This is what our just-released text search support addresses for DB2 for z/OS data - character, binary, and XML. And it's built on OmniFind technology. With this support, you can do text search queries using the built-in CONTAINS() function. It's provided with DB2 9 for z/OS and the no-charge accessories suite.
Now, I know that this is just one piece of enterprise search. In fact, I joke with my colleagues that all of the work that we've put into this is "just an SQL statement". :-) But hey, it's an important piece - it can keep the DB2 for z/OS data where it is and "let the searches come to us".
Jan 14 2008, 05:28:20 PM EST
Permalink
|
|
 |
| S | M | T | W | T | F | S | | | | | 1 | 2 | 3 | 4 | | 5 | 6 | 7 | 8 | 9 | 10 | 11 | | 12 | 13 | 14 | 15 | 16 | 17 | 18 | | 19 | 20 | 21 | 22 | 23 | 24 | 25 | | 26 | 27 | 28 | 29 | 30 | 31 | | | | | | | | | | | Today |
|