Since I mentioned user-defined aggregates (UDAs), in the previous entry, it as good a time as ever to cover this subject.
Everybody that knows a little bit about (object) relational databases knows about aggregate functions such as AVG, SUM, and so on. User-defined aggregates allow you to create your own functions to aggregate your data. The thing is it can do more.
I had a situation once where people had a complex stored procedure that included two foreach loop that were executing SELECT statements. The goal of the procedure was to take a bunch of geo shapes and merge them into a multi-polygon. This procedure had multiple issues:
1) It was issuing multiple SELECT statements making it less than optimal2) It solved one and only one problem due to the specificity of the SELECT statements. (with IDS 11.50, we could build the SQL statements dynamically in the stored procedure)3) It was complex (82 lines long)
I was able to convert the procedure into a 22 lines UDA (including 2 lines of comments and 6 blank lines). The UDA could then be inserted into an SQL statement that would decide how to group rows. The advantages of the UDA are:
1) The UDA is simpler (22 lines without loops or SQL statements vs. 82 lines)2) The UDA is more flexible since it gets the grouping from the SQL statement3) Better performance, more streamlined
UDAs include an initialization function and parameter. Since the initialization parameter can be of any type, we can pass a row type that can include multiple values providing the flexibility required for any initialization. Any business processing that takes a large number of values to get to an answer can be converted from a client application to a user-defined aggregate. The benefits include:
1) reduced network traffic, translating into increased performance2) Simpler code since all the code related to issuing SQL statements, cursor manipulations and error testing can be eliminated3) That specific business processing becomes available to all applications
Some people are really passionate about following a strict approach. This can cause problems with such things as encapsulation that insures that the implementation of the object is opaque. Look at it a little bit as being very strict about following the highest possible normal form. My point is that you have to be careful about not offending people in their approach. Learn about their methodology before jumping into a passionate presentation of your approach: Take them from where they are to where you want them to be slowly, watching for resistance where comminucation could break down.
Looking back at the employee definition presented in my previous blog entry, note the following: A manager can have multiple employees working for her. This lead to a representation where a manager object includes a collection of employee objects. This lead to implementation performance problems where because all the objects were instantiated (created) it took a long time to create the object that included the collection of object. The concept of "lazy binding" was implemented to solve this. Basically, the object in a collection is not instantiated until it is accessed.
This is another area where database specialists can start a discussion to improve the overall performance. Now that I've set the premise, I'll cover it in more details another time.