Instantiation problem: Objects explosion
I was talking earlier about encapsulation and the collection of objects that can be found in another object. Let's look at another possibility:
A corporation has multiple regions, a region has multiple branches, a branch has multiple customers. To summarize:
corporation->regions->branches->customers
Let's say that the customers are loans taken by different types of companies. To find out the average amount of the loans given out by each branch, the strict approach would be that each branch has a method (function) that does the following:
customer_count = 0
total_loans = 0
for each customer
customer_count = customer_count + 1
total_loans = total_loans + customer.getLoanAmount()
end // for each customer
return(total_loans / customer_count)
We protect the encapsulation of customers by providing a method that returns the loan amount (getLoanAmount). The first problem we have relates to performance: All the customer objects for a branch need to be instantiated (created). That may require quite a bit of memory. The second performance problem is that each customer object instantiation requires one database call.
What about if we want to do this average at the region level instead of the branch level? Then, to preserve the encapsulation, we need to created additional methods to return totals and counts. I'll let you imagine the processing needed. On the performance side, we see that the number of objects instantiated and the number of database calls increase with the number of branches and customer objects processed.
If you can convince the architects and programmers to relax their encapsulation requirements, you could add one method at the branch level, one at the region level, and even possibly one at the corporation level to return the desired average. Considering the average for a region, the method would implement the one SQL statement looking like:
SELECT AVG(loan) FROM customers
WHERE region_id = :region_num
GROUP BY region_id;
In this case, I don't instantiate all the customer (and branch) objects, saving processing and memory. It is pretty obvious that the performance of these requests will be greatly improved compared to the "strict" OO approach.
Having a method that uses the database to do the processing is one thing. What about more complex processing like the average risk taken by a branch on their loans?
IDS provides the ability to implement user-defined aggregates. It would be easy to implement the average risk function. The number of lines of code would be less than implementing it in the application and the performance would be better even if it was only because of the significant reduction in the volume of data transferred.
I hope that in the last few blog entries I gave you some things to think about to improve the overall performance of your systems. The bottom line is: get involved in the analysis and design phases of new projects. You can add a lot of value there.
May 12 2008, 01:46:32 PM EDT
Permalink
|