Perennially I stumble into the question of when is the performance tuning effort finished? How do you know if enough has been done or there is still further to go?
Personally, I think this should be dictated specifically around Service Level Agreements (SLAs). If the throughput / response time is within the SLAs then we are done tuning. Any more tuning should only be done if the application (in a new release) starts slipping SLAs or the SLAs themselves have changed.
But not every one has defined or agreed upon SLAs which muddles the water. Say, for example, we're building a new application. We have a number of components that we're performance testing but we don't know if the results we are seeing are the best results that can be achieved. Well, that makes sense. If we have no experience working with a particular end point (i.e. database, messaging engine, etc) then obviously our lack of knowledge with that package or our application code will hamper the analysis.
This is where application monitoring comes in. Properly installed and configured the application monitoring tool should be able to provide enough information as to whether we have the best throughput and response time. Look at the metrics, compare CPU and memory utilization. Make sure all the resources under load are as close to maxed out as possible without introducing contention. Look at the throughput / response time graphs as load is ramped up and then how it does after reaching steady state. Once all these have come together you are probably ready to go the next step and over saturate the environment. Once you've proven that over saturation has occurred (because of the noted degradation) then you can probably tune no better. And it will probably take a lot of tuning work to get to the point where you are able to saturate all the resources. This will be through a variety of methods like increased thread pools, vertical/horizontal scaling out, increased memory settings, garbage collection policy analysis, changing application code, etc.
However, is it worth it? Does an organization really want to pay for the most optimal tuning when it may not be necessary? These efforts can be long and drawn out (i.e. months instead of weeks). It also can tie up a lot of highly technically skilled people on an effort that provides for some unknown possible improvement. Lets look at the only scenario I can think of (can you think of another?).
High volume applications tend to also be resource hogs requiring gobs (i.e. hundreds or thousands) of JVMs and the supporting infrastructure. This can get expensive real quick especially when factoring in all the administration, configuration, maintenance costs are factored in. In this one scenario I can see justification, even beyond defined SLAs, to have as best performance as possible in order to reduce the capacity needed to support the application.