Many years ago we encountered an environment where the client wanted the old system refactored into the new. The "new" here being the Netezza platform and the "old" here being an overwhelmed RDBMS that couldn't hope to keep up with the workload. So the team landed on the ground with all hopes high. The client had purchased the equivalent of a 4-rack striper for production and a 1-rack Striper for development. Oddly, the same thing happened here as happens in many places. The 4-rack was dispatched to the protected production enclave and the 1-rack was dropped into the local data center with the developers salivating to get started. And get started they did.
The first team inherited about half a terabyte of raw data from the old system and started crunching on it. The second team, starting a week later, began testing on the work of the first team. A third team entered the fray, building out test cases and a wide array of number-crunching exercises. While these three teams dogpiled onto and hammered the 1-rack, the 4-rack sat elsewhere, humming with nothing to do.
We know that in any environment we encounter, with any technology we can name, the development machines are underpowered compared to the production environment. And while the production environment has a lot of growing priorities for ongoing projects, we don't have this scenario for our first project, do we? Our first project has a primary, overarching theme: it is a huge bubble of work that we need to muscle-through with as much power as possible. That "as much as possible" in our case, was the 4-rack sitting behind the smoked glass, mocking us.
And this is the irony - for a first project we have a huge "first-bubble" of work before us that will never appear again. the bubble includes all the data movement, management and backfilling of structures that we will execute only once, right? Really? I've been in places where these processes have to be executed dozens if not hundreds of times in a development or integration environment as a means to boil out any latent bugs prior to its maiden - and only - conversion voyage. But is this a maiden-and-only voyage? Hardly - typically the production guys will want to make several dry runs of the stuff too. We can multiply their need for dry runs with ours, because we have no intention of invoking such a large-scale movement of data without extensive testing.
And yet, we're doing it on the smaller machine. No doubt the 1-rack has some stuff - but I've seen cases where it might take us two weeks to wrap up a particularly heavy-lifting piece of logic. If we'd done this on the larger 4-rack, we would ahve finished it in days or less. Double the power, half the time-to-deliver (when the time is deliver is governed by testing)
In practically every case of a data warehouse conversion, the actual 'coding' and development itself is a nit compered to the timeline required for testing. I've noted this in a number of places and forms, in that the testing load for a data warehouse conversion is the largest and most protracted part of the effort. And if testing (as in our case) is largely loading, crunching and presenting the data, we need the strongest possible hardware to get past the first bubble. A data conversion project is a "testing" project more so than a "development" project, and with the volumes we'll ultimately crunch, hardware is king.
But I've had this conversation with more people than I can count. Why can't you deploy the production environment with all its power, for use in getting past the first bubble, then scratch the system and deploy for production? What is the danger here? I know plenty of people, some of them vendor product engineers, who would be happy to validate such a 'scratch' so that the production system arrives with nothing but its originally deployed default environment.
Yet another philosophy is that we would pre-configure the machine for production deployment, but nobody likes developers doing this kind of thing in a vacuum. They would rather see deployment/implementation scripts that "promote" the implementation. I'm a big fan of that, too, for the first and every following deployment. That's why I would prefer we used the production-destined system to get past the first-bubble-blues, then scratch it, and get the original environment standing up straight, and only then treat it as an operational production asset.
Most projects like this have a very short runway in their time-to-market, and we do a disservice to the hard-working folks who are doing their best to stand up this environment, They need all the power they can get, especially when they enter the testing cycle.
And for this, it's an 80/20 rule for every technical work product we will ever produce. Take a look sometime at what it takes to roll out a simple Java Bean, or a C# application, or a web site. Part of the time is spent in raw development, and part of it in testing. If I include the total number of minutes spent by the developer in unit testing, and then by hardcore testers in a UAT or QA environment, and it is clear that the total wall-clock hours spent in producing quality technology breaks into the 80/20 rule - 20 percent of the time is spent in development, and 80 percent in testing.
And if the majority of the time is spent in testing, what are we testing on Enzee space? The machine's ability to load, internally crunch and then publish the data. On a Netezza machine, this last operation is largely a function of the first two. But we have to test all the loading don't we? And when testing the full processing cycle we have to load-and-crunch in the same stream, no? What does it take to do this? Hardware, baby, and lots of it.
I can say that multiple small teams can get a lot of "ongoing" work done on a 1-rack, no doubt a very powerful environment. I can also say that a machine like this, for multiple teams in the first-bubble effort, will gaze longingly at the 4-rack in the hopes they can get to it soon, because so much testing is still before them, and they need the power to close.
What are some options to make this work? Typically the production controllers and operators don't like to see any "development" work in the machines that sit inside the production enclosure. They want tried-and-tested solutions that are production-ready while they're running. At the same time, they have no issues with allowing a pre-production instance into the environment because they know a pre-production instance is often necessary for performance testing. Here's the rub: the entire conversion and migration is one giant performance test! So designating the environment as pre-production isn't subtle, nuanced, disingenuous or sneaky - it accurately defines what we're trying to do. It's a performance-centric conversion of a pre-existing production solution, now de-engineered for the Netezza machine. As I noted, development is usually a nit, where the testing is the centerpiece of the work.
With that, Netezza gives us the power to close, to handily muscle-through this first-bubble without the blues - we only hurt ourselves with "policies" for the environment that are impractical for the first-bubble.
This brings us full-circle yet again to a common problem with environments assimilating a Netezza machine. The scales and protocols put pressure on policies, because those policies are geared for general-purpose environments. There's nothing wrong with the policies, they protect things inside those general-purpose environments. But the same policies that protect things in general-purpose realm actually sacrifice performance in the Netezza realm. Don't toss those policies - adapt them.