This blog is for the open exchange of ideas relating to IBM Systems, storage and storage networking hardware, software and services.
(Short URL for this blog: ibm.co/Pearson )
Tony Pearson is a Master Inventor, Senior IT Architect and Event Content Manager for [IBM Systems for IBM Systems Technical University] events. With over 30 years with IBM Systems, Tony is frequent traveler, speaking to clients at events throughout the world.
Lloyd Dean is an IBM Senior Certified Executive IT Architect in Infrastructure Architecture. Lloyd has held numerous senior technical roles at IBM during his 19 plus years at IBM. Lloyd most recently has been leading efforts across the Communication/CSI Market as a senior Storage Solution Architect/CTS covering the Kansas City territory. In prior years Lloyd supported the industry accounts as a Storage Solution architect and prior to that as a Storage Software Solutions specialist during his time in the ATS organization.
Lloyd currently supports North America storage sales teams in his Storage Software Solution Architecture SME role in the Washington Systems Center team. His current focus is with IBM Cloud Private and he will be delivering and supporting sessions at Think2019, and Storage Technical University on the Value of IBM storage in this high value IBM solution a part of the IBM Cloud strategy. Lloyd maintains a Subject Matter Expert status across the IBM Spectrum Storage Software solutions. You can follow Lloyd on Twitter @ldean0558 and LinkedIn Lloyd Dean.
Tony Pearson's books are available on Lulu.com! Order your copies today!
Safe Harbor Statement: The information on IBM products is intended to outline IBM's general product direction and it should not be relied on in making a purchasing decision. The information on the new products is for informational purposes only and may not be incorporated into any contract. The information on IBM products is not a commitment, promise, or legal obligation to deliver any material, code, or functionality. The development, release, and timing of any features or functionality described for IBM products remains at IBM's sole discretion.
Tony Pearson is a an active participant in local, regional, and industry-specific interests, and does not receive any special payments to mention them on this blog.
Tony Pearson receives part of the revenue proceeds from sales of books he has authored listed in the side panel.
Tony Pearson is not a medical doctor, and this blog does not reference any IBM product or service that is intended for use in the diagnosis, treatment, cure, prevention or monitoring of a disease or medical condition, unless otherwise specified on individual posts.
The developerWorks Connections Platform is now in read-only mode and content is only available for viewing. No new wiki pages, posts, or messages may be added. Please see our FAQ for more information. The developerWorks Connections platform will officially shut down on March 31, 2020 and content will no longer be available. More details available on our FAQ. (Read in Japanese.)
Continuing my week in Washington DC for the annual [2010 System Storage Technical University], I presented a session on Storage for the Green Data Center, and attended a System x session on Greening the Data Center. Since they were related, I thought I would cover both in this post.
Storage for the Green Data Center
I presented this topic in four general categories:
Drivers and Metrics - I explained the three key drivers for consuming less energy, and the two key metrics: Power Usage Effectiveness (PUE) and Data Center Infrastructure Efficiency (DCiE).
Storage Technologies - I compared the four key storage media types: Solid State Drives (SSD), high-speed (15K RPM) FC and SAS hard disk, slower (7200 RPM) SATA disk, and tape. I had comparison slides that showed how IBM disk was more energy efficient than competition, for example DS8700 consumes less energy than EMC Symmetrix when compared with the exact same number and type of physical drives. Likewise, IBM LTO-5 and TS1130 tape drives consume less energy than comparable HP or Oracle/Sun tape drives.
Integrated Systems - IBM combines multiple storage tiers in a set of integrated systems managed by smart software. For example, the IBM DS8700 offers [Easy Tier] to offer smart data placement and movement across Solid-State drives and spinning disk. I also covered several blended disk-and-tape solutions, such as the Information Archive and SONAS.
Actions and Next Steps - I wrapped up the talk with actions that data center managers can take to help them be more energy efficient, from deploying the IBM Rear Door Heat Exchanger, or improving the management of their data.
Greening of the Data Center
Janet Beaver, IBM Senior Manager of Americas Group facilities for Infrastructure and Facilities, presented on IBM's success in becoming more energy efficient. The price of electricity has gone up 10 percent per year, and in some locations, 30 percent. For every 1 Watt used by IT equipment, there are an additional 27 Watts for power, cooling and other uses to keep the IT equipment comfortable. At IBM, data centers represent only 6 percent of total floor space, but 45 percent of all energy consumption. Janet covered two specific data centers, Boulder and Raleigh.
At Boulder, IBM keeps 48 hours reserve of gasoline (to generate electricity in case of outage from the power company) and 48 hours of chilled water. Many power outages are less than 10 minutes, which can easily be handled by the UPS systems. At least 25 percent of the Computer Room Air Conditioners (CRAC) are also on UPS as well, so that there is some cooling during those minutes, within the ASHRAE guidelines of 72-80 degrees Fahrenheit. Since gasoline gets stale, IBM runs the generators once a month, which serves as a monthly test of the system, and clears out the lines to make room for fresh fuel.
The IBM Boulder data center is the largest in the company: 300,000 square feet (the equivalent of five football fields)! Because of its location in Colorado, IBM enjoys "free cooling" using outside air temperature 63 percent of the year, resulting in a PUE of 1.3 rating. Electricity is only 4.5 US cents per kWh. The center also uses 1 Million KwH per year of wind energy.
The Raleigh data center is only 100,000 Square feet, with a PUE 1.4 rating. The Raleigh area enjoys 44 percent "free cooling" and electricity costs at 5.7 US cents per kWh. The Leadership in Energy and Environmental Design [LEED] has been updated to certify data centers. The IBM Boulder data center has achieved LEED Silver certification, and IBM Raleigh data center has LEED Gold certification.
Free cooling, electricity costs, and disaster susceptibility are just three of the 25 criteria IBM uses to locate its data centers. In addition to the 7 data centers it manages for its own operations, and 5 data centers for web hosting, IBM manages over 400 data centers of other clients.
It seems that Green IT initiatives are more important to the storage-oriented attendees than the x86-oriented folks. I suspect that is because many System x servers are deployed in small and medium businesses that do not have data centers, per se.
Continuing my week in Washington DC for the annual [2010 System Storage Technical University], here is my quick recap of the keynote sessions presented Monday morning. Marlin Maddy, Worldwide Technical Events Executive for IBM Systems Lab Services and Training, served as emcee.
Jim Northington, IBM System x Business Line Executive, covered the IT industry's "Love/Hate Relationship" with x86 platform. Many of the physical limitations that were previously a pain on this platform are now addressed, through a combination of IBM's new innovative eX5 architecture and virtualization technologies.
Jim also presented the [IBM CloudBurst] solution. IBM CloudBurst is one of the many "Integrated Systems" designed to help simplify deployment. Based on IBM BladeCenter, the IBM CloudBurst is basically a Private Cloud rack for those that are ready to deploy in their own data center.
Jim feels that server virtualization on x86 platforms is still in its infancy. IBM calls it the 70/30 rule: 70 percent of x86 workloads are running virtualized on 30 percent of the physical servers.
Maria Azua, IBM Vice President of Cloud Computing Enablement, presented on Cloud Computing. Technology is being adopted at faster rates. It took 40 years for radio to get 60 million listeners, 20 years for 60 million television viewers, 3 years to get 60 million surfers on the Internet, but it only took 4 months to get 60 million players on Farmville!
Maria covered various aspects of Cloud Computing: virtualization images, service catalog, provisioning elasticity, management and billing services, and virtual networks. With Cloud Computing, the combination of virtualization technologies, standardization, and automation can reduce costs and improve flexibility.
We've seen this happen before. Telcos transitioned from human operators to automated digital switches. Manufacturers went from having small teams of craftsmen to assembly lines of robots. Banks went from long lines of bank tellers to short lines at the ATM.
Maria said that companies are faced with three practical choices:
Do-it-Yourself, buy the servers, storage and switches and connect everything together.
Purchase pre-installed "integrated systems" to simplify deployment.
Subscribe to Cloud computing, allowing a service provider do all this for you.
In countries where network access is not ubiquitous, IBM has developed tools for the cloud that work in "offline" mode. IBM has also developed or modified tools to run better in the cloud. Launching a computer instance from the cloud from the service catalog is so easy to do, your 5-year-old child can do this!
Want to see Cloud Computing in action? Check out [Innovation.ed.gov], which is run in the IBM cloud, for the US Department of Education's website to foster innovation.
Whether you adopt public, private or a hybrid cloud computing approach, Maria suggests you take time to plan, test your applications for standardization, examine all risks, and explore new workloads that might be good candidates. Otherwise, moving to the cloud might just mean "More mess for less". Maria provided a list of applications that IBM considers good fit for Cloud Computing today.
I heard several audience members indicate that this is the first time someone finally explained Cloud Computing to them in a way that made sense!
Continuing my week in Washington DC for the annual [2010 System Storage Technical University], here is my quick recap of the keynote sessions presented Monday morning. Marlin Maddy, Worldwide Technical Events Executive for IBM Systems Lab Services and Training, served as emcee.
Roland Hagan, IBM Vice President for IBM System x server platform, presented on how IBM is redefining the x86 computing experience. More than 50 percent of all servers are x86 based. These x86 servers are easy to acquire, enjoy a large application base, and can take advantage of readily available skilled workforce for administration. The problem is that 85 percent of x86 processing power remains idle, energy costs are 8 times what they were 12 years ago, and management costs are now 70 percent of the IT budget.
IBM has the number one market share for scalable x86 servers. Roland covered the newly announced eX5 architecture that has been deployed in both rack-optimized models as well as IBM BladeCenter blade servers. These can offer 2x the memory capacity as competitive offerings, which is important for today's server virtualization, database and analytics workloads. This includes 40 and 80 DIMM models of blades, and 64 to 96 DIMM models of rack-optimized systems. IBM also announced eXFlash, internal Solid State Drives accessible at bus speeds. FlexNode allows a 4-node system to dynamically change to 2 separate 2-node systems.
By 2013, analysts estimate that 69 percent of x86 workloads will be virtualized, and that 22 percent of servers will be running some form of hypervisor software. By 2015, this grows to 78 percent of x86 workloads being virtualized, and 29 percent of servers running hypervisor.
Doug Balog, IBM Vice President and Disk Storage Business Line Executive, presented how the growth of information results in a "perfect storom" for the storage industry. Storage Admins are focused on managing storage growth and the related costs and complexity, proper forecasting and capacity planning, and backup administration. IBM's strategy is to help clients in the following areas:
Storage Efficiency - getting the most use out of the resources you invest
Service Delivery - ensuring that information gets to the right people at the right time, simplify reporting and provisioning
Data Protection - protecting data against unethical tampering, unauthorized access, and unexpected loss and corruption
He wrapped up his talk covering the success of DS8700 and XIV. In fact, 60 percent of XIV sales are to EMC customers. The TCO of an XIV is less than half the TCO of a comparable EMC VMAX disk system.
Dave McQueeney, IBM Vice President for Strategy and CTO for US Federal, covered how IBM's Smarter Planet vision for smarter cities, smarter healthcare, smarter energy grid and smarter traffic are being adopted by the public sector. Almost every data center in US Federal government is out of power, floor space and/or cooling capability. An estimated 80 percent of US Federal government IT budgets are spent on maintenance and ongoing operations, leaving very little left over for the big transformational projects that President Barack Obama wants to accomplish.
Who has the most active Online Transaction Processing (OLTP)? You might guess a big bank, but it is the US Department of Homeland Security (DHS), with a system processing 600 million transactions per day. Another government agency is #2, and the top Banking application is finally #3. The IBM mainframe has solved problems 10 to 15 years ago that the distributed systems are just now encountering today. Worldwide, more than 80 percent of banks use mainframes to handle their financial transactions.
IBM's recent POWER7 set of servers are proving successful in the field. For example, Allianz was able to consolidate 60 servers to 1. Running DB2 on POWER7 server is 38 percent less expensive than Oracle on x86 Nehalem processors. For Java, running JVM on POWER7 is 73 percent better than JVM on x86 Nehalem.
The US federal government ingests a large amount of data. It has huge 10-20 PB data warehouses. In fact, the amount of GB received every year by the US federal government alone exceed the production of all disk drives produced by all drive manufacturers. This means that all data must be processed through "data reduction" or it is gone forever.
The last keynote for Monday was given by Clod Barrera, IBM Distinguished Engineer and Chief Technical Strategist for System Storage. He started out shocking the audience with his view that the "disk drive industry is a train wreck". While R&D in disk drives enjoyed a healthy improvement curve up to about 2004, it has now slowed down, getting more difficult and more expensive to improve performance and capacity of disk drives. The rest of his presentation was organized around three themes:
Integrated Stacks - while new-comers like Oralce/Sun and the VCE coalition are promoting the benefits of integrated stacks, IBM has been doing this for the past five decades. New advancements in Server and Storage virtualization provide exciting new opportunities.
Integrated Systems - solutions like IBM Information Archive and SONAS, and new features like Easy Tier that help adopt SSD transparently. As it gets harder and harder to scale-up, IBM has moved to innovative scale-out architectures.
Integrated Data Center management - companies are now realizing that management and governance are critical factors of success, and that this needs to be integrated between traditional IT, private, public and hybrid cloud computing.
This was a great inspiring start for what looks like an awesome week!
By combining multiple components into a single "integrated system", IBM can offer a blended disk-and-tape storage solutions. This provides the best of both worlds, high speed access using disk, while providing lower costs and more energy efficiency with tape. According to a study by the Clipper Group, tape can be 23 times less expensive than disk over a 5 year total cost of ownership (TCO).
I've also covered Hierarchical Storage Management, such as my post [Seven Tiers of Storage at ABN Amro], and my role as lead architect for DFSMS on z/OS in general, and DFSMShsm in particular.
However, some explanation might be warranted in the use of these two terms in regards to SONAS. In this case, ILM refers to policy-based file placement, movement and expiration on internal disk pools. This is actually a GPFS feature that has existed for some time, and was tested to work in this new configuration. Files can be individually placed on either SAS (15K RPM) or SATA (7200 RPM) drives. Policies can be written to move them from SAS to SATA based on size, age and days non-referenced.
HSM is also a form of ILM, in that it moves data from SONAS disk to external storage pools managed by IBM Tivoli Storage Manager. A small stub is left behind in the GPFS file system indicating the file has been "migrated". Any reference to read or update this file will cause the file to be "recalled" back from TSM to SONAS for processing. The external storage pools can be disk, tape or any other media supported by TSM. Some estimate that as much as 60 to 80 percent of files on NAS have low reference and should be stored on tape instead of disk, and now SONAS with HSM makes that possible.
This distinction allows the ILM movement to be done internally, within GPFS, and the HSM movement to be done externally, via TSM. Both ILM and HSM movement take advantage of the GPFS high-speed policy engine, which can process 10 million files per node, run in parallel across all interface nodes. Note that TSM is not required for ILM movement. In effect, SONAS brings the policy-based management features of DFSMS for z/OS mainframe to all the rest of the operating systems that access SONAS.
HTTP and NIS support
In addition to NFS v2, NFS v3, and CIFS, the SONAS v1.1.1 adds the HTTP protocol. Over time, IBM plans to add more protocols in subsequent releases. Let me know which protocols you are interested in, so I can pass that along to the architects designing future releases!
SONAS v1.1.1 also adds support for Network Information Service (NIS), a client/server based model for user administration. In SONAS, NIS is used for netgroup and ID mapping only. Authentication is done via Active Directory, LDAP or Samba PDC.
SONAS already had synchronous replication, which was limited in distance. Now, SONAS v1.1.1 provides asynchronous replication, using rsync, at the file level. This is done over Wide Area Network (WAN) across to any other SONAS at any distance.
Interface modules can now be configured with either 64GB or 128GB of cache. Storage now supports both 450GB and 600GB SAS (15K RPM) and both 1TB and 2TB SATA (7200 RPM) drives. However, at this time, an entire 60-drive drawer must be either all one type of SAS or all one type of SATA. I have been pushing the architects to allow each 10-pack RAID rank to be independently selectable. For now, a storage pod can have 240 drives, 60 drives of each type of disk, to provide four different tiers of storage. You can have up to 30 storage pods per SONAS, for a total of 7200 drives.
An alternative to internal drawers of disk is a new "Gateway" iRPQ that allows the two storage nodes of a SONAS storage pod to connect via Fibre Channel to one or two XIV disk systems. You cannot mix and match, a storage pod is either all internal disk, or all external XIV. A SONAS gateway combined with external XIV is referred to as a "Smart Business Storage Cloud" (SBSC), which can be configured off premises and managed by third-party personnel so your IT staff can focus on other things.
See the Announcement Letters for the SONAS [hardware] and [software] for more details.
For those who are wondering how this positions against IBM's other NAS solution, the IBM System Storage N series, the rule of thumb is simple. If your capacity needs can be satisfied with a single N series box per location, use that. If not, consider SONAS instead. For those with non-IBM NAS filers that realize now that SONAS is a better approach, IBM offers migration services.
Both the Information Archive and the SONAS can be accessed from z/OS or Linux on System z mainframe, from "IBM i", AIX and Linux on POWER systems, all x86-based operating systems that run on System x servers, as well as any non-IBM server that has a supported NAS client.
Of course, EMC isn't the first, and won't be the last, vendor to [hear the sirens] of Cloud Computing and crash their ships on rocky shores. Just because you manufacture hardware or write software does not guarantee your success as a Cloud service provider.
(FTC disclaimer: I work for IBM. IBM is a successful public cloud service provider, as well as offering products that can be used to deploy a private, hybrid or community cloud, and provides technology to other cloud service proviers.)
An amusing excerpt from Steve Duplessie's post:
"Side Note: There is no such thing as a private cloud. A private cloud is called IT. We don’t need more terms for the same stuff."
I have to agree that when vendors like EMC say "Journey to the Private Cloud", skeptics hear "How to keep your IT administrator job by sticking with a traditional IT approach". Butchers, bakers, candlestick makers and the specialty shop "arms dealers" of Cloud Computing IT equipment may not want to see their market shrink down to a dozen or so service providers, and drum up the fear that "Public Cloud" deployments will "disintermediate" the IT staff.
But does that mean the use of term "Private Cloud" should be discontinued? The US National Institute of Standards and Technology [NIST] offers their cloud model composed of five essential characteristics, three service models, and four deployment models. Here's an excerpt:
Broad network access
Cloud Software as a Service (SaaS)
Cloud Platform as a Service (PaaS)
Cloud Infrastructure as a Service (IaaS)
Like traditional IT, a private cloud infrastructure is operated solely for an organization, so I can see how many might consider the term unnecessary. However, unlike traditional IT, a private cloud may be managed by the organization or a third party and may exist on premise or off premise.
How many traditional IT departments meet the five essential characteristics above? Instead of "on-demand self-service", many IT departments have complicated and lengthy procurement and change control procedures. A few might have "measured service" with a charge-back scheme, and a few others prefer to use a "show-back" aproach instead, showing end users or managers how much IT resources are being consumed without assigning a monetary figure or other penalty. Rapid elasticity? Giving any resource you asked for back can be just as painful because re-purposing that equipment follows the same complicated and lengthy change control procedures.
Just like the term "intranet" refers to a private network that employs Internet standards and technologies, I feel the term "private cloud" is useful, representing an infrastructure that meets the above criteria, employing Public Cloud standards and technologies, that can distinguish itself from traditional IT in key ways that provide business value.
What I do hope "vaporizes" is all the hype, and all the misuse of the Cloud terminology out there.
Well, I am off on a much-needed vacation. For my American readers, this weekend represents our "4th of July" Independence Day holiday. What better way to celebrate than to drive hundreds of miles from one side of the country to the other? In this case, from the North side down to the South side.
I am armed with two books on this subject. The first, is part of a series on American Road Trips, which details the roadside attractions to be found along the Great River Road. We will start up in Minnesota, and work our way Southward, covering a total of eight states in eight days along the Mississippi River.
The second book is Alton Brown's "Feasting on Asphalt, the River Run". This book describes Alton's ride Northward up the Mississippi river, detailing the restaurants and foods he enjoyed, so I will have to read the chapters in reverse.
Special thanks to Roy Buol, mayor of Dubuque, Iowa that I [met in Scottsdale earlier this year] for the idea to come visit his fine city, considered one of the Smarter Cities in the USA, thanks to IBM technology.
I don't know if I will have internet access along the way, or have the time and/or energy to blog, tweet (@az990tony) or upload photos during the trip. We'll see.
Congratulations to my colleague and close friend, Harley Puckett, who celebrated his 25th anniversary of service here at IBM. This is known internally as joining the "Quarter Century Club" or QCC. This is not just a figure of speech, the members of this club hold get-togethers and barbeques throughout the year.
Here is Harley welcoming Ken Hannigan and others he worked with back in Tivoli Storage Manager (TSM) software development.
Our manager, Bill Terry, presenting Harley with a plaque.
Continuing my saga for my [New Laptop], I have gotten all my programs operational, transferred and organized all my data, and now ready for testing. You can read my previous posts on this series: [Day 1], [Day 2], [Day 3], [Day 4].
At this point, you might be thinking, "Testing? Just use your laptop already, deal with problems as you find them!" In my case, I need to sign off that the new laptop meets my needs, and then send back my previous laptop, wiped clean of all passwords and data. I have until the end of June to do this.
The value of testing is to avoid problems later, perhaps an inconvenient time such as a business trip or client briefing. It is better to work out any issues while I am still in the office, connected to the internal IBM intranet on a high-speed wired connection. Also, I plan to do a Physical-to-Virtual (P-to-V) conversion of my Windows XP C: drive to run as a virtual guest OS on Linux, so I want to make sure the image is in working order before the conversion. That said, here is what my testing encountered.
Of the 134 applications I had identified as being installed on my old laptop, I determined that I only needed about 70 of them. The others I did not bother to install on the new.
I had not thought about "addons" and "plugins" that I have that attach themselves inside browsers or other applications. I made sure that Flash, Shockwave and Java worked correctly on all three browsers: IE6, Firefox and Opera.
One of my "plugins" is an application called [iSpring Pro, which plugs into Microsoft PowerPoint. I thought I had Microsoft Office installed, but found out the standard IBM build had only the viewers. I installed Microsoft Office 2003 Standard Edition with PowerPoint, Excel and Word. I then realized that I did not have the original V4.3 installation file for iSpring Pro, so I downloaded the latest v5 from their website. However, my license key is only for version 4, so a quick email got this resolved, and the nice folks at iSpring Solutions sent me the v4.3 installation file.
Shameless Plug: We use iSpring Pro to record our voices with PowerPoint slides to generate web videos for the [IBM Virtual Briefing Center] which we use to complement face-to-face briefings. This allows attendees to review introductory materials to prepare for their visit to Tucson, or to stay up-to-date on products and features in between annual visits. If you have not checked out the IBM Virtual Briefing Center, now is a good time to see what videos and other resources we have out there. You can even request to schedule a briefing in Tucson!
Testing out iSpring Pro, I realized that there are no jacks for my headset. On my old ThinkPad T60, I had two jacks, one green for headphone and one pink for microphone. My headset has two cables, one for each, which I then use for the recordings. I also use this for online webinars and training sessions. Apparently, ThinkPad T410 went for a single 3.5mm "Combo" audio jack that handles both roles. Fortunately, there is a [Headset Buddy] adapter that merges the two cables from my headset to the combo jack on my new laptop. I ordered one which will arrive some time next week.
My new laptop doesn't fit my old docking station either. I had set the docking station aside while I had the two laptops latched together for the file transfers, but now that I am done with the old laptop, I discovered that my new T410 doesn't fit. I ordered a new one.
Using find, grep, awk, sort and uniq, I was able to generate a list of all the file extensions on my Documents foler. I was able to find old Lotus 123, Freelance Graphics, and Wordpro files. I thought Lotus Symphony would handle these, but it does not. I was able to install an old version of Lotus Smartsuite that includes these programs so that I can process these files.
I also found in the extensions list pptx, docx and xlsx files, which represent the new Microsoft Office 2007 formats. I installed the "Format Compatability Pack" that allows Office 2003 read these files.
Lastly, I installed a few programs that support a wide variety of file formats. VideoLAN's [VLC] plays a variety of audio and video files. [7-Zip] packs and unpacks a variety of archive files. (Note: Another program, BitZipper, also supports a variety of archive formats, but the install will corrupt your Firefox and IE browsers with new tool bars, change your search engine default, and install a lot of other unwanted software. Cleaning up the mess can be time-consuming. You have been warned!) I also installed [MadEdit], a binary/hex/text editor that will open any file to see what kind of format it has inside. From this, I was able to determine that some of my extension-less files were GIF, RTF or PDF format, and rename them accordingly.
With the testing done, I am ready to go wipe my old system of all passwords and data!
Continuing my saga for my [New Laptop], I have gotten all my programs operational, and now it is a good time to re-evaluate how I organize my data. You can read my previous posts on this series: [Day 1], [Day 2], [Day 3].
I started my career at IBM developing mainframe software. The naming convention was simple, you had 44 character dataset names (DSN), which can be divided into qualifiers separated by periods. Each qualifier could be up to 8 characters long. The first qualifier was called the "high level qualifier" (HLQ) and the last one was the "low level qualifier" (LLQ). Standard naming conventions helped with ownership and security (RACF), catalog management, policy-based management (DFSMS), and data format identification. For example:
In the first case, we see that the HLQ is "PROD" for production, the application is PAYROLL and this file holds job control language (JCL). The LLQ often identified the file type. The second can be a version for testing a newer version of this application. The third represents user data, in which case my userid PEARSON would have my own written TEST JCL. I have seen successful naming conventions with 3, 4, 5 and even 6 qualifiers. The full dataset name remains the same, even if it is moved from one disk to another, or migrated to tape.
(We had to help one client who had all their files with single qualifier names, no more than 8 characters long, all in the Master Catalog (root directory). They wanted to implement RACF and DFSMS, and needed help converting all of their file names and related JCL to a 4-qualifer naming convention. It took seven months to make this transformation, but the client was quite pleased with the end result.)
While the mainframe has a restrictive approach to naming files, the operating systems on personal computers provide practically unlimited choices. File systems like NTFS or EXT3 support filenames as long as 254 characters, and pathnames up to 32,000 characters. The problem is that when you move a file from one disk to another, or even from one directory structure to another, the pathname will change. If you rely on the pathname to provide critical information about the meaning or purpose of a file, that could get lost when moving the files around.
I found several websites that offered organization advice. On The Happiness Project blog, Gretchen Rubin [busts 11 myths] about organization. On Zenhabits blog, Leo Babauta offers [18 De-cluttering tips].
Peter Walsh's [Tip No. 185] suggests using nouns to describe each folder. Granted these are about physical objects in your home or office, but some of the concepts can apply to digital objects on your disk drive.
"Use the computer’s sorting function. Put “AAA” (or a space) in front of the names of the most-used folders and “ZZZ” (or a bullet) in front of the least-used ones, so the former float to the top of an alphabetical list and the latter go to the bottom."
Personally, I hate spaces anywhere in directory and file names, and the thought of putting a space at the front of one to make it float to the top is even worse. Rather than resorting to naming folders with AAA or ZZZ, why not just limit the total number of files or directories so they are all visible on the screen. I often sort by date to access my most frequently-accessed or most-recently-updated files.
Of all the suggestions I found, Peter Walsh's "Use Nouns" seemed to be the most useful. Wikipedia has a fascinating article on [Biological Classification]. Certainly, if all living things can be put into classifications with only seven levels, we should not need more than seven levels of file system directory structure either! So, this is how I decided to organize my files on my new Thinkad T410:
Windows XP operating system programs and applications. I have structured this so that if I had to replace my hard disk entirely while traveling, I could get a new drive and restore just the operating system on this drive, and a few critical data files needed for the trip. I could then do a full recovery when I was back in the office. If I was hit with a virus that prevented Windows from booting up, I could re-install the Windows (or Linux) operating system without affecting any of my data.
This will be for my most active data, files and databases. I have the Windows "My Documents" point to D:\Documents directory. Under Archives, I will keep files for events that have completed, projects that have finished, and presentations I used that year. If I ever run out of space on my disk drive, I would delete or move off these archives first. I have a single folder for all Downloads, which I can then move to a more appropriate folder after I decide where to put them. My Office folder holds administrative items, like org charts, procedures, and so on.
As a consultant, many of my files relate to Events, these could be Briefings, Conferences, Meetings or Workshops. These are usually one to five days in duration, so I can hold here background materials for the clients involved, agendas, my notes on what transpired, and so on. I keep my Presentations separately, organized by topic. I also am involved with Projects that might span several months or ongoing tasks and assignments. I also keep my Resources separately, these could be templates, training materials, marketing research, whitepapers, and analyst reports.
A few folders I keep outside of this structure on the D: drive. [Evernote] is an application that provides "folksonomy" tagging. This is great in that I can access it from my phone, my laptop, or my desktop at home. Install-files are all those ZIP and EXE files to install applications after a fresh Windows install. If I ever had to wipe clean my C: drive and re-install Windows, I would then have this folder on D: drive to upgrade my system. Finally, I keep my Lotus Notes database directory on my D: drive. Since these are databases (NSF) files accessed directly by Lotus Notes, I saw no reason to put them under the D:\Documents directory structure.
This will be for my multimedia files. These don't change often, are mostly read-only, and could be restored quickly as needed.
I'll give this new re-organization a try. Since I have to take a fresh backup to Tivoli Storage Manager anyways, now is the best time to re-organize the directory structure and update my dsm.opt options file.