Skip to main content

If you don't have an IBM ID and password, register here.

By clicking Submit, you agree to the developerWorks terms of use.

The first time you sign into developerWorks, a profile is created for you. This profile includes the first name, last name, and display name you identified when you registered with developerWorks. Select information in your developerWorks profile is displayed to the public, but you may edit the information at any time. Your first name, last name (unless you choose to hide them), and display name will accompany the content that you post.

All information submitted is secure.

The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerworks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

By clicking Submit, you agree to the developerWorks terms of use.

All information submitted is secure.

Improve performance with self-configuring distributed systems

Research into dynamic hardware-resources allocation to handle variable workloads

Jonathan Wildstrom (jwildstr@us.ibm.com), Software Engineer, IBM, Software Group
Jonathan Wildstrom is a third-year Ph.D. student at the University of Texas at Austin. He completed his undergraduate work in Computer Science and Discrete Mathematics at Carnegie Mellon University in 1999. Immediately following that, he began work at IBM on the AIX operating system kernel, where he focuses on the areas of process management and memory allocators. His current research focuses on self-optimizing distributed systems and autonomous reallocation of resources.
Peter Stone (pstone@cs.utexas.edu), Assistant Professor, University of Texas at Austin
Dr. Peter Stone is an Alfred P. Sloan Research Fellow and Assistant Professor in the Department of Computer Sciences at the University of Texas at Austin. He received his Ph.D. in Computer Science in 1998 from Carnegie Mellon University. From 1999 to 2002 he was a Senior Technical Staff Member in the Artificial Intelligence Principles Research Department at AT&T Labs - Research. Peter's research interests include machine learning, multiagent systems, robotics, and e-commerce. In 2003, he won a CAREER award from the National Science Foundation for his research on learning agents in dynamic, collaborative, and adversarial multiagent environments. In 2004, he was named an ONR Young Investigator for his research on machine learning on physical robots.
Emmett Witchel (witchell@cs.utexas.edu), Assistant Professor, University of Texas at Austin
Emmett Witchel is an Assistant Professor in computer science at the University of Texas at Austin. His award-winning doctoral dissertation at MIT was on Mondriaan Memory Protection, an efficient, fine-grained memory-protection system. While at MIT he published work on reducing energy consumption in caches, and low-power instruction sets. In 1997 he co-founded a company that developed a multiplatform static instrumentation technology for efficient monitoring of program control flow. Before MIT he published several papers as part of Stanford University's SimOS project, including Embra, still the fastest reported full machine simulator. He is interested in computer architecture and how the architecture is used by operating systems and compilers, and in applying machine learning to systems.
Raymond J. Mooney (mooney@cs.utexas.edu), Professor, University of Texas at Austin
Raymond J. Mooney is a Professor in the Department of Computer Sciences at the University of Texas at Austin. He received his Ph.D. in 1988 from the University of Illinois at Urbana/Champaign. He is an author of more than 100 published research papers, primarily in the area of machine learning. He is program co-chair for the 2006 National Conference on Artificial Intelligence, a former co-chair of the 1990 International Conference on Machine Learning, a former editor of the Machine Learning journal, and a Fellow of the American Association for Artificial Intelligence. His recent research has focused on learning for natural-language processing, text mining, relational learning, semi-supervised learning, bioinformatics, and autonomic computing.
Mike Dahlin (dahlin@cs.utexas.edu), Associate Professor, University of Texas at Austin
Mike Dahlin received his Ph.D. in Computer Science from the University of California at Berkeley in 1995 under the supervision of professors Tom Anderson and Dave Patterson, and he joined the faculty at the University of Texas at Austin in 1996. He received the NSF CAREER award in 1998, the Alfred P. Sloan Research Fellowship in 2000, a departmental Faculty Fellowship in Computer Science in 1999, a Cisco University Research Award in 2000, and IBM Faculty Partnership Awards in 2000, 2001, 2002, and 2004. His research interests include Internet- and large-scale services, fault tolerance and security, operating systems, distributed systems, and filesystems.

Summary:  Recent advances in hardware development have made adaptive hardware configuration possible. For example, processors and memory can be dynamically added to or removed from a running system. This article establishes that such adaptive configuration can improve system performance when workloads vary.

Date:  06 Dec 2005
Level:  Intermediate

Comments:  

Introduction

This article presents the initial steps toward a distributed system that can optimize its performance by learning to reconfigure CPU and memory resources in reaction to current workload. We present a learning framework that uses standard system-monitoring tools to identify preferable configurations and their quantitative performance effects. The framework requires no instrumentation of the middleware or of the operating system. Using results from an implementation of the TPC Benchmark™ W (TPC-W) online transaction-processing benchmark, we demonstrate a significant performance benefit to reconfiguration in response to workload changes.

Our research was supported in part by NSF CAREER award IIS-0237699, a DARPA ACIP grant, and an IBM faculty award.

As an experimental testbed for our research, we constructed an implementation of the TPC-W benchmark (see Resources) using commonly available hardware and software. TPC-W provides a well-defined simulation of an online store that lets you vary the workload and evaluate the system under varying demands.

Our findings represent three important steps toward the eventual goal of constructing a fully adaptive online system:

  1. We establish that dynamically reconfiguring hardware in response to workload changes has the potential to improve performance.
  2. Using uninstrumented middleware and given only raw, low-level system statistics, we show that it's possible to predict which of two configurations will outperform the other at any given time.
  3. We extend this prediction capability to make precise numerical predictions of the quantitative change in performance when the system is switched to each alternative configuration. With such information, performance gains can be traded off against inevitable reconfiguration costs.

This article describes our experimental setup, details our work in establishing the importance of autonomous reconfiguration, reports our results, and discusses the potential impact of our work.


Experimental setup

Large servers are now available that can be partitioned into one or more logical subsystems (see Resources). Each logical system has memory and processors available to it, so it can operate as if it were an independent physical machine. Letting each logical subsystem run its own instance of the operating system (OS) prevents them from interfering with one another through resource contention.

These servers can also be flexibly configured to allocate different amounts of memory and processing resources to the logical subsystems. But allocation of resources to the subsystems that maximizes performance for one set of workloads might be suboptimal for other workloads. Real-time reconfiguration might be needed to maximize performance under a variable workload.

We simulated reconfiguration of logical subsystems on multiple desktop computers to perform our research. The remainder of this section details the testbed setup.

TPC-W

The TPC-W Benchmark is a standardized benchmark designed to determine the relative performance of a System Under Test (SUT) when used to run an online bookstore. The benchmark operates by having an external machine or set of machines -- the Remote Browser Emulators (RBEs) -- run a set of Emulated Browsers (EBs). The EBs represent the bookstore's individual customers. The customers can browse through the store, view products, perform searches, and sometimes place orders.

The TPC-W specification defines the relative probabilities of the customers' actions. It defines three workloads, called mixes:

  • The shopping mix represents normal operation of the system. 80% of the accesses are users browsing the available catalog, and 20% of the accesses are orders being placed.
  • The browsing mix represents a slow commerce period. 95% of the users are browsing, and only 5% are ordering.
  • The ordering mix represents a rush on the latest hot book. Browsing and ordering users are evenly split.

A total of 14 Web pages can be retrieved. They are divided into six browsing pages and eight ordering pages. The probability of a customer moving from a given page to any other page is well defined by the specification, and each page has its own expected response time.

Pages are generated dynamically in response to user queries, and some pages require significantly more processing than others. For example, because some of the ordering pages update the prices and stock in a highly used database, they consume more resources than simply pulling up the home page. Results are normally measured in Web Interactions per Second (WIPS) -- the average number of page requests that return in a second.

Commercially built, tested, and published systems often have one main database server and many independent Web servers. They also often have distinct Web cache servers, image servers, and load balancers. For simplicity, our setup uses one database server (the back end) and one Web server (the front end), as illustrated in Figure 1. The WIPS numbers our system produces, though three orders of magnitude less than those of a large commercial system, are not out of the ordinary for experimental systems.


Figure 1. The three machines used in the physical setup
Physical setup

In Figure 1, the thick, dashed rectangles represent physical machines. The dotted rectangles are processes, and the innermost rounded rectangles are logical units. Network connections are shown as lines and come together at the point where they are managed. Apache Tomcat (see Software) handles routing of connections to the application and image servers, while the physical machine coalesces the individual EBs' network connections.

Software

A TPC-W implementation requires three software modules to support the SUT and drive the benchmark: a database server, an application server, and an image server. We use PostgreSQL 7.4.6 as the database server. The front end uses Apache Tomcat 5.5.4 as a combined application server and image server. The Java™ code the application server runs to generate the Web pages (and interface with the database) is derived from the code freely available from the University of Wisconsin PHARM project (see Resources). This code implements both the servlets and a Java RBE, which is used to run the benchmark. We made slight modifications to this code to let it work with Tomcat and PostgreSQL (see Resources). The RBE was modified to retry any inabilities to connect to the front end, rather than treat them as fatal errors.

Hardware

The system's physical setup uses three identical desktop workstations with a 2.8 GHz Pentium 4 processor and 2 GB of RAM each. The machines are networked using built-in gigabit Ethernet interfaces and a gigabit Ethernet switch. As you can see in Figure 1, one machine acts as the back-end database machine, one is the front-end Web server, and a third machine drives the benchmark by hosting the RBE.

These physically distinct computers are meant to represent logical partitions of a single reconfigurable computer with a total of 2.8 GHz processing power and 2 GB of RAM. Memory and CPU power are artificially constrained on each machine to simulate partitioning of one such machine into a front-end and back-end machine. Overall, one full 2.8 GHz processor and 2 GB of RAM are available for use by the front end and back end combined. By constraining CPU power and memory simultaneously, we can simulate any desired hardware configuration. The processes are also designed to be reconfigurable on the fly. We can simulate reconfiguring the system to give more memory or CPU power to one machine by first constraining the other to the new requirement, and then unconstraining the newly available memory or CPU power on the target machine.


Adaptive configuration matters

It's not self-evident that adaptive configuration is necessary to maximize the throughput of a TPC-W system. Our first goal was to verify that that the best configuration is a function of workload. We identified two workloads and two configurations such that when running workload x, configuration A gives better results, and when running workload y, configuration B gives better results (see Table 1).


Table 1. WIPS results for each configuration and workload
Configuration AConfiguration B
Workload xq>r
Workload ys<t

One way to achieve the dynamic resource allocation for workloads and configurations with the relationship shown in Table 1 is to run both the Web server and database in the same virtual machine and let the OS manage the processor and memory. Such a system is unlikely to outperform our split system unless great care is taken with the CPU and memory allocation priorities of the database and Web server processes. Commercial systems isolate the database on a single machine because database performance is sensitive to resource availability. When the OS takes resources from the database, it usually harms database performance because the OS lacks sufficient information to choose exactly the resources that will not harm performance. We plan to run an experiment to verify this effect as part of future work.

In Table 1, q, r, s, and t are the WIPS results for the given configuration and workload.

For all workloads in TPC-W, the database does more processing than the Web server. Our initial experiments showed that the configurations that maximized throughput dedicated much of the CPU to the database back-end machine. Detailed experimentation on the shopping mix indicated that with about 7/8 of the CPU (about 2.5 GHz) on the back end, and the remaining 1/8 on the front end, the system is roughly in balance.

After high-level analysis of 15 workloads run on 15 configurations with roughly 7/8 of the CPU on the back end and various splits of memory, we identified five likely workloads and two configurations for further investigation. The first configuration (Configuration 1) had 27/32 of the CPU and 3/8 of the memory on the back end. Configuration 1 maximized the throughput of the workload with 200 EBs running the ordering mix, and workloads of 250 and 300 EBs running the shopping mix. The second configuration (Configuration 2) had 30/32 of the CPU and 5/8 of the memory on the back end. Configuration 2 led to higher performance for workloads of 350 and 400 EBs running the browsing mix (see Table 2).


Table 2. Mean WIPS for chosen configurations over a variety of workloads (standard deviations in parentheses)
Workload (EBs)
200 ordering250 shopping300 shopping 350 browsing400 browsing
Configuration 119.61(0.39)16.30(0.22)16.46(0.26)12.73(0.27) 12.22(0.51)
Configuration 218.79(0.35)14.96(0.12)14.86(0.21)13.10(0.22)13.10(0.25)

In order to establish conclusively that configuration matters in this case, we tested each configuration and workload independently 53 times. To eliminate any interference between tests, the database and servlet engine were started before and shut down after each test, and the database files were copied from originals. This prevented any growth in the databases, or servlet persistence issues, from having an impact on the results.

Analysis of the data confirmed that there were statistically significant differences between the two configurations, shown in Table 2. The probability that the differences were due to random chance was less than 10-10 in all cases. Note especially the large differences in WIPS between the two configurations with 250 and 300 EBs running the shopping mix, and the difference in WIPS with 400 EBs running the browsing mix. These large differences conclusively confirm that neither configuration is optimal in all situations. This result establishes the need for adaptive system configuration in order to take advantage of the optimal allocation of resources for the current workload.


Learning the best configuration

The results so far show that the system can improve its performance if it can adapt its configuration as the workload changes. But in practice the workload isn't an observable quantity to the system. This section presents results indicating that the optimal configuration can be determined from low-level OS statistics with no customized instrumentation. The lack of instrumentation lets this approach work for any TPC-W implementation, regardless of the software used to implement the database, Web server, and so on.

In addition to collecting WIPS results during experiments, the front-end and back-end machines also collect low-level system statistics. The statistics are collected in parallel on both machines through the vmstat command, a commonly available system activity reporting utility (see Table 3). In order to determine the currently optimal configuration, we aimed to create a model that maps current system state, as represented by vmstat, to the optimal configuration. Using the results reported in Table 2 as training data, standard machine-learning methods can be used to learn such a model.


Table 3. Statistics reported by vmstat
Processes (number)RunnableBlocked
Memory (KB)VM usedIdle
InactiveActive
Swapping (KB/s)Swapped inSwapped out
I/O (blocks/s)ReceivedSent
System (per second)InterruptsContext switches
CPU (%)UserSystem
IdleWaiting

The Weka package

The Weka package implements many machine-learning algorithms for exactly this purpose (see Resources). In order to obtain human-understandable output, we applied the JRip rule learner (see Resources) to the training data. As a baseline for analyzing the learned rules, accuracy can be compared to the model that always predicts the most likely outcome (Configuration 1 in this case). The accuracy of this baseline learner is 61.9%. By comparison, JRip learned the rules shown in Figure 2, yielding a prediction accuracy of 93.0%.


Figure 2. JRip rules learned by WEKA
1. If (Number of front-end system interrupts <= 1392.8)
and (Number of blocks sent by the front end to devices <= 201.2)
then Best configuration = Configuration 2

2. If (Number of front-end context switches <= 422.0)
and (Number of runnable processes on the back end >= 18.4)
then Best configuration = Configuration 2

3. If (Number of blocks received by the front end 
   from devices >= 499.5)
then Best configuration = Configuration 2

4. If (Percentage of CPU time spent by the back end
   in the kernel <= 12.4%)
then Best configuration = Configuration 2

5. Else Best configuration = Configuration 1

The evaluation of JRip's rules was performed using stratified 10-fold cross validation, in which results are averaged over 10 separate trials where the learning algorithm is training on 90% of the data and tested on the remaining 10% held out as independent test cases.

As desired, the rules learned by JRip are human-interpretable. Rules 1 through 4 indicate cases in which Configuration 2 is the preferred configuration, while Rule 5 classifies all remaining cases as preferring Configuration 1. These four rules can be divided into two sets: Rules 1 and 2 help identify situations where the front-end system has excess CPU available that the back end could use. Rules 3 and 4 determine that the back end is overutilized.

Of the first two rules, Rule 1 indicates a situation where the front system appears to be underutilized. Weka finds thresholds for the number of system interrupts taken by the front end and how many blocks it is sending to assorted block devices (most likely the network sockets). If the front end falls under both thresholds, the back end is the bottleneck. The second rule chooses a different method -- the number of context switches per second -- for determining if the front end is underloaded. However, it also uses a threshold on the back-end machine to determine that there are more processes runnable than can be handled with the current configuration and that the additional CPU would be helpful.

The rules in Figure 2 yield accurate predictions based entirely on low-level system data that's readily available independently of the system's software components. This feature lets our approach generalize to a wide variety of scenarios, including different TPC-W implementations and potentially to more-varied distributed-system scenarios.

The third rule indicates that the back end is receiving blocks (both from the network and the disk) at a fast enough rate that it would benefit from more CPU and memory. Finally, the fourth rule indicates that the CPU is spending very little time handling kernel-level work. This rule is a little odd. However, Weka is likely determining that the back end does not have enough CPU to handle the necessary system-space work, and is trying to handle too many things in user-space code simultaneously.


Learning the benefit of switching configurations

Even more useful than predicting the best configuration is an ability to predict the actual benefit of switching configurations, in terms of increased (or decreased) WIPS. This ability is important because some cost is always involved in switching configurations. This cost needs to be taken into account when considering a configuration change.

For example, consider moving a CPU from one logical machine to another. Even when this move can be done without impacting the actual current work, the CPU will still be unavailable for a period of time. A similar problem applies to moving memory between machines. In a worst-case scenario, the machine might be temporarily unavailable while the system reallocates the hardware.

In Table 5, the mean absolute error is the average of the absolute values of the prediction errors, and the root mean squared error is the square root of the average of the squares of the prediction errors. These results were determined using 10-fold cross validation, as before. The differences that we tried to predict are fairly small (around 1 WIPS). However, not only is the accuracy given by the M5P trees good enough that it can be used to predict if a configuration switch would help, but the standard deviations are even smaller than most of those associated with the training data. This implies that the error in the WIPS predictions could easily be masked by normal noise.

The Weka package includes an algorithm for M5P model trees that can learn function approximations (see Resources). We used this algorithm to predict the changes in WIPS when changing configuration. As before, the learner had access to only the averaged raw system-level data from both systems and the current configuration. The change in WIPS was defined to be 0 for a change to the current configuration. Accuracy of the M5P trees was compared to a baseline learner that always predicted a constant change in WIPS equal to the overall average.

Statistical analysis of the results indicates a sizeable improvement over the baseline learner. The complete M5P trees are too complex to display, but the small degree of prediction error (see Table 4) shows that the change in WIPS is a learnable function, without the need for internal instrumentation in the middleware.


Table 4. Error results of predicting the change in WIPS from one configuration to another
TargetLearnerMean abs.Root mean sq.
Configuration 1Baseline0.610.78
M5P0.180.37
Configuration 2Baseline0.610.78
M5P0.180.49

Potential impact

The differences between the results of the winning and losing configurations are fairly small (about 1 WIPS). Two aspects of the original PHARM code hamper its ability to get high performance out of a system and lead to these small disparities. First, the code always searches the database, rather than returning a previously generated copy of a page. This places an undue amount of workload on the database, particularly for the browsing and shopping mixes.

Second, the code that processes an order serializes all code that writes to the database, thereby preventing the database from being properly loaded (or overloaded) when running the ordering mix. These two properties of PHARM contributed to the difficulty of locating the breaking point for the CPU. They also make some results counterintuitive. For example, the ordering mix is designed to be far more back-end intensive than the browsing mix. But because the browsing mix constantly accesses the database in parallel to retrieve records, while the front end serializes many ordering accesses to the database in the ordering mix, the back-end is stressed much more in the browsing mixes. This explains why Configuration 2 wins on the two browsing workloads, when intuition says that it should lose. Modifying the PHARM code to eliminate these two problems is an area of future work.

The work this article reports is a step toward building a self-configuring system. An implementation of this approach on real configurable hardware remains to be done, but it is possible to analyze its potential impact. For example, consider a workload that alternates evenly between 300 EBs running the shopping mix and 400 EBs running the browsing mix. The data in Table 3 indicates that Configuration 1 would have an approximate throughput of 14.34 WIPS, while Configuration 2 would have an approximate throughput of 13.98 WIPS. By comparison, a system running adaptively, using the JRip learner, would have a throughput of 14.69 WIPS, if switching time is negligible. This computation assumes that 93.0% of the time, the winning WIPS value is attained, and 7.0% of the time, the losing WIPS value is attained. These numbers represent a 2.4% performance gain over Configuration 1 and a 5.1% gain over Configuration 2. As the disparities between the configurations grow (for example, if the differences were 5 WIPS instead of 1), this gain will also grow.

Our work makes two main contributions. First, it demonstrates that there is a need for self-configuring systems. Some systems might have a clear optimal division of resources to maximize throughput, but in certain situations this division is workload-dependent. Because of this reliance on the workload, the learners we have outlined here appear to have a real use in distributed systems.

Second, having learners able to predict the change in WIPS is a critical step toward a fully functional self-configuring system able to maximize performance on the TPC-W benchmark. Because the cost involved in changing configurations can vary, this prediction allows a threshold to be set that would control when enough benefit would result from a configuration change to overcome the temporary cost of moving resources around.

A key feature of this work is that no instrumentation of any code is necessary. All learning was done based upon easily available raw system statistics. Because no middleware instrumentation is necessary, it is possible to change any or all parts of the system, including the front-end and back-end software, the TPC-W implementation, or even the OS or physical hardware. Retraining would be necessary, but no additional code modifications would be needed. Finally, the entire system could be replaced with another benchmark, such as Sun's Java Pet Store Web-based store simulation (see Resources). In contrast, most other adaptive systems appear to need instrumentation built into the applications and OS.


Conclusion

The rapid development of reconfigurable servers indicates that they will become more commonly used. As this hardware is deployed, it can be used for distributed applications where isolating the front and back ends is desirable, but where extra hardware is unneeded. In these cases, the server will need some form of adaptability to deal with changing workloads.

We've presented preliminary research into methods to handle variable workloads by dynamically reallocating hardware resources between the machines. In addition to showing that autonomous reconfiguration has the potential to improve performance, two learners were presented that predict preferable configuration changes with high accuracy. Our ongoing research agenda includes fully automating the adaptive process under dynamically changing workloads, first on our simulated reconfigurable machines and eventually on true reconfigurable hardware.


Resources

Learn

Get products and technologies

  • TPC-W: TPC Benchmark™ W (TPC-W) is a transactional Web benchmark that measures system performance in a controlled Internet commerce environment that simulates the activities of a business-oriented transactional Web server.

  • Java TPC-W Implementation Distribution: A publicly available implementation of the TPC-W benchmark, written completely in the Java programming language, from the PHARM Project at the University of Wisconsin.

  • Weka: A collection of machine-learning algorithms for data-mining tasks.

  • Java pet store demo: A sample application demonstrating how to use the capabilities of the Java EE 1.3 platform to develop scalable, cross-platform enterprise applications.

About the authors

Jonathan Wildstrom

Jonathan Wildstrom is a third-year Ph.D. student at the University of Texas at Austin. He completed his undergraduate work in Computer Science and Discrete Mathematics at Carnegie Mellon University in 1999. Immediately following that, he began work at IBM on the AIX operating system kernel, where he focuses on the areas of process management and memory allocators. His current research focuses on self-optimizing distributed systems and autonomous reallocation of resources.

Peter Stone

Dr. Peter Stone is an Alfred P. Sloan Research Fellow and Assistant Professor in the Department of Computer Sciences at the University of Texas at Austin. He received his Ph.D. in Computer Science in 1998 from Carnegie Mellon University. From 1999 to 2002 he was a Senior Technical Staff Member in the Artificial Intelligence Principles Research Department at AT&T Labs - Research. Peter's research interests include machine learning, multiagent systems, robotics, and e-commerce. In 2003, he won a CAREER award from the National Science Foundation for his research on learning agents in dynamic, collaborative, and adversarial multiagent environments. In 2004, he was named an ONR Young Investigator for his research on machine learning on physical robots.

Emmett Witchel

Emmett Witchel is an Assistant Professor in computer science at the University of Texas at Austin. His award-winning doctoral dissertation at MIT was on Mondriaan Memory Protection, an efficient, fine-grained memory-protection system. While at MIT he published work on reducing energy consumption in caches, and low-power instruction sets. In 1997 he co-founded a company that developed a multiplatform static instrumentation technology for efficient monitoring of program control flow. Before MIT he published several papers as part of Stanford University's SimOS project, including Embra, still the fastest reported full machine simulator. He is interested in computer architecture and how the architecture is used by operating systems and compilers, and in applying machine learning to systems.

Raymond J. Mooney

Raymond J. Mooney is a Professor in the Department of Computer Sciences at the University of Texas at Austin. He received his Ph.D. in 1988 from the University of Illinois at Urbana/Champaign. He is an author of more than 100 published research papers, primarily in the area of machine learning. He is program co-chair for the 2006 National Conference on Artificial Intelligence, a former co-chair of the 1990 International Conference on Machine Learning, a former editor of the Machine Learning journal, and a Fellow of the American Association for Artificial Intelligence. His recent research has focused on learning for natural-language processing, text mining, relational learning, semi-supervised learning, bioinformatics, and autonomic computing.

Mike Dahlin

Mike Dahlin received his Ph.D. in Computer Science from the University of California at Berkeley in 1995 under the supervision of professors Tom Anderson and Dave Patterson, and he joined the faculty at the University of Texas at Austin in 1996. He received the NSF CAREER award in 1998, the Alfred P. Sloan Research Fellowship in 2000, a departmental Faculty Fellowship in Computer Science in 1999, a Cisco University Research Award in 2000, and IBM Faculty Partnership Awards in 2000, 2001, 2002, and 2004. His research interests include Internet- and large-scale services, fault tolerance and security, operating systems, distributed systems, and filesystems.

Report abuse help

Report abuse

Thank you. This entry has been flagged for moderator attention.


Report abuse help

Report abuse

Report abuse submission failed. Please try again later.


developerWorks: Sign in

If you don't have an IBM ID and password, register here.


Forgot your IBM ID?


Forgot your password?
Change your password


By clicking Submit, you agree to the developerWorks terms of use.

 


The first time you sign into developerWorks, a profile is created for you. This profile includes the first name, last name, and display name you identified when you registered with developerWorks. Select information in your developerWorks profile is displayed to the public, but you may edit the information at any time. Your first name, last name (unless you choose to hide them), and display name will accompany the content that you post.

Choose your display name

The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerWorks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

(Must be between 3 – 31 characters.)


By clicking Submit, you agree to the developerWorks terms of use.

 


Rate this article

Comments

Help: Update or add to My dW interests

What's this?

This little timesaver lets you update your My developerWorks profile with just one click! The general subject of this content (AIX and UNIX, Information Management, Lotus, Rational, Tivoli, WebSphere, Java, Linux, Open source, SOA and Web services, Web development, or XML) will be added to the interests section of your profile, if it's not there already. You only need to be logged in to My developerWorks.

And what's the point of adding your interests to your profile? That's how you find other users with the same interests as yours, and see what they're reading and contributing to the community. Your interests also help us recommend relevant developerWorks content to you.

View your My developerWorks profile

Return from help

Help: Remove from My dW interests

What's this?

Removing this interest does not alter your profile, but rather removes this piece of content from a list of all content for which you've indicated interest. In a future enhancement to My developerWorks, you'll be able to see a record of that content.

View your My developerWorks profile

Return from help

static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=Tivoli
ArticleID=99815
ArticleTitle=Improve performance with self-configuring distributed systems
publish-date=12062005
author1-email=jwildstr@us.ibm.com
author1-email-cc=
author2-email=pstone@cs.utexas.edu
author2-email-cc=
author3-email=witchell@cs.utexas.edu
author3-email-cc=
author4-email=mooney@cs.utexas.edu
author4-email-cc=
author5-email=dahlin@cs.utexas.edu
author5-email-cc=

Tags

Help
Use the search field to find all types of content in My developerWorks with that tag.

Use the slider bar to see more or fewer tags.

For articles in technology zones (such as Java technology, Linux, Open source, XML), Popular tags shows the top tags for all technology zones. For articles in product zones (such as Info Mgmt, Rational, WebSphere), Popular tags shows the top tags for just that product zone.

For articles in technology zones (such as Java technology, Linux, Open source, XML), My tags shows your tags for all technology zones. For articles in product zones (such as Info Mgmt, Rational, WebSphere), My tags shows your tags for just that product zone.

Use the search field to find all types of content in My developerWorks with that tag. Popular tags shows the top tags for this particular content zone (for example, Java technology, Linux, WebSphere). My tags shows your tags for this particular content zone (for example, Java technology, Linux, WebSphere).