Introduction
Modeling is a powerful way to solve complex problems. According to the book Modeling Languages in Mathematical Optimization (see Resources), models are used to:
 Explain phenomena
 Make predictions
 Assess key factors
 Identify extreme
 Analyze tradeoffs
In business, spreadsheet software like Microsoft® Excel is often the first choice for modeling problems. Now, spreadsheets are often very intuitive, but they can be limited in what they can ultimately do for largescale problems. If you're a developer, you might find it more expedient simply to write a script to solve a modeling problem, because then you can easily integrate the script into other systems. This article introduces the basics of linear optimization in Python using the Pyomo library.
You might be familiar with algebraic modeling languages such as AMPL, AIMMS, and GAMS. The advantage of using Pyomo is that its modeling objects are embedded within a highlevel programming language with a rich set of supporting libraries. Another key advantage to doing scientific or mathematical computing in Python is that the language can easily translate algebraic notation into an expression in code. In addition, there are many mature libraries you can leverage. An example is NumPy (see Resources), a library that provides additional data structures such as a multidimensional array. The NumPy library is used in a code example later in the article because it answers the requirement for a multidimensional data structure.
Getting started
To start, install Pyomo. Pyomo is a central component of Coopr, a collection of Python software packages. Download the coopr_install script, which creates a Python virtual environment when you run it with the Python interpreter.
To create a relative directory named "coopr":
noahsMacBookAir% python coopr_install
Use the following command to start Pyomo, which puts the relative directory and its executables in your path:
noahsMacBookAir% source coopr/bin/activate
Use the Pyomo "help
command to
get help for using pyomo:
(coopr)noahsMacBookAir% pyomo help
You need a compiler to use both the Virtual Python Environment builder (virtualenv) and Pyomo. On OS X, use the XCode Developer Tools commandline tools. On Linux, use the GNU Compiler Collection (GCC). Once this virtual environment is initialized, you can solve problems with Pyomo in one of two ways:
 Use the Pyomo commandline tool:
(coopr)noahsMacBookAir% pyomo my_problem.py solver=glpk
 Or, embed the initialization inside your script and run it via the Python interpreter:
Listing 1. Invoking Pyomo inside a script
#This is an optional code path that allows the script to be run outside of #pyomo commandline. For example: python wyndor.py if __name__ == '__main__': #This replicates what the pyomo commandline tools does from coopr.opt import SolverFactory opt = SolverFactory("glpk") instance = model.create() results = opt.solve(instance) #sends results to stdout results.write()
Pyomo assumes you have at least one solver installed. The GNU Linear Programming Kit (glpk) is the default. Refer to the installation instructions for the solver you want to use. Then make sure Pyomo is able to find this solver on its path.
Modeling strategies
There are two ways to create a data model with Pyomo: Abstract and concrete. The key difference is whether to separate the model and the data.
 In an abstract model, you separate the model and the data.
 In a concrete model, you define the data inside the model itself. For example, in a Pyomo script that uses a concrete model, everything is defined inside a single script file. For simpler problems, especially teaching problems, this can be an efficient approach. On the other hand, larger problems often have larger data sets, which makes it impractical to embed the data in the same script as the code needed to work on the data.
At a high level, a Pyomo model consists of variables, objectives, and constraints.
 Variables are values that are calculated during the optimization.
 Objective are Python functions that take data and variables, and is either maximized or minimized.
 Constraints are a way of defining an expression that limits the values a variable can assume.
When creating a Pyomo model, you need to undertand the Python method of notating algebraic expressions. Here is a quick summary that compares expressions between a spreadsheet, algebra, and Python:
Figure 1. Table of notation comparisons
Pyomo application example: Wyndor
To illustrate a simple linear optimization problem using Pyomo, let's take a productmix problem from the book Introduction to Management Science (see Resources). The Wyndor factory produces doors and windows. Each of three plants with different available hours can produce either doors or windows. To determine how to maximize the profit available in the limited amount of hours of production, write the following script. It has three main sections: model, objective, and constraint.
 The model — it's a concrete model — instantiates the data of the problem, such as the hours available from plants.
 The objective (which can be either to maximize or minimize) is to maximize profit.
 The constraint uses a CapacityRule function, written using a Python idiom called list comprehension.
Here's the script for the Wyndor optimization problem:
Listing 2. Wyndor Pyomo basic example
"""Wyndor model from Hillier and Hillier *Introduction to Management Science* One way to do it... Original Author dlw Modified ndg To run this you need pyomo and the glpk solver installed, and need to be inside of the virtual environment. When these dependencies are installed you can solve this way (glpk is default): pyomo wyndor.py """ #Explicit importing makes it more clear what a user needs to define #versus what is included within the pyomo library from coopr.pyomo import (ConcreteModel, Objective, Var, NonNegativeReals, maximize, Constraint) Products = ['Doors', 'Windows'] ProfitRate = {'Doors':300, 'Windows':500} Plants = ['Door Fab', 'Window Fab', 'Assembly'] HoursAvailable = {'Door Fab':4, 'Window Fab':12, 'Assembly':18} HoursPerUnit = {('Doors','Door Fab'):1, ('Windows', 'Window Fab'):2, ('Doors','Assembly'):3, ('Windows', 'Assembly'):2, ('Windows', 'Door Fab'):0, ('Doors', 'Window Fab'):0} #Concrete Model model = ConcreteModel() #Decision Variables model.WeeklyProd = Var(Products, within=NonNegativeReals) #Objective model.obj = Objective(expr= sum(ProfitRate[i] * model.WeeklyProd[i] for i in Products), sense = maximize) def CapacityRule(model, p): """User Defined Capacity Rule Accepts a pyomo Concrete Model as the first positional argument, and and a plant index as a second positional argument """ return sum(HoursPerUnit[i,p] * model.WeeklyProd[i] for i in Products) <= HoursAvailable[p] This statement is what Pyomo needs to generate one constraint for each plant model.Capacity = Constraint(Plants, rule = CapacityRule) #This is an optional code path that allows the script to be run outside of #pyomo commandline. For example: python wyndor.py if __name__ == '__main__': #This replicates what the pyomo commandline tools does from coopr.opt import SolverFactory opt = SolverFactory("glpk") instance = model.create() results = opt.solve(instance) #sends results to stdout results.write()
Let's run the script:
(coopr)noahsMacBookAir% pyomo wyndor.py
Pyomo returns the following output, which shows one feasible solution that achieves a $3,600 profit:
Listing 3. One feasible solution with $3,600 profit
(coopr)noahsMacBookAir% pyomo wyndor.py [ 0.00] Setting up Pyomo environment [ 0.00] Applying Pyomo preprocessing actions [ 0.00] Creating model [ 0.00] Applying solver [ 0.01] Processing results Number of solutions: 1 Solution Information Gap: 0.0 Status: feasible Function Value: 3600.0 Solver results file: results.json [ 0.02] Applying Pyomo postprocessing actions [ 0.02] Pyomo Finished
Let's find out which ratio of products you should produce. To do that,
cat
the output file results.json. The results show this:
"Variable": { "WeeklyProd[Doors]": { "Id": 0, "Value": 2.0 }, "WeeklyProd[Windows]": { "Id": 1, "Value": 6.0
From this output, you see that producing six windows and two doors gives you a profit maximization of $3,600.
Another way you can solve the script is to use Python because of the conditional statement added. This invokes Pyomo inside of the script, instead of using Pyomo to invoke the script.
Heuristic optimization: The greedy randomized traveling salesman
The book In Pursuit of the Traveling Salesman: Mathematics at the Limits of Computation by Bill Cook contains a great example of a problem that doesn't always scale well as a linear program. It's the Traveling Salesman Problem, or TSP: Given a list of cities, find the shortest possible route that visits each city exactly once and returns to the original city. The worstcase running time that solves the traveling salesman problem increases exponentially with the number of cities.
Figure 2 is a graph that simulates the amount of time necessary to solve for every possible route, assuming it takes a few seconds to solve for a couple of cities. As you can see, it rapidly becomes impossible to solve the problem in your lifetime.
Figure 2. Traveling salesman exponential time to solve
You need another strategy to find a good result. Often a good, but not a guaranteed optimal, solution is achieved by running simulations and then taking the best results of those simulations. An example of this approach is shown in the Python code below. Randomly select a starting city, and then greedily select the next city with the shortest distance. This simulation can then be run N, or multiple times, and the shortest path among the simulations will be run.
Listing 4. Greedy shortestdistance selections
noahsMacBookAir% python tsp_greedy_random_start.py 20 Running simulation 20 times Shortest Distance: 128 Optimal Route: [ ('PCG', 'GPS', 1), ('GPS', 'URS', 1), ('URS', 'WFC', 0), ('WFC', 'MCK', 2), ('MCK', 'SFO', 16), ('SFO', 'ORCL', 20), ('ORCL', 'HPQ', 12), ('HPQ', 'GOOG', 6), ('GOOG', 'AAPL', 11), ('AAPL', 'INTC', 8), ('INTC', 'CSCO', 6), ('CSCO', 'EBAY', 0), ('EBAY', 'SWY', 32), ('SWY', 'CVX', 13) ]
Listing 5. TSP greedy random start
#!/usr/bin/env python """ Traveling salesman solution with random start and greedy path selection You can select how many iterations to run by doing the following: python tsp_greedy_random_start.py 20 #runs 20 times """ import sys from random import choice import numpy as np from routes import values dt = np.dtype([('city_start', 'S10'), ('city_end', 'S10'), ('distance', int)]) data_set = np.array(values,dtype=dt) def all_cities(mdarray): """Takes a multidimensional array in this format and finds unique cities array([["A", "A"], ["A", "B"]]) """ cities = {} city_set = set(data_set['city_end']) for city in city_set: cities[city] = "" return cities def randomize_city_start(all_cities): """Returns a randomized city to start trip""" return choice(all_cities) def get_shortest_route(routes): """Sort the list by distance and return shortest distance route""" route = sorted(routes, key=lambda dist: dist[2]).pop(0) return route def greedy_path(): """Select the next path to travel based on the shortest, nearest path""" itinerary = [] cities = all_cities(data_set) starting_city = randomize_city_start(cities.keys()) #print "starting_city: %s" % starting_city cities_visited = {} #we want to iterate through all cities once count = 1 while True: possible_routes = [] distance = [] #print "starting city: %s" % starting_city for path in data_set: if starting_city in path['city_start']: #we can't go to cities we have already visited if path['city_end'] in cities_visited: continue else: #print "path: ", path possible_routes.append(path) if not possible_routes: break #append this to itinerary route = get_shortest_route(possible_routes) #print "Route(%s): %s " % (count, route) count += 1 itinerary.append(route) #add this city to the visited city list cities_visited[route[0]] = count #print "cities_visited: %s " % cities_visited #reset the starting_city to the next city starting_city = route[1] #print "itinerary: %s" % itinerary return itinerary def get_total_distance(complete_itinerary): distance = sum(z for x,y,z in complete_itinerary) return distance def lowest_simulation(num): routes = {} for i in range(num): itinerary = greedy_path() distance = get_total_distance(itinerary) routes[distance] = itinerary shortest_distance = min(routes.keys()) route = routes[shortest_distance] return shortest_distance, route def main(): """runs everything""" if len(sys.argv) == 2: iterations = int(sys.argv[1]) print "Running simulation %s times" % iterations distance, route = lowest_simulation(iterations) print "Shortest Distance: %s" % distance print "Optimal Route: %s" % route else: #print "All Routes: %s" % data_set itinerary = greedy_path() print "itinerary: %s" % itinerary print "Distance: %s" % get_total_distance(itinerary) if __name__ == '__main__': main()
The TSP script uses a NumPy multidimensional array as the source of data. If you want to run this example, you need to install NumPy (see Resources). The array is defined in a file called routes.py. Here is what those routes look like.
Listing 6. NumPy multidimensional array containing citytocity routes
values = [ ("AAPL", "CSCO", 14), ("AAPL", "CVX", 44), ("AAPL", "EBAY", 14), ("AAPL", "GOOG", 14), ("AAPL", "GPS", 59), ("AAPL", "HPQ", 14), ("AAPL", "INTC", 8), ("AAPL", "MCK", 60), ("AAPL", "ORCL", 26), ("AAPL", "PCG", 59), ("AAPL", "SFO", 46), ("AAPL", "SWY", 37), ("AAPL", "URS", 60), ("AAPL", "WFC", 60), ("CSCO","AAPL" ,14), ("CSCO", "CVX", 43), ("CSCO", "EBAY",0), ("CSCO", "GOOG",21), ("CSCO", "GPS",67), ("CSCO", "HPQ",26), ("CSCO", "INTC",6), ("CSCO", "MCK",68), ("CSCO", "ORCL",37), ("CSCO", "PCG",68), ("CSCO", "SFO",57), ("CSCO", "SWY",32), ("CSCO", "URS",68), ("CSCO", "WFC",68), ("CVX","AAPL" ,44), ("CVX", "CSCO",43), ("CVX", "EBAY",43), ("CVX", "GOOG",36), ("CVX", "GPS",43), ("CVX", "HPQ",40), ("CVX", "INTC",41), ("CVX", "MCK",46), ("CVX", "ORCL",39), ("CVX", "PCG",44), ("CVX", "SFO",45), ("CVX", "SWY",13), ("CVX", "URS",44), ("CVX", "WFC",44), ("EBAY","AAPL" ,14), ("EBAY", "CSCO",0), ("EBAY", "CVX",43), ("EBAY", "GOOG",21), ("EBAY", "GPS",67), ("EBAY", "HPQ",26), ("EBAY", "INTC",6), ("EBAY", "MCK",68), ("EBAY", "ORCL",37), ("EBAY", "PCG",68), ("EBAY", "SFO",57), ("EBAY", "SWY",32), ("EBAY", "URS",68), ("EBAY", "WFC",68), ("GOOG","AAPL",11), ("GOOG", "CSCO",21), ("GOOG", "CVX",36), ("GOOG", "EBAY",21), ("GOOG", "GPS",48), ("GOOG", "HPQ",6), ("GOOG", "INTC",15), ("GOOG", "MCK",49), ("GOOG", "ORCL",16), ("GOOG", "PCG",48), ("GOOG", "SFO",36), ("GOOG", "SWY",32), ("GOOG", "URS",49), ("GOOG", "WFC",49), ("GPS","AAPL" ,59), ("GPS", "CSCO",67), ("GPS", "CVX",43), ("GPS", "EBAY",67), ("GPS", "GOOG",48), ("GPS", "HPQ",45), ("GPS", "INTC",62), ("GPS", "MCK",03), ("GPS", "ORCL",34), ("GPS", "PCG",01), ("GPS", "SFO",18), ("GPS", "SWY",53), ("GPS", "URS",01), ("GPS", "WFC",01), ("HPQ","AAPL" ,14), ("HPQ", "CSCO",26), ("HPQ", "CVX",40), ("HPQ", "EBAY",26), ("HPQ", "GOOG",6), ("HPQ", "GPS",45), ("HPQ", "INTC",20), ("HPQ", "MCK",46), ("HPQ", "ORCL",12), ("HPQ", "PCG",46), ("HPQ", "SFO",32), ("HPQ", "SWY",37), ("HPQ", "URS",46), ("HPQ", "WFC",46), ("INTC","AAPL",8), ("INTC","CSCO",6), ("INTC", "CVX",41), ("INTC", "EBAY",6), ("INTC", "GOOG",15), ("INTC", "GPS",62), ("INTC", "HPQ",20), ("INTC", "MCK",63), ("INTC", "ORCL",31), ("INTC", "PCG",62), ("INTC", "SFO",51), ("INTC", "SWY",32), ("INTC", "URS",63), ("INTC", "WFC",63), ("MCK", "AAPL",60), ("MCK", "CSCO",68), ("MCK", "CVX",46), ("MCK", "EBAY",68), ("MCK", "GOOG",49), ("MCK", "GPS",03), ("MCK", "HPQ",46), ("MCK", "INTC",63), ("MCK", "ORCL",34), ("MCK", "PCG",3), ("MCK", "SFO",16), ("MCK", "SWY",56), ("MCK", "URS",3), ("MCK", "WFC",2), ("ORCL", "AAPL",22), ("ORCL", "CSCO",37), ("ORCL", "CVX",39), ("ORCL", "EBAY",37), ("ORCL", "GOOG",16), ("ORCL", "GPS",34), ("ORCL", "HPQ",12), ("ORCL", "INTC",31), ("ORCL", "MCK",34), ("ORCL", "PCG",35), ("ORCL", "SFO",20), ("ORCL", "SWY",40), ("ORCL", "URS",35), ("ORCL", "WFC",35), ("PCG", "AAPL",59), ("PCG", "CSCO",68), ("PCG", "CVX",44), ("PCG", "EBAY",68), ("PCG", "GOOG",48), ("PCG", "GPS",01), ("PCG", "HPQ",46), ("PCG", "INTC",62), ("PCG", "MCK",03), ("PCG", "ORCL",35), ("PCG", "SFO",18), ("PCG", "SWY",54), ("PCG", "URS",01), ("PCG", "WFC",01), ("SFO", "AAPL",46), ("SFO", "CSCO",57), ("SFO", "CVX",45), ("SFO", "EBAY",57), ("SFO", "GOOG",36), ("SFO", "GPS",18), ("SFO", "HPQ",32), ("SFO", "INTC",51), ("SFO", "MCK",16), ("SFO", "ORCL",20), ("SFO", "PCG",18), ("SFO", "SWY",52), ("SFO", "URS",18), ("SFO", "WFC",18), ("SWY", "AAPL",37), ("SWY", "CSCO",32), ("SWY", "CVX",13), ("SWY", "EBAY",32), ("SWY", "GOOG",32), ("SWY", "GPS",53), ("SWY", "HPQ",37), ("SWY", "INTC",32), ("SWY", "MCK",56), ("SWY", "ORCL",40), ("SWY", "PCG",54), ("SWY", "SFO",52), ("SWY", "URS",54), ("SWY", "WFC",54), ("URS", "AAPL",60), ("URS", "CSCO",68), ("URS", "CVX",44), ("URS", "EBAY",68), ("URS", "GOOG",49), ("URS", "GPS",01), ("URS", "HPQ",46), ("URS", "INTC",63), ("URS", "MCK",03), ("URS", "ORCL",35), ("URS", "PCG",01), ("URS", "SFO",18), ("URS", "SWY",54), ("URS", "WFC",0), ("WFC", "AAPL",60), ("WFC", "CSCO",68), ("WFC", "CVX",44), ("WFC", "EBAY",68), ("WFC", "GOOG",49), ("WFC", "GPS",01), ("WFC", "HPQ",46), ("WFC", "INTC",63), ("WFC", "MCK",02), ("WFC", "ORCL",35), ("WFC", "PCG",01), ("WFC", "SFO",18), ("WFC", "SWY",54), ("WFC", "URS",0), ]
Conclusion
In this article, I covered the basics of doing optimization with Pyomo and Python. This should be enough to get you started on your own. In the next two articles in this series, I walk through some more detailed examples and cover some pragmatic aspects of building larger programs and dealing with scaling issues.
Acknowledgements
Special thanks to Dr. David L. Woodruff for reviewing this article.
Resources
Learn
 These two books are a highly recommended source of information on optimization:
 Pyomo – Optimization Modeling in Python, by W. E. Hart et al.
 Introduction to Management Science: A Modeling and Case Studies Approach with Spreadsheets, by David Anderson et al.
 Join the discussion at the Pyomo mailing list.
 Learn more about modeling and solving real world optimization problems in the book Modeling Languages in Mathematical Optimization, edited by Joseph Kallrath.
 In the book In Pursuit of the Traveling Salesman: Mathematics at the Limits of Computation, the author William J. Cook explores the various algorithms and branches of mathematics used to solve the traveling salesman problem, including the branch of mathematics known as linear programming.
 Study some notation examples from Dr. David L. Woodruff.
 Learn more about cloud computing technologies at cloud at developerWorks.
 Access IBM SmartCloud Enterprise.
 Follow developerWorks on Twitter.
 Watch developerWorks demos ranging from product installation and setup demos for beginners, to advanced functionality for experienced developers.
Get products and technologies
 Pyomo is a central component of Coopr, a collection of Python software packages. Download the Pyomo installer.
 Download the Virtual Python Environment builder (virtualenv).
 Read about and download NumPy, the fundamental package for scientific computing with Python.
 Download the GLPK (GNU Linear Programming Kit) package, used to solve largescale linear programming (LP), mixed integer programming (MIP), and other related problems.
 Evaluate IBM products in the way that suits you best: Download a product trial, try a product online, use a product in a cloud environment, or spend a few hours in the SOA Sandbox learning how to implement Service Oriented Architecture efficiently.
Discuss
 Get involved in the developerWorks community. Connect with other developerWorks users while exploring the developerdriven blogs, forums, groups, and wikis.
Comments
Dig deeper into Cloud computing on developerWorks

developerWorks Premium
Exclusive tools to build your next great app. Learn more.

Cloud newsletter
Crazy about Cloud? Sign up for our monthly newsletter and the latest cloud news.

Try SoftLayer Cloud
Deploy public cloud instances in as few as 5 minutes. Try the SoftLayer public cloud instance for one month.