Linear optimization in Python, Part 1: Solve complex problems in the cloud with Pyomo

Write a script to solve a modeling problem

Explore ways to model optimization applications in Python using Python Optimization Modeling Objects (Pyomo), an open source tool. You can use Pyomo to define symbolic problems, create concrete problem instances, and solve these instances with standard solvers. This article series shows how to leverage Pyomo's ability to integrate with Python to model optimization applications. This first article covers the basics. Part 2 shows how to add more tools and build a scalable architecture.

Noah Gift (noah.gift@giftcs.com), Founder, CTO, Giftcs

Noah GiftNoah Gift is an experienced technical leader and software developer at AT&T Interactive. He solves interesting problems in a variety of languages including Python/Iron Python, Erlang, F#, C#, and JavaScript. (He's also worked at Caltech, Disney Feature Animation, Sony Imageworks, and Weta Digital.) A member of the Python Software Foundation, he is also an author of many developerWorks articles and the co-author of Python For Unix and Linux System Administration. He earned a BS in Nutritional Science from Cal Poly San Luis Obispo, an MS in Computer Information Systems from CSULA, and is an MBA Candidate at UC Davis specializing in business analytics, finance, and entrepreneurship. In his spare time, he composes for the piano and runs in marathons. Find him at his web site, on Twitter, or for consulting.



05 February 2013

Also available in Chinese

Introduction

Modeling is a powerful way to solve complex problems. According to the book Modeling Languages in Mathematical Optimization (see Resources), models are used to:

  • Explain phenomena
  • Make predictions
  • Assess key factors
  • Identify extreme
  • Analyze trade-offs

In business, spreadsheet software like Microsoft® Excel is often the first choice for modeling problems. Now, spreadsheets are often very intuitive, but they can be limited in what they can ultimately do for large-scale problems. If you're a developer, you might find it more expedient simply to write a script to solve a modeling problem, because then you can easily integrate the script into other systems. This article introduces the basics of linear optimization in Python using the Pyomo library.

You might be familiar with algebraic modeling languages such as AMPL, AIMMS, and GAMS. The advantage of using Pyomo is that its modeling objects are embedded within a high-level programming language with a rich set of supporting libraries. Another key advantage to doing scientific or mathematical computing in Python is that the language can easily translate algebraic notation into an expression in code. In addition, there are many mature libraries you can leverage. An example is NumPy (see Resources), a library that provides additional data structures such as a multi-dimensional array. The NumPy library is used in a code example later in the article because it answers the requirement for a multi-dimensional data structure.

Getting started

To start, install Pyomo. Pyomo is a central component of Coopr, a collection of Python software packages. Download the coopr_install script, which creates a Python virtual environment when you run it with the Python interpreter.

To create a relative directory named "coopr":

noahs-MacBook-Air% python coopr_install

Use the following command to start Pyomo, which puts the relative directory and its executables in your path:

noahs-MacBook-Air% source coopr/bin/activate

Use the Pyomo "--help command to get help for using pyomo:

(coopr)noahs-MacBook-Air% pyomo --help

You need a compiler to use both the Virtual Python Environment builder (virtualenv) and Pyomo. On OS X, use the XCode Developer Tools command-line tools. On Linux, use the GNU Compiler Collection (GCC). Once this virtual environment is initialized, you can solve problems with Pyomo in one of two ways:

  • Use the Pyomo command-line tool:
    (coopr)noahs-MacBook-Air% pyomo my_problem.py --solver=glpk
  • Or, embed the initialization inside your script and run it via the Python interpreter:
    Listing 1. Invoking Pyomo inside a script
    #This is an optional code path that allows the script to be run outside of
    #pyomo command-line.  For example:  python wyndor.py
    if __name__ == '__main__':
       
        #This replicates what the pyomo command-line tools does
        from coopr.opt import SolverFactory
        opt = SolverFactory("glpk")
        instance = model.create()
        results = opt.solve(instance)
        #sends results to stdout
        results.write()

Pyomo assumes you have at least one solver installed. The GNU Linear Programming Kit (glpk) is the default. Refer to the installation instructions for the solver you want to use. Then make sure Pyomo is able to find this solver on its path.

Modeling strategies

There are two ways to create a data model with Pyomo: Abstract and concrete. The key difference is whether to separate the model and the data.

  • In an abstract model, you separate the model and the data.
  • In a concrete model, you define the data inside the model itself. For example, in a Pyomo script that uses a concrete model, everything is defined inside a single script file. For simpler problems, especially teaching problems, this can be an efficient approach. On the other hand, larger problems often have larger data sets, which makes it impractical to embed the data in the same script as the code needed to work on the data.

At a high level, a Pyomo model consists of variables, objectives, and constraints.

  • Variables are values that are calculated during the optimization.
  • Objective are Python functions that take data and variables, and is either maximized or minimized.
  • Constraints are a way of defining an expression that limits the values a variable can assume.

When creating a Pyomo model, you need to undertand the Python method of notating algebraic expressions. Here is a quick summary that compares expressions between a spreadsheet, algebra, and Python:

Figure 1. Table of notation comparisons
Table of notation comparisons

Pyomo application example: Wyndor

To illustrate a simple linear optimization problem using Pyomo, let's take a product-mix problem from the book Introduction to Management Science (see Resources). The Wyndor factory produces doors and windows. Each of three plants with different available hours can produce either doors or windows. To determine how to maximize the profit available in the limited amount of hours of production, write the following script. It has three main sections: model, objective, and constraint.

  • The model — it's a concrete model — instantiates the data of the problem, such as the hours available from plants.
  • The objective (which can be either to maximize or minimize) is to maximize profit.
  • The constraint uses a CapacityRule function, written using a Python idiom called list comprehension.

Here's the script for the Wyndor optimization problem:

Listing 2. Wyndor Pyomo basic example
"""Wyndor model from Hillier and Hillier *Introduction to Management Science*

One way to do it...
Original Author dlw
Modified ndg 

To run this you need pyomo and the glpk solver installed, and need to be
inside of the virtual environment.
When these dependencies are installed you can solve this way (glpk is default):
    
    pyomo wyndor.py

"""

#Explicit importing makes it more clear what a user needs to define
#versus what is included within the pyomo library
from coopr.pyomo import (ConcreteModel, Objective, Var, NonNegativeReals,
                              maximize, Constraint)

Products = ['Doors', 'Windows']
ProfitRate = {'Doors':300, 'Windows':500}
Plants = ['Door Fab', 'Window Fab', 'Assembly']
HoursAvailable = {'Door Fab':4, 'Window Fab':12, 'Assembly':18}
HoursPerUnit = {('Doors','Door Fab'):1, ('Windows', 'Window Fab'):2,
                ('Doors','Assembly'):3, ('Windows', 'Assembly'):2,
                ('Windows', 'Door Fab'):0, ('Doors', 'Window Fab'):0}

#Concrete Model
model = ConcreteModel()
#Decision Variables
model.WeeklyProd = Var(Products, within=NonNegativeReals)

#Objective
model.obj = Objective(expr=
            sum(ProfitRate[i] * model.WeeklyProd[i] for i in Products),
            sense = maximize)

def CapacityRule(model, p):
    """User Defined Capacity Rule
    
    Accepts a pyomo Concrete Model as the first positional argument,
    and and a plant index as a second positional argument
    """
    
    return sum(HoursPerUnit[i,p] * model.WeeklyProd[i] for i in Products) 
	  <= HoursAvailable[p]

This statement is what Pyomo needs to generate one constraint for each plant
model.Capacity = Constraint(Plants, rule = CapacityRule)

#This is an optional code path that allows the script to be run outside of
#pyomo command-line.  For example:  python wyndor.py
if __name__ == '__main__':
   
    #This replicates what the pyomo command-line tools does
    from coopr.opt import SolverFactory
    opt = SolverFactory("glpk")
    instance = model.create()
    results = opt.solve(instance)
    #sends results to stdout
    results.write()

Let's run the script:

(coopr)noahs-MacBook-Air% pyomo wyndor.py

Pyomo returns the following output, which shows one feasible solution that achieves a $3,600 profit:

Listing 3. One feasible solution with $3,600 profit
    (coopr)noahs-MacBook-Air% pyomo wyndor.py  
[    0.00] Setting up Pyomo environment
[    0.00] Applying Pyomo preprocessing actions
[    0.00] Creating model
[    0.00] Applying solver
[    0.01] Processing results
    Number of solutions: 1
    Solution Information
      Gap: 0.0
      Status: feasible
      Function Value: 3600.0
    Solver results file: results.json
[    0.02] Applying Pyomo postprocessing actions
[    0.02] Pyomo Finished

Let's find out which ratio of products you should produce. To do that, cat the output file results.json. The results show this:

"Variable": {
                "WeeklyProd[Doors]": {
                    "Id": 0, 
                    "Value": 2.0
                }, 
                "WeeklyProd[Windows]": {
                    "Id": 1, 
                    "Value": 6.0

From this output, you see that producing six windows and two doors gives you a profit maximization of $3,600.

Another way you can solve the script is to use Python because of the conditional statement added. This invokes Pyomo inside of the script, instead of using Pyomo to invoke the script.

Heuristic optimization: The greedy randomized traveling salesman

The book In Pursuit of the Traveling Salesman: Mathematics at the Limits of Computation by Bill Cook contains a great example of a problem that doesn't always scale well as a linear program. It's the Traveling Salesman Problem, or TSP: Given a list of cities, find the shortest possible route that visits each city exactly once and returns to the original city. The worst-case running time that solves the traveling salesman problem increases exponentially with the number of cities.

Figure 2 is a graph that simulates the amount of time necessary to solve for every possible route, assuming it takes a few seconds to solve for a couple of cities. As you can see, it rapidly becomes impossible to solve the problem in your lifetime.

Figure 2. Traveling salesman exponential time to solve
Traveling salesman exponential time to solve

You need another strategy to find a good result. Often a good, but not a guaranteed optimal, solution is achieved by running simulations and then taking the best results of those simulations. An example of this approach is shown in the Python code below. Randomly select a starting city, and then greedily select the next city with the shortest distance. This simulation can then be run N, or multiple times, and the shortest path among the simulations will be run.

Listing 4. Greedy shortest-distance selections
noahs-MacBook-Air% python tsp_greedy_random_start.py 20 
Running simulation 20 times
Shortest Distance: 128
Optimal Route: [
 ('PCG', 'GPS', 1), 
 ('GPS', 'URS', 1),
 ('URS', 'WFC', 0),
 ('WFC', 'MCK', 2), 
 ('MCK', 'SFO', 16), 
 ('SFO', 'ORCL', 20), 
 ('ORCL', 'HPQ', 12), 
 ('HPQ', 'GOOG', 6), 
 ('GOOG', 'AAPL', 11), 
 ('AAPL', 'INTC', 8), 
 ('INTC', 'CSCO', 6), 
 ('CSCO', 'EBAY', 0), 
 ('EBAY', 'SWY', 32), 
 ('SWY', 'CVX', 13)
]
Listing 5. TSP greedy random start
#!/usr/bin/env python
"""
Traveling salesman solution with random start and greedy path selection
You can select how many iterations to run by doing the following:

python tsp_greedy_random_start.py 20 #runs 20 times


"""

import sys
from random import choice
import numpy as np
from routes import values

dt = np.dtype([('city_start', 'S10'), ('city_end', 'S10'), ('distance', int)])
data_set = np.array(values,dtype=dt)

def all_cities(mdarray):
    """Takes a multi-dimensional array in this format and finds unique cities
    
    array([["A", "A"],
    ["A", "B"]])
    
    """
    cities = {}
    city_set = set(data_set['city_end'])
    for city in city_set:
        cities[city] = ""
    return cities
    
def randomize_city_start(all_cities):
    """Returns a randomized city to start trip"""
    
    return choice(all_cities)

def get_shortest_route(routes):
    """Sort the list by distance and return shortest distance route"""

    route = sorted(routes, key=lambda dist: dist[2]).pop(0)
    return route

def greedy_path():
    """Select the next path to travel based on the shortest, nearest path"""

    itinerary = []
    cities = all_cities(data_set)
    starting_city = randomize_city_start(cities.keys())
    #print "starting_city: %s" % starting_city
    cities_visited = {}
    #we want to iterate through all cities once
    count = 1
    while True:
        possible_routes = []
        distance = [] 
        #print "starting city: %s" % starting_city
        for path in data_set:
            if starting_city in path['city_start']:
                #we can't go to cities we have already visited
                if path['city_end'] in cities_visited:
                    continue
                else:
                    #print "path: ", path
                    possible_routes.append(path)
        
        if not possible_routes:
            break
        #append this to itinerary
        route = get_shortest_route(possible_routes)
        #print "Route(%s): %s " % (count, route)
        count += 1
        itinerary.append(route)
        #add this city to the visited city list
        cities_visited[route[0]] = count
        #print "cities_visited: %s " % cities_visited
        #reset the starting_city to the next city
        starting_city = route[1]
        #print "itinerary: %s" % itinerary
    
    return itinerary

def get_total_distance(complete_itinerary):
    
    distance = sum(z for x,y,z in complete_itinerary)
    return distance

def lowest_simulation(num):
    
    routes = {}
    for i in range(num):
        itinerary = greedy_path()
        distance = get_total_distance(itinerary)
        routes[distance] = itinerary
    shortest_distance = min(routes.keys())
    route = routes[shortest_distance]
    return shortest_distance, route

def main():
    """runs everything"""
    
    if len(sys.argv) == 2:
        iterations = int(sys.argv[1])
        print "Running simulation %s times" % iterations
        distance, route = lowest_simulation(iterations)
        print "Shortest Distance: %s" % distance
        print "Optimal Route: %s" % route
    else:
        #print "All Routes: %s" % data_set
        itinerary = greedy_path()
        print "itinerary: %s" % itinerary
        print "Distance: %s" % get_total_distance(itinerary)
    

if __name__ == '__main__':
   main()

The TSP script uses a NumPy multi-dimensional array as the source of data. If you want to run this example, you need to install NumPy (see Resources). The array is defined in a file called routes.py. Here is what those routes look like.

Listing 6. NumPy multi-dimensional array containing city-to-city routes
values = [
("AAPL", "CSCO", 14),
("AAPL", "CVX", 44),
("AAPL", "EBAY", 14),
("AAPL", "GOOG", 14),
("AAPL", "GPS", 59),
("AAPL", "HPQ", 14),
("AAPL", "INTC", 8),
("AAPL", "MCK", 60),
("AAPL", "ORCL", 26),
("AAPL", "PCG", 59),
("AAPL", "SFO", 46),
("AAPL", "SWY", 37),
("AAPL", "URS", 60),
("AAPL", "WFC", 60),

("CSCO","AAPL" ,14),
("CSCO", "CVX", 43),
("CSCO", "EBAY",0),
("CSCO", "GOOG",21),
("CSCO", "GPS",67),
("CSCO", "HPQ",26),
("CSCO", "INTC",6),
("CSCO", "MCK",68),
("CSCO", "ORCL",37),
("CSCO", "PCG",68),
("CSCO", "SFO",57),
("CSCO", "SWY",32),
("CSCO", "URS",68),
("CSCO", "WFC",68),

("CVX","AAPL" ,44),
("CVX", "CSCO",43),
("CVX", "EBAY",43),
("CVX", "GOOG",36),
("CVX", "GPS",43),
("CVX", "HPQ",40),
("CVX", "INTC",41),
("CVX", "MCK",46),
("CVX", "ORCL",39),
("CVX", "PCG",44),
("CVX", "SFO",45),
("CVX", "SWY",13),
("CVX", "URS",44),
("CVX", "WFC",44),

("EBAY","AAPL" ,14),
("EBAY", "CSCO",0),
("EBAY", "CVX",43),
("EBAY", "GOOG",21),
("EBAY", "GPS",67),
("EBAY", "HPQ",26),
("EBAY", "INTC",6),
("EBAY", "MCK",68),
("EBAY", "ORCL",37),
("EBAY", "PCG",68),
("EBAY", "SFO",57),
("EBAY", "SWY",32),
("EBAY", "URS",68),
("EBAY", "WFC",68),

("GOOG","AAPL",11),
("GOOG", "CSCO",21),
("GOOG", "CVX",36),
("GOOG", "EBAY",21),
("GOOG", "GPS",48),
("GOOG", "HPQ",6),
("GOOG", "INTC",15),
("GOOG", "MCK",49),
("GOOG", "ORCL",16),
("GOOG", "PCG",48),
("GOOG", "SFO",36),
("GOOG", "SWY",32),
("GOOG", "URS",49),
("GOOG", "WFC",49),

("GPS","AAPL" ,59),
("GPS", "CSCO",67),
("GPS", "CVX",43),
("GPS", "EBAY",67),
("GPS", "GOOG",48),
("GPS", "HPQ",45),
("GPS", "INTC",62),
("GPS", "MCK",03),
("GPS", "ORCL",34),
("GPS", "PCG",01),
("GPS", "SFO",18),
("GPS", "SWY",53),
("GPS", "URS",01),
("GPS", "WFC",01),

("HPQ","AAPL" ,14),
("HPQ", "CSCO",26),
("HPQ", "CVX",40),
("HPQ", "EBAY",26),
("HPQ", "GOOG",6),
("HPQ", "GPS",45),
("HPQ", "INTC",20),
("HPQ", "MCK",46),
("HPQ", "ORCL",12),
("HPQ", "PCG",46),
("HPQ", "SFO",32),
("HPQ", "SWY",37),
("HPQ", "URS",46),
("HPQ", "WFC",46),

("INTC","AAPL",8),
("INTC","CSCO",6),
("INTC", "CVX",41),
("INTC", "EBAY",6),
("INTC", "GOOG",15),
("INTC", "GPS",62),
("INTC", "HPQ",20),
("INTC", "MCK",63),
("INTC", "ORCL",31),
("INTC", "PCG",62),
("INTC", "SFO",51),
("INTC", "SWY",32),
("INTC", "URS",63),
("INTC", "WFC",63),

("MCK", "AAPL",60),
("MCK", "CSCO",68),
("MCK", "CVX",46),
("MCK", "EBAY",68),
("MCK", "GOOG",49),
("MCK", "GPS",03),
("MCK", "HPQ",46),
("MCK", "INTC",63),
("MCK", "ORCL",34),
("MCK", "PCG",3),
("MCK", "SFO",16),
("MCK", "SWY",56),
("MCK", "URS",3),
("MCK", "WFC",2),

("ORCL", "AAPL",22),
("ORCL", "CSCO",37),
("ORCL", "CVX",39),
("ORCL", "EBAY",37),
("ORCL", "GOOG",16),
("ORCL", "GPS",34),
("ORCL", "HPQ",12),
("ORCL", "INTC",31),
("ORCL", "MCK",34),
("ORCL", "PCG",35),
("ORCL", "SFO",20),
("ORCL", "SWY",40),
("ORCL", "URS",35),
("ORCL", "WFC",35),

("PCG", "AAPL",59),
("PCG", "CSCO",68),
("PCG", "CVX",44),
("PCG", "EBAY",68),
("PCG", "GOOG",48),
("PCG", "GPS",01),
("PCG", "HPQ",46),
("PCG", "INTC",62),
("PCG", "MCK",03),
("PCG", "ORCL",35),
("PCG", "SFO",18),
("PCG", "SWY",54),
("PCG", "URS",01),
("PCG", "WFC",01),

("SFO", "AAPL",46),
("SFO", "CSCO",57),
("SFO", "CVX",45),
("SFO", "EBAY",57),
("SFO", "GOOG",36),
("SFO", "GPS",18),
("SFO", "HPQ",32),
("SFO", "INTC",51),
("SFO", "MCK",16),
("SFO", "ORCL",20),
("SFO", "PCG",18),
("SFO", "SWY",52),
("SFO", "URS",18),
("SFO", "WFC",18),

("SWY", "AAPL",37),
("SWY", "CSCO",32),
("SWY", "CVX",13),
("SWY", "EBAY",32),
("SWY", "GOOG",32),
("SWY", "GPS",53),
("SWY", "HPQ",37),
("SWY", "INTC",32),
("SWY", "MCK",56),
("SWY", "ORCL",40),
("SWY", "PCG",54),
("SWY", "SFO",52),
("SWY", "URS",54),
("SWY", "WFC",54),

("URS", "AAPL",60),
("URS", "CSCO",68),
("URS", "CVX",44),
("URS", "EBAY",68),
("URS", "GOOG",49),
("URS", "GPS",01),
("URS", "HPQ",46),
("URS", "INTC",63),
("URS", "MCK",03),
("URS", "ORCL",35),
("URS", "PCG",01),
("URS", "SFO",18),
("URS", "SWY",54),
("URS", "WFC",0),

("WFC", "AAPL",60),
("WFC", "CSCO",68),
("WFC", "CVX",44),
("WFC", "EBAY",68),
("WFC", "GOOG",49),
("WFC", "GPS",01),
("WFC", "HPQ",46),
("WFC", "INTC",63),
("WFC", "MCK",02),
("WFC", "ORCL",35),
("WFC", "PCG",01),
("WFC", "SFO",18),
("WFC", "SWY",54),
("WFC", "URS",0),

]

Conclusion

In this article, I covered the basics of doing optimization with Pyomo and Python. This should be enough to get you started on your own. In the next two articles in this series, I walk through some more detailed examples and cover some pragmatic aspects of building larger programs and dealing with scaling issues.

Acknowledgements

Special thanks to Dr. David L. Woodruff for reviewing this article.

Resources

Learn

Get products and technologies

Discuss

  • Get involved in the developerWorks community. Connect with other developerWorks users while exploring the developer-driven blogs, forums, groups, and wikis.

Comments

developerWorks: Sign in

Required fields are indicated with an asterisk (*).


Need an IBM ID?
Forgot your IBM ID?


Forgot your password?
Change your password

By clicking Submit, you agree to the developerWorks terms of use.

 


The first time you sign into developerWorks, a profile is created for you. Information in your profile (your name, country/region, and company name) is displayed to the public and will accompany any content you post, unless you opt to hide your company name. You may update your IBM account at any time.

All information submitted is secure.

Choose your display name



The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerWorks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

Required fields are indicated with an asterisk (*).

(Must be between 3 – 31 characters.)

By clicking Submit, you agree to the developerWorks terms of use.

 


All information submitted is secure.

Dig deeper into Cloud computing on developerWorks


  • Bluemix Developers Community

    Get samples, articles, product docs, and community resources to help build, deploy, and manage your cloud apps.

  • Cloud digest

    Complete cloud software, infrastructure, and platform knowledge.

  • DevOps Services

    Software development in the cloud. Register today to create a project.

  • Try SoftLayer Cloud

    Deploy public cloud instances in as few as 5 minutes. Try the SoftLayer public cloud instance for one month.

static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=Cloud computing, Open source
ArticleID=857492
ArticleTitle=Linear optimization in Python, Part 1: Solve complex problems in the cloud with Pyomo
publish-date=02052013