Linear optimization in Python, Part 1

# Solve complex problems in the cloud with Pyomo

Write a script to solve a modeling problem

### Content series:

## This content is part # of # in the series: Linear optimization in Python, Part 1

## This content is part of the series:Linear optimization in Python, Part 1

Stay tuned for additional content in this series.

Modeling is a powerful way to solve complex problems. According to the book
*Modeling Languages in Mathematical Optimization* (see Related topics), models are used to:

- Explain phenomena
- Make predictions
- Assess key factors
- Identify extreme
- Analyze trade-offs

In business, spreadsheet software like Microsoft® Excel is often the first choice for modeling problems. Now, spreadsheets are often very intuitive, but they can be limited in what they can ultimately do for large-scale problems. If you're a developer, you might find it more expedient simply to write a script to solve a modeling problem, because then you can easily integrate the script into other systems. This article introduces the basics of linear optimization in Python using the Pyomo library.

You might be familiar with algebraic modeling languages such as AMPL, AIMMS, and GAMS. The advantage of using Pyomo is that its modeling objects are embedded within a high-level programming language with a rich set of supporting libraries. Another key advantage to doing scientific or mathematical computing in Python is that the language can easily translate algebraic notation into an expression in code. In addition, there are many mature libraries you can leverage. An example is NumPy (see Related topics), a library that provides additional data structures such as a multi-dimensional array. The NumPy library is used in a code example later in the article because it answers the requirement for a multi-dimensional data structure.

## Getting started

To start, install Pyomo. Pyomo is a central component of Coopr, a collection of Python software packages. Download the coopr_install script, which creates a Python virtual environment when you run it with the Python interpreter.

To create a relative directory named "coopr":

noahs-MacBook-Air% python coopr_install

Use the following command to start Pyomo, which puts the relative directory and its executables in your path:

noahs-MacBook-Air% source coopr/bin/activate

Use the Pyomo `"--help `

command to
get help for using pyomo:

(coopr)noahs-MacBook-Air% pyomo --help

You need a compiler to use both the Virtual Python Environment builder (virtualenv) and Pyomo. On OS X, use the XCode Developer Tools command-line tools. On Linux, use the GNU Compiler Collection (GCC). Once this virtual environment is initialized, you can solve problems with Pyomo in one of two ways:

- Use the Pyomo command-line tool:
(coopr)noahs-MacBook-Air% pyomo my_problem.py --solver=glpk

- Or, embed the initialization inside your script and run it via the Python interpreter:
##### Listing 1. Invoking Pyomo inside a script

#This is an optional code path that allows the script to be run outside of #pyomo command-line. For example: python wyndor.py if __name__ == '__main__': #This replicates what the pyomo command-line tools does from coopr.opt import SolverFactory opt = SolverFactory("glpk") instance = model.create() results = opt.solve(instance) #sends results to stdout results.write()

Pyomo assumes you have at least one solver installed. The GNU Linear Programming Kit (glpk) is the default. Refer to the installation instructions for the solver you want to use. Then make sure Pyomo is able to find this solver on its path.

## Modeling strategies

There are two ways to create a data model with Pyomo: Abstract and concrete. The key difference is whether to separate the model and the data.

- In an abstract model, you separate the model and the data.
- In a concrete model, you define the data inside the model itself. For example, in a Pyomo script that uses a concrete model, everything is defined inside a single script file. For simpler problems, especially teaching problems, this can be an efficient approach. On the other hand, larger problems often have larger data sets, which makes it impractical to embed the data in the same script as the code needed to work on the data.

At a high level, a Pyomo model consists of variables, objectives, and constraints.

- Variables are values that are calculated during the optimization.
- Objective are Python functions that take data and variables, and is either maximized or minimized.
- Constraints are a way of defining an expression that limits the values a variable can assume.

When creating a Pyomo model, you need to undertand the Python method of notating algebraic expressions. Here is a quick summary that compares expressions between a spreadsheet, algebra, and Python:

##### Figure 1. Table of notation comparisons

## Pyomo application example: Wyndor

To illustrate a simple linear optimization problem using Pyomo, let's take a product-mix
problem from the book *Introduction to Management Science* (see Related topics). The Wyndor factory produces doors and windows. Each of three plants with different available hours can produce either doors or windows. To determine how to maximize the profit available in the limited amount of hours of production, write the following script. It has three main sections: model, objective, and constraint.

- The model — it's a concrete model — instantiates the data of the problem, such as the hours available from plants.
- The objective (which can be either to maximize or minimize) is to maximize profit.
- The constraint uses a CapacityRule function, written using a Python idiom called
*list comprehension*.

Here's the script for the Wyndor optimization problem:

##### Listing 2. Wyndor Pyomo basic example

"""Wyndor model from Hillier and Hillier *Introduction to Management Science* One way to do it... Original Author dlw Modified ndg To run this you need pyomo and the glpk solver installed, and need to be inside of the virtual environment. When these dependencies are installed you can solve this way (glpk is default): pyomo wyndor.py """ #Explicit importing makes it more clear what a user needs to define #versus what is included within the pyomo library from coopr.pyomo import (ConcreteModel, Objective, Var, NonNegativeReals, maximize, Constraint) Products = ['Doors', 'Windows'] ProfitRate = {'Doors':300, 'Windows':500} Plants = ['Door Fab', 'Window Fab', 'Assembly'] HoursAvailable = {'Door Fab':4, 'Window Fab':12, 'Assembly':18} HoursPerUnit = {('Doors','Door Fab'):1, ('Windows', 'Window Fab'):2, ('Doors','Assembly'):3, ('Windows', 'Assembly'):2, ('Windows', 'Door Fab'):0, ('Doors', 'Window Fab'):0} #Concrete Model model = ConcreteModel() #Decision Variables model.WeeklyProd = Var(Products, within=NonNegativeReals) #Objective model.obj = Objective(expr= sum(ProfitRate[i] * model.WeeklyProd[i] for i in Products), sense = maximize) def CapacityRule(model, p): """User Defined Capacity Rule Accepts a pyomo Concrete Model as the first positional argument, and and a plant index as a second positional argument """ return sum(HoursPerUnit[i,p] * model.WeeklyProd[i] for i in Products) <= HoursAvailable[p] This statement is what Pyomo needs to generate one constraint for each plant model.Capacity = Constraint(Plants, rule = CapacityRule) #This is an optional code path that allows the script to be run outside of #pyomo command-line. For example: python wyndor.py if __name__ == '__main__': #This replicates what the pyomo command-line tools does from coopr.opt import SolverFactory opt = SolverFactory("glpk") instance = model.create() results = opt.solve(instance) #sends results to stdout results.write()

Let's run the script:

(coopr)noahs-MacBook-Air% pyomo wyndor.py

Pyomo returns the following output, which shows one feasible solution that achieves a $3,600 profit:

##### Listing 3. One feasible solution with $3,600 profit

(coopr)noahs-MacBook-Air% pyomo wyndor.py [ 0.00] Setting up Pyomo environment [ 0.00] Applying Pyomo preprocessing actions [ 0.00] Creating model [ 0.00] Applying solver [ 0.01] Processing results Number of solutions: 1 Solution Information Gap: 0.0 Status: feasible Function Value: 3600.0 Solver results file: results.json [ 0.02] Applying Pyomo postprocessing actions [ 0.02] Pyomo Finished

Let's find out which ratio of products you should produce. To do that,
`cat`

the output file results.json. The results show this:

"Variable": { "WeeklyProd[Doors]": { "Id": 0, "Value": 2.0 }, "WeeklyProd[Windows]": { "Id": 1, "Value": 6.0

From this output, you see that producing six windows and two doors gives you a profit maximization of $3,600.

Another way you can solve the script is to use Python because of the conditional statement added. This invokes Pyomo inside of the script, instead of using Pyomo to invoke the script.

## Heuristic optimization: The greedy randomized traveling salesman

The book *In Pursuit of the Traveling Salesman: Mathematics at the Limits of
Computation* by Bill Cook contains a great example of a problem that doesn't always scale well as a linear program. It's the Traveling Salesman Problem, or TSP: Given a list of cities, find the shortest possible route that visits each city exactly once and returns to the original city. The worst-case running time that solves the traveling salesman problem increases exponentially with the number of cities.

Figure 2 is a graph that simulates the amount of time necessary to solve for every possible route, assuming it takes a few seconds to solve for a couple of cities. As you can see, it rapidly becomes impossible to solve the problem in your lifetime.

##### Figure 2. Traveling salesman exponential time to solve

You need another strategy to find a *good* result. Often a *good*, but
not a guaranteed optimal, solution is achieved by running simulations and then taking the best results of those simulations. An example of this approach is shown in the Python code below. Randomly select a starting city, and then *greedily* select the next city with the shortest distance. This simulation can then be run N, or multiple times, and the shortest path among the simulations will be run.

##### Listing 4. Greedy shortest-distance selections

noahs-MacBook-Air% python tsp_greedy_random_start.py 20 Running simulation 20 times Shortest Distance: 128 Optimal Route: [ ('PCG', 'GPS', 1), ('GPS', 'URS', 1), ('URS', 'WFC', 0), ('WFC', 'MCK', 2), ('MCK', 'SFO', 16), ('SFO', 'ORCL', 20), ('ORCL', 'HPQ', 12), ('HPQ', 'GOOG', 6), ('GOOG', 'AAPL', 11), ('AAPL', 'INTC', 8), ('INTC', 'CSCO', 6), ('CSCO', 'EBAY', 0), ('EBAY', 'SWY', 32), ('SWY', 'CVX', 13) ]

##### Listing 5. TSP greedy random start

#!/usr/bin/env python """ Traveling salesman solution with random start and greedy path selection You can select how many iterations to run by doing the following: python tsp_greedy_random_start.py 20 #runs 20 times """ import sys from random import choice import numpy as np from routes import values dt = np.dtype([('city_start', 'S10'), ('city_end', 'S10'), ('distance', int)]) data_set = np.array(values,dtype=dt) def all_cities(mdarray): """Takes a multi-dimensional array in this format and finds unique cities array([["A", "A"], ["A", "B"]]) """ cities = {} city_set = set(data_set['city_end']) for city in city_set: cities[city] = "" return cities def randomize_city_start(all_cities): """Returns a randomized city to start trip""" return choice(all_cities) def get_shortest_route(routes): """Sort the list by distance and return shortest distance route""" route = sorted(routes, key=lambda dist: dist[2]).pop(0) return route def greedy_path(): """Select the next path to travel based on the shortest, nearest path""" itinerary = [] cities = all_cities(data_set) starting_city = randomize_city_start(cities.keys()) #print "starting_city: %s" % starting_city cities_visited = {} #we want to iterate through all cities once count = 1 while True: possible_routes = [] distance = [] #print "starting city: %s" % starting_city for path in data_set: if starting_city in path['city_start']: #we can't go to cities we have already visited if path['city_end'] in cities_visited: continue else: #print "path: ", path possible_routes.append(path) if not possible_routes: break #append this to itinerary route = get_shortest_route(possible_routes) #print "Route(%s): %s " % (count, route) count += 1 itinerary.append(route) #add this city to the visited city list cities_visited[route[0]] = count #print "cities_visited: %s " % cities_visited #reset the starting_city to the next city starting_city = route[1] #print "itinerary: %s" % itinerary return itinerary def get_total_distance(complete_itinerary): distance = sum(z for x,y,z in complete_itinerary) return distance def lowest_simulation(num): routes = {} for i in range(num): itinerary = greedy_path() distance = get_total_distance(itinerary) routes[distance] = itinerary shortest_distance = min(routes.keys()) route = routes[shortest_distance] return shortest_distance, route def main(): """runs everything""" if len(sys.argv) == 2: iterations = int(sys.argv[1]) print "Running simulation %s times" % iterations distance, route = lowest_simulation(iterations) print "Shortest Distance: %s" % distance print "Optimal Route: %s" % route else: #print "All Routes: %s" % data_set itinerary = greedy_path() print "itinerary: %s" % itinerary print "Distance: %s" % get_total_distance(itinerary) if __name__ == '__main__': main()

The TSP script uses a NumPy multi-dimensional array as the source of data. If you want to run this example, you need to install NumPy (see Related topics). The array is defined in a file called routes.py. Here is what those routes look like.

##### Listing 6. NumPy multi-dimensional array containing city-to-city routes

values = [ ("AAPL", "CSCO", 14), ("AAPL", "CVX", 44), ("AAPL", "EBAY", 14), ("AAPL", "GOOG", 14), ("AAPL", "GPS", 59), ("AAPL", "HPQ", 14), ("AAPL", "INTC", 8), ("AAPL", "MCK", 60), ("AAPL", "ORCL", 26), ("AAPL", "PCG", 59), ("AAPL", "SFO", 46), ("AAPL", "SWY", 37), ("AAPL", "URS", 60), ("AAPL", "WFC", 60), ("CSCO","AAPL" ,14), ("CSCO", "CVX", 43), ("CSCO", "EBAY",0), ("CSCO", "GOOG",21), ("CSCO", "GPS",67), ("CSCO", "HPQ",26), ("CSCO", "INTC",6), ("CSCO", "MCK",68), ("CSCO", "ORCL",37), ("CSCO", "PCG",68), ("CSCO", "SFO",57), ("CSCO", "SWY",32), ("CSCO", "URS",68), ("CSCO", "WFC",68), ("CVX","AAPL" ,44), ("CVX", "CSCO",43), ("CVX", "EBAY",43), ("CVX", "GOOG",36), ("CVX", "GPS",43), ("CVX", "HPQ",40), ("CVX", "INTC",41), ("CVX", "MCK",46), ("CVX", "ORCL",39), ("CVX", "PCG",44), ("CVX", "SFO",45), ("CVX", "SWY",13), ("CVX", "URS",44), ("CVX", "WFC",44), ("EBAY","AAPL" ,14), ("EBAY", "CSCO",0), ("EBAY", "CVX",43), ("EBAY", "GOOG",21), ("EBAY", "GPS",67), ("EBAY", "HPQ",26), ("EBAY", "INTC",6), ("EBAY", "MCK",68), ("EBAY", "ORCL",37), ("EBAY", "PCG",68), ("EBAY", "SFO",57), ("EBAY", "SWY",32), ("EBAY", "URS",68), ("EBAY", "WFC",68), ("GOOG","AAPL",11), ("GOOG", "CSCO",21), ("GOOG", "CVX",36), ("GOOG", "EBAY",21), ("GOOG", "GPS",48), ("GOOG", "HPQ",6), ("GOOG", "INTC",15), ("GOOG", "MCK",49), ("GOOG", "ORCL",16), ("GOOG", "PCG",48), ("GOOG", "SFO",36), ("GOOG", "SWY",32), ("GOOG", "URS",49), ("GOOG", "WFC",49), ("GPS","AAPL" ,59), ("GPS", "CSCO",67), ("GPS", "CVX",43), ("GPS", "EBAY",67), ("GPS", "GOOG",48), ("GPS", "HPQ",45), ("GPS", "INTC",62), ("GPS", "MCK",03), ("GPS", "ORCL",34), ("GPS", "PCG",01), ("GPS", "SFO",18), ("GPS", "SWY",53), ("GPS", "URS",01), ("GPS", "WFC",01), ("HPQ","AAPL" ,14), ("HPQ", "CSCO",26), ("HPQ", "CVX",40), ("HPQ", "EBAY",26), ("HPQ", "GOOG",6), ("HPQ", "GPS",45), ("HPQ", "INTC",20), ("HPQ", "MCK",46), ("HPQ", "ORCL",12), ("HPQ", "PCG",46), ("HPQ", "SFO",32), ("HPQ", "SWY",37), ("HPQ", "URS",46), ("HPQ", "WFC",46), ("INTC","AAPL",8), ("INTC","CSCO",6), ("INTC", "CVX",41), ("INTC", "EBAY",6), ("INTC", "GOOG",15), ("INTC", "GPS",62), ("INTC", "HPQ",20), ("INTC", "MCK",63), ("INTC", "ORCL",31), ("INTC", "PCG",62), ("INTC", "SFO",51), ("INTC", "SWY",32), ("INTC", "URS",63), ("INTC", "WFC",63), ("MCK", "AAPL",60), ("MCK", "CSCO",68), ("MCK", "CVX",46), ("MCK", "EBAY",68), ("MCK", "GOOG",49), ("MCK", "GPS",03), ("MCK", "HPQ",46), ("MCK", "INTC",63), ("MCK", "ORCL",34), ("MCK", "PCG",3), ("MCK", "SFO",16), ("MCK", "SWY",56), ("MCK", "URS",3), ("MCK", "WFC",2), ("ORCL", "AAPL",22), ("ORCL", "CSCO",37), ("ORCL", "CVX",39), ("ORCL", "EBAY",37), ("ORCL", "GOOG",16), ("ORCL", "GPS",34), ("ORCL", "HPQ",12), ("ORCL", "INTC",31), ("ORCL", "MCK",34), ("ORCL", "PCG",35), ("ORCL", "SFO",20), ("ORCL", "SWY",40), ("ORCL", "URS",35), ("ORCL", "WFC",35), ("PCG", "AAPL",59), ("PCG", "CSCO",68), ("PCG", "CVX",44), ("PCG", "EBAY",68), ("PCG", "GOOG",48), ("PCG", "GPS",01), ("PCG", "HPQ",46), ("PCG", "INTC",62), ("PCG", "MCK",03), ("PCG", "ORCL",35), ("PCG", "SFO",18), ("PCG", "SWY",54), ("PCG", "URS",01), ("PCG", "WFC",01), ("SFO", "AAPL",46), ("SFO", "CSCO",57), ("SFO", "CVX",45), ("SFO", "EBAY",57), ("SFO", "GOOG",36), ("SFO", "GPS",18), ("SFO", "HPQ",32), ("SFO", "INTC",51), ("SFO", "MCK",16), ("SFO", "ORCL",20), ("SFO", "PCG",18), ("SFO", "SWY",52), ("SFO", "URS",18), ("SFO", "WFC",18), ("SWY", "AAPL",37), ("SWY", "CSCO",32), ("SWY", "CVX",13), ("SWY", "EBAY",32), ("SWY", "GOOG",32), ("SWY", "GPS",53), ("SWY", "HPQ",37), ("SWY", "INTC",32), ("SWY", "MCK",56), ("SWY", "ORCL",40), ("SWY", "PCG",54), ("SWY", "SFO",52), ("SWY", "URS",54), ("SWY", "WFC",54), ("URS", "AAPL",60), ("URS", "CSCO",68), ("URS", "CVX",44), ("URS", "EBAY",68), ("URS", "GOOG",49), ("URS", "GPS",01), ("URS", "HPQ",46), ("URS", "INTC",63), ("URS", "MCK",03), ("URS", "ORCL",35), ("URS", "PCG",01), ("URS", "SFO",18), ("URS", "SWY",54), ("URS", "WFC",0), ("WFC", "AAPL",60), ("WFC", "CSCO",68), ("WFC", "CVX",44), ("WFC", "EBAY",68), ("WFC", "GOOG",49), ("WFC", "GPS",01), ("WFC", "HPQ",46), ("WFC", "INTC",63), ("WFC", "MCK",02), ("WFC", "ORCL",35), ("WFC", "PCG",01), ("WFC", "SFO",18), ("WFC", "SWY",54), ("WFC", "URS",0), ]

## Conclusion

In this article, I covered the basics of doing optimization with Pyomo and Python. This should be enough to get you started on your own. In the next two articles in this series, I walk through some more detailed examples and cover some pragmatic aspects of building larger programs and dealing with scaling issues.

### Acknowledgements

Special thanks to Dr. David L. Woodruff for reviewing this article.

#### Downloadable resources

#### Related topics

- These two books are a highly recommended source of information on optimization:
*Pyomo – Optimization Modeling in Python*, by W. E. Hart et al.*Introduction to Management Science: A Modeling and Case Studies Approach with Spreadsheets*, by David Anderson et al.

- Pyomo is a central component of Coopr, a collection of Python software packages. Download the Pyomo installer.
- Download the Virtual Python Environment builder (virtualenv).
- Read about and download NumPy, the fundamental package for scientific computing with Python.
- Download the GLPK (GNU Linear Programming Kit) package, used to solve large-scale linear programming (LP), mixed integer programming (MIP), and other related problems.
- Join the discussion at the Pyomo mailing list.
- Learn more about modeling and solving real world optimization problems in the book
*Modeling Languages in Mathematical Optimization*, edited by Joseph Kallrath. - In the book
*In Pursuit of the Traveling Salesman: Mathematics at the Limits of Computation*, the author William J. Cook explores the various algorithms and branches of mathematics used to solve the traveling salesman problem, including the branch of mathematics known as linear programming. - Study some notation examples from Dr. David L. Woodruff.
- Access IBM SmartCloud Enterprise.
- Evaluate IBM products in the way that suits you best: Download a product trial, try a product online, use a product in a cloud environment, or spend a few hours in the SOA Sandbox learning how to implement Service Oriented Architecture efficiently.