Patricia Selinger
She invented a way to make relational databases perform — and built relationships to help teams perform better
Patricia Selinger portrait

Having spent a decade enduring the brutal winters of Cambridge, Massachusetts, while earning her bachelor’s, master’s and PhD from Harvard University, Patricia Selinger had a clear goal upon graduation. She wanted some sunshine.

So, in 1975, having been aggressively recruited by IBM Research in San Jose, she decided to take the leap. The company’s West Coast lab had by then become a locus of database research. She had never even taken a database class and knew she would have a lot of catching up to do on a team charged with exploring new terrain in databases. But she told herself that the fringe benefits, at least, would be worth the effort. “I applied other places, but what I intended was to work in California,” she said. 

As it turned out, Selinger’s studies in applied mathematics and computer science — as well as her fluency in computer compilers and operating systems — would provide a solid basis for her explorations. She made for an easy fit on the team, and together they would change the way databases operate. 

During her time at IBM Research, Selinger would invent a novel method for optimizing database queries that factored in “cost,” or processing time. She and the team would advance a relational database prototype based on a new conceptual data model proposed by a San Jose colleague, computer scientist Edgar Codd. System R, as it was known, became the forerunner of all of IBM’s relational-database products and led to the invention of SQL (Structured Query Language), for decades the industry standard language for relational databases.

A new approach to data

Selinger’s path to math and computers was a case of interest over aptitude. She excelled in English and social studies but always preferred math. She had absorbed math concepts at the side of her father, an electrical engineer who finished his college degree while Selinger was in elementary school. “We got to study whatever he was studying,” she said.

In college she pursued pure math, a highly theoretical discipline at Harvard, but discovered programming on a whim after a class on logic didn’t work out. It was “not so much career planning but good taste or good luck or something,” she said.

Luck happened to intervene again in San Jose. She joined just as IBM Research was creating a prototype to test Codd’s concept of relational data. This new approach was simpler than the complex hierarchical data tables most commonly used at the time, but no one knew whether it could work in a real-world application.

Key to proving the concept was showing that searches could happen quickly. Until that point, databases relied on teams of programmers to pre-load instructions for grabbing data. This relational model needed a more flexible way for algorithms, not people, to decide how to get the data.

The cost-based optimizer

Selinger went to work. She invented a technique to identify all the ways to build and execute a query in this new data structure and then select the most efficient option to run. This “cost-based optimizer” worked by estimating the overall processing time and machine resources required to complete the query and then deciding how best to retrieve the data — all without the aid of navigational coding. “Cost-based optimization made relational work,” Selinger said. “It made relational databases perform. And it made it practical to code in this higher level, SQL.”

The System R project ran from 1975 to 1979 and was a massive success. In 1981, IBM announced its first relational database product, SQL/DS. In 1983, the company launched DB2 for mainframes, based on the same foundational developments.

By then, Selinger was already onto the next thing. In mid-1978, she had started research on extending System R concepts to multiple machines with individual databases, or distributed computing, under the R* (or Starburst) project. She had also begun her ascent through various management roles and by 1983 was a functional manager running a computer science department of 150 people for IBM Research in San Jose.

In 1981, IBM announced its first relational database product, SQL/DS
An IBM fellow and other awards
1980

Selinger was the recipient of a stream of company accolades. IBM granted her a Research Division Outstanding Contribution Award for her work on System R.

1985

She earned an Outstanding Innovation Award for her work extending query optimization and execution to a distributed environment through R*.

1988

She shared with three colleagues the Software System Award given by the Association for Computing Machinery (ACM) for their work on System R.

1990

She joined the IBM Academy of Technology.

1994

Four years later, IBM CEO Lou Gerstner honored Selinger at the company’s annual Corporate Recognition Event for its top tech innovators. She was named an IBM Fellow, providing her the freedom to pursue research and technical projects of her choosing for at least the next five years.

1999

The National Academy of Engineering elected her a member, in recognition of her contributions to relational database technology.

Bridging the research/developer gap

Despite the significance of her work on System R, Selinger said in a 2003 interview that her most important career contribution was building bridges between IBM teams. In 1986, she formed the Database Technology Institute (DBTI) to strengthen the connections between the Research and development groups to improve product delivery. Disconnects and inefficiency among these groups was a common issue in industry that often led to failed products, she believed. “The Database Technology Institute that I formed solved that problem,” she said.

DBTI worked across IBM functions to deliver a number of fundamental technologies in DB2, originally a family of relational data products. When they shipped in 1997, Selinger moved into development as director of DB2 integration, where she began to explore how to incorporate the vast amounts of unstructured data, like pictures and email, into database models.

She was also a prolific mentor, actively advising 30 or more people at a time. In this spirit, the company created a PhD fellowship in her name in 2008, awarded to a female PhD student worldwide with special focus on database design and management. In 2010, IBM launched an effort to better understand, through data, the actions needed to improve human health. Selinger joined experts across a variety of fields to study societal impacts on childhood obesity.

She retired in 2018.

She formed the DBTI to strengthen connections between the research and development groups
Related stories The relational database

A new theory of structureless data retrieval spawned a multibillion-dollar industry and unleashed our modern world

Information Management Systems

Built to log rocket parts, this database management system powers billions of daily transactions in industries around the globe

Edgar F. Codd

The inventor who made relational databases possible