Life as a Site Reliability Engineer at IBM

We recently had the opportunity to speak with three Site Reliability Engineer (SRE) colleagues from different IBM locations: Angelica from Costa Rica, Kaushal from India, and Pawel from Poland.

Both Angelica and Pawel joined IBM last year and present a fresh perspective on being an SRE at IBM, while Kaushal is a seasoned professional with over eight years of experience in this field.

We asked them what it is like to work in this role, why they have chosen it, as well as the impact their technology solutions have on our clients.

Read on to learn about their unique perspectives.


Why work as an SRE at IBM?

Angelica is a Planning Analytics SRE in Costa Rica who studied electrical engineering with an emphasis on computers and networks. She was interested in software but didn’t want to become a developer. She preferred the part of maintaining the infrastructure of applications or services. “I learned about the SRE role, when researching DevOps, and I found out that it had a little brother called <SRE>,” she said. “I read more about it, watched many videos, and realized that it was something I wanted to do.”

Kaushal is a Planning Analytics on Cloud SRE in India whose motivation to pursue a career as an SRE comes from the desire to gain exposure at all levels of the Software Development Life Cycle (SDLC). “This role allows me to deepen my technical understanding of products, while also enabling me to work closely with customers and stakeholders to understand the business impact of our systems,” he said.


Daily responsibilities of a Site Reliability Engineer

Pawel, and SRE for Watson AI in Poland says his goal is to ensure our systems are reliable. He explains further: “Each team member takes part in an on-call rotation for our services to respond to any arising issues. If we are dealing with a known scenario, which cannot be automated, then we use runbooks (a set of instructions) that guide us through the process. However, we always try to automate, so that our tools can solve issues for us. Otherwise, we learn and root cause new issues, so that we can avoid or remediate them in the future.”

SREs configure the necessary infrastructure for applications to work correctly. This includes everything from provisioning, customization, and creating network rules. Angelica says SREs do their best to complete these processes as quickly and efficiently as possible, and in order to do that, they use automation. She says SREs also monitor alerts from servers to assure they are always up and running.


Evolving your career at IBM

Kaushal appreciates that SREs at IBM are not limited to working only on production systems. In fact, they are encouraged to engage in automation and side projects that use cutting-edge technologies. This not only facilitates individual growth, but also contributes to the company’s overall development. “As an SRE, we can engage in various domains, including DevOps, architectural projects, and the exploration of emerging AI technologies through IBM’s watsonx AI and data platforms,” he said.

As an IBMer, Angelica says she has access to different learning platforms, including Udemy, cognitive classes, and the internal IBM learning platform, where she could take a specific certification called “IBM Certified Professional SRE.”


IBMers discuss their favorite thing about working as an SRE at IBM

Pawel says he finds satisfaction in solving issues, including being able to split his work between development and hands-on system debugging.

Angelica’s favorite thing is working directly with the provisioning of the application and setting up how it is subsequently accessed by the clients.  For her, it is of utmost importance that clients feel satisfied with the service provided.

Kaushal likes being at the forefront of ensuring system reliability and performance for customers. He says it is rewarding to know that their efforts directly impact businesses to succeed.


Tips for aspiring SREs and future IBMers

Angelica – “The first thing is to understand the role, look for information, videos, blogs about what an SRE does. Understand the difference from a DevOps role and always try to be up to date with current technologies.”

Kaushal – “While technical skills and the ability to manage critical scenarios are often emphasized, it is essential to recognize that the most crucial SRE skill is analytical problem-solving. The ability to connect the dots and solve complex issues is paramount in this fast-paced work environment.”



Become a Site Reliability Engineer at IBM

If you would like to learn about opportunities as a Site Reliability Engineer, where you can create innovative solutions for the clients that Angelica, Pawel and Kaushal get to work with, visit our careers website to find the perfect job matching your skills.

If you don’t see a role that aligns with your skills and interests at this time, we invite you to join our talent network to receive updates on career opportunities and events from IBM.