Senior Site Reliability Engineer
hace 4 días
**The Elevator Pitch: Why will you enjoy this new opportunity?**
VMware's Cross-Cloud SaaS Platform team provides the public cloud infrastructure and Managed Kubernetes clusters that host all of VMware's SaaS products consumed by our customers. The platform is globally distributed and built using a combination of industry-standard open-source solutions and VMware Products. We host VMware's most significant SaaS products in production, including VMware Cloud on AWS, CarbonBlack, Networking (NSBU) and Cloud Management Business Unit (CMBU) services. The Site Reliability Engineering team is responsible for ensuring the uptime and reliability of our Managed Platform and providing value add services to our customers, internal VMware engineering teams. We get to play with the latest cloud-native technologies to support the business, and inclusivity is critical where everyone can have a voice and make a difference. You will be part of a vital team providing the platform integral to VMware's SaaS transformation; we are essential in increasing VMware's Annual Revenue Recognition (ARR) by ensuring our services are stable and secure. We are a passionate, hard-working, friendly, supportive, culturally diverse global team.
**Success in the Role: What are the performance goals over the first 6-12 months you will work toward completing?**
- You will assist the team in continuing to ensure that our uptime and reliability are aligned with the defined SLOs. Being able to design solutions, automate everything, reduce toil, and be a fantastic collaborator working well in a globally distributed team.
- Experience in managing and automating Public Cloud (AWS/GCP/AZURE) Infrastructure and Kubernetes is required. Being inquisitive and determined to get to the root cause of a problem ensuring that we don't see repetitive issues, or better yet, have the foresight to address them before they occur. Some critical areas of focus will be assisting with our Observability stack, such as logging, metrics, tracing and dashboards.
- We have three functional pillars within the Platform Team:
- ** Managed Platform** - Deploy, manage and support the uptime and reliability of our SaaS platform, including our public cloud configurations, 10's of thousands of pods running on hundreds of Kubernetes clusters.
- ** Managed Services** - Provide value-add cloud-native services that run on Kubernetes. The team currently supports VMware's Managed Kafka, Observability stacks, logging services, secrets management and more Having Golang experience is ideal in the development of Kubernetes Custom Resource Definitions.
- ** Security & Compliance** - The platform is certified in FedRAMP High, PCI, HIPAA, SOC2 and more. We are constantly innovating, building new services and automating to secure our platform and increase our security posture with least privilege principles and zero trust.
**The Work: What type of work will you be doing? What assignments, requirements, or skills will you be performing on a regular basis?**
This engineering team moves at lightning speed adopting leading edge technologies.
You’ll be pulling things apart and tinkering, building new platforms, or playing in the cloud. Here, the engineering opportunities are endless. With this fast-paced, synergetic group, you’ll be working together and across the organization to ensure we maintain our SaaS Platform running on Kubernetes ensuring we meet our our SLOs. Strong knowledge in SRE with experience in Kubernetes as an Admin would be ideal. You will ideally have contributed to opensource and keen to develop services that all of VMware SaaS engineering products and teams will consume. You must be driven, understand and can demonstrate SRE best practices with strong expertise in troubleshooting Kubernetes clusters. Have a strong understanding of Observability, Release Management, exposure to Cloud services and excellent hands-on expertise in Python and ideally Golang. This is an extremely exciting role working with a very strong team right at the heart of VMware’s SaaS transformation journey.
As part of the VMware Developer Platform (VDP) Engineering Team, you will be.
- Managing a cross cloud SaaS Service platform with high uptime focused on Kubernetes management.
- Ensuring we maintain a high security posture and ensure our we stay compliant with our varying accreditations (SOC2/PCI/HIPAA/FedRAMP High/GDPR)
- Strong DevOps experience with and bringing in release best practices and help shorten the release cycle.
- Identifying Opportunities to improve efficiency of our CI/CD pipeline and optimize developer productivity.
- Developing tools and code to automate manual steps in the pipeline and optimize existing steps.
- Using tools such as GitLab, Rundeck, Terraform, Docker, SonarQube, Helm etc.
- Using Python/bash & Golang for automation & achieving operational excellence.
- Develop and manage Kubernetes Operators and Custom Resource Definitions
- Assisting the team with operational
-
Site Reliability Engineer
hace 7 meses
Heredia, Costa Rica Sysco Costa Rica A tiempo completo**Requirements**: - Develop and refine strategy and process for all support issue tracking from intake through resolution in conjunction with senior members of the team. - Contribute to, and occasionally lead, strategic discussions to continue the evolution of flexibility and sustainability of the entire product suite. - Partner with Level 1 support teams,...
-
Site Reliability Engineer
hace 7 meses
Heredia, Costa Rica IBM A tiempo completoIntroduction At IBM, work is more than a job - it's a calling: To build. To design. To code. To consult. To think along with clients and sell. To make markets. To invent. To collaborate. Not just to do something better, but to attempt things you've never thought possible. Are you ready to lead in this new era of technology and solve some of the world's most...
-
Site Reliability Engineer
hace 7 meses
Heredia, Costa Rica IBM A tiempo completo**Introduction** At IBM, work is more than a job - it's a calling: To build. To design. To code. To consult. To think along with clients and sell. To make markets. To invent. To collaborate. Not just to do something better, but to attempt things you've never thought possible. Are you ready to lead in this new era of technology and solve some of the world's...
-
Site Reliability and Automation Engineer
hace 1 semana
Heredia, Costa Rica IBM A tiempo completo**Introduction** At IBM, work is more than a job - it's a calling: To build. To design. To code. To consult. To think along with clients and sell. To make markets. To invent. To collaborate. Not just to do something better, but to attempt things you've never thought possible. Are you ready to lead in this new era of technology and solve some of the world's...
-
Site Reliability Engineer Manager
hace 1 semana
Heredia, Costa Rica IBM A tiempo completo**Introduction** At IBM, work is more than a job - it's a calling: To build. To design. To code. To consult. To think along with clients and sell. To make markets. To invent. To collaborate. Not just to do something better, but to attempt things you've never thought possible. Are you ready to lead in this new era of technology and solve some of the world's...
-
Site Reliability and Platform Support
hace 6 días
Heredia, Costa Rica IBM A tiempo completo**Introduction** At IBM, work is more than a job - it's a calling: To build. To design. To code. To consult. To think along with clients and sell. To make markets. To invent. To collaborate. Not just to do something better, but to attempt things you've never thought possible. Are you ready to lead in this new era of technology and solve some of the world's...
-
Senior Site Reliability Engineer
hace 7 meses
Heredia, Costa Rica IBM A tiempo completoIntroduction As a SRE Engineer, you will work in an agile, collaborative environment to build, deploy, configure and maintain systems for the IBM client business. In this role, you will lead the problem resolution process for our clients, from analysis and troubleshooting, to deploying workarounds or fixes. Working closely with our worldwide teams, you will...
-
Senior Site Reliability Engineer
hace 7 meses
Heredia, Costa Rica IBM A tiempo completoIntroduction As a SRE Engineer, you will work in an agile, collaborative environment to build, deploy, configure and maintain systems for the IBM client business. In this role, you will lead the problem resolution process for our clients, from analysis and troubleshooting, to deploying workarounds or fixes. Working closely with our worldwide teams, you will...
-
Senior Site Reliability Engineer
hace 7 meses
Heredia, Costa Rica IBM A tiempo completoIntroduction As a SRE Engineer, you will work in an agile, collaborative environment to build, deploy, configure and maintain systems for the IBM client business. In this role, you will lead the problem resolution process for our clients, from analysis and troubleshooting, to deploying workarounds or fixes. Working closely with our worldwide teams, you will...
-
Compute Operations Site Reliability Engineer
hace 7 meses
Heredia, Costa Rica IBM A tiempo completoIntroduction At IBM, work is more than a job - it's a calling: To build. To design. To code. To consult. To think along with clients and sell. To make markets. To invent. To collaborate. Not just to do something better, but to attempt things you've never thought possible. Are you ready to lead in this new era of technology and solve some of the world's most...
-
Senior Site Reliability Engineer Manager
hace 7 meses
Heredia, Costa Rica IBM A tiempo completoIntroduction The IBM Software is seeking a talented and motivated SRE Manager professional to lead and manage a team of engineers focused on a global Cloud Platform solution servicing multiple IBM offering. Your Role and Responsibilities - Manage and lead a team of SRE engineers. This involves hiring, training, and mentoring team members, assigning tasks,...
-
Site Reliability Engineer
hace 7 meses
Heredia, Costa Rica IBM A tiempo completo**Introduction** As a SRE Engineer, you will work in an agile, collaborative environment to build, deploy, configure and maintain systems for the IBM client business. In this role, you will lead the problem resolution process for our clients, from analysis and troubleshooting, to deploying workarounds or fixes. Working closely with our worldwide teams, you...
-
Site Reliability Engineer
hace 7 meses
Heredia, Costa Rica IBM A tiempo completo**Introduction** As a SRE Engineer, you will work in an agile, collaborative environment to build, deploy, configure and maintain systems for the IBM client business. In this role, you will lead the problem resolution process for our clients, from analysis and troubleshooting, to deploying workarounds or fixes. Working closely with our worldwide teams, you...
-
Site Reliability Engineering Professional
hace 8 meses
Heredia, Costa Rica IBM A tiempo completoIntroduction At IBM, work is more than a job - it's a calling: To build. To design. To code. To consult. To think along with clients and sell. To make markets. To invent. To collaborate. Not just to do something better, but to attempt things you've never thought possible. Are you ready to lead in this new era of technology and solve some of the world's most...
-
Site Reliability Engineering Professional
hace 8 meses
Heredia, Costa Rica IBM A tiempo completoIntroduction At IBM, work is more than a job - it's a calling: To build. To design. To code. To consult. To think along with clients and sell. To make markets. To invent. To collaborate. Not just to do something better, but to attempt things you've never thought possible. Are you ready to lead in this new era of technology and solve some of the world's most...
-
Site Reliability Engineer
hace 7 meses
Heredia, Costa Rica IBM A tiempo completo**Introduction** As a SRE Engineer, you will work in an agile, collaborative environment to build, deploy, configure and maintain systems for the IBM client business. In this role, you will lead the problem resolution process for our clients, from analysis and troubleshooting, to deploying workarounds or fixes. Working closely with our worldwide teams, you...
-
Site Reliability Engineer
hace 7 meses
Heredia, Costa Rica IBM A tiempo completo**Introduction** As a SRE Engineer, you will work in an agile, collaborative environment to build, deploy, configure and maintain systems for the IBM client business. In this role, you will lead the problem resolution process for our clients, from analysis and troubleshooting, to deploying workarounds or fixes. Working closely with our worldwide teams, you...
-
Mid Site Reliability Engineer
hace 7 meses
Heredia, Costa Rica IBM A tiempo completoIntroduction As a SRE Engineer, you will work in an agile, collaborative environment to build, deploy, configure and maintain systems for the IBM client business. In this role, you will lead the problem resolution process for our clients, from analysis and troubleshooting, to deploying workarounds or fixes. Working closely with our worldwide teams, you will...
-
Mid Site Reliability Engineer
hace 7 meses
Heredia, Costa Rica IBM A tiempo completoIntroduction As a SRE Engineer, you will work in an agile, collaborative environment to build, deploy, configure and maintain systems for the IBM client business. In this role, you will lead the problem resolution process for our clients, from analysis and troubleshooting, to deploying workarounds or fixes. Working closely with our worldwide teams, you will...
-
Mid Site Reliability Engineer
hace 7 meses
Heredia, Costa Rica IBM A tiempo completoIntroduction As a SRE Engineer, you will work in an agile, collaborative environment to build, deploy, configure and maintain systems for the IBM client business. In this role, you will lead the problem resolution process for our clients, from analysis and troubleshooting, to deploying workarounds or fixes. Working closely with our worldwide teams, you will...