Cloud Reliability Engineer

hace 3 días


San Francisco, Heredia, Costa Rica Ibm A tiempo completo

Job Overview

We are seeking a skilled Senior Software developer to join our Production Reliability Engineering team. As a key member of this team, you will be responsible for ensuring the performance, reliability, and scalability of AI & ML driven voice agent microservices, Kubernetes clusters, AWS cloud infrastructure, network services, and storage layers.

Your Key Contributions

  1. You will work closely with other Watson Orders development teams in an embedded SRE model to help define and implement key metrics for uptime, reliability, and performance of these services and develop runbooks for incident management.
  2. You will design, implement, and support critical multi-region cloud infrastructure in AWS to support data platform, microservices, and development environments.
  3. You will develop deep service telemetry through metric collection, distributed tracing, visualization, and reporting via Open Telemetry, Prometheus, and related tooling.
  4. You will participate in the definition and management of SLIs, SLOs, and error budgets for infrastructure and production services.
  5. You will design, develop, and maintain CI/CD pipelines for integration and Kubernetes clusters.

About Us

IBM is committed to creating a diverse environment and is proud to be an equal opportunity employer. We believe that our diverse workforce is one of our greatest strengths, and we strive to create a workplace where everyone can contribute and thrive.



  • San Francisco, Heredia, Costa Rica Ibm A tiempo completo

    About the RoleThe Technical Reliability Engineer will play a critical role in ensuring the reliability and resiliency of our software systems. This involves specializing in automation, DevOps, and SRE principles to work closely with development teams to build, test, and deploy well-engineered information systems and ecosystems.Key responsibilities include...


  • San Francisco, Heredia, Costa Rica Ibm A tiempo completo

    Your Role and ResponsibilitiesWe are looking for a site reliability engineer (sre) to join a global team managing one of our leading security solutions. As a member of the team, you will be working in a fast-paced and rewarding environment. You will have access to the latest education, tools, and technology, and a limitless career path with the world's...


  • San Francisco, Heredia, Costa Rica Equifax A tiempo completo

    Equifax is a leader in powering possible, and we are looking for a skilled Cloud Communication Engineer to join our team. As a Cloud Communication Engineer, you will play a key role in designing, building, and maintaining our cloud-based communication infrastructure.Your main responsibilities will include working closely with our Infrastructure Engineering...


  • San Francisco, Heredia, Costa Rica Databricks A tiempo completo

    ResponsibilitiesYou will be responsible for:Monitoring critical infrastructure and identifying potential issues.Collaborating with cross-functional teams to resolve incidents and improve platform reliability.Developing and implementing automated solutions to enhance platform monitoring and alerting.Contributing to software development efforts to improve...


  • San Francisco, Heredia, Costa Rica Databricks A tiempo completo

    Job DescriptionWe're a fast-growing organization, attracting top talent worldwide. Our unique blend of smart, curious, and quick thinkers drives our success.The Impact You Will Have:Monitor critical infrastructure and proactively identify incidents.Work with stakeholders to resolve incidents and propose solutions for platform reliability and...

  • Cloud Software Engineer

    hace 23 horas


    San Francisco, Heredia, Costa Rica Cloud Software Group A tiempo completo

    **About Cloud Software Group**We are a global cloud software provider, serving over 100 million users worldwide. Our team values diverse perspectives and encourages learning, dreaming, and innovation.**Job Description**We are seeking a highly skilled Networking Technical Support Engineer to deliver exceptional customer service and resolve technical issues...


  • San Francisco, Heredia, Costa Rica Databricks A tiempo completo

    About the RoleDatabricks is seeking a seasoned Cloud Operations Engineer to join our Engineering organization. As a key member of our team, you will be responsible for driving operations at scale, anticipating customer needs, and driving process improvements to ensure their success.Your ResponsibilitiesWork with our engineering teams and cloud partners to...


  • San Francisco, Heredia, Costa Rica Cloud Software Group A tiempo completo

    **Job Description Summary**We are seeking a skilled Technical Cloud Engineer to join our team. As a member of our team, you will focus on in-depth problem analysis of Cloud Software Group products.You will ensure products are integrated successfully into customer environments and utilize fundamental troubleshooting skills and technical knowledge.Your primary...


  • San Francisco, Heredia, Costa Rica Vmware A tiempo completo

    **The Elevator Pitch: Why will you enjoy this new opportunity?**VMware's Cross-Cloud SaaS Platform team provides the public cloud infrastructure and Managed Kubernetes clusters that host all of VMware's SaaS products consumed by our customers.The platform is globally distributed and built using a combination of industry-standard open-source solutions and...

  • Site Reliability Engineer

    hace 3 semanas


    San Francisco, Heredia, Costa Rica Sysco Costa Rica A tiempo completo

    **Requirements**:- Develop and refine strategy and process for all support issue tracking from intake through resolution in conjunction with senior members of the team.- Contribute to, and occasionally lead, strategic discussions to continue the evolution of flexibility and sustainability of the entire product suite.- Partner with Level 1 support teams,...


  • San Francisco, Heredia, Costa Rica Ibm A tiempo completo

    **Introduction**At IBM, work is more than a job - it's a calling: To build.To design.To code.To consult.To think along with clients and sell.To make markets.To invent.To collaborate.Not just to do something better, but to attempt things you've never thought possible.Are you ready to lead in this new era of technology and solve some of the world's most...

  • Technical Cloud Engineer

    hace 23 horas


    San Francisco, Heredia, Costa Rica Cloud Software Group A tiempo completo

    **Company Overview**Cloud Software Group is a leading cloud software provider, serving over 100 million users worldwide.We value diverse perspectives and encourage our teams to learn, dream, and build the future of work.Our company is on the brink of significant growth, and we need experts like you to help us achieve it.


  • San Francisco, Heredia, Costa Rica Sysco Costa Rica A tiempo completo

    **Company Overview:**Sysco Costa Rica is a leading provider of food and support services. Our team works diligently to ensure our customers receive the highest quality products and experiences.**Job Summary:**We are seeking an experienced Site Reliability Engineer to join our team. The successful candidate will be responsible for developing and refining...

  • Site Reliability Engineer

    hace 2 semanas


    San Francisco, Heredia, Costa Rica Ibm A tiempo completo

    IntroductionAt IBM, work is more than a job - it's a calling: To build.To design.To code.To consult.To think along with clients and sell.To make markets.To invent.To collaborate.Not just to do something better, but to attempt things you've never thought possible.Are you ready to lead in this new era of technology and solve some of the world's most...


  • San Francisco, Heredia, Costa Rica Ibm A tiempo completo

    Job SummaryWe are seeking a skilled Cloud Security Engineer to join our team. As a Cloud Security Engineer, you will be responsible for designing, implementing, and maintaining secure cloud-based systems and applications.About UsAt IBM, we pride ourselves on being an early adopter of artificial intelligence, quantum computing, and blockchain. We are...


  • San Francisco, Heredia, Costa Rica Servicenow A tiempo completo

    About the RoleWe are seeking a skilled Performance Engineer to join our team as a Staff Performance and Reliability Support Engineer. In this role, you will play a critical part in ensuring the stability and reliability of our customer instances. Your technical expertise and problem-solving skills will enable you to manage and resolve complex technical...


  • San Francisco, Heredia, Costa Rica Kyndryl Costa Rica, Sociedad De Responsabilidad Limitada A tiempo completo

    Job OverviewThe role of a Virtualization Specialist Citrix at Kyndryl Costa Rica, Sociedad De Responsabilidad Limitada involves designing and implementing virtualized desktop infrastructure solutions to meet the business needs of clients. This position requires strong technical expertise in areas such as VDI infrastructure layers, Citrix technologies, and...


  • San Francisco, Heredia, Costa Rica Ibm A tiempo completo

    About the RoleWe are seeking a talented Cloud Infrastructure Engineer to join our team. As a Cloud Infrastructure Engineer, you will be responsible for designing, building, and maintaining scalable cloud-based systems for our clients.Key Responsibilities:Design and implement cloud infrastructure solutions using public cloud providers such as Amazon Web...


  • San Francisco, Heredia, Costa Rica Sysco Costa Rica A tiempo completo

    **About the Role:**We are looking for a skilled Site Reliability Engineer to join our team. The ideal candidate will have a strong background in technical operations support and experience working with enterprise cloud platforms.**Key Responsibilities:- Develop and implement strategies and processes for support issue tracking.- Collaborate with...


  • San Francisco, Heredia, Costa Rica Ibm A tiempo completo

    **Introduction**At IBM, work is more than a job - it's a calling: To build.To design.To code.To consult.To think along with clients and sell.To make markets.To invent.To collaborate.Not just to do something better, but to attempt things you've never thought possible.Are you ready to lead in this new era of technology and solve some of the world's most...