Cloud Reliability Engineer
hace 3 días
Job Overview
We are seeking a skilled Senior Software developer to join our Production Reliability Engineering team. As a key member of this team, you will be responsible for ensuring the performance, reliability, and scalability of AI & ML driven voice agent microservices, Kubernetes clusters, AWS cloud infrastructure, network services, and storage layers.
Your Key Contributions
- You will work closely with other Watson Orders development teams in an embedded SRE model to help define and implement key metrics for uptime, reliability, and performance of these services and develop runbooks for incident management.
- You will design, implement, and support critical multi-region cloud infrastructure in AWS to support data platform, microservices, and development environments.
- You will develop deep service telemetry through metric collection, distributed tracing, visualization, and reporting via Open Telemetry, Prometheus, and related tooling.
- You will participate in the definition and management of SLIs, SLOs, and error budgets for infrastructure and production services.
- You will design, develop, and maintain CI/CD pipelines for integration and Kubernetes clusters.
About Us
IBM is committed to creating a diverse environment and is proud to be an equal opportunity employer. We believe that our diverse workforce is one of our greatest strengths, and we strive to create a workplace where everyone can contribute and thrive.
-
Technical Reliability Engineer
hace 3 días
San Francisco, Heredia, Costa Rica Ibm A tiempo completoAbout the RoleThe Technical Reliability Engineer will play a critical role in ensuring the reliability and resiliency of our software systems. This involves specializing in automation, DevOps, and SRE principles to work closely with development teams to build, test, and deploy well-engineered information systems and ecosystems.Key responsibilities include...
-
Cloud Infrastructure Reliability Expert
hace 4 días
San Francisco, Heredia, Costa Rica Ibm A tiempo completoYour Role and ResponsibilitiesWe are looking for a site reliability engineer (sre) to join a global team managing one of our leading security solutions. As a member of the team, you will be working in a fast-paced and rewarding environment. You will have access to the latest education, tools, and technology, and a limitless career path with the world's...
-
Cloud Communication Engineer
hace 5 días
San Francisco, Heredia, Costa Rica Equifax A tiempo completoEquifax is a leader in powering possible, and we are looking for a skilled Cloud Communication Engineer to join our team. As a Cloud Communication Engineer, you will play a key role in designing, building, and maintaining our cloud-based communication infrastructure.Your main responsibilities will include working closely with our Infrastructure Engineering...
-
Cloud Platform Reliability Expert
hace 6 días
San Francisco, Heredia, Costa Rica Databricks A tiempo completoResponsibilitiesYou will be responsible for:Monitoring critical infrastructure and identifying potential issues.Collaborating with cross-functional teams to resolve incidents and improve platform reliability.Developing and implementing automated solutions to enhance platform monitoring and alerting.Contributing to software development efforts to improve...
-
Cloud Network Operations Engineer
hace 6 días
San Francisco, Heredia, Costa Rica Databricks A tiempo completoJob DescriptionWe're a fast-growing organization, attracting top talent worldwide. Our unique blend of smart, curious, and quick thinkers drives our success.The Impact You Will Have:Monitor critical infrastructure and proactively identify incidents.Work with stakeholders to resolve incidents and propose solutions for platform reliability and...
-
Cloud Software Engineer
hace 23 horas
San Francisco, Heredia, Costa Rica Cloud Software Group A tiempo completo**About Cloud Software Group**We are a global cloud software provider, serving over 100 million users worldwide. Our team values diverse perspectives and encourages learning, dreaming, and innovation.**Job Description**We are seeking a highly skilled Networking Technical Support Engineer to deliver exceptional customer service and resolve technical issues...
-
Cloud Operations Engineer
hace 5 días
San Francisco, Heredia, Costa Rica Databricks A tiempo completoAbout the RoleDatabricks is seeking a seasoned Cloud Operations Engineer to join our Engineering organization. As a key member of our team, you will be responsible for driving operations at scale, anticipating customer needs, and driving process improvements to ensure their success.Your ResponsibilitiesWork with our engineering teams and cloud partners to...
-
Cloud Software Specialist
hace 20 horas
San Francisco, Heredia, Costa Rica Cloud Software Group A tiempo completo**Job Description Summary**We are seeking a skilled Technical Cloud Engineer to join our team. As a member of our team, you will focus on in-depth problem analysis of Cloud Software Group products.You will ensure products are integrated successfully into customer environments and utilize fundamental troubleshooting skills and technical knowledge.Your primary...
-
Senior Site Reliability Engineer
hace 3 semanas
San Francisco, Heredia, Costa Rica Vmware A tiempo completo**The Elevator Pitch: Why will you enjoy this new opportunity?**VMware's Cross-Cloud SaaS Platform team provides the public cloud infrastructure and Managed Kubernetes clusters that host all of VMware's SaaS products consumed by our customers.The platform is globally distributed and built using a combination of industry-standard open-source solutions and...
-
Site Reliability Engineer
hace 3 semanas
San Francisco, Heredia, Costa Rica Sysco Costa Rica A tiempo completo**Requirements**:- Develop and refine strategy and process for all support issue tracking from intake through resolution in conjunction with senior members of the team.- Contribute to, and occasionally lead, strategic discussions to continue the evolution of flexibility and sustainability of the entire product suite.- Partner with Level 1 support teams,...
-
Site Reliability And Automation Engineer
hace 3 semanas
San Francisco, Heredia, Costa Rica Ibm A tiempo completo**Introduction**At IBM, work is more than a job - it's a calling: To build.To design.To code.To consult.To think along with clients and sell.To make markets.To invent.To collaborate.Not just to do something better, but to attempt things you've never thought possible.Are you ready to lead in this new era of technology and solve some of the world's most...
-
Technical Cloud Engineer
hace 23 horas
San Francisco, Heredia, Costa Rica Cloud Software Group A tiempo completo**Company Overview**Cloud Software Group is a leading cloud software provider, serving over 100 million users worldwide.We value diverse perspectives and encourage our teams to learn, dream, and build the future of work.Our company is on the brink of significant growth, and we need experts like you to help us achieve it.
-
Site Reliability Expert
hace 6 días
San Francisco, Heredia, Costa Rica Sysco Costa Rica A tiempo completo**Company Overview:**Sysco Costa Rica is a leading provider of food and support services. Our team works diligently to ensure our customers receive the highest quality products and experiences.**Job Summary:**We are seeking an experienced Site Reliability Engineer to join our team. The successful candidate will be responsible for developing and refining...
-
Site Reliability Engineer
hace 2 semanas
San Francisco, Heredia, Costa Rica Ibm A tiempo completoIntroductionAt IBM, work is more than a job - it's a calling: To build.To design.To code.To consult.To think along with clients and sell.To make markets.To invent.To collaborate.Not just to do something better, but to attempt things you've never thought possible.Are you ready to lead in this new era of technology and solve some of the world's most...
-
Cloud Security Engineer
hace 5 días
San Francisco, Heredia, Costa Rica Ibm A tiempo completoJob SummaryWe are seeking a skilled Cloud Security Engineer to join our team. As a Cloud Security Engineer, you will be responsible for designing, implementing, and maintaining secure cloud-based systems and applications.About UsAt IBM, we pride ourselves on being an early adopter of artificial intelligence, quantum computing, and blockchain. We are...
-
Reliability Support Specialist
hace 18 horas
San Francisco, Heredia, Costa Rica Servicenow A tiempo completoAbout the RoleWe are seeking a skilled Performance Engineer to join our team as a Staff Performance and Reliability Support Engineer. In this role, you will play a critical part in ensuring the stability and reliability of our customer instances. Your technical expertise and problem-solving skills will enable you to manage and resolve complex technical...
-
Cloud Virtualization Engineer
hace 7 días
San Francisco, Heredia, Costa Rica Kyndryl Costa Rica, Sociedad De Responsabilidad Limitada A tiempo completoJob OverviewThe role of a Virtualization Specialist Citrix at Kyndryl Costa Rica, Sociedad De Responsabilidad Limitada involves designing and implementing virtualized desktop infrastructure solutions to meet the business needs of clients. This position requires strong technical expertise in areas such as VDI infrastructure layers, Citrix technologies, and...
-
Cloud Infrastructure Engineer
hace 7 días
San Francisco, Heredia, Costa Rica Ibm A tiempo completoAbout the RoleWe are seeking a talented Cloud Infrastructure Engineer to join our team. As a Cloud Infrastructure Engineer, you will be responsible for designing, building, and maintaining scalable cloud-based systems for our clients.Key Responsibilities:Design and implement cloud infrastructure solutions using public cloud providers such as Amazon Web...
-
Cloud Operations Specialist
hace 6 días
San Francisco, Heredia, Costa Rica Sysco Costa Rica A tiempo completo**About the Role:**We are looking for a skilled Site Reliability Engineer to join our team. The ideal candidate will have a strong background in technical operations support and experience working with enterprise cloud platforms.**Key Responsibilities:- Develop and implement strategies and processes for support issue tracking.- Collaborate with...
-
Site Reliability And Platform Support
hace 3 semanas
San Francisco, Heredia, Costa Rica Ibm A tiempo completo**Introduction**At IBM, work is more than a job - it's a calling: To build.To design.To code.To consult.To think along with clients and sell.To make markets.To invent.To collaborate.Not just to do something better, but to attempt things you've never thought possible.Are you ready to lead in this new era of technology and solve some of the world's most...