Senior Site Reliability Engineer

hace 7 meses


San José, Costa Rica Encora A tiempo completo

**Important Information**

Experience: + 5 years

Job Mode: Full-time

Work Mode: Work from home

**Job Summary**

As a **_Senior Site Reliability Engineer (6632)_**, you will be part of a highly skilled technology and agile team by supporting and developing cutting-edge solutions to meet our business requirements. You will help us accelerate our customers' business results by innovating cutting-edge digital products.

Your responsibilities will include leading and actively participating in the design, development, and delivery of our software projects.

**Responsibilities and Duties**
- Design, implement, and maintain highly available and scalable cloud infrastructure on AWS platform.
- Develop and implement automated monitoring, alerting, and incident response mechanisms to ensure proactive identification and resolution of system issues.
- Collaborate with software engineering teams to establish Service Level Objectives (SLOs) and Service Level Indicators (SLIs) to measure system reliability and performance.
- Conduct regular performance analysis, capacity planning to anticipate and address scaling requirements.
- Implement and maintain disaster recovery and failover strategies to mitigate service disruptions and ensure business continuity.
- Lead incident response and post-mortem analysis to identify root causes and implement preventive measures.
- Continuously improve system reliability through automation, optimization, and implementation of best practices.
- Stay updated with the latest AWS services and technologies, and evaluate their applicability to enhance our infrastructure and operations.
- Mentor junior team members and foster a culture of collaboration, learning, and continuous improvement.

**Qualifications and Skills**
- Bachelor's degree in Computer Science, Engineering, or related field. Master's degree preferred.
- AWS Certified Solutions Architect - Professional or AWS Certified DevOps Engineer - Professional certification is required.
- 8+ years of experience in Site Reliability Engineering, DevOps, or related roles, with a focus on AWS cloud technologies.
- Strong understanding of cloud architecture principles and experience with AWS services such as EC2, S3, RDS, Lambda, DynamoDB, etc.
- Proficiency in scripting and automation using languages such as Python, Bash, or PowerShell.
- Experience with infrastructure as code (IaC) tools such as Terraform or CloudFormation for provisioning and configuration management.
- Hands-on experience with monitoring, logging, and observability tools such as CloudWatch, Prometheus, Grafana, ELK stack, etc.
- Solid understanding of CI/CD principles and experience with related tools like Jenkins, GitLab CI/CD, or AWS CodePipeline.
- Excellent problem-solving skills and the ability to troubleshoot complex issues in distributed systems.
- Strong communication and collaboration skills, with the ability to work effectively in cross-functional teams and influence stakeholders at all levels.

**About Encora**

Encora is the preferred digital engineering and modernization partner of some of the world's leading enterprises and digital native companies. With over 9,000 experts in 47+ offices and innovation labs worldwide, Encora's technology practices include Product Engineering & Development, Cloud Services, Quality Engineering, DevSecOps, Data & Analytics, Digital Experience, Cybersecurity, and AI & LLM Engineering.

**At Encora, we hire professionals based solely on their skills and qualifications, and do not discriminate based on age, disability, religion, gender, sexual orientation, socioeconomic status, or nationality.



  • San Francisco, Heredia, Costa Rica Ibm A tiempo completo

    Job SummaryWe are seeking a highly skilled Site Reliability Engineer to join our team at IBM. As a Site Reliability Engineer, you will be responsible for ensuring the reliability and performance of our systems and infrastructure.Key ResponsibilitiesLead the problem resolution process for our clients, from analysis and troubleshooting to deploying workarounds...


  • San Francisco, Heredia, Costa Rica Ibm A tiempo completo

    Job SummaryWe are seeking a highly skilled Site Reliability Engineer to join our team at IBM. As a Site Reliability Engineer, you will be responsible for ensuring the reliability and scalability of our cloud infrastructure.Key ResponsibilitiesIdentify and investigate issues with our cloud infrastructureDevelop and implement solutions to improve the...


  • San José, Costa Rica Oracle A tiempo completo

    Site Reliability Engineer-230001K1 **Applicants are required to read, write, and speak the following languages***: English **Preferred Qualifications** Solve complex problems related to infrastructure cloud services and build automation to prevent problem recurrence. Design, write, and deploy software to improve the availability, scalability, and...


  • San José, Costa Rica Oracle A tiempo completo

    Site Reliability Engineer-2200087I **Applicants are required to read, write, and speak the following languages**: English **Preferred Qualifications** Work with Site Reliability Engineering (SRE) team on the shared full stack ownership of a collection of services and/or technology areas. Understand the end-to-end configuration, technical dependencies, and...


  • San Francisco, Heredia, Costa Rica Sysco Costa Rica A tiempo completo

    **Job Requirements**:We are seeking a highly skilled Site Reliability Engineer to join our team at Sysco Costa Rica. This position will be responsible for developing and refining strategies and processes for support issue tracking from intake through resolution.**Key Responsibilities**:Contribute to and lead strategic discussions to evolve the product...


  • San José, Costa Rica Equifax A tiempo completo

    Equifax is where you can power your possible. If you want to achieve your true potential, chart new paths, develop new skills, collaborate with bright minds, and make a meaningful impact, we want to hear from you. _ **What you’ll do**: - You will influence and design the infrastructure, architecture, standards, and methods for large-scale systems. - Will...


  • San José, Costa Rica Nucleus Health A tiempo completo

    A U.S.based company that is on a mission to develop the largest online marketplace and media platform in the world is looking for a Senior DevOps/SRE Engineer. The engineer will be working with cross-functional teams to raise system performance, reliability, and effectiveness. The company is developing a knowledge-commerce platform that connects clients and...


  • San José, Costa Rica Hitachi Solutions Ltd A tiempo completo

    **Company Description** Hitachi Solutions is a global Microsoft solutions integrator passionate about developing and delivering industry-focused solutions that support our clients to deliver on their business transformation goals. Our industry focus, expertise, and intellectual property is what truly sets us apart. We have earned, and continue to maintain,...


  • San José, Costa Rica Hitachi Solutions A tiempo completo

    Company Description Hitachi Solutions is a global Microsoft solutions integrator passionate about developing and delivering industry-focused solutions that support our clients to deliver on their business transformation goals. Our industry focus, expertise, and intellectual property is what truly sets us apart. We have earned, and continue to maintain, a...


  • San José, Costa Rica Canonical - Jobs A tiempo completo

    **Site Reliability Engineer**: To become a member of this team, you need to be a software engineer fluent in Python, you need a genuine interest in the full open source infrastructure stack from metal to containers, and you need the ability to work in a high pressure operations environment with mission-critical services for global brand name customers. As a...


  • San Francisco, Heredia, Costa Rica Ibm A tiempo completo

    Job SummaryWe are seeking a highly skilled Cloud Engineer to join our team as a Senior Site Reliability Engineer. As a key member of our infrastructure team, you will be responsible for designing, deploying, and maintaining large-scale cloud-based systems. Your expertise in cloud computing, DevOps, and system administration will enable you to identify and...


  • San José, Costa Rica Equifax A tiempo completo

    Equifax is where you can power your possible. If you want to achieve your true potential, chart new paths, develop new skills, collaborate with bright minds, and make a meaningful impact, we want to hear from you. _ - As a Site Reliability Engineer (SRE) you will combine software and systems engineering for building and running large-scale, distributed,...


  • San José, Costa Rica Sysdig A tiempo completo

    Sysdig is driving the standard for securing the cloud and containers. We created Falco, the open standard for cloud-native threat detection, and consistently contribute to open source software projects. We are passionate, technical problem-solvers, continually innovating and delivering powerful solutions to secure the cloud from source to run. We value...


  • San José, Costa Rica Scalable Systems A tiempo completo

    Scalable Systems is a USA-based Big Data, Analytics and Digital Transformation Company focused on vertical, innovative solutions. By providing next-generation technology solutions and services, we help organizations to identify risks & opportunities, achieve operational excellence, and gain an innovative edge. **Openings**: **Title**: Site Reliability...


  • San José, Costa Rica Datasite A tiempo completo

    Datasite is where deals are made. We provide the data rooms and SaaS technology used in M&A and other high-value transactions, to deliver projects in more than 170 countries. Carrying that success into the future is all about you. Your useful skills, your unusual experience, your unique ideas. Everyone here brings something unexpected. What’s yours? Invest...


  • San José, Costa Rica Equifax A tiempo completo

    Site Reliability Engineering (SRE) at Equifax is a discipline that combines software and systems engineering for building and running large-scale, distributed, fault-tolerant systems. SRE ensures that internal and external services meet or exceed reliability and performance expectations while adhering to Equifax engineering principles. _ - SREs in our team...


  • San José, Costa Rica Equifax A tiempo completo

    Site Reliability Engineering (SRE) at Equifax is a discipline that combines software and systems engineering for building and running large-scale, distributed, fault-tolerant systems. SRE ensures that internal and external services meet or exceed reliability and performance expectations while adhering to Equifax engineering principles. _ - SREs in our team...


  • San Francisco, Heredia, Costa Rica Ibm A tiempo completo

    IntroductionWe are seeking a highly skilled Site Reliability Engineer to join our global team managing one of IBM's leading security solutions. As a member of our team, you will be working in a fast-paced and rewarding environment.Your Role and ResponsibilitiesYou will have access to the latest education, tools, and technology, and a limitless career path...


  • San José, Costa Rica Equifax A tiempo completo

    **Equifax is where you can power your possible. If you want to achieve your true potential, chart new paths, develop new skills, collaborate with bright minds, and make a meaningful impact, we want to hear from you.** **Cyber Security Site Reliability Engineer (SRE Intermediate) **is a discipline that combines software and systems engineering for building...


  • San Francisco, Heredia, Costa Rica Ibm A tiempo completo

    OverviewWelcome to IBM, where innovation meets reliability. As a Site Reliability Engineer, you will be at the forefront of building and maintaining systems that power our client business.