Senior Site Reliability Engineer

hace 2 días


San José, Costa Rica Nucleus Health A tiempo completo

A U.S.based company that is on a mission to develop the largest online marketplace and media platform in the world is looking for a Senior DevOps/SRE Engineer. The engineer will be working with cross-functional teams to raise system performance, reliability, and effectiveness. The company is developing a knowledge-commerce platform that connects clients and advisers through its customized online and telephonic technology solutions. The company has managed to secure more than $288mn in funding so far.

**Responsibilities**:

- Architect, automate, and manage engineers' and corporate users' platforms
- Contribute to the creation of plans for moving current environments and services to the cloud
- Create and maintain CI/CD pipelines for a range of on-premises and cloud apps

**Job Requirements**:

- Bachelor’s/Master’s degree in Engineering, Computer Science (or equivalent experience)
- At least 8+ years of relevant experience as a DevOps, System Administration, or SRE Engineer
- At least 5+ years of experience working with Azure
- 5+ years of experience working with virtualization platforms including VMware, Docker, and Kubernetes
- 3+ years of experience with configuration management tools like SaltStack, Ansible, or Puppet
- 3+ years of experience with Azure DevOps for CI/CD to multi-cloud and on-prem
- Thorough understanding of the operation and data flow of n-tier web apps
- Strong knowledge of a variety of web hosting technologies, including IIS, Nginx, and Apache
- Prior knowledge in maintaining e-commerce websites around-the-clock
- Thorough awareness of DR, HA, and fault best practices in a cloud deployment based on K8s
- Solid comprehension of the OSI model, DNS, NTP, and vLANs, among other networking essentials
- Extensive knowledge of KPI monitoring and alerting
- Prolific Windows, Linux, and cloud experience
- In-depth knowledge and experience in Linux administration including Ubuntu, Debian, and Centos
- Nice to have some experience with security standards like CIS, NIST, or SANS
- Prior experience using scripts and APIs as part of monitoring tools like Zabbix is desirable
- Some familiarity with SAML, particularly utilizing Okta for SSO is preferred
- Nice to have some knowledge of using Azure DevOps to automate tedious activities and create modern CI/CD pipelines
- Excellent spoken and written English communication skills

**Job Type**: Contract



  • San José, Costa Rica Oracle A tiempo completo

    Site Reliability Engineer-230001K1 **Applicants are required to read, write, and speak the following languages***: English **Preferred Qualifications** Solve complex problems related to infrastructure cloud services and build automation to prevent problem recurrence. Design, write, and deploy software to improve the availability, scalability, and...

  • Senior Site Reliability

    hace 2 semanas


    San José, San José, Costa Rica Canonical - Jobs A tiempo completo

    Canonical is a leading provider of open source software and operating systems to the global enterprise and technology markets. Our platform, Ubuntu, is very widely used in breakthrough enterprise initiatives such as public cloud, data science, AI, engineering innovation, and IoT. Our customers include the world's leading public cloud and silicon providers,...


  • San José, Costa Rica Hitachi Solutions Ltd A tiempo completo

    **Company Description** Hitachi Solutions is a global Microsoft solutions integrator passionate about developing and delivering industry-focused solutions that support our clients to deliver on their business transformation goals. Our industry focus, expertise, and intellectual property is what truly sets us apart. We have earned, and continue to maintain,...


  • San José, Costa Rica Equifax A tiempo completo

    Equifax is where you can power your possible. If you want to achieve your true potential, chart new paths, develop new skills, collaborate with bright minds, and make a meaningful impact, we want to hear from you. _ - As a Site Reliability Engineer (SRE) you will combine software and systems engineering for building and running large-scale, distributed,...

  • Site Reliability Engineer

    hace 2 semanas


    San José, Costa Rica Sysdig A tiempo completo

    Sysdig is driving the standard for securing the cloud and containers. We created Falco, the open standard for cloud-native threat detection, and consistently contribute to open source software projects. We are passionate, technical problem-solvers, continually innovating and delivering powerful solutions to secure the cloud from source to run. We value...


  • San José, Costa Rica Akamai A tiempo completo

    **Do you have a passion for cutting edge technologies and tackling system problems?** **Are you a self-starting professional who thrives in a dynamic environment?** **Join our Site Reliability team** **Help us shape the future of the Internet** As a Site Reliability Engineer, you will be responsible for: - Deploying, managing, and operating scalable,...


  • San José, San José, Costa Rica Canonical - Jobs A tiempo completo

    Canonical is a leading provider of open source software and operating systems to the global enterprise and technology markets. Our platform, Ubuntu, is very widely used in breakthrough enterprise initiatives such as public cloud, data science, AI, engineering innovation and IoT. Our customers include the world's leading public cloud and silicon providers,...

  • Site Reliability

    hace 2 semanas


    San José, San José, Costa Rica Canonical - Jobs A tiempo completo

    Canonical is a leading provider of open source software and operating systems to the global enterprise and technology markets. Our platform, Ubuntu, is very widely used in breakthrough enterprise initiatives such as public cloud, data science, AI, engineering innovation, and IoT. Our customers include the world's leading public cloud and silicon providers,...


  • San José, San José, Costa Rica Equifax A tiempo completo

    A Site Reliability Engineering (SRE) is a discipline that combines software and systems engineering for building and running large-scale, distributed, fault-tolerant systems. SRE ensures that internal and external services meet or exceed reliability and performance expectations while adhering to Equifax engineering, security, and vulnerability management...


  • San José, Costa Rica BairesDev A tiempo completo

    BairesDev is proud to be one of the fastest-growing companies in Latin America and a welcoming, highly rated employer (Glassdoor Employee Score: 4.3). With more than 3500 employees in 27 countries and world-class clients from start-ups to Fortune 500 companies, we’re only as strong as the multicultural teams at the heart of our business. BairesDev runs on...