System Reliability Engineer

hace 7 días


San José, San José, Costa Rica Splunk A tiempo completo

Splunk's Mission

Splunk is committed to making machine data accessible and usable for everyone. Our team is passionate about our product and strives to provide an exceptional experience for our customers. We are actively seeking a DevOps Engineer with a passion for automation to help build scalable tools for our distributed cloud-based systems.

Key Responsibilities:

  • Develop and maintain cloud account management systems and compliance monitoring & remediation systems.
  • Configure and manage servers using configuration management tooling such as Ansible or Puppet.
  • Join the on-call rotation to respond to high-priority incidents with minimal downtime (MTTR).
  • Create and maintain scripts and tools to automate routine tasks using languages such as Python or Go.
  • Improve system reliability through automation and scripting.
  • Maintain up-to-date system procedures, configurations, troubleshooting guides, and best practices as reference material for the team.

Requirements:

  • 3+ years of experience in a DevOps / SRE focused environment.
  • Experience managing AWS or other Public Cloud platforms.
  • Knowledge of multiple Operating Systems, including Linux (Ubuntu/RHEL) and Windows.
  • Familiarity with Configuration Management tools like Ansible/Chef/Puppet.
  • Hands-on experience with CI/CD pipeline tools (e.g. Jenkins, GitLab, etc).
  • Understanding of networking concepts and Internet protocols.
  • Ability to communicate complex technical concepts clearly to customers and upper management.
  • Strong desire to automate and solve issues with code.


  • San José, San José, Costa Rica Oracle A tiempo completo

    **About the Job**We're seeking a highly skilled Cloud Reliability Engineer to join our team at Oracle.In this role, you will be responsible for ensuring the uptime and performance of our cloud-based systems. You will work closely with cross-functional teams to identify and resolve issues that impact our customers.Solve complex problems related to...


  • San José, San José, Costa Rica Equifax A tiempo completo

    Site Reliability Engineering (SRE) at Equifax is a discipline that combines software and systems engineering for building and running large-scale, distributed, fault-tolerant systems.SRE ensures that internal and external services meet or exceed reliability and performance expectations while adhering to Equifax engineering principles._- SREs in our team take...

  • Site Reliability Engineer

    hace 2 semanas


    San José, San José, Costa Rica Oracle A tiempo completo

    Site Reliability Engineer-230001K1**Applicants are required to read, write, and speak the following languages***: English**Preferred Qualifications**Solve complex problems related to infrastructure cloud services and build automation to prevent problem recurrence.Design, write, and deploy software to improve the availability, scalability, and efficiency of...


  • San José, San José, Costa Rica Pfizer A tiempo completo

    **Job Summary:**We are seeking a highly skilled Site Reliability Engineer Leader to join our Digital Command Operations team. In this role, you will be responsible for ensuring the robustness, reliability, and performance of Pfizer's critical digital solutions.**Key Responsibilities:**Act as focal point for day-to-day operation of Cloud services at...


  • San José, San José, Costa Rica Modus Create A tiempo completo

    About Modus CreateModus Create is a fast-growing, remote-first company that specializes in emerging technologies. We are seeking an experienced and enthusiastic DevOps/SRE Engineer (Tooling and Site Reliability Engineer) to join our team.This senior-level position requires expertise in optimization and automation, as well as experience with software...


  • San José, San José, Costa Rica Intel A tiempo completo

    The System Quality and Reliability team owns the development and execution of system stress test experiments required for product release to market and for post launch support activities to meet the Quality and Reliability goals across multiple market segments.This position will be part of the System Customer and Content team that supports product pre and...


  • San José, San José, Costa Rica Akamai A tiempo completo

    **Do you have a passion for cutting edge technologies and tackling system problems?****Are you a self-starting professional who thrives in a dynamic environment?****Join the Akamai SRE Infrastructure team**As Site Reliability Engineer II youll be responsible for the operational stability and performance of critical systems and services.Part of a Global team...

  • Site Reliability Engineer

    hace 2 semanas


    San José, San José, Costa Rica Fullstack Labs A tiempo completo

    FullStack is the fastest-growing software consultancy in the Americas.We help organizations like Uber, GoDaddy, MGM, Siemens, Stanford University, and the State of California, build distributed software development teams, and deliver transformational digital solutions.As an employee-first company, we focus on hiring the most talented software designers and...


  • San José, San José, Costa Rica Fullstack Labs A tiempo completo

    About UsFullStack Labs is a software consultancy with a strong presence in the Americas. Our team helps organizations build distributed software development teams and deliver transformational digital solutions. We focus on creating a positive, respectful, and supportive work environment where our employees can thrive.We're proud of:Offering life-changing...

  • Site Reliability Engineer

    hace 2 semanas


    San José, San José, Costa Rica Hitachi Solutions Ltd A tiempo completo

    **Company Description**Hitachi Solutions is a global Microsoft solutions integrator passionate about developing and delivering industry-focused solutions that support our clients to deliver on their business transformation goals.Our industry focus, expertise, and intellectual property is what truly sets us apart.We have earned, and continue to maintain, a...

  • Site Reliability Engineer

    hace 2 semanas


    San José, San José, Costa Rica Hitachi Solutions A tiempo completo

    Company DescriptionHitachi Solutions is a global Microsoft solutions integrator passionate about developing and delivering industry-focused solutions that support our clients to deliver on their business transformation goals.Our industry focus, expertise, and intellectual property is what truly sets us apart.We have earned, and continue to maintain, a...


  • San José, San José, Costa Rica Datasite A tiempo completo

    Datasite is where deals are made.We provide the data rooms and SaaS technology used in M&A and other high-value transactions, to deliver projects in more than 170 countries.Carrying that success into the future is all about you.Your useful skills, your unusual experience, your unique ideas.Everyone here brings something unexpected.What's yours?Invest your...


  • San José, San José, Costa Rica Emerson A tiempo completo

    Overview:We are looking for a System Operations Engineer to join our team at Emerson. As a System Operations Engineer, you will be responsible for ensuring the availability and performance of our Windows and Linux servers. You will work closely with our vendors and datacenter technicians to troubleshoot and resolve issues related to hardware installation,...


  • San José, San José, Costa Rica Akamai A tiempo completo

    **Do you have a passion for cutting edge technologies and tackling system problems?****Are you a self-starting professional who thrives in a dynamic environment?****Join our Site Reliability team****Help us shape the future of the Internet**As a Site Reliability Engineer, you will be responsible for:- Deploying, managing, and operating scalable, highly...


  • San José, San José, Costa Rica Akamai A tiempo completo

    **Do you have a passion for cutting edge technologies and tackling system problems?****Are you a self-starting professional who thrives in a dynamic environment?****Join the Akamai SRE Infrastructure team**As Site Reliability Engineer Senior II, youll be responsible for the operational stability and performance of critical systems and services.Part of a...


  • San José, San José, Costa Rica Micro Focus A tiempo completo

    As a member of our team, you'll work closely with colleagues across the Micro Focus delivery organization to deliver exceptional service.**Responsibilities:**- System component evaluation for availability, reliability, and performance.- Troubleshooting and resolving system issues identified by monitoring systems.- System performance analysis to ensure high...

  • Site Reliability Engineer

    hace 2 semanas


    San José, San José, Costa Rica Bairesdev A tiempo completo

    BairesDev is proud to be one of the fastest-growing companies in Latin America and a welcoming, highly rated employer (Glassdoor Employee Score: 4.3).With more than 3500 employees in 27 countries and world-class clients from start-ups to Fortune 500 companies, we're only as strong as the multicultural teams at the heart of our business.BairesDev runs on...


  • San José, San José, Costa Rica Scalable Systems A tiempo completo

    Scalable Systems?is a USA-based Big Data, Analytics and Digital Transformation Company focused on vertical, innovative solutions.By providing next-generation technology solutions and services, we help organizations to identify risks & opportunities, achieve operational excellence, and gain an innovative edge.**Openings**:**Title**: Site Reliability...


  • San José, San José, Costa Rica Oracle A tiempo completo

    Oracle's Cloud Deployment Engineering group is responsible for the delivery of underlying services in our data centers, whether physical or cloud-based. The team is highly technical and empowers itself and peers to deliver services better, faster, and more effectively.We are seeking a skilled Cloud Systems Engineer to join our team. The ideal candidate will...


  • San José, San José, Costa Rica Intel A tiempo completo

    Intel is driving innovation in computing with AI, High Performance Computing and remote connectivity.Demand for Efficient PlatformsData growth and the shift to remote work have created a high demand for efficient platforms that can handle this demand. The PC business has been revitalized by remote work and learning, placing new demands on our platform and...