Site Reliability Engineer

hace 3 semanas


San José, Costa Rica Oracle A tiempo completo

Site Reliability Engineer-230001K1

**Applicants are required to read, write, and speak the following languages***: English

**Preferred Qualifications**

Solve complex problems related to infrastructure cloud services and build automation to prevent problem recurrence. Design, write, and deploy software to improve the availability, scalability, and efficiency of Oracle products and services. Design and develop designs, architectures, standards, and methods for large-scale distributed systems. Facilitate service capacity planning and demand forecasting, software performance analysis, and system tuning.

Work with Site Reliability Engineering (SRE) team on the shared full stack ownership of a collection of services and/or technology areas. Understand the end-to-end configuration, technical dependencies, and overall behavioral characteristics of production services. Responsible for the design and delivery of the mission critical stack, with focus on security, resiliency, scale, and performance. Authority for end-to-end performance and operability. Partner with development teams in defining and implementing improvements in service architecture. Articulate technical characteristics of services and technology areas and guide Development Teams to engineer and add premier capabilities to the Oracle Cloud service portfolio. Understand and communicate the scale, capacity, security, performance attributes, and requirements of the service and technology stack. Demonstrate clear understanding of automation and orchestration principles. Act as ultimate escalation point for complex or critical issues that have not yet been documented as Standard Operating Procedures (SOPs). Utilize a deep understanding of service topology and their dependencies required to troubleshoot issues and define mitigations. Understand and explain the affect of product architecture decisions on distributed systems. Professional curiosity and a desire to a develop deep understanding of services and technologies.

Solid knowledge of server hardware and software configuration, networking, standard internet services, scripting languages, cloud computing patterns, technology security and compliance.

Experience running large scale customer facing web services.

Solid understanding of load balancing technologies and experience with development in programming languages, databases and big data stores, and container technologies.

Work involves defining and documenting technical architecture of complex and highly scalable products.

A minimum of 2 years experience.

**Responsibilities**:
As a member of our world-class Site Reliability team, you will bring your expertise and passion for learning monitoring, backups, infrastructure, and systems architecture, including mean time to resolution of any and all issues that may impact our customer-facing production environment.

The SRE's foremost responsibility is to ensure the Oracle Netsuite NSGBU Cloud Operations systems are operational, eliminating impact to our customers by using all the resources available to identify, resolve, or escalate issues to the appropriate person or team.

You will need to actively collaborate with many other engineering teams in Netsuite Cloud Operations (system engineering, network team, infrastructure engineering, security, maintenance team) to support, design and implementation as well as tooling and automation platforms.

On daily basis, resolve site incidents on various levels of infrastructure - from Hardware, Network, OS to Application issues

**Work in Linux terminal**

Work with monitoring and analytic tools like Kibana, Icinga, and Prometheus/Grafana to resolve incidents and identify problems

Cooperate with multiple teams to build systems and services that improve operational efficiency and drive reliability, scalability, resilience, security, and performance across Oracle NSGBU products.

Participate in NSGBU SRE 24x7 Follow the Sun Operational coverage.

**Qualifications and Education Requirements**

As a member of the NSGBU Global SRE team, you will need knowledge and understanding in:
Linux systems internals, monitoring, networking, and core cloud concepts

Standard internet services, such as DNS, TCP/IP, NFS, and Global Load Balancing

Performance troubleshooting/tuning experience

Understanding of web technologies, Apache, HTTPS/SSL, Web sessions

Understand the software lifecycle development process

Knowledge of database environments and requirements for high-availability environments

Solid analytical skills for troubleshooting problems

Recognize architectural patterns for distributed systems and understand the composition of reliable services from unreliable components and cloud computing at scale

Excellent communication skills in English

BS in Computer Science or a related field

**Preferred Skills**

Excellent troubleshooting skills

Motivated to work quickly and accurately under pressure in time-critical situations

A self-starter who takes pride in job



  • San Pedro, Costa Rica CRG Solutions A tiempo completo

    Reporting to the Director of Solutions Engineering, the Site Reliability Engineer provides technical and process guidance specific to a business unit. Key areas of impact this role provides are in depth knowledge of the engineering environments within the specific business unit and providing automated, stable, and Automation Solutions Engineering, CI/CD...


  • Ubicación San José, San José, Costa Rica Udersol A tiempo completo

    Requisitos: Technical Requirements: - Bachelor-s degree in computer science, IT or other highly technical, scientific discipline. - 3+ Years experience in a Site Reliability role. - Ability to program with one or more high level languages, such as Python, Ruby, and Javascript. Experience with automation and scripting languages, including CloudFormation and...


  • San José, Costa Rica Micro Focus A tiempo completo

    Micro Focus is one of the world’s largest enterprise software providers. We deliver mission-critical technology and supporting services that help thousands of customers worldwide manage core IT elements of their business so they can run and transform—at the same time. CyberRes is a Micro Focus line of business. We bring the expertise of one of the...


  • San Jose, Costa Rica Datasite A tiempo completo

    Job Description:The Jr Site Reliability Engineer (SRE) assists in maturing our organization's operational observability practices, addresses underlying performance and stability issues, and tracks down root-cause for enterprise incidents in our customer-facing enterprise platform. You will work as a part of a talented team of engineers focused on delivering...

  • Site Reliability Engineer

    hace 4 semanas


    San José, Costa Rica Equifax A tiempo completo

    **What you’ll do**: - Manage system(s) uptime across cloud-native (AWS, GCP) and hybrid architectures. - Build infrastructure as code (IAC) patterns that meet security and engineering standards using one or more technologies (Terraform, scripting with cloud CLI, and programming with cloud SDK). - Support Hashicorp Vault platform for global Equifax...


  • San José, Costa Rica Equifax A tiempo completo

    As SRE you are responsible for overall system operation and we use a breadth of tools and approaches to solve a broad set of problems. Practices such as limiting time spent on operational work, blameless postmortems, proactive identification, and prevention of potential outages. _ **What you’ll do**: - Manage system(s) uptime across cloud-native (AWS,...


  • San José, Costa Rica Uptalent.io A tiempo completo

    Uptalent.io, a leading global platform, is seeking a highly skilled Transmission Line Engineer (PLSS-CAD expert) to join our team. As a company that connects top tier talent from Latin America with the most exciting companies in the world, we are looking for an individual who is passionate about delivering high-quality solutions in the electrical engineering...


  • San José, Costa Rica Equifax A tiempo completo

    The Cloud Automation Engineer - Intermediate will help develop, manage and execute plans for deploying changes, improving reliability and performance, boarding user groups, and improving the services provided by the technology team that supports software development tools. This position will required a Rotational on call (1 week every 4 weeks) _ **What...

  • Software Engineer Mid

    hace 16 horas


    San José, Costa Rica Encora A tiempo completo

    **Important Information** Experience: + 2 years Job Mode: Full-time Work Mode: Work from home. **Job Summary** As a **_Software engineer Mid (6555)_**, you will be part of a highly skilled technology and agile team by supporting and developing cutting-edge solutions to meet our business requirements. You will help us accelerate our customers' business...

  • NetSuite Cloud Ops

    hace 1 semana


    San José, Costa Rica Oracle A tiempo completo

    We look for the type of engineer who can’t walk past a problem. When you find an error or inefficiency, does it become your mission to ensure you never see its like again? If you are nodding your head, this is the job for you! At CSI, one of our primary purposes is to discover and address recurring issues under the Continual Service Improvement (CSI)...

  • Senior Data Engineer

    hace 2 días


    San José, Costa Rica Encora A tiempo completo

    At Encora we are looking for a great talent like you to join our team as the next **_Senior Data Engineer (6409)_** Would you like to join our great team of engineers? Here we will tell you more about us and the role! **About the role**: As a **_Senior Data Engineer_**, you will be part of a highly skilled technology and agile team by supporting and...

  • Senior Frontend Engineer

    hace 4 semanas


    San José, Costa Rica Encora A tiempo completo

    At Encora we are looking for a great talent like you to join our team as the next **Senior Frontend Engineer (10036)** Would you like to join our great team of engineers? Here we will tell you more about us and the role! **About the role**: As a **Senior Frontend Engineer**, you will be part of a highly skilled technology and agile team by supporting and...

  • Senior QA Engineer

    hace 4 días


    San José, Costa Rica Wind River A tiempo completo

    **ABOUT WIND RIVER** Wind River is a global leader in delivering software for mission-critical intelligent systems. For more than four decades, the company has been an innovator and pioneer, powering billions of systems that require the highest levels of security, safety, and reliability. Wind River helps customers across automotive, aerospace, defense,...


  • San José, Costa Rica Joby Aero, Inc. A tiempo completo

    Avionyx Overview: IT Service Desk Engineer Job Overview: An IT Site Admin would operate on premise at a Joby worksite providing Tier 2/3 support for network, server, storage, conference media, telephony, and R&D compute systems. Operating in conjunction with the IT Network and IT Infrastructure teams as on-site resources to troubleshoot, implement, and...


  • San José, Costa Rica Johnson Controls A tiempo completo

    Fire Alarm Project Development Engineer **Goal** Support the Sales & Engineering teams on pre-sales and post-sales multi-faceted projects with technical activities such as takeoff, drawings, bill of materials and design. **Essential Functions** - Support on Request for Proposals working on takeoffs, bill of material creation, pricing, etc. - Create...

  • Intermediate Engineer I

    hace 4 semanas


    San José, Costa Rica Emerson A tiempo completo

    **Responsibilities**: Project Engineer role: - With mínimal direction can design some non-complex solutions. - Designs and implements hardware database, logics, graphics and datalinks. - With guidance selects hardware and software for project implementation. - Work closely with project technical leader/s to verify the project requirements are met...


  • San José, Costa Rica Pfizer A tiempo completo

    ROLE SUMMARY The Digital Hosting Solutions (DHS) team represents the Digital Center of Excellence (CoE) for cloud infrastructure capabilities, providing foundational public and private cloud services to all business lines, globally across Pfizer. DHS is a high-performing team, focused on delivering secure, scalable, compliant, operationally viable, and...


  • San Pedro, Costa Rica CRG Solutions A tiempo completo

    We are seeking a highly skilled and experienced Senior Ansible Engineer with expertise in both Ansible and OpenShift to join our dynamic and innovative team. As a Senior Ansible Engineer, you will be a key member of our senior team responsible for implementing and managing automation solutions using Ansible for our IT infrastructure. Your expertise in...


  • San José, Costa Rica Johnson Controls A tiempo completo

    **Electronic Security Project Engineer Jr** **Goal** Support the Sales & Engineering teams on pre-sales and post-sales multi-faceted projects with technical activities such as takeoff, drawings, bill of materials and design. **Essentials Functions** - Support on Request for Proposals working on takeoffs, bill of material creation, pricing, etc. - Create...


  • San José, Costa Rica Amazon Support Services Costa Rica SRL A tiempo completo

    3+ years of non-internship professional software development experience - 2+ years of non-internship design or architecture (design patterns, reliability and scaling) of new and existing systems experience - Experience programming with at least one software programming language Do you obsess over software performance and challenge yourself and others to...