Site Reliability Engineer
hace 3 semanas
Site Reliability Engineer-2200087E
**Applicants are required to read, write, and speak the following languages**: English
**Preferred Qualifications**
Work with Site Reliability Engineering (SRE) team on the shared full stack ownership of a collection of services and/or technology areas.
Understand the end-to-end configuration, technical dependencies, and overall behavioral characteristics of production services.
Responsible for the design and delivery of the mission critical stack, with focus on security, resiliency, scale, and performance.
Authority for end-to-end performance and operability.
Partner with development teams in defining and implementing improvements in service architecture.
Articulate technical characteristics of services and technology areas and guide Development Teams to engineer and add premier capabilities to the Oracle Cloud service portfolio.
Understand and communicate the scale, capacity, security, performance attributes, and requirements of the service and technology stack.
Demonstrate clear understanding of automation and orchestration principles.
Act as ultimate escalation point for complex or critical issues that have not yet been documented as Standard Operating Procedures (SOPs).
Utilize a deep understanding of service topology and their dependencies required to troubleshoot issues and define mitigations.
Understand and explain the affect of product architecture decisions on distributed systems.
Professional curiosity and a desire to a develop deep understanding of services and technologies.
**S**olve complex problems related to infrastructure cloud services and build automation to prevent problem recurrence.
Design, write, and deploy software to improve the availability, scalability, and efficiency of Oracle products and services.
Design and develop designs, architectures, standards, and methods for large-scale distributed systems.
Facilitate service capacity planning and demand forecasting, software performance analysis, and system tuning.
**Requierements**:
- A BS or MS in Computer Science, or equivalent.
- Solid knowledge of server hardware and software configuration, networking, standard internet services, scripting languages, cloud computing patterns, technology security and compliance.
- Experience running large scale customer facing web services.
- Solid understanding of load balancing technologies and experience with development in programming languages, databases and big data stores, and container technologies.
- Work involves defining and documenting technical architecture of complex and highly scalable products.
A minimum of 2 years experience.
**Languages**:Bash, Python, Perl
**Services**:Java, Apache, Jetty, Kafka, Zookeeper, Oracle Database
**Operating Systems**:Any related Linux operating systems.
**Detailed Description and Job Requirements**
Solve complex problems related to infrastructure cloud services and build automation to prevent problem recurrence.
Design, write, and deploy software to improve the availability, scalability, and efficiency of Oracle products and services.
Design and develop designs, architectures, standards, and methods for large-scale distributed systems.
Facilitate service capacity planning and demand forecasting, software performance analysis, and system tuning.
Work with Site Reliability Engineering (SRE) team on the shared full stack ownership of a collection of services and/or technology areas.
Understand the end-to-end configuration, technical dependencies, and overall behavioral characteristics of production services.
Responsible for the design and delivery of the mission critical stack, with focus on security, resiliency, scale, and performance.
Authority for end-to-end performance and operability.
Partner with development teams in defining and implementing improvements in service architecture.
Articulate technical characteristics of services and technology areas and guide Development Teams to engineer and add premier capabilities to the Oracle Cloud service portfolio.
Understand and communicate the scale, capacity, security, performance attributes, and requirements of the service and technology stack.
Demonstrate clear understanding of automation and orchestration principles.
Act as ultimate escalation point for complex or critical issues that have not yet been documented as Standard Operating Procedures (SOPs).
Utilize a deep understanding of service topology and their dependencies required to troubleshoot issues and define mitigations.
Understand and explain the affect of product architecture decisions on distributed systems.
Professional curiosity and a desire to a develop deep understanding of services and technologies.
A BS or MS in Computer Science, or equivalent.
Solid knowledge of server hardware and software configuration, networking, standard internet services, scripting languages, cloud computing patterns, technology security and compliance.
Experience running large scale customer facing web services.
Sol
-
Site Reliability Engineer
hace 3 semanas
San José, San José, Costa Rica Vs-Staffing A tiempo completoJob Description - Site Reliability Engineer - Remote Costa Rica**Title**:Site Reliability Engineer**Location**:Remote, based in Costa Rica**Job Overview**:**Key responsibilities include**:- Incident Management: Lead the response to security incidents through identification, containment, analysis, and mitigation strategies to minimize impact.- Procedure...
-
Site Reliability Engineer
hace 2 semanas
San José, San José, Costa Rica Canonical - Jobs A tiempo completoAbout the Role: We are seeking an experienced Site Reliability Engineer to join our team at Canonical. As a key member of our infrastructure team, you will be responsible for designing, implementing, and maintaining the reliability and scalability of our cloud infrastructure.">Automate software operations for reusability and consistency across private and...
-
Sr. Site Reliability Engineer
hace 3 semanas
San José, San José, Costa Rica Vs-Staffing A tiempo completoJob Description - Sr. Site Reliability Engineer**Title**:Sr. Site Reliability Engineer**Location**:Remote, based in Costa Rica**Job Overview**:**Key responsibilities include**:- Leadership and Mentorship: Direct and mentor junior SREs, fostering a culture of excellence, continuous improvement, and learning within the team.- Strategy Development: Lead the...
-
Site Reliability Specialist
hace 5 días
San José, San José, Costa Rica beBee Careers A tiempo completoJob SummaryWe are seeking a highly skilled Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will be responsible for ensuring the scalability, fault-tolerance, and high availability of our platform.Key ResponsibilitiesLeverage a suite of SaaS-based observability tools to monitor and improve the performance of our platform.Take...
-
Site Reliability Engineering Lead
hace 4 días
San José, San José, Costa Rica beBee Careers A tiempo completo**Job Summary:** We are seeking an experienced Site Reliability Engineer to lead our team in delivering high-quality, scalable, and secure systems.**Key Responsibilities:*Develop and execute strategies for system optimization, ensuring scalability, reliability, and security at all levels.Lead the creation and implementation of sophisticated incident response...
-
Site Reliability Engineer
hace 2 semanas
San José, San José, Costa Rica Oracle A tiempo completoSite Reliability Engineer-230001K1**Applicants are required to read, write, and speak the following languages***: English**Preferred Qualifications**Solve complex problems related to infrastructure cloud services and build automation to prevent problem recurrence.Design, write, and deploy software to improve the availability, scalability, and efficiency of...
-
Site Reliability Engineer
hace 2 semanas
San José, San José, Costa Rica Crg Solutions A tiempo completoReporting to the Director of Solutions Engineering, the Site Reliability Engineer provides technical andprocess guidance specific to a business unit.Key areas of impact this role provides are in depth knowledgeof the engineering environments within the specific business unit and providing automated, stable, andAutomation Solutions Engineering, CI/CD...
-
Site Reliability Engineer
hace 2 semanas
San José, San José, Costa Rica Oracle A tiempo completoSite Reliability Engineer-2200087I**Applicants are required to read, write, and speak the following languages**: English**Preferred Qualifications**Work with Site Reliability Engineering (SRE) team on the shared full stack ownership of a collection of services and/or technology areas.Understand the end-to-end configuration, technical dependencies, and...
-
Site Reliability Engineer
hace 2 días
San José, San José, Costa Rica Fullstack Labs A tiempo completoFullStack is the fastest-growing software consultancy in the Americas.We help organizations like Uber, GoDaddy, MGM, Siemens, Stanford University, and the State of California, build distributed software development teams, and deliver transformational digital solutions.As an employee-first company, we focus on hiring the most talented software designers and...
-
Site Reliability Engineering Specialist
hace 2 días
San José, San José, Costa Rica Fullstack Labs A tiempo completoAbout UsFullStack Labs is a software consultancy with a strong presence in the Americas. Our team helps organizations build distributed software development teams and deliver transformational digital solutions. We focus on creating a positive, respectful, and supportive work environment where our employees can thrive.We're proud of:Offering life-changing...