Site Reliability Engineer
hace 5 días
Site Reliability Engineer-230001K1
**Applicants are required to read, write, and speak the following languages***: English
**Preferred Qualifications**
Solve complex problems related to infrastructure cloud services and build automation to prevent problem recurrence.
Design, write, and deploy software to improve the availability, scalability, and efficiency of Oracle products and services.
Design and develop designs, architectures, standards, and methods for large-scale distributed systems.
Facilitate service capacity planning and demand forecasting, software performance analysis, and system tuning.
Work with Site Reliability Engineering (SRE) team on the shared full stack ownership of a collection of services and/or technology areas.
Understand the end-to-end configuration, technical dependencies, and overall behavioral characteristics of production services.
Responsible for the design and delivery of the mission critical stack, with focus on security, resiliency, scale, and performance.
Authority for end-to-end performance and operability.
Partner with development teams in defining and implementing improvements in service architecture.
Articulate technical characteristics of services and technology areas and guide Development Teams to engineer and add premier capabilities to the Oracle Cloud service portfolio.
Understand and communicate the scale, capacity, security, performance attributes, and requirements of the service and technology stack.
Demonstrate clear understanding of automation and orchestration principles.
Act as ultimate escalation point for complex or critical issues that have not yet been documented as Standard Operating Procedures (SOPs).
Utilize a deep understanding of service topology and their dependencies required to troubleshoot issues and define mitigations.
Understand and explain the affect of product architecture decisions on distributed systems.
Professional curiosity and a desire to a develop deep understanding of services and technologies.
Solid knowledge of server hardware and software configuration, networking, standard internet services, scripting languages, cloud computing patterns, technology security and compliance.
Experience running large scale customer facing web services.
Solid understanding of load balancing technologies and experience with development in programming languages, databases and big data stores, and container technologies.
Work involves defining and documenting technical architecture of complex and highly scalable products.
A minimum of 2 years experience.
**Responsibilities**:
As a member of our world-class Site Reliability team, you will bring your expertise and passion for learning monitoring, backups, infrastructure, and systems architecture, including mean time to resolution of any and all issues that may impact our customer-facing production environment.
The SRE's foremost responsibility is to ensure the Oracle Netsuite NSGBU Cloud Operations systems are operational, eliminating impact to our customers by using all the resources available to identify, resolve, or escalate issues to the appropriate person or team.
You will need to actively collaborate with many other engineering teams in Netsuite Cloud Operations (system engineering, network team, infrastructure engineering, security, maintenance team) to support, design and implementation as well as tooling and automation platforms.
On daily basis, resolve site incidents on various levels of infrastructure - from Hardware, Network, OS to Application issues
**Work in Linux terminal**
Work with monitoring and analytic tools like Kibana, Icinga, and Prometheus/Grafana to resolve incidents and identify problems
Cooperate with multiple teams to build systems and services that improve operational efficiency and drive reliability, scalability, resilience, security, and performance across Oracle NSGBU products.
Participate in NSGBU SRE 24x7 Follow the Sun Operational coverage.
**Qualifications and Education Requirements**
As a member of the NSGBU Global SRE team, you will need knowledge and understanding in:
Linux systems internals, monitoring, networking, and core cloud concepts
Standard internet services, such as DNS, TCP/IP, NFS, and Global Load Balancing
Performance troubleshooting/tuning experience
Understanding of web technologies, Apache, HTTPS/SSL, Web sessions
Understand the software lifecycle development process
Knowledge of database environments and requirements for high-availability environments
Solid analytical skills for troubleshooting problems
Recognize architectural patterns for distributed systems and understand the composition of reliable services from unreliable components and cloud computing at scale
Excellent communication skills in English
BS in Computer Science or a related field
**Preferred Skills**
Excellent troubleshooting skills
Motivated to work quickly and accurately under pressure in time-critical situations
A self-starter who takes pride in job
-
Site Reliability Engineer
hace 1 semana
San José, San José, Costa Rica Vs-Staffing A tiempo completoJob Description - Site Reliability Engineer - Remote Costa Rica**Title**:Site Reliability Engineer**Location**:Remote, based in Costa Rica**Job Overview**:**Key responsibilities include**:- Incident Management: Lead the response to security incidents through identification, containment, analysis, and mitigation strategies to minimize impact.- Procedure...
-
Site Reliability Engineer
hace 5 días
San José, San José, Costa Rica Canonical - Jobs A tiempo completoAbout the Role: We are seeking an experienced Site Reliability Engineer to join our team at Canonical. As a key member of our infrastructure team, you will be responsible for designing, implementing, and maintaining the reliability and scalability of our cloud infrastructure.">Automate software operations for reusability and consistency across private and...
-
Sr. Site Reliability Engineer
hace 1 semana
San José, San José, Costa Rica Vs-Staffing A tiempo completoJob Description - Sr. Site Reliability Engineer**Title**:Sr. Site Reliability Engineer**Location**:Remote, based in Costa Rica**Job Overview**:**Key responsibilities include**:- Leadership and Mentorship: Direct and mentor junior SREs, fostering a culture of excellence, continuous improvement, and learning within the team.- Strategy Development: Lead the...
-
Site Reliability Engineer
hace 5 días
San José, San José, Costa Rica Crg Solutions A tiempo completoReporting to the Director of Solutions Engineering, the Site Reliability Engineer provides technical andprocess guidance specific to a business unit.Key areas of impact this role provides are in depth knowledgeof the engineering environments within the specific business unit and providing automated, stable, andAutomation Solutions Engineering, CI/CD...
-
Site Reliability Engineer
hace 2 semanas
San José, San José, Costa Rica Oracle A tiempo completoSite Reliability Engineer-2200087E**Applicants are required to read, write, and speak the following languages**: English**Preferred Qualifications**Work with Site Reliability Engineering (SRE) team on the shared full stack ownership of a collection of services and/or technology areas.Understand the end-to-end configuration, technical dependencies, and...
-
Site Reliability Engineer
hace 4 días
San José, San José, Costa Rica Oracle A tiempo completoSite Reliability Engineer-2200087I**Applicants are required to read, write, and speak the following languages**: English**Preferred Qualifications**Work with Site Reliability Engineering (SRE) team on the shared full stack ownership of a collection of services and/or technology areas.Understand the end-to-end configuration, technical dependencies, and...
-
Site Reliability Specialist
hace 3 días
San José, San José, Costa Rica Bairesdev A tiempo completoAbout the RoleWe're looking for a skilled Site Reliability Engineer to join our Development team on a Home-based modality. As a key member of our multicultural teams, you'll contribute to delivering top-notch solutions to our clients while fostering a dynamic work environment. Your expertise will play a crucial role in ensuring the smooth operation of our...
-
Site Reliability Engineer
hace 1 semana
San José, San José, Costa Rica Modus Create A tiempo completoJob DescriptionWe are looking for an experienced DevOps/SRE Engineer to join our team. As a key member of our technical staff, you will be responsible for designing and implementing efficient systems, automating processes, and ensuring the reliability of our infrastructure. You will collaborate with cross-functional teams to deliver high-quality solutions...
-
Site Reliability Engineer
hace 1 semana
San José, San José, Costa Rica Hitachi Solutions Ltd A tiempo completo**Company Description**Hitachi Solutions is a global Microsoft solutions integrator passionate about developing and delivering industry-focused solutions that support our clients to deliver on their business transformation goals.Our industry focus, expertise, and intellectual property is what truly sets us apart.We have earned, and continue to maintain, a...
-
Site Reliability Engineer
hace 4 días
San José, San José, Costa Rica Hitachi Solutions A tiempo completoCompany DescriptionHitachi Solutions is a global Microsoft solutions integrator passionate about developing and delivering industry-focused solutions that support our clients to deliver on their business transformation goals.Our industry focus, expertise, and intellectual property is what truly sets us apart.We have earned, and continue to maintain, a...
-
Site Reliability Engineer
hace 7 días
San José, San José, Costa Rica Equifax A tiempo completoEquifax is where you can power your possible.If you want to achieve your true potential, chart new paths, develop new skills, collaborate with bright minds, and make a meaningful impact, we want to hear from you._**What you'll do**:- You will influence and design the infrastructure, architecture, standards, and methods for large-scale systems.- Will support...
-
Site Reliability Engineering Lead
hace 4 días
San José, San José, Costa Rica Oracle A tiempo completoAbout the RoleAs a Site Reliability Engineering Lead at Oracle, you will be responsible for ensuring the reliability and efficiency of our cloud services. This involves working closely with development teams to design and implement solutions that meet the needs of our customers.Key Responsibilities:Collaborate with development teams to define and implement...
-
Site Reliability and Cyber Security Expert
hace 16 horas
San José, San José, Costa Rica Equifax A tiempo completo**Job Details:**Job Title: Cyber Security Site Reliability EngineerLocation: [Insert Location]Job Type: Full-timeAbout the Role:We are seeking a highly skilled Cyber Security Site Reliability Engineer to join our team. As an SRE Intermediate, you will be responsible for ensuring the reliability and performance of our internal and external services.Your Key...
-
Site Reliability Engineer
hace 3 días
San José, San José, Costa Rica Bairesdev A tiempo completoBairesDev is proud to be one of the fastest-growing companies in Latin America and a welcoming, highly rated employer (Glassdoor Employee Score: 4.3).With more than 3500 employees in 27 countries and world-class clients from start-ups to Fortune 500 companies, we're only as strong as the multicultural teams at the heart of our business.BairesDev runs on...
-
Staff Site Reliability Engineer/Devops
hace 4 semanas
San José, San José, Costa Rica Wikimedia Foundation A tiempo completo**Staff Site Reliability Engineer (Traffic)****Summary**We are looking for a Staff Site Reliability Engineer to support and develop the platform serving the world's favorite encyclopedia to millions of people around the globe.Wikimedia's Site Reliability Engineering (SRE) team is principally responsible for ensuring our global top-15 website, our...
-
Senior Site Reliability Engineer
hace 1 día
San José, San José, Costa Rica Akamai A tiempo completo**Do you have a passion for cutting edge technologies and tackling system problems?****Are you a self-starting professional who thrives in a dynamic environment?****Join our Site Reliability team****Help us shape the future of the Internet**As a Site Reliability Engineer, you will be responsible for:- Deploying, managing, and operating scalable, highly...
-
Cloud Reliability Engineer
hace 5 días
San José, San José, Costa Rica Oracle A tiempo completoJob DescriptionWe are seeking a highly skilled Cloud Reliability Engineer to join our team at Oracle.The successful candidate will be responsible for designing, developing, and deploying software solutions to improve the availability, scalability, and efficiency of Oracle products and services.Main Responsibilities:Design and develop software solutions to...
-
Cyber Security Site Reliability Engineer
hace 1 día
San José, San José, Costa Rica Equifax A tiempo completo**Company Overview:**At Equifax, we empower individuals to achieve their full potential by delivering innovative solutions that drive business growth. Our mission is to provide a platform for people to power their possible.**Job Description:**We are seeking a highly skilled Cyber Security Site Reliability Engineer to join our team. As an SRE Intermediate,...
-
Sr Site Reliability Engineer
hace 6 días
San José, San José, Costa Rica Datasite A tiempo completoDatasite is where deals are made.We provide the data rooms and SaaS technology used in M&A and other high-value transactions, to deliver projects in more than 170 countries.Carrying that success into the future is all about you.Your useful skills, your unusual experience, your unique ideas.Everyone here brings something unexpected.What's yours?Invest your...
-
Site Reliability Engineer
hace 7 días
San José, San José, Costa Rica Cohesity A tiempo completoCohesity is on a mission to radically simplify how organizations secure and manage their data, while unlocking limitless value.As a leader in data security and management, we make it easy to secure, protect, manage and derive value from data—across the data center, edge, and cloud.At Cohesity, we're a group of builders and go-getters who are committed to...