Empleos actuales relacionados con Site Reliability Operations Engineer Iii - San José - Zuora

  • Site Reliability Engineer

    hace 2 semanas


    San Pedro, Costa Rica CRG Solutions A tiempo completo

    Reporting to the Director of Solutions Engineering, the Site Reliability Engineer provides technical and process guidance specific to a business unit. Key areas of impact this role provides are in depth knowledge of the engineering environments within the specific business unit and providing automated, stable, and Automation Solutions Engineering, CI/CD...

  • Site Reliability Engineer

    hace 4 semanas


    San José, San José, Costa Rica Oracle A tiempo completo

    Site Reliability Engineer-230001K1**Applicants are required to read, write, and speak the following languages***: English**Preferred Qualifications**Solve complex problems related to infrastructure cloud services and build automation to prevent problem recurrence.Design, write, and deploy software to improve the availability, scalability, and efficiency of...


  • San José, Costa Rica Oracle A tiempo completo

    Site Reliability Engineer-2200087E **Applicants are required to read, write, and speak the following languages**: English **Preferred Qualifications** Work with Site Reliability Engineering (SRE) team on the shared full stack ownership of a collection of services and/or technology areas. Understand the end-to-end configuration, technical dependencies, and...

  • Site Reliability Engineer

    hace 4 semanas


    San Francisco, Heredia, Costa Rica Sysco Costa Rica A tiempo completo

    **Requirements**:- Develop and refine strategy and process for all support issue tracking from intake through resolution in conjunction with senior members of the team.- Contribute to, and occasionally lead, strategic discussions to continue the evolution of flexibility and sustainability of the entire product suite.- Partner with Level 1 support teams,...

  • Site Reliability Engineer

    hace 4 semanas


    San José, Costa Rica Oracle A tiempo completo

    Site Reliability Engineer-2200087I**Applicants are required to read, write, and speak the following languages**: English**Preferred Qualifications**Work with Site Reliability Engineering (SRE) team on the shared full stack ownership of a collection of services and/or technology areas.Understand the end-to-end configuration, technical dependencies, and...

  • Site Reliability Engineer

    hace 3 semanas


    San José, San José, Costa Rica Scalable Systems A tiempo completo

    Scalable Systems?is a USA-based Big Data, Analytics and Digital Transformation Company focused on vertical, innovative solutions.By providing next-generation technology solutions and services, we help organizations to identify risks & opportunities, achieve operational excellence, and gain an innovative edge.**Openings**:**Title**: Site Reliability...

  • Site Reliability Engineer

    hace 4 semanas


    San José, San José, Costa Rica Fullstack Labs A tiempo completo

    FullStack is the fastest-growing software consultancy in the Americas.We help organizations like Uber, GoDaddy, MGM, Siemens, Stanford University, and the State of California, build distributed software development teams, and deliver transformational digital solutions.As an employee-first company, we focus on hiring the most talented software designers and...

  • Site Reliability Engineer

    hace 3 semanas


    San José, Costa Rica VS-Staffing A tiempo completo

    Job Description - Site Reliability Engineer - Remote Costa Rica **Title**: Site Reliability Engineer **Location**: Remote, based in Costa Rica **Job Overview**: **Key responsibilities include**: - Incident Management: Lead the response to security incidents through identification, containment, analysis, and mitigation strategies to minimize impact. -...

  • Site Reliability Engineer

    hace 2 semanas


    Ubicación San José, San José, Costa Rica Udersol A tiempo completo

    Requisitos: Technical Requirements: - Bachelor-s degree in computer science, IT or other highly technical, scientific discipline. - 3+ Years experience in a Site Reliability role. - Ability to program with one or more high level languages, such as Python, Ruby, and Javascript. Experience with automation and scripting languages, including CloudFormation and...

  • Site Reliability Engineer

    hace 4 semanas


    San José, San José, Costa Rica Hitachi Solutions A tiempo completo

    Company DescriptionHitachi Solutions is a global Microsoft solutions integrator passionate about developing and delivering industry-focused solutions that support our clients to deliver on their business transformation goals.Our industry focus, expertise, and intellectual property is what truly sets us apart.We have earned, and continue to maintain, a...

  • Site Reliability Engineer

    hace 3 semanas


    San José, San José, Costa Rica Equifax A tiempo completo

    Site Reliability Engineering (SRE) at Equifax is a discipline that combines software and systems engineering for building and running large-scale, distributed, fault-tolerant systems.SRE ensures that internal and external services meet or exceed reliability and performance expectations while adhering to Equifax engineering principles._- SREs in our team take...


  • San José, Costa Rica VS-Staffing A tiempo completo

    Job Description - Sr. Site Reliability Engineer **Title**: Sr. Site Reliability Engineer **Location**: Remote, based in Costa Rica **Job Overview**: **Key responsibilities include**: - Leadership and Mentorship: Direct and mentor junior SREs, fostering a culture of excellence, continuous improvement, and learning within the team. - Strategy Development:...


  • San José, Costa Rica Sysdig A tiempo completo

    Sysdig is driving the standard for securing the cloud and containers. We created Falco, the open standard for cloud-native threat detection, and consistently contribute to open source software projects. We are passionate, technical problem-solvers, continually innovating and delivering powerful solutions to secure the cloud from source to run. We value...


  • San José, San José, Costa Rica Wikimedia Foundation A tiempo completo

    **Staff Site Reliability Engineer (Traffic)****Summary**We are looking for a Staff Site Reliability Engineer to support and develop the platform serving the world's favorite encyclopedia to millions of people around the globe.Wikimedia's Site Reliability Engineering (SRE) team is principally responsible for ensuring our global top-15 website, our...

  • Site Reliability

    hace 4 semanas


    San José, San José, Costa Rica Canonical - Jobs A tiempo completo

    This role is an opportunity for a hands-on, but literally hands-off, technologist with a passion for Linux to build a career with Canonical and drive the success with those leveraging Ubuntu and open source products.If you have an affinity for operations automation and a passion for technology, then you will enjoy working with some of the best people in the...


  • San José, Costa Rica Equifax A tiempo completo

    Equifax is where you can power your possible. If you want to achieve your true potential, chart new paths, develop new skills, collaborate with bright minds, and make a meaningful impact, we want to hear from you. _ **What you’ll do**: - You will influence and design the infrastructure, architecture, standards, and methods for large-scale systems. - Will...

  • Site Reliability Engineer

    hace 2 semanas


    San José, Costa Rica Equifax A tiempo completo

    Equifax is where you can power your possible. If you want to achieve your true potential, chart new paths, develop new skills, collaborate with bright minds, and make a meaningful impact, we want to hear from you. _ **What you’ll do**: - You will influence and design the infrastructure, architecture, standards, and methods for large-scale systems. - Will...

  • Site Reliability Engineer

    hace 4 semanas


    San José, San José, Costa Rica Bairesdev A tiempo completo

    BairesDev is proud to be one of the fastest-growing companies in Latin America and a welcoming, highly rated employer (Glassdoor Employee Score: 4.3).With more than 3500 employees in 27 countries and world-class clients from start-ups to Fortune 500 companies, we're only as strong as the multicultural teams at the heart of our business.BairesDev runs on...


  • San José, San José, Costa Rica Akamai A tiempo completo

    **Do you have a passion for cutting edge technologies and tackling system problems?****Are you a self-starting professional who thrives in a dynamic environment?****Join the Akamai SRE Infrastructure team**As Site Reliability Engineer II youll be responsible for the operational stability and performance of critical systems and services.Part of a Global team...

  • Site Reliability Engineer

    hace 3 semanas


    San Francisco, Heredia, Costa Rica Ibm A tiempo completo

    IntroductionAt IBM, work is more than a job - it's a calling: To build.To design.To code.To consult.To think along with clients and sell.To make markets.To invent.To collaborate.Not just to do something better, but to attempt things you've never thought possible.Are you ready to lead in this new era of technology and solve some of the world's most...

Site Reliability Operations Engineer Iii

hace 1 mes


San José, Costa Rica Zuora A tiempo completo

**Company Overview**

At Zuora, we do Modern Business.
We're helping people subscribe to new ways of doing business that are better for people, companies and ultimately the planet.
It's an approach resulting from the shift to the Subscription Economy that puts customers first by building recurring relationships instead of one-time product sales and focuses on sustainable growth.
Through our leading expertise and multi-product suite, we are transforming all industries and working with the world's most innovative companies to monetize new business models, nurture subscriber relationships and optimize their digital experiences.
**THE TEAM**

Responsible For:

- Service Operations & Impacting issue Restoration
- Driving Command Center Incident Bridges for customer issues to resolution
- Responding to Observability Alerts/Alarms
- Responding to escalated issues from Customer support
- Write & Automate runbooks and drive alerts/incidents and service requests reduction by automation
- Being a liaison for a service and partner with service owner to make the service rock solid and efficient

**WHAT YOU'LL ACHIEVE**

As a SRO, you will be a member of a team that understands the configuration, technical dependencies, and overall behavioral characteristics of production services.
In partnership with developers, you have the responsibility to ensure services are designed and delivered with focus on security, resiliency, scale, and performance.
SROs are the ultimate authority and are accountable for end-to-end performance and operability of the services they own.
Champion service reliability operations and incidents prevention
- You will be part of the team whose mission is the shared ownership of a collection of services and technology areas, in partnership with developer teams.
- You are a key escalation point for issues that have been documented as Standard Operating Procedures (SOPs) or issues that needed in-depth troubleshooting and analysis.
You will help maintain up-to-date documentation on deployments, processes and SOP runbooks.
- You are a key escalation point in leading incidents and working with Subject Matter Expert (SME) for performing real-time incident handling tasks to support operations.
You will help develop and implement the incident management process.
- You will have the deep understanding of service topology and their dependencies required to troubleshoot issues and define mitigations.
Once you have expertly mitigated an incident, you will immediately work with SME on how to more quickly resolve the issue next time, with the goal to prevent the problem from recurring.
You will help develop and implement the problem management process.
- You will manage the full lifecycle of infrastructure and change management, including planned maintenance, standart, normal, and emergency changes.
You will help develop and implement change management processes to ensure developers and SRO can easily manage system configurations, deploy new code quickly and fix incidents faster.
Service design and implementation
- You will partner with development SCRUM teams in defining and implementing improvements to service architecture, both current and future.
You will be an expert at articulating technical characteristics of services and their dependencies, and guide development teams to engineer highly reliable and performant services.
- You will frequently partner with developer SCRUM teams and actively participate in the execution of tasks required to meet milestones and deliverables set by the team throughout a release cycle.
Operations Engineering
- You will take part in a shared on-call rotation that won't cripple your life or kill your soul.
Job Involves:

- Resolution of complex and critical issues, participation in Major incidents as a SME
- Service expert ensuring expertise is reflected in SOP's documentation are shared
- Instrumentation and metrics that clearly describe the service behaviors
- Scaling requirements and patterns
- Resiliency and recoverability, ensuring that backup / restore and disaster recovery capabilities are implemented, tested and maintained
- Driving and escalating gaps in automation, solutions and documentation

**WHAT YOU'LL NEED TO BE SUCCESSFUL**

SROs are a rare mix of sysadmins and development engineers, and as such you have the ability to understand and explain the effect of product architecture decisions on the ability to run as distributed systems.
You are driven by professional curiosity and a desire to develop a deep understanding of the services and the technologies they depend upon.
You demonstrate competence in shell scripting and high-level programming languages such as Bash, Ansible, Python, Terraform and low-level / no-code programming languages and solutions such as Google Apps Scripts, Jenkins Pipelines Groovy scripts, Jira Automation, Rundeck.
You are proactive, self-motivated, customer-focused, organized, and a good communicator.
You have over 4 years experience r