Site Reliability Engineer
hace 7 días
Sysdig is driving the standard for securing the cloud and containers. We created Falco, the open standard for cloud-native threat detection, and consistently contribute to open source software projects. We are passionate, technical problem-solvers, continually innovating and delivering powerful solutions to secure the cloud from source to run.
We value diversity and open dialog to spur ideas, working closely together to achieve goals. We're an international company that understands how to cultivate a strong culture across a remote team. And we're a great place to work too — we've been named a **_Bay Area Best Place to Work _**by the **_San Francisco Business Times and the Silicon Valley Business Journal_**_ _for three years now We were recognized by Deloitte as one of the 500 fastest growing organizations in 2020 and 2021. We are looking for team members who have a passion for container and cloud security and are willing to dig deeper to help our customers. Does this sound like the right place for you?
As a **Site Reliability Engineer,** you will build solutions to enhance the availability, security, and resilience of the Sysdig services, including backends and data stores. You will collaborate with the Infrastructure, Engineering, and Customer Success teams to provide the best experience for our high-profile customers.
**What you will do**
- Deploy, upgrade and migrate large-scale Sysdig services on Kubernetes
- Enable customers and Sysdig customer-facing teams to solve common issues in productions
- Enhance the observability and reliability of Sysdig services to meet SLA/SLO
- Automate manual and repetitive tasks to reduce the toil
- Work with the Engineering team on security hardening in highly regulated environments
**What you will bring with you**
- Working experience in deploying and running workloads on Kubernetes in production is a must
- Working experience in monitoring production environments using Prometheus is a must
- Working experience with one of the following data stores is highly preferred: Postgres, Redis, Cassandra, Elasticsearch, Kafka/Zookeeper
- Ability to write and maintain technical documentation is a must
- Strong coding skills in a high-level programming language (Python, Golang, etc.)
- Working experience with Terraform or Helm
- Experience with well-known CI/CD tool
- Familiar with common Linux commands
- Knowledge and experience in public cloud are preferred
- On call every 6 weeks
**Why work at Sysdig?**
- We're a well-funded startup that already has a large enterprise customer base
- We have a pragmatic, transparent culture, from the CEO down
- We have an organizational focus on delivering value to customers
**When you join Sysdig, you can expect**:
- Competitive compensation including equity opportunities
- Flexible hours and additional recharge days
- Mental wellbeing support through Modern Health for you and your family
- Career growth
**_Some of our hiring managers are based internationally, an up to date CV in English would be appreciated_**
LI-FD1
LI-Hybrid
-
Site Reliability Engineer
hace 3 días
San José, Costa Rica Hitachi Solutions Ltd A tiempo completo**Company Description** Hitachi Solutions is a global Microsoft solutions integrator passionate about developing and delivering industry-focused solutions that support our clients to deliver on their business transformation goals. Our industry focus, expertise, and intellectual property is what truly sets us apart. We have earned, and continue to maintain,...
-
Site Reliability Engineer
hace 2 semanas
San José, Costa Rica Equifax A tiempo completoEquifax is where you can power your possible. If you want to achieve your true potential, chart new paths, develop new skills, collaborate with bright minds, and make a meaningful impact, we want to hear from you. _ **What you’ll do**: - You will influence and design the infrastructure, architecture, standards, and methods for large-scale systems. - Will...
-
Site Reliability
hace 1 semana
San José, San José, Costa Rica Canonical - Jobs A tiempo completoCanonical is a leading provider of open source software and operating systems to the global enterprise and technology markets. Our platform, Ubuntu, is very widely used in breakthrough enterprise initiatives such as public cloud, data science, AI, engineering innovation, and IoT. Our customers include the world's leading public cloud and silicon providers,...
-
Senior Site Reliability Engineer
hace 1 día
San José, Costa Rica Equifax A tiempo completoEquifax is where you can power your possible. If you want to achieve your true potential, chart new paths, develop new skills, collaborate with bright minds, and make a meaningful impact, we want to hear from you. _ - As a Site Reliability Engineer (SRE) you will combine software and systems engineering for building and running large-scale, distributed,...
-
Site Reliability Security Engineer
hace 1 semana
San José, San José, Costa Rica Equifax A tiempo completoA Site Reliability Engineering (SRE) is a discipline that combines software and systems engineering for building and running large-scale, distributed, fault-tolerant systems. SRE ensures that internal and external services meet or exceed reliability and performance expectations while adhering to Equifax engineering, security, and vulnerability management...
-
Site Reliability Engineer
hace 1 semana
San José, San José, Costa Rica Canonical - Jobs A tiempo completoCanonical is a leading provider of open source software and operating systems to the global enterprise and technology markets. Our platform, Ubuntu, is widely used in breakthrough enterprise initiatives such as public cloud, data science, AI, engineering innovation, and IoT. Our customers include the world's leading public cloud and silicon providers, and...
-
Senior Site Reliability
hace 1 semana
San José, San José, Costa Rica Canonical - Jobs A tiempo completoCanonical is a leading provider of open source software and operating systems to the global enterprise and technology markets. Our platform, Ubuntu, is very widely used in breakthrough enterprise initiatives such as public cloud, data science, AI, engineering innovation, and IoT. Our customers include the world's leading public cloud and silicon providers,...
-
Site Reliability Engineer
hace 2 semanas
San José, Costa Rica Micro Focus A tiempo completoMicro Focus is one of the world’s largest enterprise software providers. We deliver mission-critical technology and supporting services that help thousands of customers worldwide manage core IT elements of their business so they can run and transform—at the same time. CyberRes is a Micro Focus line of business. We bring the expertise of one of the...
-
Senior Site Reliability Engineer
hace 1 semana
San José, San José, Costa Rica Canonical - Jobs A tiempo completoCanonical is a leading provider of open source software and operating systems to the global enterprise and technology markets. Our platform, Ubuntu, is very widely used in breakthrough enterprise initiatives such as public cloud, data science, AI, engineering innovation and IoT. Our customers include the world's leading public cloud and silicon providers,...
-
Quality Reliability Rnd Engineer
hace 2 semanas
San José, Costa Rica Intel A tiempo completoDevelops, applies, and maintains quality and reliability standards for processing materials into partially finished or finished product. - Evaluate the materials, process and techniques used in production to meet the requirement of products and production equipment. - Specifies inspection and testing mechanism, conduct quality assessment (up to and including...