Site Reliability and Platform Support
hace 1 día
**Introduction**
At IBM, work is more than a job - it's a calling: To build. To design. To code. To consult. To think along with clients and sell. To make markets. To invent. To collaborate. Not just to do something better, but to attempt things you've never thought possible. Are you ready to lead in this new era of technology and solve some of the world's most challenging problems? If so, lets talk.
**Your Role and Responsibilities**
Are you passionate about technology? Do you love building new things? Do you want to develop the future of IBM's Cloud offerings? If you answered YES, then we have the right opportunity for you
The shift toward the consumption of IT as a service, i.e., the cloud, is one of the most important changes to happen to our industry in decades. At IBM, we are driven to shift our technology to an as-a-service model and to help our clients transform themselves to take full advantage of the cloud. With industry leadership in analytics, security, commerce, and cognitive computing and with unmatched hardware and software design and industrial research capabilities, no other company is as well positioned to address the full opportunity of cloud computing.
We are looking for a dynamic,
**Site Reliability and Automation Engineer **to join our
**Cloud Operations Team**, who is responsive to market needs, to deliver value to our clients in a fast-changing cloud landscape. The Cloud team is dedicated to ensuring the IBM Cloud is at the forefront of cloud technology, from data center design to network architecture to storage and compute clusters to flexible infrastructure services. We are building and operating IBM's VMware Solutions cloud platform to deliver performance and predictability for our customers' most demanding workloads, at global scale and with leadership efficiency, resiliency and security. It is an exciting time, and as a team we are driven by this incredible opportunity to thrill our clients.
In this
**Site Reliability and Automation Engineer** role, you will work closely with the Data Center, the entire
**Cloud** development organization and IBM vendors to support, maintain and operationally improve the cloud infrastructure. Your focus will be the following key responsibilities:
- Support and Operate Cloud Service delivery
- Automate health monitoring of the production and test systems
- Automate return to service procedures for Cloud Service delivery
- Support the compliance and security integrity of the environment through your work
- Partner with other teams, functional managers and program managers to deliver mission-critical services to the market
- Support development of new and existing capabilities for our compute, storage and network services
- Integrate automation with operational requirements
- Work with Engineering and Development to:
- Define operational requirements
- Automate operational requirements
- Provide initial assessment and possible workaround of production issue
- Troubleshoot and resolve production issues
- Work with Support and Infrastructure to:
- Identify and resolve complex issues
- Discuss and plan integration tasks
- Qualifications:
- Excellent written and verbal communication skills
- Comfortable operating in fast paced environment
The days for this position are Sunday-Wednesday OR Wednesday -Saturday
**Required Technical and Professional Expertise**
- Minimum 2-3 years’ experience in hands-on production administration of large systems and environment
- Experience establishing and improving procedures within a mission critical environment
- Must be efficient in writing and debugging scripts
- Must be extremely comfortable using and navigating within a Linux environment
- Ability to do low level debugging and problem analysis by examining logs and running Unix commands
- 2+ years of extensive experience with Monitoring technologies: Zabbix (preferred), Grafana, Nagios, ELK, Splunk, etc.
- 2+ years of experience with one or more Virtualization technologies: Citrix Xen Hypervisor (preferred), KVM(also preferred), libvirt, VMware vSphere, etc.
- 2+ years of experience with one or more automation and configuration management tools/solutions: Ansible, Salt, Chef, python, bash, puppet, Rundeck, etc.
- Working knowledge with ServiceNow, JIRA, Confluence, and GitHub
**Preferred Technical and Professional Expertise**
- Experience in maintaining cloud based solutions with VMware vCloud Director
- Experience with Veeam Backup
- Experience with replication/failover using Zerto Platform, VMware vCloud Availability or Veeam Cloud Connect
- (Extensive) Experience with scripting languages, such as Bash, Powershell and Python
- Working knowledge with SQL (PostgreSQL, MSSQL) and Cloudant
- Working knowledge with Networking, sub-netting and Storage technologies
- Working knowledge with ServiceNow, JIRA, Confluence, and GitHub
**About Business Unit**
Digitization is accelerating the ongoing evolution of business, and clouds - public, private, and
-
Site Reliability Engineer
hace 7 meses
Heredia, Costa Rica Sysco Costa Rica A tiempo completo**Requirements**: - Develop and refine strategy and process for all support issue tracking from intake through resolution in conjunction with senior members of the team. - Contribute to, and occasionally lead, strategic discussions to continue the evolution of flexibility and sustainability of the entire product suite. - Partner with Level 1 support teams,...
-
Site Reliability and Automation Engineer
hace 3 días
Heredia, Costa Rica IBM A tiempo completo**Introduction** At IBM, work is more than a job - it's a calling: To build. To design. To code. To consult. To think along with clients and sell. To make markets. To invent. To collaborate. Not just to do something better, but to attempt things you've never thought possible. Are you ready to lead in this new era of technology and solve some of the world's...
-
Site Reliability Engineer-platform
hace 7 meses
Heredia, Costa Rica IBM A tiempo completo**Introduction** As a SRE Engineer, you will work in an agile, collaborative environment to build, deploy, configure and maintain systems for the IBM client business. In this role, you will lead the problem resolution process for our clients, from analysis and troubleshooting, to deploying workarounds or fixes. Working closely with our worldwide teams, you...
-
Site Reliability Engineer
hace 7 meses
Heredia, Costa Rica IBM A tiempo completoIntroduction At IBM, work is more than a job - it's a calling: To build. To design. To code. To consult. To think along with clients and sell. To make markets. To invent. To collaborate. Not just to do something better, but to attempt things you've never thought possible. Are you ready to lead in this new era of technology and solve some of the world's most...
-
Site Reliability Engineer
hace 7 meses
Heredia, Costa Rica IBM A tiempo completo**Introduction** At IBM, work is more than a job - it's a calling: To build. To design. To code. To consult. To think along with clients and sell. To make markets. To invent. To collaborate. Not just to do something better, but to attempt things you've never thought possible. Are you ready to lead in this new era of technology and solve some of the world's...
-
Site Reliability Engineer Manager
hace 3 días
Heredia, Costa Rica IBM A tiempo completo**Introduction** At IBM, work is more than a job - it's a calling: To build. To design. To code. To consult. To think along with clients and sell. To make markets. To invent. To collaborate. Not just to do something better, but to attempt things you've never thought possible. Are you ready to lead in this new era of technology and solve some of the world's...
-
Site Reliability Engineer
hace 9 meses
Heredia, Costa Rica IBM A tiempo completoIntroduction As a SRE Engineer, you will work in an agile, collaborative environment to build, deploy, configure and maintain systems for the IBM client business. In this role, you will lead the problem resolution process for our clients, from analysis and troubleshooting, to deploying workarounds or fixes. Working closely with our worldwide teams, you will...
-
Sr Qlty
hace 4 días
Heredia, Costa Rica TE Connectivity A tiempo completoTE Connectivity’s Quality and Reliability Engineering Teams analyze the ability of product and production systems to comply with customer and contractual requirements through established reliability factors. They design, recommend revisions and install quality control systems, develop and document analytical methods for establishing reliability of products...
-
Senior Site Reliability Engineer Manager
hace 7 meses
Heredia, Costa Rica IBM A tiempo completoIntroduction The IBM Software is seeking a talented and motivated SRE Manager professional to lead and manage a team of engineers focused on a global Cloud Platform solution servicing multiple IBM offering. Your Role and Responsibilities - Manage and lead a team of SRE engineers. This involves hiring, training, and mentoring team members, assigning tasks,...
-
Sre and Automation Support-vmware
hace 1 día
Heredia, Costa Rica IBM A tiempo completo**Introduction** At IBM, work is more than a job - it's a calling: To build. To design. To code. To consult. To think along with clients and sell. To make markets. To invent. To collaborate. Not just to do something better, but to attempt things you've never thought possible. Are you ready to lead in this new era of technology and solve some of the world's...
-
Site Reliability Engineering Professional
hace 8 meses
Heredia, Costa Rica IBM A tiempo completoIntroduction At IBM, work is more than a job - it's a calling: To build. To design. To code. To consult. To think along with clients and sell. To make markets. To invent. To collaborate. Not just to do something better, but to attempt things you've never thought possible. Are you ready to lead in this new era of technology and solve some of the world's most...
-
Site Reliability Engineering Professional
hace 8 meses
Heredia, Costa Rica IBM A tiempo completoIntroduction At IBM, work is more than a job - it's a calling: To build. To design. To code. To consult. To think along with clients and sell. To make markets. To invent. To collaborate. Not just to do something better, but to attempt things you've never thought possible. Are you ready to lead in this new era of technology and solve some of the world's most...
-
Site Reliability Engineering
hace 7 meses
Heredia, Costa Rica IBM A tiempo completo**Introduction** As a SRE Engineer, you will work in an agile, collaborative environment to build, deploy, configure and maintain systems for the IBM client business. In this role, you will lead the problem resolution process for our clients, from analysis and troubleshooting, to deploying workarounds or fixes. Working closely with our worldwide teams, you...
-
Platform Support Engineer
hace 6 meses
Heredia, Costa Rica 360training A tiempo completo**Platform Support Engineer** As our Platform Support Engineer, you are not just joining a team; you are embarking on a journey where your curiosity and passion for technology will be ignited daily. You will have the unique opportunity to dissect and interact with a multitude of platforms - from our cutting-edge Learning Management System and its associated...
-
Site Reliability Engineer
hace 7 meses
Heredia, Costa Rica IBM A tiempo completo**Introduction** As a SRE Engineer, you will work in an agile, collaborative environment to build, deploy, configure and maintain systems for the IBM client business. In this role, you will lead the problem resolution process for our clients, from analysis and troubleshooting, to deploying workarounds or fixes. Working closely with our worldwide teams, you...
-
Site Reliability Engineer
hace 7 meses
Heredia, Costa Rica IBM A tiempo completo**Introduction** As a SRE Engineer, you will work in an agile, collaborative environment to build, deploy, configure and maintain systems for the IBM client business. In this role, you will lead the problem resolution process for our clients, from analysis and troubleshooting, to deploying workarounds or fixes. Working closely with our worldwide teams, you...
-
Compute Operations Site Reliability Engineer
hace 7 meses
Heredia, Costa Rica IBM A tiempo completoIntroduction At IBM, work is more than a job - it's a calling: To build. To design. To code. To consult. To think along with clients and sell. To make markets. To invent. To collaborate. Not just to do something better, but to attempt things you've never thought possible. Are you ready to lead in this new era of technology and solve some of the world's most...
-
Site Reliability Engineering Professional
hace 7 meses
Heredia, Costa Rica IBM A tiempo completoIntroduction At IBM, work is more than a job - it's a calling: To build. To design. To code. To consult. To think along with clients and sell. To make markets. To invent. To collaborate. Not just to do something better, but to attempt things you've never thought possible. Are you ready to lead in this new era of technology and solve some of the world's most...
-
Senior Site Reliability Engineer
hace 7 meses
Heredia, Costa Rica IBM A tiempo completoIntroduction As a SRE Engineer, you will work in an agile, collaborative environment to build, deploy, configure and maintain systems for the IBM client business. In this role, you will lead the problem resolution process for our clients, from analysis and troubleshooting, to deploying workarounds or fixes. Working closely with our worldwide teams, you will...
-
Senior Site Reliability Engineer
hace 7 meses
Heredia, Costa Rica IBM A tiempo completoIntroduction As a SRE Engineer, you will work in an agile, collaborative environment to build, deploy, configure and maintain systems for the IBM client business. In this role, you will lead the problem resolution process for our clients, from analysis and troubleshooting, to deploying workarounds or fixes. Working closely with our worldwide teams, you will...