
Site Reliability Engineer, Fleet – REMOTE
Full time @CISCO Meraki posted 4 days ago in Information Technology (IT) Shortlist Email JobJob Detail
-
Job ID 23999
-
Experience 2 Years
-
Qualifications Degree Bachelor
Job Description
“RESPONSIBILITES
Develop and maintain automation code for cloud maintenance processes using Ansible and Ruby.
Efficiently coordinate and execute large scale maintenance operations acting as a central point between multiple teams
Debug and resolve complex failure scenarios across large-scale systems, ensuring high availability and reliability.
Design, implement, and optimize GitLab CI pipelines to streamline deployment and testing workflows.
Collaborate with engineering teams to identify and address performance bottlenecks and scaling challenges.
Proactively troubleshoot issues across the fleet, using a deep understanding of Linux systems and networking.
Contribute to the creation of robust unit tests and infrastructure testing suites with RSpec.
Participate in collaborative projects to improve infrastructure efficiency, scalability, and observability.
Work cross-functionally with teams in different time zones, fostering a culture of shared ownership and reliability.
Develop and maintain automated tools for collecting infrastructure data to support compliance requirements.
Streamline compliance processes by reducing manual overhead through automation.
Be part of an on-call SRE team responding in real time to production incidents
YOU ARE AN IDEAL CANDIDATE IF YOU:
Experience in:
Working in Linux environments across multiple machines, comfortable with bash scripting
Scripting / programming languages, specifically around automation. Ideally ruby.
CI/CD pipelines, particularly GitLab CI
Infrastructure automation, ideally Ansible.
Cloud infrastructure providers, ideally AWS
Demonstrated experience troubleshooting and debugging in complex distributed systems.
Monitoring and alerting, prometheus, grafana etc
Experience managing and optimizing fleets of thousands of machines.
Excellent collaboration skills and the ability to work effectively across teams in multiple time zones.
Passion for automation, scalability, and infrastructure as code. “
Required skills
Other jobs you may like
-
HILSIL Software Design Lead
- @ True Anomaly
- Denver, CO