Service Reliability Engineer
4 hours ago
Do you have a passion for ensuring the reliability, scalability, and performance of critical services? Are you a highly motivated and experienced engineer with a strong understanding of Site Reliability Engineering (SRE) principles and a desire to automate and improve processes? Join Apple's General and Administrative (G&A) Solutions Engineering team and play a vital role in supporting our global production systems.
Description
As a Service Reliability Engineer, you'll be at the forefront of maintaining the health, stability, and efficiency of our services, working with a diverse range of technologies and platforms. You will collaborate with Engineers, Data Engineers, DBAs, and network specialists to proactively identify and resolve potential issues, automate repetitive tasks, and drive continuous improvement initiatives. Your expertise will directly impact the reliability of our systems, enabling Apple to deliver innovative products and services to our customers.","responsibilities":"Proactively supervise service performance, identify potential bottlenecks, and implement solutions to optimize efficiency and resilience.
Lead incident response efforts, driving rapid resolution and conducting detailed root cause analysis (RCA).
Develop and implement automation strategies to streamline operational tasks, improve service resilience, and reduce manual intervention.
Apply SRE principles to maintain highly reliable and scalable service infrastructure.
Collaborate closely with development teams to ensure that new services are built for operational perfection, incorporating guidelines for monitoring, alerting, and scalability.
Contribute to the creation and maintenance of comprehensive documentation, including run-books, service level objectives (SLOs).
Participate in on-call rotations, providing 24/7 support for critical services and responding to incidents with a sense of urgency.
Find opportunities for process improvement and drive initiatives to enhance the efficiency and effectiveness of the service reliability team.
Foster a culture of continuous learning within the team.
Define and supervise key service level indicators (SLIs) to measure and improve service reliability.
Preferred Qualifications
Familiarity with CI/CD pipelines and DevOps practices.
Experience with database technologies (e.g., MySQL, PostgreSQL, NoSQL databases).
Knowledge of ITIL frameworks and incident management processes.
Experience with vibe coding
Understanding of Linux/Unix system administration.
Experience with configuration management tools (Ansible, Chef, Puppet).
Strong proficiency in at least one programming language (e.g., Python, Java, Go) and scripting languages (e.g., Bash, PowerShell).
Experience with cloud platforms (e.g., AWS, Azure, GCP) and cloud-native technologies (e.g., Kubernetes, Docker).
Hands-on experience with monitoring and alerting tools (e.g., Prometheus, Grafana, Splunk, Datadog).
Proven track record to tackle issues in distributed systems.
Minimum Qualifications
4+ years of experience in a Site Reliability Engineering, DevOps, or related role, supporting large-scale, enterprise-level services.
Bachelor's degree in Computer Science or a related field, or equivalent experience.
Experience in RCA of technical issues.","internalDetails":null,"eeoContent":null
-
Service Reliability Engineer
4 hours ago
Hyderabad, Telangana, India Apple Full timeDo you have a passion for ensuring the reliability, scalability, and performance of critical services? Are you a highly motivated and experienced engineer with a strong understanding of Site Reliability Engineering (SRE) principles and a desire to automate and improve processes? Join Apple's General and Administrative (G&A) Solutions Engineering team and...
-
Site Reliability Engineer
1 week ago
Hyderabad, Telangana, India Elios Talent Full timeSite Reliability EngineerKey Highlights Build, automate, and support cloud-native infrastructure powering high-availability platforms Contribute to automation-first engineering across AWS, Terraform, CI/CD, and observability tooling Improve reliability, uptime, system health, and performance across production environments Strengthen DevSecOps...
-
Site Reliability Engineer
8 hours ago
Hyderabad, Telangana, India Apple Full timeImagine what you could do here. Apple is a place where extraordinary people gather to do their best work. Together we craft products and experiences people once couldn't have imagined — and now can't imagine living without. If you're motivated by the idea of making a real impact, and joining a team where we pride ourselves in being one of the most diverse...
-
Principal Service Reliability Engineer
3 hours ago
Hyderabad, Telangana, India Oracle Full timeKey ResponsibilitiesJOB DESCRIPTIONEnd-to-end service ownership: design for telemetry, security, resiliency, scalability, and performance; lead sizing/architecture; drive service health reviews and process simplification.Incident management and prevention: lead postmortems/RCAs, coordinate fixes, define repair items, and implement data-driven prevention and...
-
Principal Site Reliability Engineer
14 seconds ago
Hyderabad, Telangana, India Oracle Full timeOracle is seeking motivated Principal Site Reliability Engineer who thrives in a fast-paced rapidly evolving technology environment. This position requires wide and overall knowledge in Linux administration, AI technologies, software development, cloud computing, networking, cloud security, performance analysis and monitoring to provide the stability,...
-
Principal Site Reliability Engineer
2 days ago
Hyderabad, Telangana, India Oracle Full timeOracle is seeking motivated Principal Site Reliability Engineer who thrives in a fast-paced rapidly evolving technology environment. This position requires wide and overall knowledge in Mainframe zLinux, DB2, zVM, AIX. Site Reliability Engineer expected to work with multiple service and product development teams, identifying cross-team issues that...
-
Principal Site Reliability Engineer
6 days ago
Hyderabad, Telangana, India Oracle Full timeOracle is seeking motivated Principal Site Reliability Engineer who thrives in a fast-paced rapidly evolving technology environment. This position requires wide and overall knowledge in Linux administration, AI technologies, software development, cloud computing, networking, cloud security, performance analysis and monitoring to provide the stability,...
-
Site Reliability Engineer
2 weeks ago
Hyderabad, Telangana, India, Telangana Elios Talent Full timeSite Reliability EngineerKey Highlights️ Build, automate, and support cloud-native infrastructure powering high-availability platforms⚡ Contribute to automation-first engineering across AWS, Terraform, CI/CD, and observability tooling Improve reliability, uptime, system health, and performance across production environments Strengthen DevSecOps...
-
Site Reliability Engineer
2 weeks ago
Hyderabad, Telangana, India Cubic Corporation Full timeBusiness Unit:Cubic Transportation SystemsCompany Details:When you join Cubic, you become part of a company that creates and delivers technology solutions in transportation to make people's lives easier by simplifying their daily journeys, and defense capabilities to help promote mission success and safety for those who serve their nation. Led by our...
-
Reliability Engineer
6 days ago
Hyderabad, Telangana, India Apple Full timeJoin the AI and Data Platforms team at Apple, where we build and manage cloud-based data platforms handling petabytes of data at scale. We are looking for a passionate and independent Software Engineer specializing in reliability engineering for data platforms, with a strong understanding of data and ML systems. If you thrive in a fast-paced environment,...