Reliability Engineer

2 weeks ago


Hyderabad, Telangana, India Apple Full time ₹ 12,00,000 - ₹ 36,00,000 per year
Join the AI and Data Platforms team at Apple, where we build and manage cloud-based data platforms handling petabytes of data at scale. We are looking for a passionate and independent Software Engineer specializing in reliability engineering for data platforms, with a strong understanding of data and ML systems. If you thrive in a fast-paced environment, love crafting solutions that don't yet exist, and possess excellent communication skills to collaborate across diverse teams, we invite you to contribute to Apple's high standards in an exciting and dynamic setting.

Description
As part of our team, you will be responsible for developing and operating our big data platform using open source or other solutions to aid critical applications, such as analytics, reporting, and AI/ML apps. This includes working to optimize performance and cost, automate operations, and identifying and resolving production errors and issues to ensure the best data platform experience.

Minimum Qualifications

3+ years of professional software engineering experience with large-scale big data platforms, including strong programming skills in Java, Scala, Python, or Go.
Proven expertise in designing, building, and operating large-scale distributed data processing systems with a strong focus on Apache Spark.
Hands-on experience with table formats and data lake technologies such as Apache Iceberg, ensuring scalability, reliability, and optimized query performance.
Skilled at coding for distributed systems and developing resilient data pipelines.
Strong background in incident management, including troubleshooting, root cause analysis, and performance optimization in complex production environments.
Proficient with Unix/Linux systems and command-line tools for debugging and operational support.

Preferred Qualifications

Expertise in designing, building, and operating critical, large-scale distributed systems with a focus on low latency, fault-tolerance, and high availability.
Experience with contribution to Open Source projects is a plus.
Experience with multiple public cloud infrastructure, managing multi-tenant Kubernetes clusters at scale and debugging Kubernetes/Spark issues.
Experience with workflow and data pipeline orchestration tools (e.g., Airflow, DBT).
Understanding of data modeling and data warehousing concepts.
Familiarity with the AI/ML stack, including GPUs, MLFlow, or Large Language Models (LLMs).
A learning attitude to continuously improve the self, team, and the organization.
Solid understanding of software engineering best practices, including the full development lifecycle, secure coding, and experience building reusable frameworks or libraries.

  • Hyderabad, Telangana, India Elios Talent Full time ₹ 12,00,000 - ₹ 36,00,000 per year

    Site Reliability EngineerKey Highlights Build, automate, and support cloud-native infrastructure powering high-availability platforms Contribute to automation-first engineering across AWS, Terraform, CI/CD, and observability tooling Improve reliability, uptime, system health, and performance across production environments Strengthen DevSecOps...


  • Hyderabad, Telangana, India 2a1d0a41-1875-4bbb-b5a8-e4d5620cfd5f Full time ₹ 12,00,000 - ₹ 36,00,000 per year

    Role & responsibilitiesCoordinates cross-product chaos experimentation to proactively test system resilience and uncover reliability gaps.Maintains the centralized incident response playbook for the subdivision, documenting standards for communication, escalation, and recovery during incidents. Aggregates and reports quantifiable availability data to senior...


  • Hyderabad, Telangana, India Apple Full time ₹ 15,00,000 - ₹ 25,00,000 per year

    Do you have a passion for ensuring the reliability, scalability, and performance of critical services? Are you a highly motivated and experienced engineer with a strong understanding of Site Reliability Engineering (SRE) principles and a desire to automate and improve processes? Join Apple's General and Administrative (G&A) Solutions Engineering team and...


  • Hyderabad, Telangana, India Talent Worx Full time ₹ 12,00,000 - ₹ 36,00,000 per year

    Site Reliability Engineer (SRE)At Talent Worx, we are looking for a dedicated Site Reliability Engineer (SRE) to join our team. This role involves maintaining high availability and reliability of our services through the application of software engineering practices and systems administration skills. The ideal candidate will bridge the gap between...


  • Hyderabad, Telangana, India Medtronic Full time ₹ 15,00,000 - ₹ 25,00,000 per year

    At Medtronic you can begin a life-long career of exploration and innovation, while helping champion healthcare access and equity for all. You'll lead with purpose, breaking down barriers to innovation in a more connected, compassionate world.A Day in the LifeExperienced individual contributor in Reliability Engineering, working on complex projects....


  • Hyderabad, Telangana, India Apple Full time ₹ 18,00,000 - ₹ 25,00,000 per year

    Do you have a passion for ensuring the reliability, scalability, and performance of critical services? Are you a highly motivated and experienced engineer with a strong understanding of Site Reliability Engineering (SRE) principles and a desire to automate and improve processes? Join Apple's General and Administrative (G&A) Solutions Engineering team and...


  • Hyderabad, Telangana, India Technology Next Full time ₹ 20,00,000 - ₹ 30,00,000 per year

    Urgently hiring for Site Reliability Engineer (SRE) / Chaos EngineerLocation: HyderabadJob Type: Full-time, PermanentJob Description:We are looking for an experienced Site Reliability Engineer (SRE) with strong Python automation skills (Boto3 required) and hands-on experience in chaos engineering to improve system reliability and resilience. The ideal...


  • Hyderabad, Telangana, India Medtronic Full time ₹ 6,00,000 - ₹ 12,00,000 per year

    At Medtronic you can begin a life-long career of exploration and innovation, while helping champion healthcare access and equity for all. You'll lead with purpose, breaking down barriers to innovation in a more connected, compassionate world.A Day in the LifeExperienced individual contributor in Reliability Engineering, working on complex projects....


  • Hyderabad, Telangana, India Assurant Full time ₹ 6,00,000 - ₹ 12,00,000 per year

    Site Reliability Engineer, GCC-AssurantThe Site Reliability Engineer (SRE) will be part of the Assurant Reliability Team, specifically within the Site Reliability Engineering area. This remote position, based in India, focuses on building and maintaining reliable, scalable systems through a combination of software development and network diagnostics. The...


  • Hyderabad, Telangana, India Assurant Full time ₹ 12,00,000 - ₹ 36,00,000 per year

    Site Reliability Engineer, GCC-Assurant The Site Reliability Engineer (SRE) will be part of the Assurant Reliability Team, specifically within the Site Reliability Engineering area. This remote position, based in India, focuses on building and maintaining reliable, scalable systems through a combination of software development and network diagnostics. The...