Reliability Engineer

5 days ago


Hyderabad, Telangana, India Apple Full time ₹ 12,00,000 - ₹ 36,00,000 per year
Join the AI and Data Platforms team at Apple, where we build and manage cloud-based data platforms handling petabytes of data at scale. We are looking for a passionate and independent Software Engineer specializing in reliability engineering for data platforms, with a strong understanding of data and ML systems. If you thrive in a fast-paced environment, love crafting solutions that don't yet exist, and possess excellent communication skills to collaborate across diverse teams, we invite you to contribute to Apple's high standards in an exciting and dynamic setting.

Description
As part of our team, you will be responsible for developing and operating our big data platform using open source or other solutions to aid critical applications, such as analytics, reporting, and AI/ML apps. This includes working to optimize performance and cost, automate operations, and identifying and resolving production errors and issues to ensure the best data platform experience.

Minimum Qualifications

3+ years of professional software engineering experience with large-scale big data platforms, including strong programming skills in Java, Scala, Python, or Go.
Proven expertise in designing, building, and operating large-scale distributed data processing systems with a strong focus on Apache Spark.
Hands-on experience with table formats and data lake technologies such as Apache Iceberg, ensuring scalability, reliability, and optimized query performance.
Skilled at coding for distributed systems and developing resilient data pipelines.
Strong background in incident management, including troubleshooting, root cause analysis, and performance optimization in complex production environments.
Proficient with Unix/Linux systems and command-line tools for debugging and operational support.

Preferred Qualifications

Expertise in designing, building, and operating critical, large-scale distributed systems with a focus on low latency, fault-tolerance, and high availability.
Experience with contribution to Open Source projects is a plus.
Experience with multiple public cloud infrastructure, managing multi-tenant Kubernetes clusters at scale and debugging Kubernetes/Spark issues.
Experience with workflow and data pipeline orchestration tools (e.g., Airflow, DBT).
Understanding of data modeling and data warehousing concepts.
Familiarity with the AI/ML stack, including GPUs, MLFlow, or Large Language Models (LLMs).
A learning attitude to continuously improve the self, team, and the organization.
Solid understanding of software engineering best practices, including the full development lifecycle, secure coding, and experience building reusable frameworks or libraries.

  • Hyderabad, Telangana, India Oracle Financial Services Software Ltd Full time ₹ 12,00,000 - ₹ 36,00,000 per year

    Principal Site Reliability Engineer Oracle is seeking motivated Principal Site Reliability Engineer who thrives in a fast-paced rapidly evolving technology environment. This position requires wide and overall knowledge in Linux administration, AI technologies, software development, cloud computing, networking, cloud security, performance analysis and...


  • Hyderabad, Telangana, India Medtronic Full time ₹ 15,00,000 - ₹ 25,00,000 per year

    At Medtronic you can begin a life-long career of exploration and innovation, while helping champion healthcare access and equity for all. You'll lead with purpose, breaking down barriers to innovation in a more connected, compassionate world.A Day in the LifeExperienced individual contributor in Reliability Engineering, working on complex projects....


  • Hyderabad, Telangana, India Jigya Software Services Full time ₹ 1,50,000 - ₹ 28,00,000 per year

    Job Title:Senior Site Reliability Engineer (SRE) - AWS/KubernetesLocation:Hyderabad - OnsiteJob Type:Full-TimeAbout the Role:We are looking for a highly skilled and motivated Site Reliability Engineer to design, build, and maintain our high-performance, scalable cloud infrastructure. You will play a critical role in ensuring the reliability, performance, and...


  • Hyderabad, Telangana, India Medtronic Full time ₹ 6,00,000 - ₹ 12,00,000 per year

    At Medtronic you can begin a life-long career of exploration and innovation, while helping champion healthcare access and equity for all. You'll lead with purpose, breaking down barriers to innovation in a more connected, compassionate world.A Day in the LifeExperienced individual contributor in Reliability Engineering, working on complex projects....


  • Hyderabad, Telangana, India SMARTWORK IT SERVICES Full time ₹ 12,00,000 - ₹ 24,00,000 per year

    Description : Role : Site Reliability Engineer (SRE). Location : Hyderabad. Experience : 10 to 15 Years. Job Summary : The Site Reliability Engineer (SRE) will play a critical role in ensuring the reliability, scalability, and performance of Citizens Banks enterprise systems and cloud environments. The ideal candidate brings deep technical...


  • Hyderabad, Telangana, India Synectics APAC Full time ₹ 20,00,000 - ₹ 25,00,000 per year

    Our Site Reliability Engineers (SREs) play a crucial role in ensuring our systems are reliable, scalable, and efficient. We are looking for an experienced SRE to join our team and help us maintain and improve our infrastructure.ResponsibilitiesMonitor and Maintain Systems: Ensure the availability, performance, and reliability of our production environment by...


  • Hyderabad, Telangana, India Goldman Sachs Services Pvt Ltd Full time ₹ 12,00,000 - ₹ 36,00,000 per year

    Engineering-SRE-Hyderabad-Associate-Software Engineering What We Do At Goldman Sachs, our Engineers don't just make things – we make things possible. Change the world by connecting people and capital with ideas. Solve the most challenging and pressing engineering problems for our clients. Join our engineering teams that build massively scalable...


  • Hyderabad, Telangana, India TurboHire Full time ₹ 15,00,000 - ₹ 28,00,000 per year

    Site Reliability Engineer (SRE)Location: Hyderabad (Hybrid)Experience: 3–5 yearsAbout the RoleWe are looking for an SRE Engineer to own reliability, deployment, and monitoringof TurboHire's cloud infrastructure. You will ensure our platform is scalable, secure,and highly available. The role balances hands-on coding, automation, and infraoperations, freeing...


  • Hyderabad, Telangana, India Evalify-IQ Full time ₹ 6,00,000 - ₹ 18,00,000 per year

    Skills Required:AWS, Azure, Terraform, CloudFormation, Cloudformation, Pulumi, CICD, GitHub Actions,GitLab CI, Jenkins, ArgoCD, Prometheus, Splunk, Grafana, Cloudwatch, Datadog, SRE,Site Reliability, Python, Powershell, Shell, Go, Kubernetes, Docker, Performance Tuning,Performance Enhancements, Performance Enhancement, PerformanceExperience Range:2 - 5...


  • Hyderabad, Telangana, India Capgemini Full time ₹ 8,00,000 - ₹ 12,00,000 per year

    We are seeking an experienced and highly motivated StaffReliability Engineer.The Staff Reliability Engineer will have end-to-end accountability for the reliability of IT services within a defined application portfolio. A prerequisite to the role will be a build-to-manage", problem-solving and innovative mindset applied to the design, build, test, deploy,...