Reliability Engineer
4 weeks ago
About T-Mobile:T-Mobile US, Inc. (NASDAQ: TMUS), headquartered in Bellevue, Washington, is America’s supercharged Un-carrier, connecting millions through its strong nationwide network and flagship brands, T-Mobile and Metro by T-Mobile. Customers benefit from an unmatched combination of value, quality, and exceptional service experience.About TMUS Global Solutions:TMUS Global Solutions is a world-class technology powerhouse accelerating the company’s global digital transformation. With a culture built on growth, inclusivity, and global collaboration, the teams here drive innovation at scale, powered by bold thinking.TMUS India Private Limited operates as TMUS Global Solutions.About the Role:As a Site Reliability Engineer (SRE), you will be a key member of the CFL Platform Engineering and Operations team you will be responsible for building and maintaining large-scale, distributed systems that are observable, scalable, and resilient. This role sits at the intersection of software engineering and infrastructure operations, ensuring high availability and performance of production systems through automation, monitoring, and proactive engineering. You'll work closely with development, DevOps, and cloud platform teams to improve deployment strategies, incident response, and system health insights. This is a hands-on role for engineers who are passionate about operational excellence, reducing toil, and improving system reliability through code.What You Will Do:Ensure high availability and performance of production platforms through monitoring, alerting, and incident managementDesign and implement resiliency patterns such as circuit breakers, failovers, retries, and health checksDevelop automation to reduce manual operational work and improve system efficiencySupport CI/CD workflows and infrastructure automation using tools like Terraform and HelmCollaborate with developers to enhance service deployment and rollback mechanismsBuild and maintain observability tooling including dashboards, logs, and metricsAnalyze performance data and use it to guide optimizations and issue detectionParticipate in on-call rotations, incident triage, and post-incident analysisWrite and maintain operational documentation, including runbooks and playbooksSupport development teams in achieving service-level objectives (SLOs) and operational readinessWhat You Will Bring:Bachelor’s degree in Computer Science, Engineering, or a related technical field2-5 years of experience in SRE, infrastructure, DevOps, or related engineering rolesProficiency in scripting or programming (Python, Go, or Bash preferred)Strong experience with Linux systems and cloud environments (Azure preferred; AWS/GCP also relevant)Hands-on experience with Kubernetes and containerized servicesFamiliarity with observability tools such as Prometheus, Grafana, Splunk, or OpenTelemetryExposure to incident response frameworks, postmortems, and error budgetsUnderstanding of core SRE concepts: SLOs, SLIs, and service reliability metricsExperience with CI/CD tools (e.G., GitLab CI/CD, Jenkins, Spinnaker)Working knowledge of infrastructure tools such as HAProxy, RabbitMQ, or similarStrong analytical and troubleshooting skills for distributed systemsClear communication skills and ability to work cross-functionallyA continuous improvement mindset focused on reducing operational toil and enhancing developer experienceMust Have Skills:Application & Microservice: Java, Spring boot, API & Service DesignAny CI/CD Tools : Gitlab Pipeline/Test Automation/GitHub Actions/ Jenkins /Circle CIApp Platform: Docker & Containers (Kubernetes)Any Databases : SQL & NOSQL (Cassandra/Oracle/Snowflake/MongoDB)Any Messaging: Kafka, Rabbit MQAny Observability/Monitoring: Splunk/ Grafana/ Open Telemetry /ELK Stack/ Datadog/ New Relic/ Prometheus)Incident/Change/Problem ManagementNice To Have:Define SLIs/SLOs
-
Reliability Engineer
3 hours ago
Hyderabad, India Cyient Full timeJob Description We are seeking a highly analytical and detail-oriented Reliability Engineer with specialized experience in Weibull analysis and aircraft reliability data. The ideal candidate will play a critical role in enhancing the safety, performance, and cost-effectiveness of our aircraft fleet by analyzing failure data, predicting component life, and...
-
Cad Drafter
2 weeks ago
Hyderabad, India Pinnacle Reliability Full timeWe are building a team of trailblazers, who embody growth, impact, and excellence. **Job Description**: We are currently looking for a CAD DRAFTER to support engineering projects by utilizing skills in AutoCAD as well as the ability to draft in Isometric planes. Job Duties - Executes drafting work in AutoCAD to meet high quality standards and efficiency...
-
Site Reliability Engineer
5 days ago
Hyderabad, Telangana, India Oracle Financial Services Software Ltd Full time ₹ 12,00,000 - ₹ 36,00,000 per yearPrincipal Site Reliability Engineer Oracle is seeking motivated Principal Site Reliability Engineer who thrives in a fast-paced rapidly evolving technology environment. This position requires wide and overall knowledge in Linux administration, AI technologies, software development, cloud computing, networking, cloud security, performance analysis and...
-
Reliability Engineer
5 days ago
Hyderabad, Telangana, India Apple Full time ₹ 12,00,000 - ₹ 36,00,000 per yearJoin the AI and Data Platforms team at Apple, where we build and manage cloud-based data platforms handling petabytes of data at scale. We are looking for a passionate and independent Software Engineer specializing in reliability engineering for data platforms, with a strong understanding of data and ML systems. If you thrive in a fast-paced environment,...
-
Reliability Engineer II
6 days ago
Hyderabad, Telangana, India Medtronic Full time ₹ 15,00,000 - ₹ 25,00,000 per yearAt Medtronic you can begin a life-long career of exploration and innovation, while helping champion healthcare access and equity for all. You'll lead with purpose, breaking down barriers to innovation in a more connected, compassionate world.A Day in the LifeExperienced individual contributor in Reliability Engineering, working on complex projects....
-
Site Reliability Engineer
2 weeks ago
Hyderabad, Telangana, India Jigya Software Services Full time ₹ 1,50,000 - ₹ 28,00,000 per yearJob Title:Senior Site Reliability Engineer (SRE) - AWS/KubernetesLocation:Hyderabad - OnsiteJob Type:Full-TimeAbout the Role:We are looking for a highly skilled and motivated Site Reliability Engineer to design, build, and maintain our high-performance, scalable cloud infrastructure. You will play a critical role in ensuring the reliability, performance, and...
-
Reliability Engineering Specialist
3 weeks ago
Hyderabad, India ANSR Full timeAbout American Airlines:To Care for People on Life's Journey®. We have a relentless drive for innovation and excellence. Whether you're engaging with customers at the airport or advancing our IT infrastructure, every team member plays a vital role in shaping the future of travel. At American’s Tech Hubs, we tackle complex challenges and pioneer...
-
Reliability Engineer II
1 week ago
Hyderabad, Telangana, India Medtronic Full time ₹ 6,00,000 - ₹ 12,00,000 per yearAt Medtronic you can begin a life-long career of exploration and innovation, while helping champion healthcare access and equity for all. You'll lead with purpose, breaking down barriers to innovation in a more connected, compassionate world.A Day in the LifeExperienced individual contributor in Reliability Engineering, working on complex projects....
-
Lead Reliability Engineer
4 weeks ago
Hyderabad, India ANSR Full timeAbout T-Mobile:T-Mobile US, Inc. (NASDAQ: TMUS), headquartered in Bellevue, Washington, is America’s supercharged Un-carrier, connecting millions through its strong nationwide network and flagship brands, T-Mobile and Metro by T-Mobile. Customers benefit from an unmatched combination of value, quality, and exceptional service experience.About TMUS Global...
-
Site Reliability Engineer
3 weeks ago
Hyderabad, India Sonata Software Full timeCategory Details Role Site Reliability Engineer (SRE) III – Data Engineering Location Hyderabad- Employment Type Full Time Experience 7–12 years in site reliability, cloud-based data infrastructure, data pipeline observability, automation, and high-availability engineering within EdTech platforms (2U) Primary Skills (Must-Have) AWS, CI/CD, Jenkins, IAAC,...