16/09/2025 Principal Engineer, Site Reliability

2 weeks ago


Hyderabad, Telangana, India ANSR Full time
ANSR is hiring for one of its clients.

About T-Mobile:

T-Mobile US, Inc. (NASDAQ: TMUS), headquartered in Bellevue, Washington, is America's supercharged Un-carrier, connecting millions through its strong nationwide network and flagship brands, T-Mobile and Metro by T-Mobile. Customers benefit from an unmatched combination of value, quality, and exceptional service experience.

About TMUS Global Solutions:

TMUS Global Solutions is a world-class technology powerhouse accelerating the company's global digital transformation. With a culture built on growth, inclusivity, and global collaboration, the teams here drive innovation at scale, powered by bold thinking.

TMUS India Private Limited is a subsidiary of T-Mobile US, Inc. and operates as TMUS Global Solutions.

About the Role:

The Principal Engineer, Site Reliability (SRE) will play a critical role in ensuring the stability, scalability, and operational excellence of Accounting and Finance platforms. This role is focused on leading the operational health of these platforms, ensuring the delivery of highly reliable financial applications and data services that meet the demanding requirements of accuracy, compliance, and availability to support business operations.

As a Principal SRE, you will build automation, implement monitoring, improve incident response, and champion DevOps practices that enable Finance and Accounting systems to operate with consistency and trustworthiness, while also coaching and mentoring junior SREs to ensure overall operational excellence.

What You'll Do:

- Operational Oversight: Own day-to-day operations for Accounting and Finance applications and data platforms, ensuring they run smoothly and meet business expectations.
- Reliability & Availability: Ensure Accounting and Finance platforms meet defined SLAs, SLOs, and SLIs for performance, reliability, and uptime.
- Automation & Efficiency: Build automation for deployments, monitoring, scaling, and self-healing capabilities to reduce manual effort and operational risk.
- Observability & Monitoring: Implement and maintain comprehensive monitoring, alerting, and logging for accounting applications and data pipelines (e.g., Snowflake, dbt workflows, ERP integrations).
- Incident Response: Lead and participate in on-call rotations, perform root cause analysis, and drive improvements to prevent recurrence of production issues.
- Operational Excellence: Establish and enforce best practices for capacity planning, performance tuning, disaster recovery, and compliance controls in financial systems.
- Collaboration with Engineering & Finance: Partner with software engineers, data engineers, and Finance/Accounting teams to ensure operational needs are met from development through production.
- Team Coordination: Manage workload, priorities, and escalations for operations staff and partner teams, ensuring alignment with SLAs and compliance requirements.
- Security & Compliance: Ensure financial applications and data pipelines meet audit, compliance, and security requirements.
- Continuous Improvement: Drive post-incident reviews, implement lessons learned, and proactively identify opportunities to improve system resilience.
- Audit & Compliance Support: Ensure operational practices meet internal controls, audit requirements, and financial compliance standards.

What You'll Bring:

- Bachelor's in computer science, Engineering, Information Technology, or related field (or equivalent experience).
- 12-15 years of experience in Site Reliability Engineering, DevOps, or Production Engineering, ideally supporting financial or mission-critical applications.
- Strong experience with monitoring/observability tools (Datadog, Prometheus, Grafana, Splunk, or equivalent).
- Hands-on expertise with CI/CD pipelines, automation frameworks, and IaC tools (Terraform, Ansible, GitHub Actions, Azure DevOps, etc.).
- Familiarity with Snowflake, dbt, and financial system integrations from an operational support perspective.
- Strong scripting/programming experience (Python, Bash, Go, or similar) for automation and tooling.
- Proven ability to manage incident response and conduct blameless postmortems.
- Experience ensuring compliance, security, and audit-readiness in enterprise applications.

Must Have Skills:

- SRE
- SQL
- Snowflake OR Databricks
- DevOps OR CICD OR GitHub Actions
- monitoring/observability tools (Datadog, Prometheus, Grafana, Splunk, or equivalent)
- Automation

Nice To Have:

- Experience supporting financial applications (ERP, revenue recognition systems, accounting platforms).
- Exposure to FinOps practices for optimizing cloud spend in finance-related platforms.
- Familiarity with containers and orchestration (Docker, Kubernetes).
- Experience building resilience into data pipelines and ensuring auditability for accounting data.
- Strong communication skills to articulate operational issues and risks to both technical and non-technical stakeholders.

  • Hyderabad, Telangana, India Cubic Transportation Systems Full time

    Hiring Principal Site Reliability Engineer Experience: 12+ Years Location: Hyderabad Notice: Immediate to 30 Days We're seeking an experienced Site Reliability Engineer (SRE) to ensure our services are robust, scalable, secure, and maintainable. You will blend software engineering and systems operations to automate processes, monitor performance, lead...


  • Hyderabad, Telangana, India IntraEdge Full time

    Site Reliability EngineerExperience: 7+ YearsLocation: HyderabadHybrid 4-day office and 1 Day remoteSkills for Principal:- Strong leadership and people management skills.- Exceptional technical proficiency in Pearson's technology stack.- Advanced project management capabilities.- Excellent communication and collaboration skills.- Adept at risk assessment and...


  • Hyderabad, Telangana, India IntraEdge Full time

    Site Reliability Engineer Experience: 7+ Years Location: Hyderabad Hybrid 4-day office and 1 Day remote Skills for Principal: Strong leadership and people management skills. Exceptional technical proficiency in Pearson's technology stack. Advanced project management capabilities. Excellent communication and collaboration skills. Adept at risk assessment...


  • Hyderabad, Telangana, India IntraEdge Full time

    Site Reliability EngineerExperience: 7+ YearsLocation: HyderabadHybrid 4-day office and 1 Day remoteSkills for Principal:Strong leadership and people management skills.Exceptional technical proficiency in Pearson's technology stack.Advanced project management capabilities.Excellent communication and collaboration skills.Adept at risk assessment and crisis...


  • Hyderabad, Telangana, India Principal Financial Full time

    Responsibilities Sr Software Engineer T3 Mainframe Modernization ToolingIndicative years of total experience 4 to 6 yearsRole Description Principal Pune is hiring a Mainframe Modernization Sr Infrastructure Engineer This engineer will be a part of the Platform support under Information Services IS and responsible for helping achieve the strategy around...


  • Hyderabad, Telangana, India TechBlocks Full time

    Job Title: Site Reliability Engineering (SRE) Manager Location : Hyderabad Employment Type: Full-Time Work Model - 3 Days from office (Hybrid) Summary: The SRE Manager at TechBlocks India will lead the reliability engineering function, ensuring infrastructure resiliency and optimal operational performance. This hybrid role blends technical leadership...


  • Hyderabad, Telangana, India ANSR Full time

    ANSR is hiring for one of its client:About T-Mobile:T-Mobile US, Inc. (NASDAQ: TMUS), headquartered in Bellevue, Washington, is America's supercharged Un-carrier, connecting millions through its strong nationwide network and flagship brands, T-Mobile and Metro by T-Mobile. Customers benefit from an unmatched combination of value, quality, and exceptional...


  • Hyderabad, Telangana, India Cubic Transportation Systems Full time

    Hiring Principal Site Reliability EngineerExperience: 12+ YearsLocation: HyderabadNotice: Immediate to 30 DaysWe're seeking an experienced Site Reliability Engineer (SRE) to ensure our services are robust, scalable, secure, and maintainable. You will blend software engineering and systems operations to automate processes, monitor performance, lead incident...


  • Hyderabad, Telangana, India Cubic Corporation Full time ₹ 15,00,000 - ₹ 20,00,000 per year

    Business Unit:Cubic Transportation SystemsCompany Details:When you join Cubic, you become part of a company that creates and delivers technology solutions in transportation to make people's lives easier by simplifying their daily journeys, and defense capabilities to help promote mission success and safety for those who serve their nation. Led by our...


  • Hyderabad, Telangana, India Talent Worx Full time ₹ 15,00,000 - ₹ 25,00,000 per year

    Site Reliability Engineer (SRE)At Talent Worx, we are looking for a dedicated Site Reliability Engineer (SRE) to join our team. This role involves maintaining high availability and reliability of our services through the application of software engineering practices and systems administration skills. The ideal candidate will bridge the gap between...