Principal Engineer, Site Reliability

4 weeks ago

Hyderabad, India ANSR Full time

ANSR is hiring for one of its clients.About T-Mobile:T-Mobile US, Inc. (NASDAQ: TMUS), headquartered in Bellevue, Washington, is America’s supercharged Un-carrier, connecting millions through its strong nationwide network and flagship brands, T-Mobile and Metro by T-Mobile. Customers benefit from an unmatched combination of value, quality, and exceptional service experience.About TMUS Global Solutions:TMUS Global Solutions is a world-class technology powerhouse accelerating the company’s global digital transformation. With a culture built on growth, inclusivity, and global collaboration, the teams here drive innovation at scale, powered by bold thinking.TMUS India Private Limited operates as TMUS Global Solutions.About the Role:The Principal Engineer, Site Reliability (SRE) will play a critical role in ensuring the stability, scalability, and operational excellence of Accounting and Finance platforms. This role is focused on leading the operational health of these platforms, ensuring the delivery of highly reliable financial applications and data services that meet the demanding requirements of accuracy, compliance, and availability to support business operations.As a Principal SRE, you will build automation, implement monitoring, improve incident response, and champion DevOps practices that enable Finance and Accounting systems to operate with consistency and trustworthiness, while also coaching and mentoring junior SREs to ensure overall operational excellence.What You’ll Do:Operational Oversight: Own day-to-day operations for Accounting and Finance applications and data platforms, ensuring they run smoothly and meet business expectations.Reliability & Availability: Ensure Accounting and Finance platforms meet defined SLAs, SLOs, and SLIs for performance, reliability, and uptime.Automation & Efficiency: Build automation for deployments, monitoring, scaling, and self-healing capabilities to reduce manual effort and operational risk.Observability & Monitoring: Implement and maintain comprehensive monitoring, alerting, and logging for accounting applications and data pipelines (e.g., Snowflake, dbt workflows, ERP integrations).Incident Response: Lead and participate in on-call rotations, perform root cause analysis, and drive improvements to prevent recurrence of production issues.Operational Excellence: Establish and enforce best practices for capacity planning, performance tuning, disaster recovery, and compliance controls in financial systems.Collaboration with Engineering & Finance: Partner with software engineers, data engineers, and Finance/Accounting teams to ensure operational needs are met from development through production.Team Coordination: Manage workload, priorities, and escalations for operations staff and partner teams, ensuring alignment with SLAs and compliance requirements.Security & Compliance: Ensure financial applications and data pipelines meet audit, compliance, and security requirements.Continuous Improvement: Drive post-incident reviews, implement lessons learned, and proactively identify opportunities to improve system resilience.Audit & Compliance Support: Ensure operational practices meet internal controls, audit requirements, and financial compliance standards.What You’ll Bring:Bachelor’s in computer science, Engineering, Information Technology, or related field (or equivalent experience).12-15 years of experience in Site Reliability Engineering, DevOps, or Production Engineering, ideally supporting financial or mission-critical applications.Strong experience with monitoring/observability tools (Datadog, Prometheus, Grafana, Splunk, or equivalent).Hands-on expertise with CI/CD pipelines, automation frameworks, and IaC tools (Terraform, Ansible, GitHub Actions, Azure DevOps, etc.).Familiarity with Snowflake, dbt, and financial system integrations from an operational support perspective.Strong scripting/programming experience (Python, Bash, Go, or similar) for automation and tooling.Proven ability to manage incident response and conduct blameless postmortems.Experience ensuring compliance, security, and audit-readiness in enterprise applications.Must Have Skills:SRESQLSnowflake OR DatabricksDevOps OR CICD OR GitHub Actionsmonitoring/observability tools (Datadog, Prometheus, Grafana, Splunk, or equivalent)AutomationNice To Have: Experience supporting financial applications (ERP, revenue recognition systems, accounting platforms).Exposure to FinOps practices for optimizing cloud spend in finance-related platforms.Familiarity with containers and orchestration (Docker, Kubernetes).Experience building resilience into data pipelines and ensuring auditability for accounting data.Strong communication skills to articulate operational issues and risks to both technical and non-technical stakeholders.

Principal Site Reliability Engineer

2 hours ago

Hyderabad, Telangana, India Oracle Full time ₹ 12,00,000 - ₹ 36,00,000 per year

Oracle is seeking motivated Principal Site Reliability Engineer who thrives in a fast-paced rapidly evolving technology environment. This position requires wide and overall knowledge in Mainframe zLinux, DB2, zVM, AIX. Site Reliability Engineer expected to work with multiple service and product development teams, identifying cross-team issues that...
Site Reliability Engineer

4 days ago

Hyderabad, Telangana, India Oracle Financial Services Software Ltd Full time ₹ 12,00,000 - ₹ 36,00,000 per year

Principal Site Reliability Engineer Oracle is seeking motivated Principal Site Reliability Engineer who thrives in a fast-paced rapidly evolving technology environment. This position requires wide and overall knowledge in Linux administration, AI technologies, software development, cloud computing, networking, cloud security, performance analysis and...
Principal Site Reliability Engineer

6 days ago

Hyderabad, Telangana, India Oracle Full time ₹ 20,00,000 - ₹ 60,00,000 per year

Oracle is seeking motivated Principal Site Reliability Engineer who thrives in a fast-paced rapidly evolving technology environment. This position requires wide and overall knowledge in Linux administration, AI technologies, software development, cloud computing, networking, cloud security, performance analysis and monitoring to provide the stability,...
Principal Site Reliability Engineer

2 weeks ago

Hyderabad, Telangana, India Cubic Transportation Systems Full time ₹ 12,00,000 - ₹ 36,00,000 per year

Hiring Principal Site Reliability EngineerExperience: 12+ YearsLocation: HyderabadNotice: Immediate to 30 DaysWe're seeking an experiencedSite Reliability Engineer (SRE)to ensure our services are robust, scalable, secure, and maintainable. You will blend software engineering and systems operations to automate processes, monitor performance, lead incident...
Principal Site Reliability Engineer

2 weeks ago

Hyderabad, Telangana, India Cubic Transportation Full time ₹ 12,00,000 - ₹ 36,00,000 per year

Hiring Principal Site Reliability EngineerExperience: 12 to 18 YearsLocation: HyderabadNotice Period: Immediate to 30 DaysKey ResponsibilitiesDesign, deploy, and maintain scalable, secure applications and infrastructure in cloud or hybrid environmentsImplement and manage robust monitoring, alerting, and observability systemsAutomate recurrent operational...
Site Reliability Engineer

2 weeks ago

Hyderabad, Telangana, India Oracle Financial Services Software Ltd Full time ₹ 12,00,000 - ₹ 36,00,000 per year

Senior Principal Site Reliability Engineer, Fusion SRE About Oracle Cloud: Oracle Cloud is a comprehensive suite of cloud services—including infrastructure, platform, and applications—designed to help organizations build, deploy, and manage workloads securely at scale. At Oracle, we are building the most intelligent future of cloud computing. Our...
Principal Site Reliability Engineer

6 days ago

Hyderabad, Telangana, India Amgen Inc Full time ₹ 8,00,000 - ₹ 12,00,000 per year

We are looking for a Site Reliability Engineer/Cloud Engineer (SRE) to work on the performance optimization, standardization, and automation of Amgens critical infrastructure and systems. This role is crucial to ensuring the reliability, scalability, and cost-effectiveness of our production systems. The ideal candidate will work on operational excellence...
Site Reliability Engineering

2 weeks ago

Hyderabad, Telangana, India Acesoft Labs Full time ₹ 20,00,000 - ₹ 25,00,000 per year

Hi ,Kindly find the below JD :Job Title: Site Reliability Engineering (SRE) ManagerLocation: HyderabadEmployment Type: Full-TimeWork Model - 3 Days from office (Hybrid)Summary:The SRE Manager at TechBlocks India will lead the reliability engineering function, ensuring infrastructure resiliency and optimal operational performance. This hybrid role blends...
Principal Engineer, Site Reliability

4 weeks ago

Hyderabad, India ANSR Full time

ANSR is hiring for one of its clients. About T-Mobile: T-Mobile US, Inc. (NASDAQ: TMUS), headquartered in Bellevue, Washington, is America's supercharged Un-carrier, connecting millions through its strong nationwide network and flagship brands, T-Mobile and Metro by T-Mobile. Customers benefit from an unmatched combination of value, quality, and exceptional...
Site Reliability Engineering

2 weeks ago

Hyderabad, Telangana, India TECHBLOCKS Full time ₹ 12,00,000 - ₹ 36,00,000 per year

Job Title: Site Reliability Engineering (SRE) ManagerLocation: HyderabadEmployment Type: Full-TimeWork Model - 3 Days from office (Hybrid)Summary:The SRE Manager at TechBlocks India will lead the reliability engineering function, ensuring infrastructure resiliency and optimal operational performance. This hybrid role blends technical leadership with team...

Americas

Europe

Asia / Oceania

Africa

Principal Engineer, Site Reliability