Site Reliability Engineer/SRE
1 day ago
Job Title: Site Reliability Engineer II
Location: Hyderabad
Job Type: Full-Time, 24*7 Availability
Job Description:
As a Site Reliability Engineer, you will be responsible for ensuring the reliability, performance, and observability of our transaction processing and settlement platforms, while also providing hands-on support for critical business applications. You will work across hybrid environments (on-prem and cloud), contribute to automation and monitoring enhancements, and support incident response and platform stability. This position requires availability to work in a 247 support model, including rotational shifts and on-call duties.
Key Responsibilities:
- Monitoring & Observability: Maintain and enhance monitoring systems using Prometheus, Grafana, Splunk, and SolarWinds. Ensure timely detection and resolution of issues through effective alerting and dashboards.
- Application Support: Provide L1/L2 support for business-critical applications, including incident triage, health checks, deployment validation, and coordination with development and product teams.
- Incident Management: Lead response for moderate to complex incidents, perform root cause analysis, and contribute to post-incident reviews and documentation.
- Automation & Scripting: Develop and maintain automation scripts using Python, PowerShell, or Bash to streamline operational tasks and reduce manual effort.
- Infrastructure Support: Monitor and support infrastructure health across on-prem and cloud platforms (GCP, Azure), including performance tuning and capacity planning.
- Kubernetes Operations: Support containerized workloads and microservices running on Kubernetes clusters. Perform health checks, troubleshoot deployments, and optimize resource usage.
- Process Adherence: Participate in ITIL-aligned processes for incident, change, and problem management. Ensure compliance with operational standards and audit requirements.
- Knowledge Sharing: Document SOPs, recurring issues, and resolutions. Mentor junior engineers and contribute to team knowledge base.
- Collaboration: Work closely with development, QA, and platform teams to support deployments, platform transitions, and reliability improvements.
- Continuous Improvement: Proactively identify areas for improvement in system reliability, alerting, and operational workflows.
- 24/7 Support: Provide on-call support for critical issues.
Qualifications:
- Bachelors / master's degree in computer science, Engineering, or related field.
- 25 years of experience in SRE, infrastructure operations, system administration, or application support.
- Proficiency in monitoring tools (Prometheus, Grafana, Splunk, SolarWinds).
- Strong scripting skills (Python, Bash, PowerShell).
- Experience with cloud platforms (GCP, Azure, AWS).
- Hands-on experience with Kubernetes in production environments.
- Solid understanding of ITIL practices and enterprise support workflows.
- Hands-on experience with automation tools (ActiveBatch or similar).
- Strong analytical, communication, and problem-solving skills.
- Willingness to work in a 247 support environment and take ownership of reliability outcomes.
-
Senior Site Reliability Engineer
4 days ago
Hyderabad, Telangana, India Jade Global Full time ₹ 12,00,000 - ₹ 24,00,000 per yearSenior Site Reliability Engineer (SRE) – Datadog Observability1Job Title: Senior Site Reliability Engineer (SRE) – Datadog ObservabilityExperience Required: 8+ years overall in SRE and Infrastructure Operations with minimum 3+ years hands-on experience in DatadogLocation: Hyderabad preferable but open for Pune and remoteJob Summary:We are seeking an...
-
Devops SRE Site Reliability Engineer
1 day ago
Hyderabad, Telangana, India Infoshare Systems, Inc. Full time ₹ 20,00,000 - ₹ 25,00,000 per yearPosition: DevOps SRELocationHyderabad - hybridDuration: FulltimeOptumRequired Skills & Experience5 years of experience in Site Reliability Engineering, DevOps, or Cloud Infrastructure roles.Strong experience with Linux/Unix systems administration.Proficiency with Python, Go, or Bash scripting.Solid understanding of networking, firewalls, and security...
-
Senior Site Reliability Engineer
5 days ago
Hyderabad, Telangana, India Jade Global Full time ₹ 1,00,00,000 - ₹ 2,00,00,000 per yearJob Title: Senior Site Reliability Engineer (SRE) – Datadog ObservabilityExperience Required: 8+ years overall in SRE and Infrastructure Operations with minimum 3 + years hands-on experience in Datadog Location: Hyderabad preferable but open for Pune and remoteJob Summary:We are seeking an experienced Site Reliability Engineer (SRE) to lead end-to-end SRE...
-
Site Reliability Engineer
2 weeks ago
Hyderabad, Telangana, India SID Global Solutions Full time ₹ 9,00,000 - ₹ 12,00,000 per yearJob Role: Site Reliability Engineer (SRE) – GCPExperience: 3+ yearsLocation: HyderabadAbout SIDGS:SIDGS is a premium global systems integrator and global implementation partner of Google corporation, providing Digital Solutions & Services to Fortune 500 companies. Our Digital solutions go across following domains: User Experience, CMS, API Management,...
-
Site Reliability Engineer
1 day ago
Hyderabad, Telangana, India Technology Next Full time ₹ 20,00,000 - ₹ 30,00,000 per yearUrgently hiring for Site Reliability Engineer (SRE) / Chaos EngineerLocation: HyderabadJob Type: Full-time, PermanentJob Description:We are looking for an experienced Site Reliability Engineer (SRE) with strong Python automation skills (Boto3 required) and hands-on experience in chaos engineering to improve system reliability and resilience. The ideal...
-
Site Reliability Engineer
1 week ago
Hyderabad, Telangana, India Assurant Full time ₹ 6,00,000 - ₹ 12,00,000 per yearSite Reliability Engineer, GCC-AssurantThe Site Reliability Engineer (SRE) will be part of the Assurant Reliability Team, specifically within the Site Reliability Engineering area. This remote position, based in India, focuses on building and maintaining reliable, scalable systems through a combination of software development and network diagnostics. The...
-
Site Reliability Engineer
2 weeks ago
Hyderabad, Telangana, India Assurant Full time ₹ 12,00,000 - ₹ 36,00,000 per yearSite Reliability Engineer, GCC-Assurant The Site Reliability Engineer (SRE) will be part of the Assurant Reliability Team, specifically within the Site Reliability Engineering area. This remote position, based in India, focuses on building and maintaining reliable, scalable systems through a combination of software development and network diagnostics. The...
-
Site Reliability Engineer
2 weeks ago
Hyderabad, Telangana, India Evalify-IQ Full time ₹ 6,00,000 - ₹ 18,00,000 per yearSkills Required:AWS, Azure, Terraform, CloudFormation, Cloudformation, Pulumi, CICD, GitHub Actions,GitLab CI, Jenkins, ArgoCD, Prometheus, Splunk, Grafana, Cloudwatch, Datadog, SRE,Site Reliability, Python, Powershell, Shell, Go, Kubernetes, Docker, Performance Tuning,Performance Enhancements, Performance Enhancement, PerformanceExperience Range:2 - 5...
-
Sr Site Reliability Engineer
22 hours ago
Hyderabad, Telangana, India GHX Full time ₹ 4,00,000 - ₹ 6,00,000 per yearSite Reliability Engineer (SRE)Position SummaryThe Site Reliability Engineer (SRE) will be a hands-on contributor within the Site Reliability Engineering Center of Excellence (CoE), responsible for building monitoring and observability solutions, troubleshooting production issues, and participating in 24x7 on-call operations.This role focuses on the...
-
Principal Site Reliability Engineer
1 week ago
Hyderabad, Telangana, India Amgen Inc Full time ₹ 8,00,000 - ₹ 12,00,000 per yearWe are looking for a Site Reliability Engineer/Cloud Engineer (SRE) to work on the performance optimization, standardization, and automation of Amgens critical infrastructure and systems. This role is crucial to ensuring the reliability, scalability, and cost-effectiveness of our production systems. The ideal candidate will work on operational excellence...