Site Reliability Engineer/SRE
3 days ago
Job Title: Site Reliability Engineer II
Location: Hyderabad
Job Type: Full-Time, 24*7 Availability
Job Description:
As a Site Reliability Engineer, you will be responsible for ensuring the reliability, performance, and observability of our transaction processing and settlement platforms, while also providing hands-on support for critical business applications. You will work across hybrid environments (on-prem and cloud), contribute to automation and monitoring enhancements, and support incident response and platform stability. This position requires availability to work in a 247 support model, including rotational shifts and on-call duties.
Key Responsibilities:
- Monitoring & Observability: Maintain and enhance monitoring systems using Prometheus, Grafana, Splunk, and SolarWinds. Ensure timely detection and resolution of issues through effective alerting and dashboards.
- Application Support: Provide L1/L2 support for business-critical applications, including incident triage, health checks, deployment validation, and coordination with development and product teams.
- Incident Management: Lead response for moderate to complex incidents, perform root cause analysis, and contribute to post-incident reviews and documentation.
- Automation & Scripting: Develop and maintain automation scripts using Python, PowerShell, or Bash to streamline operational tasks and reduce manual effort.
- Infrastructure Support: Monitor and support infrastructure health across on-prem and cloud platforms (GCP, Azure), including performance tuning and capacity planning.
- Kubernetes Operations: Support containerized workloads and microservices running on Kubernetes clusters. Perform health checks, troubleshoot deployments, and optimize resource usage.
- Process Adherence: Participate in ITIL-aligned processes for incident, change, and problem management. Ensure compliance with operational standards and audit requirements.
- Knowledge Sharing: Document SOPs, recurring issues, and resolutions. Mentor junior engineers and contribute to team knowledge base.
- Collaboration: Work closely with development, QA, and platform teams to support deployments, platform transitions, and reliability improvements.
- Continuous Improvement: Proactively identify areas for improvement in system reliability, alerting, and operational workflows.
- 24/7 Support: Provide on-call support for critical issues.
Qualifications:
- Bachelors / master's degree in computer science, Engineering, or related field.
- 25 years of experience in SRE, infrastructure operations, system administration, or application support.
- Proficiency in monitoring tools (Prometheus, Grafana, Splunk, SolarWinds).
- Strong scripting skills (Python, Bash, PowerShell).
- Experience with cloud platforms (GCP, Azure, AWS).
- Hands-on experience with Kubernetes in production environments.
- Solid understanding of ITIL practices and enterprise support workflows.
- Hands-on experience with automation tools (ActiveBatch or similar).
- Strong analytical, communication, and problem-solving skills.
- Willingness to work in a 247 support environment and take ownership of reliability outcomes.
-
SRE(Site Reliability Engineer)
1 day ago
Hyderabad, Telangana, India Talent Worx Full time ₹ 12,00,000 - ₹ 36,00,000 per yearSRE (Site Reliability Engineer)Talent Worx is seeking a talented SRE (Site Reliability Engineer) to enhance our technology team. In this role, you will be pivotal in ensuring the reliability, performance, and availability of our applications and services. Your work will involve both software engineering and systems operations as you strive to improve...
-
Senior Site Reliability Engineer
3 days ago
Hyderabad, Telangana, India Jade Global Software Pvt Ltd Full time ₹ 12,00,000 - ₹ 24,00,000 per yearSenior Site Reliability Engineer (SRE) – Datadog ObservabilitySenior Site Reliability Engineer (SRE) – Datadog Observability1 Job Title: Senior Site Reliability Engineer (SRE) – Datadog ObservabilityExperience Required: 8+ years overall in SRE and Infrastructure Operations with minimum 3+ years hands-on experience in DatadogLocation: Hyderabad...
-
Senior Site Reliability Engineer
4 days ago
Hyderabad, Telangana, India Jade Global Full time ₹ 12,00,000 - ₹ 24,00,000 per yearSenior Site Reliability Engineer (SRE) – Datadog Observability1Job Title: Senior Site Reliability Engineer (SRE) – Datadog ObservabilityExperience Required: 8+ years overall in SRE and Infrastructure Operations with minimum 3+ years hands-on experience in DatadogLocation: Hyderabad preferable but open for Pune and remoteJob Summary:We are seeking an...
-
Devops SRE Site Reliability Engineer
3 days ago
Hyderabad, Telangana, India Infoshare Systems, Inc. Full time ₹ 20,00,000 - ₹ 25,00,000 per yearPosition: DevOps SRELocationHyderabad - hybridDuration: FulltimeOptumRequired Skills & Experience5 years of experience in Site Reliability Engineering, DevOps, or Cloud Infrastructure roles.Strong experience with Linux/Unix systems administration.Proficiency with Python, Go, or Bash scripting.Solid understanding of networking, firewalls, and security...
-
Senior Site Reliability Engineer
5 days ago
Hyderabad, Telangana, India Jade Global Full time ₹ 1,00,00,000 - ₹ 2,00,00,000 per yearJob Title: Senior Site Reliability Engineer (SRE) – Datadog ObservabilityExperience Required: 8+ years overall in SRE and Infrastructure Operations with minimum 3 + years hands-on experience in Datadog Location: Hyderabad preferable but open for Pune and remoteJob Summary:We are seeking an experienced Site Reliability Engineer (SRE) to lead end-to-end SRE...
-
Site Reliability Engineer
2 weeks ago
Hyderabad, Telangana, India SID Global Solutions Full time ₹ 9,00,000 - ₹ 12,00,000 per yearJob Role: Site Reliability Engineer (SRE) – GCPExperience: 3+ yearsLocation: HyderabadAbout SIDGS:SIDGS is a premium global systems integrator and global implementation partner of Google corporation, providing Digital Solutions & Services to Fortune 500 companies. Our Digital solutions go across following domains: User Experience, CMS, API Management,...
-
Site Reliability Engineer
16 hours ago
Hyderabad, Telangana, India BYLD Group Full time ₹ 12,00,000 - ₹ 36,00,000 per yearDescriptionJob Title :Site Reliability Engineer (SRE) - DataDog / AWS Lambda / DynamoDB / ServerlessLocation :Bangalore / Pune / HyderabadExperience :5- 10 YearsAbout The RoleWe are seeking an experienced Site Reliability Engineer (SRE) with strong expertise in DataDog integration, AWS Lambda, DynamoDB, and Serverless architectures. The ideal candidate will...
-
Site Reliability Engineer
1 week ago
Hyderabad, Telangana, India SMARTWORK IT SERVICES Full time ₹ 12,00,000 - ₹ 24,00,000 per yearDescription : Role : Site Reliability Engineer (SRE). Location : Hyderabad. Experience : 10 to 15 Years. Job Summary : The Site Reliability Engineer (SRE) will play a critical role in ensuring the reliability, scalability, and performance of Citizens Banks enterprise systems and cloud environments. The ideal candidate brings deep technical...
-
Site Reliability Engineer
3 days ago
Hyderabad, Telangana, India Technology Next Full time ₹ 20,00,000 - ₹ 30,00,000 per yearUrgently hiring for Site Reliability Engineer (SRE) / Chaos EngineerLocation: HyderabadJob Type: Full-time, PermanentJob Description:We are looking for an experienced Site Reliability Engineer (SRE) with strong Python automation skills (Boto3 required) and hands-on experience in chaos engineering to improve system reliability and resilience. The ideal...
-
Site Reliability Engineer
1 day ago
Hyderabad, Telangana, India Talent Worx Full time ₹ 12,00,000 - ₹ 36,00,000 per yearSite Reliability Engineer (SRE)At Talent Worx, we are looking for a dedicated Site Reliability Engineer (SRE) to join our team. This role involves maintaining high availability and reliability of our services through the application of software engineering practices and systems administration skills. The ideal candidate will bridge the gap between...