SRE/ System monitoring engineer
20 hours ago
About the Role
We are seeking a Site Reliability Engineer (SRE) focused on system observability, monitoring, and operational support to ensure high availability, performance, and reliability of our production systems. This role is part of our 24x7 support operations team responsible for proactive monitoring, incident management, and first-line troubleshooting across our infrastructure and applications.
You will play a critical role in maintaining end-to-end system health, service uptime, and incident response readiness for our global software platforms.
Key Responsibilities
Monitoring & Observability
Manage and maintain system and application monitoring tools (e.g., Prometheus, Grafana, Datadog, New Relic, Splunk, ELK, CloudWatch, etc.)
- Develop and fine-tune dashboards, alerts, and health checks for critical services.
Ensure complete observability coverage across services, APIs, databases, and infrastructure.
Incident Response & Resolution
Provide 24x7 monitoring and first-level response for system alerts and incidents.
- Perform initial triage, escalate issues to the appropriate teams, and track resolution.
- Participate in on-call rotations and support major incident management processes.
Document and analyze root causes for recurring issues; assist with problem management.
Collaboration & Continuous Improvement
Work closely with DevOps, Cloud, and Development teams to improve system reliability.
- Contribute to observability best practices and automation initiatives.
- Support SLA/SLO/SLI definitions and reliability reporting.
Required Skills & Experience
- Experience: 26 years in SRE, DevOps, or IT operations with production support.
Technical Skills:
Hands-on experience with at least one cloud platform (AWS, GCP, Azure).
- Proficiency in monitoring tools (Grafana, Prometheus, Datadog, New Relic, or equivalent).
- Strong understanding of Linux/Unix systems, networking, and system logs.
- Experience with automation and scripting (Bash, Python, Shell, etc.)
- Familiarity with CI/CD pipelines, containers (Docker, Kubernetes), and alert management systems (PagerDuty, Opsgenie)
-
SRE Monitoring- NOC Engineer
2 weeks ago
Pune, Maharashtra, India Persistent Full time ₹ 6,00,000 - ₹ 18,00,000 per yearAbout Position:We are looking for a highly skilled professional with hands-on experience in VMware technologies and system automation. The ideal candidate will have a strong background in Linux administration, scripting, and cloud infrastructure, along with excellent communication skills. This role involves working across development, testing, staging, and...
-
SRE Migration Engineer
2 days ago
Pune, Maharashtra, India Procallisto solution Full time ₹ 12,00,000 - ₹ 36,00,000 per yearWe are seeking an experienced DevOps Engineer with proven expertise in GitHub to GitLab migration, strong hands-on skills in Python programming, AWS, and Site Reliability Engineering (SRE) practices. The ideal candidate will play a key role in modernizing our CI/CD pipelines, improving cloud infrastructure, and ensuring high system reliability and...
-
SRE Team Lead and Engineer
2 weeks ago
Pune, Maharashtra, India Apex One Full time ₹ 1,04,000 - ₹ 1,30,878 per yearLead and mentor a team of SRE engineers, fostering a reliability, efficiency, and continuous improvement culture.Develop and execute SRE strategies to enhance our systems and services' reliability, availability, and performance.Designed and implemented observability and monitoring solutions using tools like New Relic, Azure Application Insights, AWS X-Ray,...
-
SRE Migration engineer
6 days ago
Pune, Maharashtra, India procallisto solutions pvt Full time ₹ 20,40,000 per yearWe are seeking an experienced DevOps Engineer with proven expertise in GitHub to GitLab migration, strong hands-on skills in Python programming, AWS, and Site Reliability Engineering (SRE) practices. The ideal candidate will play a key role in modernizing our CI/CD pipelines, improving cloud infrastructure, and ensuring high system reliability and...
-
SRE Engineer
2 weeks ago
Pune, Maharashtra, India Techno Facts Solutions Full time ₹ 8,00,000 - ₹ 24,00,000 per yearRole OverviewWe are seeking an experienced Site Reliability Engineer (SRE) with a strong background in automation, monitoring, and performance optimization. The ideal candidate will be proficient in scripting (Python, Bash, ), observability tools, and incident response, ensuring reliability and scalability of enterprise applications.Key...
-
sre
5 days ago
Pune, Maharashtra, India Hitachi Solutions Full time ₹ 12,00,000 - ₹ 36,00,000 per yearCompany Description About Hitachi Solutions India Pvt Ltd:Hitachi Solutions, Ltd., headquartered in Tokyo, Japan, is a core member of Information & Telecommunication Systems Company of Hitachi Group and a recognized leader in delivering proven business and IT strategies and solutions to companies across many industries. The company provides value-driven...
-
SRE DevOps
5 days ago
Pune, Maharashtra, India Zensar Technologies Full time ₹ 12,00,000 - ₹ 36,00,000 per yearRole: SRE DevOps Engineer - AWSExperience: 10 to 16 YearsLocation: Pan IndiaNotice Period: Immediate to 15 DaysJob Description: Qualifications:· The ideal candidate will have a strong background in production monitoring, a deep understanding of development and operations, and a proven track record in managing and scaling distributed systems in a public,...
-
SRE support
6 days ago
Pune, Maharashtra, India Virtusa Full time ₹ 15,00,000 - ₹ 25,00,000 per yearWe are seeking a skilled and proactive Site Reliability Engineer (SRE) to join our growing engineering team. The SRE will be responsible for ensuring the availability, performance, scalability, and reliability of our production systems. You will work at the intersection of software development and operations, driving best practices in observability,...
-
SRE (Site Reliability Engineer)
5 days ago
Pune, Maharashtra, India Supersourcing Full time ₹ 12,00,000 - ₹ 24,00,000 per yearAbout the job Site Reliability Engineering (SRE) combines software and systems engineering to build and run large-scale, massively distributed, fault-tolerant systems. SRE ensures that the servicesboth our internally critical and our externally-visible systemshave reliability, uptime appropriate to customer's needs and a fast rate of improvement....
-
SRE- IAM
6 days ago
Pune, Maharashtra, India AZGROUPPROD Full time ₹ 10,00,000 - ₹ 25,00,000 per yearThe primary objective of the Site Reliability Engineer (SRE) specializing in One Identity Access Management is to ensure the seamless operation, reliability, and scalability of IAM systems within the organization. This role is critical in maintaining system integrity, optimizing performance, and enhancing security protocols to support the organization's...