Site Reliability Engineer
1 week ago
Job Description Job Overview: As a Site Reliability Engineer (SRE) working in a 24/7 shift rotation, you will be responsible for ensuring the reliability, availability, and performance of critical systems and services. You will combine strong technical skills with operational excellence to proactively monitor, troubleshoot, and resolve issues. Your expertise in observability will help maintain robust monitoring, alerting, and incident response processes, ensuring seamless operations around the clock. This role demands 24x7 monthly rotational shifts. Main Responsibilities: - Monitor production systems and services using observability tools (logs, metrics, traces, dashboards). - Respond to incidents, alerts, and outages in real time, ensuring rapid resolution and minimal impact. - Participate in a rotating on-call schedule, providing support during nights, weekends, and holidays. - Design, implement, and maintain observability solutions (e.g., Prometheus, Grafana, ELK and similar tools). - Develop and refine dashboards, alerts, and automated health checks for critical infrastructure and applications. - Analyze system performance and reliability data to identify trends and prevent future incidents. - Collaborate with development, infrastructure, application, and security teams to ensure system reliability and scalability. - Automate operational tasks and incident response processes using scripting and configuration management tools. - Document procedures, runbooks, and incident reports for knowledge sharing and continuous improvement. - Conduct post-incident reviews and root cause analysis to drive improvements in reliability and response. Key Requirements: - Bachelor's degree in Information Technology, Computer Science, Business Administration, or a related field. - Minimum of 2-5 years of experience in cloud engineering and operations engineering. - Proven experience with Azure services; experience with AWS and GCP is an advantage. - Hands-on experience with Infrastructure-as-Code (IaC) tools such as Terraform. - Strong scripting skills in Python, Bash, or PowerShell for automation tasks. - Familiarity with Gitlab CI/CD tools and experience integrating them with Azure. - Proficiency in monitoring and logging tools such as native cloud tools, OpenMetrics, OpenTelemetry. Nice to Have: - Master's degree or relevant certifications. Other Details: This position offers the flexibility of a hybrid work environment. Gain valuable experience in cloud and AI technology while being part of a highly motivated team. Enjoy a competitive remuneration package while charting your own course for career advancement.
-
Site Reliability Engineer
6 hours ago
Pune, India UBS Full timeJob Description Job Reference # 326131BR Job Type Full Time Your role We are seeking a highly experienced Site Reliability Engineer (SRE) to join our technology team in a mission-critical financial environment. This role is ideal for someone who has a proven track record of building and operating reliable, scalable systems in regulated industries such as...
-
Site Reliability Engineer
4 days ago
Pune, India NR Consulting Full timeJob Description ```html About the Company We are seeking a highly skilled Site Reliability Engineer (SRE) with strong expertise in Google Cloud Platform (GCP) and CI/CD automation to lead cloud infrastructure initiatives. The ideal candidate will design and implement robust CI/CD pipelines, automate deployments, ensure platform reliability, and drive...
-
Site Reliability Engineer
2 weeks ago
pune, India Talent Worx Full timeSite Reliability Engineer (SRE)At Talent Worx, we are looking for a dedicated Site Reliability Engineer (SRE) to join our team. This role involves maintaining high availability and reliability of our services through the application of software engineering practices and systems administration skills. The ideal candidate will bridge the gap between...
-
Site Reliability Engineer
1 week ago
Pune, India Talent Worx Full timeSite Reliability Engineer (SRE) At Talent Worx, we are looking for a dedicated Site Reliability Engineer (SRE) to join our team. This role involves maintaining high availability and reliability of our services through the application of software engineering practices and systems administration skills. The ideal candidate will bridge the gap between...
-
Site Reliability Engineer
2 weeks ago
Pune, Maharashtra, India Relanto Full time ₹ 12,00,000 - ₹ 36,00,000 per yearJob Title: Site Reliability EngineerSummaryWe are looking for a Site Reliability Engineer to join our Digital & Transformation department. The ideal candidate will have 4 years of experience in this field and will be responsible for ensuring the reliability, availability, and performance of our systems and applications.Roles And Responsibilities4 years of...
-
Site Reliability Engineer
4 days ago
Pune, India Siemens Digital Industries Software Full timeJob Description Siemens Digital Industries Software is a leading provider of solutions for the design, simulation, and manufacture of products across many different industries. Formula 1 cars, skyscrapers, ships, space exploration vehicles, and many of the objects we see in our daily lives are being conceived and manufactured using our Product Lifecycle...
-
Site Reliability Engineer
2 weeks ago
India Grootan Technologies Full timeAbout the Role We are seeking a skilled Site Reliability Engineer (SRE) with 4–5 years of hands-on experience to join our engineering team. In this role, you will be responsible for building and maintaining reliable, scalable, and secure infrastructure to support our applications. You will leverage your expertise in automation, cloud platforms, and...
-
Site Reliability Engineer
5 days ago
India Datum Technologies Group Full timeJob Title: Site Reliability Engineer (SRE) – AWS Experience: 8+ years Location: Chennai / Mumbai Work Mode: Hybrid Key Skills: AWS, Terraform, Kubernetes, Docker, Grafana, Prometheus, Datadog Job Summary: We are looking for a skilled Site Reliability Engineer (SRE) with strong AWS experience and a solid background in DevOps, automation, observability, and...
-
Site Reliability Engineer
1 week ago
Pune, Maharashtra, India Fiserv Full time ₹ 8,00,000 - ₹ 24,00,000 per yearSite Reliability EngineerExp. Range-8 to14 YearsWhat does a successful Site Reliability Engineer (SRE) Expert do at Fiserv?The Site reliability engineer blends the principles of software engineering with the discipline of operations to create high-performing and reliable software systems. They are tasked with designing and implementing tools, processes, and...
-
Site Reliability Engineer
3 weeks ago
India Akamai Technologies Full timeJob Description Job Description Do you like collaborating across teams to solve complex problems Do you enjoy solving large scale distributed content delivery challenges Join our highly skilled Compute Site Reliability team Our team designs, develops, and manages applications and infrastructure that support Akamai's Compute products and services. We...