
Site Reliability Engineer
24 hours ago
Senior SRE (Engineering & Reliability) Job Summary: We are seeking an experienced and dynamic Site Reliability Engineering (SRE) Lead to oversee the reliability, scalability, and performance of our critical systems.As an SeniorSRE, you will play a pivotal role in establishing and implementing SRE practices, leading a team of engineers, and driving automation, monitoring, and incident response strategies. This position combines software engineering and systems engineering expertise to build and maintain high-performing, reliable systems. Experience:7+ years Key Responsibilities: Reliability & Performance: • Lead efforts to maintain high availability and reliability of critical services. • Define and monitor SLIs, SLOs, and SLAs to ensure business requirements are met. • Proactively identify and resolve performance bottlenecks and system inefficiencies. IncidentManagement & Response: • Establish and improve incident management processes and on-call rotations. • Lead incident response and root cause analysis for high-priority outages. • Drive post-incident reviews and ensure actionable insights are implemented. Automation & Tooling: • Develop and implement automated solutions to reduce manual operational tasks. • Enhance system observability through metrics, logging, and distributed tracing tools (e.g.,Prometheus, Grafana, Elastic APM). • Optimize CI/CD pipelines for seamless deployments. Collaboration: • Partner with software engineering teams to improve the reliability of applications and infrastructure. • Work closely with product/ engineering teams to design scalable and robust systems. • Ensure seamless integration of monitoring and alerting systems across teams. Leadership &Team Building: • Manage, mentor, and grow a team of SREs. • Promote SRE best practices and foster a culture of reliability and performance across the organization. • Drive performance reviews, skills development, and career progression for team members.Capacity Planning & Cost Optimization: • Perform capacity planning and implement autoscaling solutions to handle traffic spikes. • Optimize infrastructure and cloud costs while maintaining reliability and performance. Skills & Qualifications: Required Skills: • Technical Expertise: o Experience with cloud platforms (AWS / Azure / GCP) and Kubernetes. oHands-on knowledge of infrastructure-as-code tools like Terraform /Helm/ Ansible. o Proficiency in Java o Expertise in distributed systems, databases, and load balancing. •Monitoring & Observability: oProficient with tools like Prometheus, Grafana,, Elastic APM, or New relic. o Understanding of metrics-driven approaches for system monitoring and alerting. • Automation & CI/CD: o Hands-on experience with CI/CD pipelines (e.g., Jenkins, Azure Pipelines etc). o Skilled in automation frameworks and tools for infrastructure and application deployments. • Incident Management: o Proven track record in handling incidents, post-mortems, and implementing solutions to prevent recurrence. Leadership & Communication Skills: • Strong people management and leadership skills with the ability to inspire and motivate teams. • Excellent problem-solving and decision-making skills. • Clear and concise communication, with the ability to translate technical concepts for non-technical stakeholders. Preferred Qualifications: • Experience with database optimization, Kafka, or other messaging systems. • Knowledge of autoscaling techniques • Previous experience in an SRE, DevOps, or infrastructure engineering leadership role. • Understanding of compliance and security best practices in distributed systems.
-
Site Engineer
4 weeks ago
Delhi, India Engineer Department Full timeCompany Description Engineer Department is a company We are dedicated to providing efficient and effective engineering solutions for public infrastructure and services. Our team is committed to ensuring the highest standards in project management and execution, serving the community with integrity and professionalism. Role Description This is a full-time...
-
Site Reliability Engineer
3 days ago
New Delhi, India HDFC Limited Full timeHiring for Lead / Sr Site Reliability Engineer for Mumbai & Bangalore Location Experience - 8 - 14 YearsJob Purpose Analysing, troubleshooting, and designing vital services, platforms, and infrastructure on GCP while always thinking about reliability, scalability, resilience, security, and performance.Job Responsibilities:Help build a Site Reliability...
-
Site Reliability Engineer
1 day ago
New Delhi, India Endpoint Clinical Full timeAbout Us:Endpoint is an interactive response technology (IRT®) systems and solutions provider that supports the life sciences industry. Since 2009, we have been working with a single vision in mind, to help sponsors and pharmaceutical companies achieve clinical trial success. Our solutions, realized through the proprietary PULSE® platform, have proven to...
-
Site Reliability Engineer
3 days ago
New Delhi, India Trantor Full timeJob Title - Site Reliability Engineer Role- Contract (9 Months- Extendable) Exp- 5+ years Loc- Bangalore ( Hybrid) Notice- Immediate joiner onlyDuties: Responsible for maintaining and scaling production services and servers across multiple data centers for complex and data-intensive cloud services Improve scalability, service reliability, capacity, and...
-
Site Reliability Engineer
3 days ago
New Delhi, India SID Global Solutions Full timeJob Role: Site Reliability Engineer (SRE) – GCP Experience: 3+ years Location: HyderabadAbout SIDGS: SIDGS is a premium global systems integrator and global implementation partner of Google corporation, providing Digital Solutions & Services to Fortune 500 companies. Our Digital solutions go across following domains: User Experience, CMS, API Management,...
-
Site Reliability Engineer
24 hours ago
New Delhi, India IntraEdge Full timeJob Title: Site Reliability Engineer (SRE) – Production Support Location: BengaluruJob Summary: We are looking for a skilledSite Reliability Engineer (SRE)with strong experience inproduction support, DevOps practices, and cloud infrastructure management . The ideal candidate will be responsible for maintaining the reliability, performance, and scalability...
-
Site Reliability Engineering Manager
1 day ago
New Delhi, India Tata Consultancy Services Full timeRole**: Manager, Site Reliability Engineering Required Technical Skill Set: Manager, Site Reliability Engineering Desired Experience Range: 12 - 18 yrs Notice Period: Immediate to 90Days only Location of Requirement:Bangalore We are currently planning to do a VirtualInterviewJob Description: Describe what the person will do in the role - how he/she will...
-
Senior Site Reliability Engineer
1 week ago
New Delhi, India Tata Consultancy Services Full timeDear Candidates,Greetings from TCS!!!TCS is looking for Senior Site Reliability Engineer – AWSExperience: 8-12 yearsLocation: ChennaiMust have skills:- Design, implement, and maintain scalable, secure, and highly available infrastructure on AWS - Develop and improve CI/CD pipelines, Infrastructure as Code (IaC) using Terraform, Harness - Own and implement...
-
Site Reliability Engineer
1 day ago
New Delhi, India Sonata Software Full timeCategory Details Role Site Reliability Engineer (SRE) III – Data Engineering Location Hyderabad- Employment Type Full Time Experience 7–12 years insite reliability, cloud-based data infrastructure, data pipeline observability, automation, and high-availability engineeringwithinEdTech platforms (2U) Primary Skills (Must-Have) AWS, CI/CD, Jenkins, IAAC,...
-
Site Reliability Engineer
3 weeks ago
Delhi, India Elgebra Full timeHiring: Site Reliability Engineer – 7+ Years Location: Bangalore / Chennai Payroll: Elgebra Client: Qincline Joining: Immediate to 15 Days Role Overview: We are looking for an experienced Site Reliability Engineer (SRE) with over 6 years of expertise to join our team. The ideal candidate will have strong technical skills, a problem-solving mindset, and...