Site Reliability Engineer/Lead
3 days ago
Key Responsibilities :
- Own the availability, scalability, and performance of production systems and services.
- Design and manage distributed systems and microservices architectures at scale.
- Develop and implement incident response strategies, root cause analysis, and create actionable postmortems.
- Drive improvements in infrastructure automation, CI/CD pipelines, and deployment strategies.
- Collaborate with cross-functional teams including engineering, product, and QA to embed SRE best practices.
- Implement observability tools (e.g., Prometheus, Grafana, ELK, Datadog) to monitor system performance and proactively detect issues.
- Manage and optimize cloud infrastructure on AWS, including services such as EC2, ELB,
AutoScaling, S3, CloudFront, and CloudWatch.
- Utilize Infrastructure-as-Code tools such as Terraform, CloudFormation, or Pulumi for provisioning and maintaining infrastructure.
- Apply strong Linux, networking, load balancing, and security principles to ensure platform
resilience.
- Leverage Docker and Kubernetes for container orchestration and scalable deployments.
- Build internal tools and automation using Python, Go, or Bash scripting.
- Support event-driven architectures leveraging Kafka or RabbitMQ for high-throughput, real-time systems.
- Proactively contribute to reliability-focused architecture and design discussions.
Required Skills & Experience :
years of overall experience in backend engineering, infrastructure, DevOps, or SRE roles.- Minimum 3 years of experience leading SRE, DevOps, or Infrastructure teams.
- Proven track record managing distributed systems and microservices at scale.
- Deep understanding of Linux systems, networking fundamentals, load balancing, and infrastructure security.
- Strong hands-on experience with AWS services : EC2, ELB, AutoScaling, CloudFront, S3, and CloudWatch.
- Expert-level knowledge of Docker and Kubernetes in production environments.
- Proficient with Infrastructure-as-Code tools : Terraform, CloudFormation, or Pulumi.
- Hands-on experience with monitoring and observability tools : Prometheus, Grafana, ELK
Stack, or Datadog.
- Strong scripting or programming skills in Python, Go, Bash, or similar languages.
- Familiarity with Kafka or RabbitMQ for event-driven and messaging architectures.
- Excellent incident management skills, including triage, RCA, and communication.
- Ability to thrive in fast-paced environments and adapt to changing priorities.
Preferred Qualifications :
- Bachelors degree in Computer Science, Engineering, or a related field.- Experience in startup or high-growth environments.
- Contributions to open-source DevOps or SRE tools are a plus.
- Certifications in AWS, Kubernetes, or other cloud-native technologies are advantageous.
-
Site Reliability Engineering Lead
2 weeks ago
Mumbai, Maharashtra, India RELX Full time ₹ 20,00,000 - ₹ 25,00,000 per yearWould you like to be part of a team that delivers high-quality software to our customers?Are you a visible champion with a 'can do' attitude and enthusiasm that inspires others?About The BusinessLexisNexis Risk Solutions is the essential partner in the assessment of risk. Within our Business Services vertical, we offer a multitude of solutions focused on...
-
Senior Lead Site Reliability Engineer
5 days ago
Mumbai, Maharashtra, India JPMorganChase Full time US$ 1,20,000 - US$ 2,00,000 per yearDescriptionGuide and shape the future of technology at a globally recognized firm, driven by pride in ownership.As a Senior Manager of Site Reliability Engineering at JPMorgan Chase within the Finance technology team which is aligned to Corporate Technology Division, you are the non-functional requirement owner and champion for the applications in your...
-
Site Reliability Engineering Lead
2 weeks ago
Mumbai, Maharashtra, India RELX Group Full time ₹ 12,00,000 - ₹ 36,00,000 per yearWould you like to be part of a team that delivers high-quality software to our customers?Are you a visible champion with a 'can do' attitude and enthusiasm that inspires others?About the BusinessLexisNexis Risk Solutions is the essential partner in the assessment of risk. Within our Business Services vertical, we offer a multitude of solutions focused on...
-
Site Reliability Engineer
1 week ago
Mumbai, Maharashtra, India Oracle Financial Services Software Ltd Full time ₹ 12,00,000 - ₹ 36,00,000 per yearSenior Site Reliability Developer OCI is Oracle's next-generation cloud platform, built for the most demanding enterprise workloads. We deliver high-performance computing, storage, networking, and platform services at global scale. The AI Platform, Services & Solutions organization within OCI is building the foundation for enterprise AI—spanning GPU...
-
Site Reliability Engineer
2 weeks ago
Mumbai, Maharashtra, India Oracle Financial Services Software Ltd Full time ₹ 20,00,000 - ₹ 25,00,000 per yearSite Reliability Developer 3 Solve complex problems related to infrastructure cloud services and build automation to prevent problem recurrence. Design, write, and deploy software to improve the availability, scalability, and efficiency of Oracle products and services. Design and develop designs, architectures, standards, and methods for large-scale...
-
Site Reliability Engineer
1 day ago
Mumbai, Maharashtra, India Fynd Full time ₹ 8,00,000 - ₹ 24,00,000 per yearFynd is India's largest omnichannel platform and a multi-platform tech company specializing in retail technology and products in AI, ML, big data, image editing, and the learning space. It provides a unified platform for businesses to seamlessly manage online and offline sales, store operations, inventory, and customer engagement. Serving over 2,300 brands,...
-
Site Reliability Engineer
1 week ago
Mumbai, Maharashtra, India Talent Leads HR Solutions Pvt Ltd Full time ₹ 20,00,000 - ₹ 25,00,000 per yearSkill, Knowledge &Trainings : - Site Reliability Engineer will be responsible to develop and implement services that improve Software development Life Cycle. - Build automations which will help optimize software delivery. - Improve reliability, quality, and time-to-market of our suite of software solutions. - Will be responsible for availability,...
-
Site Reliability Engineer
1 week ago
Mumbai, Maharashtra, India Aanseacore Full time ₹ 12,00,000 - ₹ 24,00,000 per yearWe are seeking experienced Site Reliability Engineers (SREs) and CDN Specialists with deep expertise in global performance optimization, cloud infrastructure reliability, and edge computing. The ideal candidate will have a strong technical foundation in network performance engineering, Azure cloud operations, and CDN/edge delivery systems, ensuring...
-
Site Reliability Engineer 2
2 weeks ago
Navi Mumbai, Maharashtra, India Uplers Full time ₹ 8,00,000 - ₹ 25,00,000 per yearExperience: 4+ yearsSalary: ConfidentialShift: (GMT+05:30) Asia/Kolkata (IST)Opportunity Type: Office (Mumbai)Placement Type: Full time Permanent Position(*Note: This is a requirement for one of Uplers' client--Gofynd)What do you need for this opportunity?Must have skills required: and AWS/Google Cloud and MongoDB/CI/CD/GrafanaJob descriptionFynd is Indias...
-
Site Reliability Engineer
5 days ago
Mumbai, Maharashtra, India Avant-Garde Corporate Services Private Limited Full time ₹ 15,00,000 - ₹ 25,00,000 per yearWe are seeking a skilled and proactive Site Reliability Engineer (SRE) to join the IT Transformation team.The role involves driving automation, reliability, and performance optimization across mission-critical applications and infrastructure within a financial market ecosystem.The successful candidate will manage end-to-end deployment automation, CI/CD...