
Site Reliability Engineering Manager
2 weeks ago
Job Title: Site Reliability Engineering Manager
Job Summary:
We are seeking an experienced and dynamic Site Reliability Engineering (SRE) expert to oversee the reliability, scalability, and performance of our systems. As a SRE Manager, you will play a pivotal role in establishing and implementing SRE practices, leading a team of engineers, and driving automation, monitoring, and incident response strategies.
About The Role:
- Reliability & Performance: Ensure high availability and reliability of critical services by proactively identifying and resolving performance bottlenecks and system inefficiencies.
- Define and monitor Service Level Indicators (SLIs), Service Level Objectives (SLOs), and Service Level Agreements (SLAs) to ensure business requirements are met.
- Incident Management & Response: Establish and improve incident management processes and on-call rotations to ensure seamless integration of monitoring and alerting systems across teams.
- Lead incident response and root cause analysis for high-priority outages and drive post-incident reviews to ensure actionable insights are implemented.
- Automation & Tooling: Develop automated solutions to reduce manual operational tasks and enhance system observability through metrics, logging, and distributed tracing tools.
- Optimize Continuous Integration/Continuous Deployment (CI/CD) pipelines for seamless deployments and automate infrastructure provisioning using infrastructure-as-code tools.
- Collaboration: Partner with software engineering teams to improve the reliability of applications and infrastructure and work closely with product/engineering teams to design scalable and robust systems.
- Ensure seamless integration of monitoring and alerting systems across teams and promote a culture of reliability and performance within the organization.
- Leadership & Team Building: Manage, mentor, and grow a team of SREs and promote SRE best practices throughout the organization.
- Drive performance reviews, skills development, and career progression for team members and foster a collaborative environment that encourages knowledge sharing and innovation.
- Capacity Planning & Cost Optimization: Perform capacity planning and implement autoscaling solutions to handle traffic spikes and optimize infrastructure and cloud costs while maintaining reliability and performance.
Required Skills & Qualifications:
- Technical Expertise:
- Experience with cloud platforms (AWS / Azure / GCP) and Kubernetes.
- Hands-on knowledge of infrastructure-as-code tools like Terraform/Helm/Ansible.
- Proficiency in Java.
- Expertise in distributed systems, databases, and load balancing.
- Monitoring & Observability: Proficient with tools like Prometheus/Grafana/Elastic APM.
- Understanding of metrics-driven approaches for system monitoring and alerting.
- Automation & CI/CD: Hands-on experience with CI/CD pipelines (e.g., Jenkins/Azure Pipelines).
- Skilled in automation frameworks and tools for infrastructure and application deployments.
- Incident Management: Proven track record in handling incidents, post-mortems, and implementing solutions to prevent recurrence.
- Leadership & Communication Skills:
- Strong people management and leadership skills with the ability to inspire and motivate teams.
- Excellent problem-solving and decision-making skills.
- Clear and concise communication with the ability to translate technical concepts for non-technical stakeholders.
Preferred Skills:
- Experience with database optimization/Kafka/messaging systems.
- Knowledge of autoscaling techniques.
- Previous experience in an SRE/DevOps/infrastructure engineering leadership role.
- Understanding of compliance and security best practices in distributed systems.
Benefits:
- Opportunity to Build & Scale Reliable Systems: Be a key driver in building and scaling reliable systems in a fast-paced environment.
- Cutting-Edge Technologies: Work with cutting-edge technologies and influence the evolution of the infrastructure.
- High-Impact Leadership Role: Lead a high-impact team and foster a culture of reliability and innovation.
-
Site Reliability Expert
6 days ago
Udaipur, Rajasthan, India beBeeDevOps Full time ₹ 1,00,00,000 - ₹ 1,50,00,000Job Title:Site Reliability ExpertOverviewWe are seeking an experienced Site Reliability Engineer to join our team. As a key member of our infrastructure team, you will be responsible for ensuring the reliability, scalability, and performance of our cloud-based systems.Key Responsibilities- Maintain and enhance Terraform scripts for DBT Cloud, GitHub, and...
-
Udaipur, Rajasthan, India beBeeSiteReliability Full time ₹ 2,50,00,000 - ₹ 3,50,00,000This high-level leadership role oversees Site Reliability Engineering (SRE) strategy development and implementation.Key Responsibilities:Drive SRE strategy to foster an 'Automate-first' culture in service operations.Develop methodologies to identify and eliminate inefficient processes.Collaborate with engineering teams to create operational metrics and plans...
-
Senior Site Reliability Lead
6 days ago
Udaipur, Rajasthan, India beBeeSiteReliability Full time ₹ 2,00,00,000 - ₹ 2,50,00,000Site Reliability EngineerOur team is seeking a seasoned Site Reliability Engineer with 12+ years of experience to lead the development and implementation of our site reliability strategy. This strategic role involves defining and driving the SRE strategy, promoting an "Automate-first" culture, and developing methodologies for automation and elimination of...
-
Senior Site Reliability Expert
5 days ago
Udaipur, Rajasthan, India beBeeSiteReliabilityEngineer Full time ₹ 1,50,00,000 - ₹ 2,50,00,000Job OverviewWe are seeking a seasoned Senior Site Reliability Engineer to play a pivotal role in ensuring the stability and scalability of accounting and finance platforms.The ideal candidate will possess expertise in Site Reliability Engineering, DevOps, or Production Engineering, with a strong focus on financial or mission-critical applications. Experience...
-
Reliable Systems Engineer
2 weeks ago
Udaipur, Rajasthan, India beBeeSeniority Full time ₹ 23,00,000 - ₹ 25,00,000Senior SRE Job DescriptionWe are seeking a seasoned Site Reliability Engineer to join our team. As part of our digital transformation, we are investing in automation and reliability engineering to enhance system resilience and reduce production outages.
-
Site Reliability Specialist
6 days ago
Udaipur, Rajasthan, India beBeeReliability Full time ₹ 1,80,00,000 - ₹ 2,70,00,000Reliable Infrastructure Engineer RoleWe are seeking a skilled System Reliability Engineer to contribute directly to our technology hub, working towards delivering high-value and innovative solutions.Our ideal candidate will be responsible for maintaining the reliability of applications in our environment, meeting development and maintenance requirements of...
-
Urgent Site Engineer
1 day ago
Udaipur, Rajasthan, India Jangid Engineering Works Full timeCivil Engineer Responsibilities Structural design and analysis Ensuring compliance with building codes and regulations Site inspections and progress reports Estimating costs and materials Qualifications Bachelor s degree in Civil Engineering Professional license if required in your region 2-5 years experience depending on your needs ...
-
Expert System Reliability Engineer
6 days ago
Udaipur, Rajasthan, India beBeeReliability Full time ₹ 40,00,000 - ₹ 50,00,000System Reliability Expert WantedWe are seeking an experienced System Reliability Engineer to join our team. As a Principal Site Reliability Engineer, you will play a key role in designing, developing, and implementing systems to ensure high availability, reliability, and performance.
-
Site Reliability Expert
2 weeks ago
Udaipur, Rajasthan, India beBeeSystemEngineer Full time ₹ 2,00,00,000 - ₹ 2,50,00,000Job OpportunityWe are seeking a skilled professional to join our engineering team. As a key member, you will be responsible for ensuring the reliability and performance of our systems.Key Responsibilities:Technical Support: Provide guidance to engineers on AI/ML environments.Cloud Platform Management: Manage cloud platforms (Azure, AWS, GCP) and DevOps...
-
Site Engineering Expert
7 days ago
Udaipur, Rajasthan, India beBeeStructuralEngineering Full time ₹ 15,00,000 - ₹ 25,00,000Site Engineering ExpertThis role involves overseeing day-to-day on-site operations, ensuring quality control and managing structural engineering tasks.On-site experience and expertiseStrong communication skillsQuality control experienceExperience in structural engineeringWe are seeking a highly skilled Site Engineer to join our team. As a key member of our...