SRE Devops Manager
2 weeks ago
We are looking for Site Reliability Engineering (SRE) Devops ManagerLocation: Bangalore / Hyderabad / Chennai / Noida / Pune / Visakhapatnam / GurgaonShift timing: regularCan join Immediate - 30 daysInterested candidates, Please share your profiles and below details toEmail ID: Total experience:Relevant Experience:Current CTC:Expected CTC:Notice Period:If Serving Notice Period, Last working day:Email ID: Job SummaryWe are seeking an experiencedSite Reliability Engineering (SRE) Managerto lead and evolve our cloud infrastructure, reliability practices, and automation strategy. This role blends hands-on technical leadership with strategic oversight to ensure scalable, secure, and reliable systems across AWS-based environments.As an SRE Manager, you will guide a team of DevOps and SRE engineers to design, build, and operate cloud-native platforms leveragingKubernetes (EKS) ,Terraform , andAWS DevOps tools . You will drive operational excellence through observability, automation, and AIOps—enhancing reliability, performance, and cost efficiency.You will collaborate closely with development, product, and security teams to defineSLOs, manage error budgets , and continuously improve infrastructure resilience and developer productivity.Key ResponsibilitiesLeadership & StrategyLead, mentor, and grow a global team of Site Reliability and DevOps Engineers.Define and drive the reliability roadmap, SLOs, and error budgets across services.Establish best practices for infrastructure automation, observability, and incident response.Partner with engineering leadership to shape long-term cloud, Kubernetes, and AIOps strategies.Infrastructure & AutomationDesign, implement, and manage AWS cloud infrastructure usingTerraform(advanced modules, remote state management, custom providers).Build and optimizeCI/CD pipelinesusing AWS CodePipeline, CodeBuild, CodeDeploy, and CodeCommit.ManageEKS clusterswith focus on scalability, reliability, and cost efficiency—leveraging Helm, ingress controllers, and service mesh (e.g., Istio).Implement robustsecurity and compliance practices(IAM policies, network segmentation, secrets management).Automate environment provisioning for dev, staging, and production using Infrastructure as Code (IaC).Monitoring, Observability & ReliabilityLead observability initiatives usingPrometheus, Grafana, CloudWatch, and OpenSearch/ELK .Improve system visibility and response times by enhancing monitoring, tracing, and alerting mechanisms.Drive proactive incident management and root cause analysis (RCA) to prevent recurring issues.Apply chaos engineering principles and reliability testing to ensure resilience under load.AIOps & Advanced OperationsIntegrateAIOps toolsto proactively detect, diagnose, and remediate operational issues.Design and manage scalable deployment strategies forAI/LLM workloads(e.g., Llama, Claude, Cohere).Monitor model performance and reliability across hybrid Kubernetes and managed AI environments.Stay current withMLOpsandGenerative AI infrastructuretrends, applying them to production workloads.Manage 24/7 operations using apropos alerting tools and follow-the-sun modelCost Optimization & GovernanceAnalyze and optimize cloud costs through instance right-sizing, auto-scaling, and spot usage.Implement cost-aware architecture decisions and monitor monthly spend for alignment with budgets.Establish cloud governance frameworks to enhance cost visibility and accountability across teams.Collaboration & ProcessPartner with developers to streamline deployment workflows and improve developer experience.Maintain high-quality documentation, runbooks, and postmortem reviews.Foster a culture of reliability, automation, and continuous improvement across teams.
-
SRE (Devops)
6 days ago
Delhi, India Cozzera Full timeJob Description: Senior SRE / DevOps EngineerExperience:6+ YearsLocation:RemoteShift:Night Shift (US East & West Coast Support)Key Skills (Must-Have)Strong hands-on experience inAWS ,Kubernetes , andTerraformExcellent communication & collaboration skillsResponsibilitiesManage and support production infrastructure during night shiftsEnsure system reliability,...
-
DevOps Engineer/SRE
2 weeks ago
Delhi, India SuprSend Full timeAbout Us:SuprSend is reinventing notification infrastructure for global businesses. Powering seamless, reliable distribution of millions of events across channels. Join us as we scale further and raise the bar on uptime, cost-efficiency and automation.Role Snapshot:We’re seeking an experienced DevOps / SRE engineer with deep Kubernetes and cloud-native...
-
Senior DevOps Engineer
2 weeks ago
Delhi, India MightyBot Full timeTitle: Senior DevOps Engineer (SRE)Location: RemoteJoin our team as a Senior DevOps Engineer, where we're focused on graduating AI from interesting demos to indispensable products. You will build and maintain the robust, scalable infrastructure that makes this possible, ensuring our platform is reliable enough to be trusted with critical business decisions....
-
Senior DevOps Engineer
2 weeks ago
Delhi, India MightyBot Full timeTitle: Senior DevOps Engineer (SRE)Location: RemoteJoin our team as a Senior DevOps Engineer, where we're focused on graduating AI from interesting demos to indispensable products. You will build and maintain the robust, scalable infrastructure that makes this possible, ensuring our platform is reliable enough to be trusted with critical business decisions....
-
SRE Devops Manager
2 weeks ago
New Delhi, India Infinite Computer Solutions Full timeWe are looking for Site Reliability Engineering (SRE) Devops ManagerLocation: Bangalore / Hyderabad / Chennai / Noida / Pune / Visakhapatnam / GurgaonShift timing: regularCan join Immediate - 30 daysInterested candidates, Please share your profiles and below details toEmail ID: Shanmukh.Varma@infinite.comTotal experience:Relevant Experience:Current...
-
SRE Devops Manager
4 days ago
New Delhi, India Infinite Computer Solutions Full timeWe are looking for Site Reliability Engineering (SRE) Devops Manager Location: Bangalore / Hyderabad / Chennai / Noida / Pune / Visakhapatnam / Gurgaon Shift timing: regular Can join Immediate - 30 daysInterested candidates, Please share your profiles and below details toEmail ID: Total experience: Relevant Experience: Current CTC: Expected CTC: Notice...
-
SRE Devops Manager
1 week ago
New Delhi, India Infinite Computer Solutions Full timeWe are looking for Site Reliability Engineering (SRE) Devops Manager Location: Bangalore / Hyderabad / Chennai / Noida / Pune / Visakhapatnam / Gurgaon Shift timing: regular Can join Immediate - 30 daysInterested candidates, Please share your profiles and below details toEmail ID: Shanmukh.Varma@infinite.comTotal experience: Relevant Experience: Current CTC:...
-
DevOps Engineer/SRE
2 weeks ago
New Delhi, India SuprSend Full timeAbout Us:SuprSend is reinventing notification infrastructure for global businesses. Powering seamless, reliable distribution of millions of events across channels. Join us as we scale further and raise the bar on uptime, cost-efficiency and automation.Role Snapshot:We’re seeking an experienced DevOps / SRE engineer with deep Kubernetes and cloud-native...
-
Site Reliability Engineer
1 week ago
Delhi, India Stoopa AI Full timeCompany DescriptionStoopa.AI is building next-generation AI-driven platforms for ports and is focused on reliability, speed, and intelligent automation. As we scale our next generation smart port product Turi, we are hiring our first dedicated SRE/DevOps Engineer to build, optimize, and own our reliability engineering function from the ground up. This is a...
-
Site Reliability Engineer
1 week ago
Delhi, India Stoopa AI Full timeCompany Description Stoopa.AI is building next-generation AI-driven platforms for ports and is focused on reliability, speed, and intelligent automation. As we scale our next generation smart port product Turi, we are hiring our first dedicated SRE/DevOps Engineer to build, optimize, and own our reliability engineering function from the ground up. This is a...