Site Reliability Engineering Manager

3 weeks ago


Chennai, India Centific Full time
Centific is a Seattle-based tech company pioneering the future of AI one breakthrough at a time. Learn how we’re transforming the world through safe and scalable AI and empowering businesses to unlock the full potential of their data.
SRE Directo
r
Key Responsibilities
:Strategic Leadership & Vision
:Lead and manage the Software Release Managemen t function for all Data and AI products
.Establish a centralized release management framework for AI and data product s that scales with the growing product portfolio
.Form and lead a high-performing Site Reliability Engineering (SRE) tea m to ensure the operational stability and performance of all AI and data-driven applications post-release
.Collaborate with Product, Engineering and Operations teams to align release and SRE strategies with business objectives
.Release Planning & Coordination
:Oversee the full lifecycle of software and AI model releases, from planning and coordination to post-release evaluation
.Develop and maintain a detailed release calendar that aligns with the timelines and priorities of various product teams
.Coordinate release activities with multiple cross-functional teams, ensuring transparent communication of dependencies, risks, and milestones
.Ensure that all releases are integrated seamlessly into production, minimizing downtime and disruptions to end users
.Site Reliability Engineering (SRE) Team Formation
:Hire, build, and lead the SRE tea m responsible for maintaining the reliability, scalability, and performance of all Data and AI products in production
.Define the roles and responsibilities of the SRE team, ensuring clear alignment with the goals of product engineering and release management
.Develop and implement SRE best practice s, including incident response, root cause analysis, and proactive performance monitoring
.Establish SLAs, SLOs, and SLIs (Service Level Agreements/Objectives/Indicators) to track and measure the reliability and performance of all services post-release
.Collaborate with DevOps to ensure that automated CI/CD pipelines integrate seamlessly with SRE processes and monitoring systems
.Process Optimization & Automation
:Lead the automation of software release processes, with an emphasis on CI/CD pipeline s for AI models, data pipelines, and cloud-based AI products
.Develop infrastructure-as-code practices to improve the scalability and reliability of AI and data systems across production environments
.Introduce tools for version control, model governance, and monitoring for MLOp s and AI model management in production
.Continuously improve operational procedures to reduce the number of incidents and optimize recovery time
.Risk & Quality Management
:Implement comprehensive quality assurance and validation processes to ensure that all AI models, data products, and software releases meet security, performance, and compliance requirements
.Proactively identify and mitigate risks related to releases, AI model performance, and operational stability in production
.Conduct post-release reviews and retrospectives to continuously improve both the release process and the reliability of products
.
Collaboration & Stakeholder Managemen
t:Serve as the central point of contact for release management and SRE-related matters, ensuring consistent communication between engineering, product teams, and key stakeholder
s.Facilitate cross-functional collaboration to ensure that releases and operational reliability goals are met efficiently and effectivel
y.Provide regular updates on release progress, system reliability, and any potential risks to executives and product leadershi
p.Innovation & Continuous Improvemen
t:Stay up to date with the latest trends i n SRE, DevOps, AI/ML, and cloud operatio ns, incorporating new tools and practices to improve the overall reliability and release processe
s.Drive the adoption of cutting-edge tools i n MLOps, AI model deployme nt, and automated incident resolution to continuously optimize operations and model lifecycle managemen
t.Foster a culture of continuous improvement by encouraging feedback loops and metrics-driven decision-making across both the release management and SRE team
s.Qualification
s:Bachelor’s or Master’s degree in Computer Science, Data Engineering, AI/ML, or a related fiel
d.10+ years of experience in software release management, with at least 3-5 years i n SRE or DevO ps environments, preferably i n AI or data-driven applicatio n
s.Proven experience building and managing both release management and SRE teams in complex, multi-product environment
s.Strong knowledge o f AI/ML operations (MLOp s), data pipeline management, and cloud-based AI product deployment
s.Expertise in release management tools (Jenkins, GitLab, Git, Jira) an d SRE too ls such as Prometheus, Grafana, Datadog, or similar monitoring system
s.Experience with cloud platforms (AWS, GCP, Azure), containerization (Kubernetes, Docker), and infrastructure automation tools (Terraform, Ansible
).Excellent problem-solving, organizational, and leadership skills, with a strong track record of driving continuous improvement in both release and operational reliability processe
s.Preferred Qualification
s:Experience deploying and maintaining large-scale AI/ML models in production environments, including monitoring, retraining, and operationalizatio
n.Familiarity wit h IT IL , MLO ps, o r DevO ps frameworks and best practice
s.Knowledge of cloud-based services and tools specifically designed for AI/ML (e.g., AWS SageMaker, TensorFlow, PyTorch
).Demonstrated ability to manage incident response and root cause analysis in complex software ecosystem
s.

  • Chennai, India Centific Full time

    The next frontier of AI begins with Centific Centific is a Seattle-based tech company pioneering the future of AI one breakthrough at a time. Learn how we’re transforming the world through safe and scalable AI and empowering businesses to unlock the full potential of their data. SRE Manager / Director Key Responsibilities: Strategic Leadership...


  • Chennai, India Tata Consultancy Services Full time

    TCS has been a great pioneer in feeding the fire of young techies like you. We are a global leader in the technology arena and there’s nothing that can stop us from growing together.What we are looking forRole: Site Reliability EngineerExperience Range: 8 – 12 YearsLocation: Pune & Chennai, Bangalore , DelhiMust-Have:Essential:Exceptional skills in...


  • Chennai, India Centific Full time

    The next frontier of AI begins with CentificCentific is a Seattle-based tech company pioneering the future of AI one breakthrough at a time. Learn how we’re transforming the world through safe and scalable AI and empowering businesses to unlock the full potential of their data.SRE Manager / DirectorKey Responsibilities:Strategic Leadership & Vision:Lead...


  • Chennai, India Centific Full time

    Centific is a Seattle-based tech company pioneering the future of AI one breakthrough at a time. Learn how we’re transforming the world through safe and scalable AI and empowering businesses to unlock the full potential of their data. SRE Directo r Key Responsibilities : Strategic Leadership & Vision : Lead and manage the Software Release...


  • Chennai, India Centific Full time

    The next frontier of AI begins with CentificCentific is a Seattle-based tech company pioneering the future of AI one breakthrough at a time. Learn how we’re transforming the world through safe and scalable AI and empowering businesses to unlock the full potential of their data.SRE Manager / Director Key Responsibilities:Strategic Leadership & Vision:Lead...


  • Chennai, India Centific Full time

    The next frontier of AI begins with Centific Centific is a Seattle-based tech company pioneering the future of AI one breakthrough at a time. Learn how we’re transforming the world through safe and scalable AI and empowering businesses to unlock the full potential of their data. SRE Manager / Director Key Responsibilities: Strategic Leadership &...


  • Chennai, India Centific Full time

    The next frontier of AI begins with CentificCentific is a Seattle-based tech company pioneering the future of AI one breakthrough at a time. Learn how we’re transforming the world through safe and scalable AI and empowering businesses to unlock the full potential of their data.SRE Manager / Director Key Responsibilities:Strategic Leadership & Vision:Lead...


  • Chennai, India Centific Full time

    Centific is a Seattle-based tech company pioneering the future of AI one breakthrough at a time. Learn how we’re transforming the world through safe and scalable AI and empowering businesses to unlock the full potential of their data. SRE Directo r Key Responsibilities :Strategic Leadership & Vision :Lead and manage the Software Release...


  • Chennai, India Centific Full time

    Centific is a Seattle-based tech company pioneering the future of AI one breakthrough at a time. Learn how we’re transforming the world through safe and scalable AI and empowering businesses to unlock the full potential of their data.SRE Director Key Responsibilities:Strategic Leadership & Vision:Lead and manage the Software Release Management function for...


  • Chennai, India Centific Full time

    Centific is a Seattle-based tech company pioneering the future of AI one breakthrough at a time. Learn how we’re transforming the world through safe and scalable AI and empowering businesses to unlock the full potential of their data.SRE Director Key Responsibilities:Strategic Leadership & Vision:Lead and manage the Software Release Management function for...


  • Chennai, India Centific Full time

    Centific is a Seattle-based tech company pioneering the future of AI one breakthrough at a time. Learn how we’re transforming the world through safe and scalable AI and empowering businesses to unlock the full potential of their data. SRE Directo r Key Responsibilities :Strategic Leadership & Vision :Lead and manage the Software Release Managemen t...


  • Chennai, India Centific Full time

    The next frontier of AI begins with CentificCentific is a Seattle-based tech company pioneering the future of AI one breakthrough at a time. Learn how we’re transforming the world through safe and scalable AI and empowering businesses to unlock the full potential of their data.SRE Manager / DirectorKey Responsibilities:Strategic Leadership & Vision:Lead...


  • Chennai, India Centific Full time

    Centific is a Seattle-based tech company pioneering the future of AI one breakthrough at a time. Learn how we’re transforming the world through safe and scalable AI and empowering businesses to unlock the full potential of their data.SRE DirectorKey Responsibilities:Strategic Leadership & Vision:Lead and manage theSoftware Release Managemen t function for...


  • Chennai, India Tata Consultancy Services Full time

    Dear Candidate Greetings from TCS !!! TCS has been a great pioneer in feeding the fire of young Techies like you. We are a global leader in the technology arena and there's nothing that can stop us from growing together. Role: Site Reliability Engineer Location: Pune/Chennai/Bangalore/Delhi Experience Range: 8-12 years Educational Qualification : 15...


  • Chennai, India Viasat Full time

    About us One team. Global challenges. Infinite opportunities. At Viasat, we’re on a mission to deliver connections with the capacity to change the world. For more than 35 years, Viasat has helped shape how consumers, businesses, governments and militaries around the globe communicate. We’re looking for people who think big, act fearlessly, and create...


  • Chennai, India Viasat Full time

    About usOne team. Global challenges. Infinite opportunities. At Viasat, we’re on a mission to deliver connections with the capacity to change the world. For more than 35 years, Viasat has helped shape how consumers, businesses, governments and militaries around the globe communicate. We’re looking for people who think big, act fearlessly, and create an...


  • Chennai, India Viasat Full time

    About usOne team. Global challenges. Infinite opportunities. At Viasat, we’re on a mission to deliver connections with the capacity to change the world. For more than 35 years, Viasat has helped shape how consumers, businesses, governments and militaries around the globe communicate. We’re looking for people who think big, act fearlessly, and create an...


  • Chennai, India Centific Full time

    Centific is a Seattle-based tech company pioneering the future of AI one breakthrough at a time. Learn how we’re transforming the world through safe and scalable AI and empowering businesses to unlock the full potential of their data.SRE DirectorKey Responsibilities:Strategic Leadership & Vision:Lead and manage the Software Release Managemen t function...


  • Chennai, India Viasat Full time

    About usOne team. Global challenges. Infinite opportunities. At Viasat, we’re on a mission to deliver connections with the capacity to change the world. For more than 35 years, Viasat has helped shape how consumers, businesses, governments and militaries around the globe communicate. We’re looking for people who think big, act fearlessly, and create an...


  • Chennai, India Viasat Full time

    About us One team. Global challenges. Infinite opportunities. At Viasat, we’re on a mission to deliver connections with the capacity to change the world. For more than 35 years, Viasat has helped shape how consumers, businesses, governments and militaries around the globe communicate. We’re looking for people who think big, act fearlessly, and create...