Site Reliability Engineering Manager

2 days ago


Delhi, India Netcore Cloud Full time
Job Title:

Manager of SRE (Site Reliability Engineering) & Application Support

Location: ThaneReports to: Sr VP Delivery headDepartment: Engineering

;

Full-Time

About us:At Netcore, innovation isn’t just a buzzword—it's the core of everything we do. As the pioneering force behind the first and leading AI/ML-powered Customer Engagement and Experience Platform (CEE), we're dedicated to revolutionizing how B2C brands interact with their customers. Our state-of-the-art SaaS products are designed to foster personalized engagement throughout the entire customer journey, creating remarkable digital experiences for businesses of all sizes.Engineering at Netcore:

Dive into a world where your work directly impacts engagement, conversions, revenue, and customer retention. Our engineering team tackles complex challenges that come with scaling high-performance systems. We thrive on versatility and speed, employing advanced tech stacks such as Kafka, Storm, RabbitMQ, Celery, RedisQ, and GoLang, all hosted robustly on AWS and GCP clouds. At Netcore, you're not just solving technical problems—you're setting industry benchmarks.

Job Summary:We are seeking a seasoned leader for our SRE & Application Support division, overseeing the reliability, scalability, and efficient operation of our martech tools built on open-source frameworks. This role will play a key part in maintaining the operational stability of our products on Netcore Cloud's infrastructure, ensuring 24/7 availability, and driving incident management.The ideal candidate will combine strong leadership abilities with a deep understanding of site reliability, automation, performance monitoring, and application support, delivering world-class service to our clients and partners.

Key Responsibilities:SRE Leadership & Strategy:- Lead the Site Reliability Engineering (SRE) team to design and implement robust systems ensuring uptime, scalability, and security.- Develop and maintain strategies for high availability, disaster recovery, and capacity planning of all Martech tools.- Advocate and apply the principles of automation to eliminate repetitive tasks and improve efficiency.- Establish and refine Service Level Objectives (SLOs), and Service Level Agreements (SLAs) in collaboration with product and engineering teams.Application Support:- Oversee and lead the Application Support Team responsible for maintaining the health and performance of customer-facing applications built on the NetcoreCloud platform.- Develop processes and Debugging procedures to ensure quick resolution of technical issues, and serve as an escalation point for critical incidents.- Ensure all incidents are triaged and handled efficiently, with proper root cause analysis and follow-up post-mortems for critical incidents.- Manage the implementation of monitoring tools and log management systems to detect, alert, and respond to potential issues proactively.Collaboration and Cross-Functional Leadership:- Work closely with Sales, CSM, Customer Support, development, QA, and DevOps teams.- Collaborate with stakeholders to drive a culture of continuous improvement by identifying and eliminating potential risks and issues in the system.- Be involved in PI (Program Increment) planning to align with product roadmaps, making sure reliability is factored into new feature development.Team Management & Development:- Recruit, mentor, and manage the SRE and Application Support Team, fostering a high-performance and collaborative environment.- Conduct regular performance reviews, provide feedback, and support professional development within the team.Innovation and Open-Source Contribution:- Lead initiatives to improve the open-source frameworks utilized in the martech stack, contributing to the open-source community as needed.- Stay current with emerging technologies, tools, and best practices in site reliability, automation, and application support.

Requirements:Experience:-

8+ years of experience in SRE, DevOps, or Application Support roles, with at least 3 years in a leadership position.- Proven track record of managing systems on open-source frameworks and cloud platforms such as NetcoreCloud or similar.- Demonstrated expertise in incident management, post-mortem analysis, and improving mean time to recovery (MTTR).- Strong experience in monitoring tools (Prometheus, Grafana, or similar), logging frameworks, and automation tools (Terraform, Ansible).Technical Skills:- Hands-on experience with Linux/Unix environments, cloud services (AWS, GCP, NetcoreCloud).- Proficiency in scripting and coding (Python, Php, Golang, Java, or similar languages) for automation purposes.- Solid understanding of CI/CD pipelines, version control (Git), and Alert & Application monitoring tools.Leadership & Soft Skills:- Proven leadership skills, with experience in team building, mentorship, and fostering a culture of accountability.- Strong interpersonal and communication skills, with the ability to interface effectively with technical and non-technical stakeholders.- Ability to manage multiple projects simultaneously, prioritize tasks, and work under pressure to meet deadlines.

Preferred Qualifications:- Experience in the martech, Digital Marketing domain or working with large-scale, customer-facing SaaS applications.- Certification in SRE, DevOps, or cloud platforms (AWS, GCP).- Good application debugging skills, Product feature understanding skills.

Why Join Us?- Be a part of an innovative and forward-thinking organization that values technology and continuous improvement.- Work with cutting-edge open-source frameworks and cloud technologies., SAAS Product.- Leadership opportunities with a direct impact on our customers and product success.

Let's start a conversation and make magic happen togetherWebsite -

  • delhi, India InstaService Inc Full time

    About Us :At InstaService, we are committed to delivering reliable, high-performance home services to our customers. As a fast-growing on-demand services platform, we are looking for a talentedDevOps / Site Reliability Engineer (SRE)to join our dynamic team. This role is crucial to scaling and maintaining our infrastructure, ensuring our platform remains...


  • delhi, India InstaService Inc Full time

    About Us :At InstaService, we are committed to delivering reliable, high-performance home services to our customers. As a fast-growing on-demand services platform, we are looking for a talented DevOps / Site Reliability Engineer (SRE) to join our dynamic team. This role is crucial to scaling and maintaining our infrastructure, ensuring our platform remains...


  • Delhi, India InstaService Inc Full time

    About Us :At InstaService, we are committed to delivering reliable, high-performance home services to our customers. As a fast-growing on-demand services platform, we are looking for a talentedDevOps / Site Reliability Engineer (SRE)to join our dynamic team. This role is crucial to scaling and maintaining our infrastructure, ensuring our platform remains...


  • Delhi, India InstaService Inc Full time

    About Us :At Insta Service, we are committed to delivering reliable, high-performance home services to our customers. As a fast-growing on-demand services platform, we are looking for a talented Dev Ops / Site Reliability Engineer (SRE) to join our dynamic team. This role is crucial to scaling and maintaining our infrastructure, ensuring our platform...


  • Delhi, Delhi, India Mrsool Full time

    About the RoleWe are seeking an experienced Engineering Manager to lead and grow a team of Site Reliability Engineers (SREs) at Mrsool. As a key member of our engineering organization, you will be responsible for ensuring platform stability and reliability while actively contributing to strategy, prioritization, and mission setting for the SRE team.This role...


  • Delhi, India Tata Consultancy Services Full time

    Dear Candidate,Greetings from TCS !!!TCS is hiring for SRE, please find the below JD…..Experience range – 5+ yearsLocation- Bangalore, Pune, Hyderabad, ChennaiSkills Required - Site Reliability EngineerRole& Responsibilities –Collaborates with cloud platform engineers and teams to design, develop, test, and implement availability, reliability,...


  • Delhi, India Tata Consultancy Services Full time

    Dear Candidate,Greetings from TCS !!!TCS is hiring for SRE, please find the below JD…..Experience range – 5+ yearsLocation- Bangalore, Pune, Hyderabad, ChennaiSkills Required - Site Reliability EngineerRole& Responsibilities –Collaborates with cloud platform engineers and teams to design, develop, test, and implement availability, reliability,...


  • Delhi, India Netcore Cloud Full time

    Job Title: Manager of SRE (Site Reliability Engineering) & Application SupportLocation: ThaneReports to: Sr VP Delivery headDepartment: Engineering ; Full-TimeAbout us:At Netcore, innovation isn’t just a buzzword—it's the core of everything we do. As the pioneering force behind the first and leading AI/ML-powered Customer Engagement and Experience...


  • Delhi, India PeopleLogic Full time

    Job Responsibilities :Ensure the 24/7 operations and reliability of data services in our productionCollaborate with the data engineering development team to design, build, and maintain scalable, reliable, and secure data pipelines and systems.Develop and implement monitoring, alerting, and incident response strategies to proactively identify and resolve...


  • delhi, India PeopleLogic Full time

    Job Responsibilities :Ensure the 24/7 operations and reliability of data services in our productionCollaborate with the data engineering development team to design, build, and maintain scalable, reliable, and secure data pipelines and systems.Develop and implement monitoring, alerting, and incident response strategies to proactively identify and resolve...


  • Delhi, India PeopleLogic Full time

    Job Responsibilities :Ensure the 24/7 operations and reliability of data services in our productionCollaborate with the data engineering development team to design, build, and maintain scalable, reliable, and secure data pipelines and systems.Develop and implement monitoring, alerting, and incident response strategies to proactively identify and resolve...


  • New Delhi, India AIVID.AI Full time

    Role Overview:We are seeking proactive and skilled Site Reliability Engineers (SREs) to manage clientdeployments, provide on-site support, and ensure the seamless functioning of our AI-basedcamera analytics systems. This hybrid role requires a mix of on-site visits and remote work.The selected candidates will operate from their respective regions—New...


  • Delhi, India Systal Technology Solutions Full time

    Site Reliability EngineerCompetitive Salary & BenefitsBangaloreSystal is a global managed network and security service and transformation specialist. We consult, deploy, and integrate multi-vendor technologies which help enterprise businesses maximize the security and value of their complex IT infrastructure. Across our 24/7 Network and Security Operations...


  • Delhi, India Systal Technology Solutions Full time

    Site Reliability EngineerCompetitive Salary & BenefitsBangaloreSystal is a global managed network and security service and transformation specialist. We consult, deploy, and integrate multi-vendor technologies which help enterprise businesses maximize the security and value of their complex IT infrastructure. Across our 24/7 Network and Security Operations...


  • Delhi, India Apex Systems Full time

    Devops EngineerBengaluru & Chennai RemoteLooking for an immediate Joiner• Overall 5+yrs of experience as Site Reliability Engineer /Devops Engineer• Bachelor’s or master’s Degree in software engineering, computer science, or in a related technical field• Familiarity with Infrastructure as Code (e.g. Terraform & CloudFormation)• Has a focus in any...


  • New Delhi, India AIVID.AI Full time

    Role Overview:We are seeking proactive and skilled Site Reliability Engineers (SREs) to manage clientdeployments, provide on-site support, and ensure the seamless functioning of our AI-basedcamera analytics systems. This hybrid role requires a mix of on-site visits and remote work.The selected candidates will operate from their respective regions—New...


  • Delhi, India Systal Technology Solutions Full time

    Site Reliability EngineerCompetitive Salary & BenefitsBangaloreSystal is a global managed network and security service and transformation specialist. We consult, deploy, and integrate multi-vendor technologies which help enterprise businesses maximize the security and value of their complex IT infrastructure. Across our 24/7 Network and Security Operations...


  • New Delhi, India Mrsool Full time

    Who Are We❓Welcome to the world of Mrsool! ✨ Where on-demand delivery meets unparalleled user needs to deliver anything you desire. As one of the largest delivery platforms in the Middle East and North Africa (MENA) region, Mrsool has captivated users with its unique and seamless experience, earning it the highest ratings among all major delivery...


  • Delhi, India Netradyne Full time

    About USAt Netradyne, we are passionate about building software that addresses real-world problems using Edge, AI/ML, and cloud-native technologies. We rely on our Site Reliability Engineers (SREs) to empower our customers with a rich feature set, high availability, and performance to help them achieve their missions. As we scale our business and customer...


  • delhi, India HCLTech Full time

    Urgent Opening for Cloud Senior Site Reliability Engineer role for Pan India location with HCL TechInterested candidates kindly share your updated resume to with the subject line " Cloud Senior Site Reliability Engineer Role_ your name & preferred location"Job Description:Ability to learn SRE practices across Red Hat Open Shift, Google Cloud or...