Sr Incident Manager
4 days ago
P-1485 At Databricks, we are passionate about empowering data teams to tackle the world’s most challenging problems — from bringing the next mode of transportation to reality to accelerating the development of medical breakthroughs. We achieve this by building and operating the world’s best data and AI infrastructure platform, enabling our customers to leverage deep data insights and enhance their business. Founded by engineers — and customer-obsessed — we leap at every opportunity to tackle technical challenges, from designing next-gen UI/UX for interfacing with data to scaling our services and infrastructure across millions of virtual machines. And we're only getting started. As an Incident Manager, you will lead Databricks’ most critical production incidents while providing clear, accurate, and timely communication to customers, executives, and engineers. You’ll serve as both incident commander and reliability engineer; orchestrating multi-team responses, driving real-time status updates, and partnering with engineering to analyze and prevent failures. Your work will ensure Databricks maintains not only technical resilience but also customer and stakeholder confidence during high-impact events. This role combines operational leadership, technical systems knowledge, and exceptional communication skills. You will be at the intersection of engineering depth and operational clarity, ensuring that every major incident is managed with precision, transparency, and continuous improvement. The impact you will have here: Lead critical incidents — coordinate multi-disciplinary response efforts across Databricks’ cloud-based services to rapidly mitigate impact and restore operations. Drive technical root cause analysis and Reliability improvements: collaborate with engineering teams to trace and document underlying causes across distributed systems, services, and data stores. Summarize key learnings, clearly communicate action items, and ensure that technical and procedural improvements are followed through. Own communications during incidents — deliver frequent, high-quality updates to internal stakeholders (executives, engineering leadership, support) and compose and publish customer-facing notifications that are accurate, timely, and empathetic. Mentor and train peers in both incident communication and technical response disciplines to raise the overall quality of Databricks’ incident response. What are we looking for? 5+ years of experience in incident management, site reliability engineering, or production operations supporting large-scale, cloud-native systems. Proven ability to lead and coordinate high-severity incidents, including identifying impact, isolating fault domains, and managing multi-team response efforts. Strong understanding of cloud infrastructure (AWS, Azure, or GCP) — including compute, networking, storage, and observability components. Deep expertise in log analysis and debugging: Familiarity with log aggregation and search tools (e.g., Datadog, Elasticsearch, Splunk, Cloud Logging, or OpenTelemetry). Hands-on experience with observability systems — metrics, logging, and tracing frameworks (Prometheus, Grafana, OpenTelemetry, etc.). Proficiency in at least one major programming or scripting language (Python, Go, or Bash) for automating diagnostics, data collection, or analysis. Experience developing and maintaining incident playbooks and communication templates to ensure consistent, timely updates. Excellent contextual interpretation and writing skills, as well as the ability to effectively summarize and communicate to both technical and business audiences, are required. BS, Master's or other advanced degree in Computer Science or Computer Engineering, or related Engineering field.
-
Incident Engineer
2 days ago
Bengaluru, Karnataka, India Augmented Database Pvt Ltd || Project Implementation || Staff Augmentation Full time ₹ 6,00,000 - ₹ 18,00,000 per yearSenior Incident Manager positionExp-4 to 7 YrsRelevant- 3+ years of experience as Incident Manager/Sr. Incident EngineerLocation- BangaloreResponsibilities:Responsible for monitoring all major metrics via various monitoring tools and following the major incident management process in restoring the major impacting incidents.Responding to a reported service...
-
Sr Incident Manager
2 weeks ago
Bengaluru, India Databricks Full timeJob Description P-1485 At Databricks, we are passionate about empowering data teams to tackle the world's most challenging problems from bringing the next mode of transportation to reality to accelerating the development of medical breakthroughs. We achieve this by building and operating the world's best data and AI infrastructure platform, enabling our...
-
Incident Manager
2 weeks ago
Bengaluru, India The Nielsen Company Full timeAt Nielsen, we believe that career growth is a partnership. You ultimately own, fuel and set the journey. By joining our team of nearly 14,000 associates, you will become part of a community that will help you to succeed. We champion you because when you succeed, we do too. Embark on a new initiative, explore a fresh approach, and take license to think big,...
-
Sr. Incident Responder
1 week ago
Bengaluru, India DocuSign Full timeCompany Overview Docusign brings agreements to life. Over 1.5 million customers and more than a billion people in over 180 countries use Docusign solutions to accelerate the process of doing business and simplify people’s lives. With intelligent agreement management, Docusign unleashes business-critical data that is trapped inside of documents. Until now,...
-
Sr. Security Incident Response Engineer
6 days ago
Bengaluru, India Autodesk Full timeJob Requisition ID # 25WD93163About the Role As a Sr. Security Incident Response Engineer, you will be an essential contributor in our incident response team. In this role, you will harness your strong Splunk expertise to monitor, analyze, and investigate security incidents across multiple data sources. Your role is pivotal in maintaining our security...
-
Incident Manager
4 weeks ago
Bengaluru, India SourceFuse Full timeSourceFuse Technologies hiring Incident Manager 4-5 years of experience.Key Responsibilities:- Work closely with other IT and business teams to ensure seamless coordination during incidents.- Participate in on-call rotations and provide support during major incidents and outages.- Contribute to the development and maintenance of the incident management...
-
Incident Manager
3 weeks ago
Bengaluru, India SourceFuse Full timeSourceFuse Technologies hiring Incident Manager 4-5 years of experience. Key Responsibilities: Work closely with other IT and business teams to ensure seamless coordination during incidents. Participate in on-call rotations and provide support during major incidents and outages. Contribute to the development and maintenance of the incident management...
-
Incident Manager
2 hours ago
Bengaluru, India SourceFuse Full timeSourceFuse Technologies hiring Incident Manager 4-5 years of experience. Preferred - Female Key Responsibilities: - Work closely with other IT and business teams to ensure seamless coordination during incidents. - Participate in on-call rotations and provide support during major incidents and outages. - Contribute to the development and maintenance of the...
-
Incident Manager
3 days ago
Bengaluru, Karnataka, India New Groyp Talentoj Full time ₹ 15,00,000 - ₹ 25,00,000 per yearRoles and Responsibilities:Act as the primary point of contact for major incidents and escalations, ensuring rapid response and communication across technical and business teams.Lead and coordinate incident resolution efforts involving multiple support teams and stakeholders to restore service as quickly as possible.Manage the end-to-end incident lifecycle...
-
Incident Manager
4 weeks ago
Bengaluru, India SourceFuse Full timeSourceFuse Technologies hiring Incident Manager 4-5 years of experience.Key Responsibilities:Work closely with other IT and business teams to ensure seamless coordination during incidents.Participate in on-call rotations and provide support during major incidents and outages.Contribute to the development and maintenance of the incident management knowledge...