Sr. Site Reliability Engineer
5 hours ago
P-1485
At Databricks, we are passionate about empowering data teams to tackle the world's most challenging problems — from bringing the next mode of transportation to reality to accelerating the development of medical breakthroughs. We achieve this by building and operating the world's best data and AI infrastructure platform, enabling our customers to leverage deep data insights and enhance their business. Founded by engineers — and customer-obsessed — we leap at every opportunity to tackle technical challenges, from designing next-gen UI/UX for interfacing with data to scaling our services and infrastructure across millions of virtual machines. And we're only getting started.
As a Sr. SRE you will utilize your technical experience and resourcefulness to lead urgent customer situations to conclusion. You will be responsible for managing frequent, high-quality updates to all internal and external stakeholders. You will advocate with engineering and leadership on behalf of your customers and will ensure that escalations are handled with the appropriate level of urgency from stakeholders.
This role combines operational leadership, technical systems knowledge, and exceptional communication skills. You will be at the intersection of engineering depth and operational clarity, ensuring that every major incident is managed with precision, transparency, and continuous improvement.
The Impact
- Drive critical customer escalations or widespread outages to conclusion and resolution. Escalate to on-call resources in support and engineering and establish checkpoint calls and action items to ensure that progress is made and status updates are delivered on time.
- Demonstrate cross-functional leadership while establishing ownership of escalations and outages.
- Compile and deliver frequent high-quality communication to internal and external stakeholders including executive staff. Candidate should be comfortable creating concise and effective messaging that is tailored to a technical or executive audience with minimal assistance from others.
- Commence and lead war rooms while establishing other temporary communication channels as warranted for the duration of an outage.
- Ability to multi-task on several incidents and/or projects at once.
- Be the leader who derives product and process improvements from every incident and submits necessary feedback for improvements.
- Participate in on-call rotations.
What are we looking for?
- Minimum 5 years of experience in customer support, support escalation and incident management is required.
- Minimum 5 years of experience in designing or testing or maintaining Python/Java/Scala-based applications in typical project delivery and consulting environments is required.
- Prior incident management or escalation management experience is required.
- Hands-on experience developing any two or more of the following: Big Data, Hadoop, Spark, Machine Learning, Artificial Intelligence, Streaming, Kafka, Data Science, ElasticSearch related industry use cases at the production scale.
- Hands-on experience in the performance tuning/troubleshooting of Spark-based applications at a production scale.
- Working knowledge in Data Lakes and preferably on the SCD types use cases at production scale.
- Working and hands-on experience with any SQL-based databases, Data Warehousing/ETL technologies like Informatica, DataStage, Oracle, Teradata, SQL Server and MySQL
- Linux/Unix administration skills and hands-on experience with AWS or Azure or GCP is required.
- Proven and real-time experience in JVM and Memory Management techniques such as Garbage collections, Heap/Thread Dump Analysis is required.
- Excellent analytical and troubleshooting skills are required. Candidate should be able to demonstrate technical excellence by applying engineering principles to solve complex problems.
- Work with a high degree of integrity, accountability, attention to detail, execution and planning expertise.
- Excellent contextual interpretation and writing skill with an effective ability to summarize and communicate to technical and business audiences is required.
- Demonstrates strong ability to make timely decisions for both business and technical perspectives.
- Enjoy working under pressure in a fast and high performance environment.
- Candidate must demonstrate resilience and the capacity to maintain a constructive attitude during high-pressure situations.
- Ability to work holidays and weekends as part of an on-call rotation is required.
- Bachelor's degree in Computer Science or a related field is required.
About Databricks
Databricks is the data and AI company. More than 10,000 organizations worldwide — including Comcast, Condé Nast, Grammarly, and over 50% of the Fortune 500 — rely on the Databricks Data Intelligence Platform to unify and democratize data, analytics and AI. Databricks is headquartered in San Francisco, with offices around the globe and was founded by the original creators of Lakehouse, Apache Spark, Delta Lake and MLflow. To learn more, follow Databricks on Twitter, LinkedIn and Facebook.
Benefits
At Databricks, we strive to provide comprehensive benefits and perks that meet the needs of all of our employees. For specific details on the benefits offered in your region, please visit
Our Commitment to Diversity and Inclusion
At Databricks, we are committed to fostering a diverse and inclusive culture where everyone can excel. We take great care to ensure that our hiring practices are inclusive and meet equal employment opportunity standards. Individuals looking for employment at Databricks are considered without regard to age, color, disability, ethnicity, family or marital status, gender identity or expression, language, national origin, physical and mental ability, political affiliation, race, religion, sexual orientation, socio-economic status, veteran status, and other protected characteristics.
Compliance
If access to export-controlled technology or source code is required for performance of job duties, it is within Employer's discretion whether to apply for a U.S. government license for such positions, and Employer may decline to proceed with an applicant on this basis alone.
-
Sr Site Reliability Engineer
4 days ago
Bengaluru, Karnataka, India Visa Inc. Full time ₹ 12,00,000 - ₹ 24,00,000 per yearJob Description The Opportunity We are seeking a skilled and innovative Sr. Site Reliability Engineer to join our team and help solve complex challenges on a global scale. The Middleware Product Reliability Engineering (PRE) group is dedicated to ensuring our products and services operate with Always On availability, exceptional reliability, and...
-
Site Reliability Engineer
4 days ago
Bengaluru, Karnataka, India HDFC Limited Full time ₹ 15,00,000 - ₹ 25,00,000 per yearHiring for Lead / Sr Site Reliability Engineer for Mumbai & Bangalore LocationExperience YearsJob PurposeAnalysing, troubleshooting, and designing vital services, platforms, and infrastructure on GCP while always thinking about reliability, scalability, resilience, security, and performance.Job Responsibilities:Help build a Site Reliability Engineering...
-
Sr Site Reliability Engineer
1 week ago
Bengaluru, Karnataka, India Visa Full time ₹ 12,00,000 - ₹ 36,00,000 per yearCompany Description Visa is a world leader in payments and technology, with over 259 billion payments transactions flowing safely between consumers, merchants, financial institutions, and government entities in more than 200 countries and territories each year. Our mission is to connect the world through the most innovative, convenient, reliable, and secure...
-
Site Reliability Engineer
1 week ago
Bengaluru, Karnataka, India Visa Inc. Full time ₹ 1,20,000 - ₹ 3,00,000 per yearJob Description We are seeking an accomplished Site Reliability Engineer (SRE) Sr Consultant to join our dynamic Observability team. In this senior role, you will provide technical leadership in developing and maintaining reliable, secure, and cost-effective observability solutions that support our global operations. As the Sr. consultant SRE, you will...
-
Senior Site Reliability Engineer
5 days ago
Bengaluru, Karnataka, India Zeco Systems, Inc. dba Shell Recharge Solutions Full time ₹ 10,00,000 - ₹ 25,00,000 per yearSr. Site Reliability EngineerShell Recharge Solutions is a leader in delivering the new electric mobility future through innovative software, infrastructure, and professional services that empower utilities, cities, fleets, transit agencies, and automakers to deploy EV charging infrastructure at scale. Our technology is connecting EV infrastructure...
-
Sr. Site Reliability Engineer
2 weeks ago
Bengaluru, Karnataka, India Visa Inc. Full time ₹ 20,00,000 - ₹ 25,00,000 per yearJob Description What We Do We digitalize traditional services and turn the phone into a cutting-edge tool for payments, loyalty and banking on the go, crafting secure and powerful solutions so that transacting becomes convenient, safe and super-fast.What were looking for We are looking for an experienced Site Reliability Engineer with proven experience...
-
Site Reliability Engineer
4 days ago
Bengaluru, Karnataka, India AppHelix Full time ₹ 9,00,000 - ₹ 12,00,000 per yearRole DescriptionThis is a full-time on-site role located in Bengaluru for a Site Reliability Engineer. The Site Reliability Engineer will be responsible for maintaining and improving the reliability of AppHelix's systems. Daily tasks include monitoring system performance, troubleshooting issues, managing infrastructure, and supporting software development....
-
Site Reliability Engineer
2 days ago
Bengaluru, Karnataka, India FIS Full time ₹ 12,00,000 - ₹ 36,00,000 per yearAbout the Role :Site Reliability Engineer (SRE)with deep expertise inMainframe technologies like COBOL, JCL, etc. to support and enhance ourCard Management & Payment processing functions. This role will be responsible for ensuring reliability, high availability, scalability, stability and performance of mission-critical mainframe software applications and...
-
Site Reliability Engineer
1 week ago
Bengaluru, Karnataka, India super Full time ₹ 12,00,000 - ₹ 24,00,000 per yearSite Reliability Engineer (SRE) Level 3Overview:A Site Reliability Engineer (SRE) Level 3 is a senior technical leadership role focused on designing, implementing, and maintaining large-scale, complex, and highly reliable systems. This role emphasizes a blend of software and systems engineering to ensure the availability, latency, performance, and capacity...
-
Site Reliability Engineer
6 days ago
Bengaluru, Karnataka, India eBay Full time ₹ 12,00,000 - ₹ 36,00,000 per yearAt eBay, we're more than a global ecommerce leader — we're changing the way the world shops and sells. Our platform empowers millions of buyers and sellers in more than 190 markets around the world. We're committed to pushing boundaries and leaving our mark as we reinvent the future of ecommerce for enthusiasts.Our customers are our compass, authenticity...