Site Reliability Engineering

2 weeks ago


Bangalore Velankani Tech Park, India Deutsche Bank Full time US$ 1,50,000 - US$ 2,00,000 per year
Job Description:

Job Title: Site Reliability Engineering - AVP

Location: Bangalore, India

Corporate Title: AVP

Role Description

Technology/Service is responsible for delivering the business vision and strategy, at a global level, focusing on achieving consistent operational excellence and client/user satisfaction through industrialisation, price/value optionality and leveraging increased automation and the use of technology. Work includes: Creating a digital vision and strategy for the bank, and ensuring its integration with the organization's overall strategic plans

  • Identifying opportunities for differentiating the bank's digital portfolio including capabilities and solutions
  • Acting as a change agent in leading the organizational changes that are required to create and maintain the necessary digital portfolio
  • Applying extensive knowledge and understanding of the evolving digital market, acts as a thought leader on emerging digital trends related to technology and business

What we'll offer you

As part of our flexible scheme, here are just some of the benefits that you'll enjoy

  • Best in class leave policy
  • Gender neutral parental leaves
  • 100% reimbursement under childcare assistance benefit (gender neutral)
  • Sponsorship for Industry relevant certifications and education
  • Employee Assistance Program for you and your family members
  • Comprehensive Hospitalization Insurance for you and your dependents
  • Accident and Term life Insurance
  • Complementary Health screening for 35 yrs. and above

Your key responsibilities

  • As Senior Site Reliability Engineer you
    • Orchestrate and contribute SRE activities across API Platforms and Integration services
    • Introduce all engineering disciplines that combine software- and systems engineering to build and run large-scale, massively distributed, fault-tolerant systems
    • Implement the core of DevOps with specific principles and practices, focusing on "what" and "how" to improve reliability
    • Establish and support capacity planning procedures and have a close eye on SLIs and SLOs for production readiness and in live environment
    • Coordinate with the rest of the division and the teams working on different layers of the application and infrastructure, and you have full commitment to collaboration on problem solving
  • For Infrastructure & Service Management you
    • Engage in and improve the whole lifecycle of services - from inception and design, deployment, operation, and refinement
    • Maintain services once they are live by measuring and monitoring availability, latency, and overall system health
    • Scale systems sustainably through mechanisms like automation; evolve systems by pushing for changes that improve reliability and velocity
    • Develop and enforce policies, standards and guidelines for site reliability
    • Automate application and infrastructure deployment activities to production environments.
  • For Incident & Problem Management you
    • Perform troubleshooting & Emergency Response
    • Investigate root causes and suggest solutions
    • Increase the productivity by leading blameless post-mortems
  • For Application Maintenance you
    • Collaboratively work with Product Owners and Engineers to run reliable services
    • Configure and maintains application & monitoring
    • Identify business objects for monitoring
    • Track system performance, capacity, and use your experience to create effective strategies for maintaining and improving system performance and availability.
  • For Operational Continuous Improvement you
    • Identify issues and optimization potential and introduce related user stories
    • Support with automation knowhow to reduce the risk of bad changes
    • Identify, design, develop, deploy tools and processes to monitor, maintain, and report site performance and availability
  • For Service Onboarding you
    • Support your Squad and your Chapter population in onboarding & promotions

Your skills and experiences

  • Expert hands-on experience with on-premises
  • Expert hands-on experience with cloud ecosystems run on Google Cloud
  • Expert hands-on experience with Docker / Kubernetes operations with GKE or similar technology
  • Expert experience with automated infrastructure provisioning based on Terraform/TerraGrunt, Terraform Enterprise, Ansible
  • Advanced hands-on experience with Continuous Integration / Continuous Deployment (Github) and patterns for CI/CD pipelines.
  • Advanced hands-on experience of monitoring tools like Prometheus, Grafana, Kibana and alerting tools like OpsGenie, NewRelic, DataDog, Splunk, Google Operations-Suite (Stackdriver)
  • Very good knowledge of security capabilities (TLS, OAuth2, KMS, Vault, Admission Controllers, let's encrypt or similar technologies).
  • Very good understanding of Microservice architectures and experience with API Management with Apigee or WSO2
  • Experience in software development in at least one language (Java, JavaScript, Python, Go).
  • Good Knowledge of the Software Development Life Cycle processes based on related tools such as
    • TeamCity, BitBucket, Artifactory
    • SonarQube, VeraCode, Crucible
    • JIRA, Confluence, Service Now

How we'll support you

  • Training and development to help you excel in your career
  • Coaching and support from experts in your team
  • A culture of continuous learning to aid progression
  • A range of flexible benefits that you can tailor to suit your needs

About us and our teams

Please visit our company website for further information:

We at DWS are committed to creating a diverse and inclusive workplace, one that embraces dialogue and diverse views, and treats everyone fairly to drive a high-performance culture. The value we create for our clients and investors is based on our ability to bring together various perspectives from all over the world and from different backgrounds. It is our experience that teams perform better and deliver improved outcomes when they are able to incorporate a wide range of perspectives. We call this #ConnectingTheDots.



  • Bangalore, Velankani Tech Park, India Deutsche Bank Full time ₹ 8,00,000 - ₹ 12,00,000 per year

    Job Description:Job Title: Site Reliability EngineerLocation: Bangalore, IndiaCorporate Title: AssociateRole DescriptionYou will work closely with application teams to ensure stable, well monitored applications that are resilient to faults. You will agree and review Service Level Objectives (SLOs) to achieve high availability for applications based on their...


  • Bangalore - Manyata Tech Park Road, India Commonwealth Bank Full time ₹ 12,00,000 - ₹ 36,00,000 per year

    Job Advert Text Organization: At CommBank, we never lose sight of the role we play in other people's financial wellbeing. Our focus is to help people and businesses move forward to progress. To make the right financial decisions and achieve their dreams, targets, and aspirations. Regardless of where you work within our organisation, your initiative,...


  • Bangalore, India ViewSonic Full time

    Job Requirements: Bachelor's degree in Computer Science, Engineering, or a related field. 3+ year of experience in a relevant role, such as Site Reliability Engineer, Dev Ops Engineer, or similar, is preferred but not mandatory. Basic understanding of AWS solutions including EC2, S3, Cloud Watch, Lambda, and RDS. Interest and understanding of Platform...


  • Bangalore, India HDFC Limited Full time

    Hiring for Lead / Sr Site Reliability Engineer for Mumbai & Bangalore Location Experience - 8 - 14 Years Job Purpose Analysing, troubleshooting, and designing vital services, platforms, and infrastructure on GCP while always thinking about reliability, scalability, resilience, security, and performance. Job Responsibilities: Help build a Site...


  • Bangalore, India WhiteLotus Talent Partners Full time

    We are looking for a L0 and L1 Site Reliability Engineer (SRE) Support to join our Krutrim Cloud Site Reliability operations team and ensure the smooth functioning of our cloud infrastructure powered by Open Stack and Kubernetes . In this role, you will focus on monitoring , basic troubleshooting , and incident response , helping to maintain high...


  • Bangalore, India ViewSonic Full time

    Job Requirements: Bachelor's degree in Computer Science, Engineering, or a related field. 3+ year of experience in a relevant role, such as Site Reliability Engineer, DevOps Engineer, or similar, is preferred but not mandatory. Basic understanding of AWS solutions including EC2, S3, CloudWatch, Lambda, and RDS. Interest and understanding of...


  • Bangalore, India ViewSonic Full time

    Job Requirements: Bachelor's degree in Computer Science, Engineering, or a related field. 3+ year of experience in a relevant role, such as Site Reliability Engineer, DevOps Engineer, or similar, is preferred but not mandatory. Basic understanding of AWS solutions including EC2, S3, CloudWatch, Lambda, and RDS. Interest and understanding of...


  • Bangalore, India WhiteLotus Talent Partners Full time

    We are looking for a L0 and L1 Site Reliability Engineer (SRE) Support to join our Krutrim Cloud Site Reliability operations team and ensure the smooth functioning of our cloud infrastructure powered by OpenStack and Kubernetes . In this role, you will focus on monitoring , basic troubleshooting , and incident response , helping to...


  • bangalore, India WhiteLotus Talent Partners Full time

    We are looking for a L0 and L1 Site Reliability Engineer (SRE) Support to join our Krutrim Cloud Site Reliability operations team and ensure the smooth functioning of our cloud infrastructure powered by OpenStack and Kubernetes. In this role, you will focus on monitoring, basic troubleshooting, and incident response, helping to maintain high system...


  • Bangalore, India HDFC Limited Full time

    Hiring for Lead / Sr Site Reliability Engineer for Mumbai & Bangalore Location Experience - 8 - 14 Years Job Purpose Analysing, troubleshooting, and designing vital services, platforms, and infrastructure on GCP while always thinking about reliability, scalability, resilience, security, and performance. Job Responsibilities: Help build a Site...