SRE with AIOP and Dynatrace
4 weeks ago
Minimum of 6 years of relevant work experience in critical production environments
Experience in enabling observability within applications to extract appropriate telemetry into suitable back ends like Dynatrace
Hands-on experience of curating Service Level Objectives, defining Error Budgets and refining the change management lifecycle to accommodate the same
Knowledge and experience with CI CD pipelines and deployment patterns like Canary
Analytics of application telemetry and AIOps enablement using Dynatrace Davis or an alternative product in combination with any other tools for orchestration
Has experience defining an SRE capability charter and roadmap for all dependent teams
Has experience successfully running and providing leadership to DevOps or SRE teams (preferred)
Working knowledge of SQL and troubleshooting by writing queries is key
Knowledge of containerized solutions and orchestration tools like Kubernetes
Core Capabilities:
Understand and demonstrate application of SRE principles, particularly toil reduction, blameless post mortems, monitoring distributed systems and release engineering
Indepth knowledge of any observability product like Dynatrace, Splunk or ELK stack covering synthetic monitoring, RUM and APM
Ability to instrument microservices applications via OpenTelemetry to extract traces is beneficial
Experience administering applications and infrastructure services in hyperscaler environments such as AWS, Azure or GCP is key
Hands-on experience in writing Python scripts and Ansible templates for application deployment automation or other automations is important
Ability to diagnose and debug systems at the code level (Java preferred) is beneficial
Qualification:
ITIL4 certification is mandatory. Achieving Practitioner or Intermediate level certifications are preferred
SRE Foundation certification via PeopleSoft or DevOps Institute is beneficial
AWS Solutions Architect Associate qualification or alternative from another Cloud Service Provider is preferred
Role & Responsibilities:
Formulate the detailed SRE rollout plan and execute a transformation roadmap
Continuously seek to uplift the maturity of SRE implementation and improve SLO, MTTR, MTTD as well as any other relevant KPIs identified
Engage in on call and critical operations support activities while leading blameless post mortems
Direct liaison with customers remotely and face to face for stakeholder management
Formulate a plan to eliminate toil by lowering incident volume, eliminating noise from alerts, automating manual processes, and converting workarounds into system features
Work with Development, QA and other squads to design, build and rollout reliability features into the applications being delivered
Lead a team of SREs deployed on the ground while being engaged hands on
Primary Location Bangalore, Karnataka, India Job Type Experienced Primary Skills CI/CD, AWS - SRE, SRE Years of Experience 10 Qualification
Knowledge & Experience:
Minimum of 6 years of relevant work experience in critical production environments
Experience in enabling observability within applications to extract appropriate telemetry into suitable back ends like Dynatrace
Hands-on experience of curating Service Level Objectives, defining Error Budgets and refining the change management lifecycle to accommodate the same
Knowledge and experience with CI CD pipelines and deployment patterns like Canary
Analytics of application telemetry and AIOps enablement using Dynatrace Davis or an alternative product in combination with any other tools for orchestration
Has experience defining an SRE capability charter and roadmap for all dependent teams
Has experience successfully running and providing leadership to DevOps or SRE teams (preferred)
Working knowledge of SQL and troubleshooting by writing queries is key
Knowledge of containerized solutions and orchestration tools like Kubernetes
Core Capabilities:
Understand and demonstrate application of SRE principles, particularly toil reduction, blameless post mortems, monitoring distributed systems and release engineering
Indepth knowledge of any observability product like Dynatrace, Splunk or ELK stack covering synthetic monitoring, RUM and APM
Ability to instrument microservices applications via OpenTelemetry to extract traces is beneficial
Experience administering applications and infrastructure services in hyperscaler environments such as AWS, Azure or GCP is key
Hands-on experience in writing Python scripts and Ansible templates for application deployment automation or other automations is important
Ability to diagnose and debug systems at the code level (Java preferred) is beneficial
Qualification:
ITIL4 certification is mandatory. Achieving Practitioner or Intermediate level certifications are preferred
SRE Foundation certification via PeopleSoft or DevOps Institute is beneficial
AWS Solutions Architect Associate qualification or alternative from another Cloud Service Provider is preferred
Role & Responsibilities:
Formulate the detailed SRE rollout plan and execute a transformation roadmap
Continuously seek to uplift the maturity of SRE implementation and improve SLO, MTTR, MTTD as well as any other relevant KPIs identified
Engage in on call and critical operations support activities while leading blameless post mortems
Direct liaison with customers remotely and face to face for stakeholder management
Formulate a plan to eliminate toil by lowering incident volume, eliminating noise from alerts, automating manual processes, and converting workarounds into system features
Work with Development, QA and other squads to design, build and rollout reliability features into the applications being delivered
Lead a team of SREs deployed on the ground while being engaged hands on
-
Senior Principal Consultant
4 weeks ago
bangalore, India Genpact Full timeSenior Principal Consultant - RunOps (SRE, AIOPS, GenAI) ArchitectLocation: Bangalore/Hyderabad/PuneWe are looking for candidates with deep knowledge and experience with IT infrastructure (on-premises, cloud, hybrid, and multi-cloud environments) and operations, and more importantly, how to optimize efficiency, effectiveness, service quality, and agility...
-
AIOps Enrichment
4 weeks ago
bangalore, India Capgemini Full timeJob Description Implement and manage AIOps solutions for enriching and enhancing monitoring data. Develop and maintain scripts and tools to automate ongoing configuration tasks. Collaborate with cross-functional teams to gather requirements and ensure alignment with business needs. Utilize machine learning and AI techniques to analyze...
-
SRE / Reliability Engineer (Lead)
2 days ago
bangalore, India Infogain Full timeSRE / Reliability Engineer (Lead) with skills ITSM Principles, AWS - EKS, AWS - CloudFormation, SRE Architecture, AWS-Apps, GCP-Apps, AWS-Infra, SRE Engineering, AWS DBA for location Any Infogain Base Location (Noida, Gurugram, Bangalore, Mumbai, Pune) Posted on: May 14, Share on Linkedin Share on Twitter Share on Facebook ROLES &...
-
SRE Engineer
1 week ago
bangalore, India Australia and New Zealand Banking Group Limited (ANZ) Full timeSRE Engineer SRE Engineer Req ID: Department: Tech Pacific Division: Technology Location: Bengaluru About the role At ANZ our purpose is to shape a world where people and communities thrive. We’re making this happen by improving our customers’ financial wellbeing so they can achieve incredible things – be it buying their home, building...
-
Senior Engineer
3 weeks ago
Bangalore, Karnataka, India CME India Technology And Support Services Pvt Ltd Full timeResponsibilities :- Automate/ IaC first mindset.- Develop, design and maintain automation tools and orchestration. - Build and improve tools for users to understand and analyze the health and operations of large-scale data-intensive systems.- Monitor the performance of our infrastructure and develop automated solutions to address any issues.- Provide...
-
Senior Engineer
4 weeks ago
Bangalore, India CME India Technology And Support Services Pvt Ltd Full timeResponsibilities :- Automate/ IaC first mindset.- Develop, design and maintain automation tools and orchestration. - Build and improve tools for users to understand and analyze the health and operations of large-scale data-intensive systems.- Monitor the performance of our infrastructure and develop automated solutions to address any issues.- Provide...
-
Cochin,Kochi,Trivandrum,Thiruvananthapuram,Bangalore, India litmus7 Full timeJob Description : Job role : SRE Lead/Manager. Prior experience in supporting JAVA based e-commerce application is MANDATORY. Align and implement SRE principles, best practices based on ongoing Issues. Closely work with Client Partners, Account Managers, and SRE Architect to prioritize the workload. Should be able to lead a 30+ members SRE team with...
-
Site Reliability Engineer
14 hours ago
bangalore, India OptOut Full timeResponsibilities :- Lead the SRE & Observability teams and execute on the vision of providing an enterprise based common Observability Platform leveraged by a global Engineering, Product and Cloud organization- Help drive change across the company, working towards a common methodology based around Site Reliability Engineering and Solid System Engineering...
-
DevOps Manager
2 weeks ago
Bangalore, India Randstad India Full timeLead DevOps Engineer Service Reliability (SRE) will work with key organizational leaders and product owners to identify opportunities and drive technical vision of this SRE team. They will be responsible for technical roadmap and strategies with strong focus to drive operational excellence, improve performance and efficiency of both our application workload...
-
DevOps Manager
4 weeks ago
bangalore, India Randstad India Full timeLead DevOps Engineer Service Reliability (SRE) will work with key organizational leaders and product owners to identify opportunities and drive technical vision of this SRE team. They will be responsible for technical roadmap and strategies with strong focus to drive operational excellence, improve performance and efficiency of both our application workload...
-
DevOps Manager
3 weeks ago
Bangalore, Karnataka, India Randstad India Full timeLead DevOps Engineer Service Reliability (SRE) will work with key organizational leaders and product owners to identify opportunities and drive technical vision of this SRE team. They will be responsible for technical roadmap and strategies with strong focus to drive operational excellence, improve performance and efficiency of both our application workload...
-
App Support
4 weeks ago
bangalore, India Jobs for Humanity Full timeJob DescriptionFull time Experienced (relevant combo of work and education) Bachelor of Computer Engineering 0% Travel App Support (Unix,SQL,Openshift Microsoft Services) Are you curious, motivated, and forward-thinking? At FIS you’ll have the opportunity to work on some of the most challenging and relevant issues in financial services and technology. Our...
-
Site Reliability Engineer
2 days ago
Bangalore, India OptOut Full timeResponsibilities :- Lead the SRE & Observability teams and execute on the vision of providing an enterprise based common Observability Platform leveraged by a global Engineering, Product and Cloud organization- Help drive change across the company, working towards a common methodology based around Site Reliability Engineering and Solid System Engineering...
-
Site Reliability Engineer
3 days ago
Bangalore, Karnataka, India OptOut Full timeResponsibilities :- Lead the SRE & Observability teams and execute on the vision of providing an enterprise based common Observability Platform leveraged by a global Engineering, Product and Cloud organization- Help drive change across the company, working towards a common methodology based around Site Reliability Engineering and Solid System Engineering...
-
Site Reliability Engineer-Cloud Infrastructure
4 weeks ago
bangalore, India ByteDance Full timeAbout UsFounded in 2012, ByteDance's mission is to inspire creativity and enrich life. With a suite of more than a dozen products, ByteDance has made it easier and more fun for people to connect with, consume, and create content.Why Join UsCreation is the core of ByteDance's purpose. Our products are built to help imagination thrive. This is doubly true of...
-
App Support
4 days ago
bangalore, India Jobs for Humanity Full timeJob Description Position Type : Full time Type Of Hire : Experienced (relevant combo of work and education) Education Desired : Bachelor of Computer Engineering Travel Percentage : 0% App Support (Unix,SQL,Openshift Microsoft Services) Are you curious, motivated, and forward-thinking? At FIS you’ll have the opportunity to work on some of the...
-
Intraedge Technologies
5 days ago
Bangalore, India Intraedge Technologies Ltd. Full timeAbout the job :As a Software engineer for performance, Resiliency and Scalability on this team, you will be working on complex systems running on-prem, relational databases and large and complex datasets.- You will focus on optimizing overall product performance and reliability.- You will focus on defining and enhancing an automate-able performance...
-
Intraedge Technologies
5 days ago
bangalore, India Intraedge Technologies Ltd. Full timeAbout the job :As a Software engineer for performance, Resiliency and Scalability on this team, you will be working on complex systems running on-prem, relational databases and large and complex datasets.- You will focus on optimizing overall product performance and reliability.- You will focus on defining and enhancing an automate-able performance...
-
Intraedge Technologies
5 days ago
Bangalore, Karnataka, India Intraedge Technologies Ltd. Full timeAbout the job :As a Software engineer for performance, Resiliency and Scalability on this team, you will be working on complex systems running on-prem, relational databases and large and complex datasets.- You will focus on optimizing overall product performance and reliability.- You will focus on defining and enhancing an automate-able performance...
-
Security Reliability Manager
2 weeks ago
Bangalore, India CME India Technology And Support Services Pvt Ltd Full timeJob Description : Manager will help to manage, create, implement, and subsequently mature and support Cyber Defense solutions for CME's Network and Systems, with a focus on Cloud computing and Automation, within Cyber Defense Engineering - Global Information Security. This position will be responsible for the management of a team of : - Cyber...