Senior Engineering Manager, SRE

4 weeks ago


noida, India Sumo Logic Full time

Want to lead a global team responsible for the most important product features – availability, reliability & security ? Sumo’s SRE program focuses on continual data-driven evolution and improvement of the reliability, security, and efficiency of our global scale technological presence. We are looking for a great leader with a passion for site reliability, continuous technology improvement, and reducing the operational workload of our own engineers - as well as our customers who leverage our products for their own monitoring and reliability.

Site Reliability Engineering (SRE) combines software and systems engineering to build and run large-scale, massively distributed, fault-tolerant systems. SRE ensures that Sumo’s services have reliability, uptime appropriate to users' needs as well as the ability to quickly and continuously deliver value to our customers.

Responsibilities

Reliability Program:

Drive the program that maintains excellent uptime numbers for our services. Manage error budgets and associated policies for key product SLOs. Promote blameless post-mortem culture combined with developer operational 

Accountability:

Continuously reduce operational workload for engineers by means of infrastructure improvements and automation. Cost Efficiency Program: Carry out projects that actively reduce our AWS spend. Manage AWS resource reservations for our whole infrastructure. Observe our current spend on cloud resources and improve our cost monitoring ecosystem.

Application Security Program:

Help product teams develop secure applications for the Sumo Logic platform. Integrate and implement solutions improving Sumo Logic’s security posture. Lead security reviews and penetration tests at design and implementation stages. Partner with the Security Operations Center (SOC) and Compliance team on our security and compliance posture, vulnerability management, and threat modeling of our tech stack. Educate product teams on secure development best practices and Quality  Engineering teams on continuous improvement of security testing.

Team Leadership:

Lead and grow a global team of SREs adept at building extremely high-volume, fault-tolerant, efficient, and scalable backend systems.

Technical Vision:

Partner with our technical leadership team to review choices on an ongoing basis, in anticipation of increased scale and ever-evolving technology to meet the demands of growing business. Leverage technical skills to successfully analyze and improve the efficiency, scalability, and reliability of our backend systems.

Required Qualifications and Skills

B.S. in Computer Sciences or related discipline (M.S., or Ph.D. is a plus). Minimum 8+ years of industry experience with a proven track record of ownership, delivery, and operational excellence. Minimum 3+ years in a management role. Experience being responsible for key SLOs of a cloud-based SaaS: availability, uptime, performance, and security . Experience in multi-threaded programming and distributed systems. Object-oriented programming experience, for example in Java, Scala, Golang . Experience with high volumes of data using the latest technologies such as Kafka, Kubernetes and Docker . Agile software development experience (test-driven development, iterative and incremental development). Experience in big data and/or 24x7 commercial service is highly desirable. Hands-on experience with public cloud Infrastructure-as-a-service and Platform-as-a-service offerings - Amazon Web Services, Google Cloud Platform, etc.

  • Noida, India Infogain Full time

    SRE / Reliability Engineer (Senior) with skills ITSM Principles, AWS - EKS, AWS - CloudFormation, SRE Architecture, AWS-Apps, GCP-Apps, AWS-Infra, SRE Engineering, AWS DBA for location Noida, India Posted on: June 29, Share on Linkedin Share on Twitter Share on Facebook ROLES & RESPONSIBILITIES GCP Administration: · Manage and configure GCP...


  • noida, India Infogain Full time

    SRE / Reliability Engineer (Senior) with skills ITSM Principles, AWS - EKS, AWS - CloudFormation, SRE Architecture, AWS-Apps, GCP-Apps, AWS-Infra, SRE Engineering, AWS DBA for location Noida, India Posted on: June 29, Share on Linkedin Share on Twitter Share on Facebook ROLES & RESPONSIBILITIES GCP Administration: · Manage and...


  • noida, India Infogain Full time

    SRE / Reliability Engineer (Senior) with skills ITSM Principles, AWS - EKS, AWS - CloudFormation, SRE Architecture, AWS-Apps, GCP-Apps, AWS-Infra, SRE Engineering, AWS DBA for location Noida, India Posted on: July 02, Share on Linkedin Share on Twitter Share on Facebook ROLES & RESPONSIBILITIES GCP Administration: · Manage and...


  • Noida, India Infogain Full time

    SRE / Reliability Engineer (Senior) with skills ITSM Principles, AWS - EKS, AWS - CloudFormation, SRE Architecture, AWS-Apps, GCP-Apps, AWS-Infra, SRE Engineering, AWS DBA for location Noida, India Posted on: July 02, Share on Linkedin Share on Twitter Share on Facebook ROLES & RESPONSIBILITIES GCP Administration: · Manage and configure GCP...


  • Noida, Uttar Pradesh, India Coforge Full time

    Site Reliability LeaderExperience range: 15-20 YearsLocation:Greater NoidaThe Site Reliability Engineer Lead (SRE Lead) will manage a team of SRE's to proactively ensure the stability, resilience and scale of our services by automation, testing and engineering. To build on expertise from systems/operations, cloud infrastructure (AZURE), build and release...

  • Sre Architect

    2 weeks ago


    Noida, Uttar Pradesh, India Triangle Global Full time

    "System Reliability ? Lead efforts to enhance the reliability, availability, and performance of critical systems ? Perform in-depth analysis of system behavior, identifying areas for improvement and implementing solutions 2. Automation Frameworks ? Design, implement, and maintain automation tools and frameworks to streamline operational processes ? Drive the...


  • Noida, India Adobe Full time

    Our Company Changing the world through digital experiences is what Adobe’s all about. We give everyone—from emerging artists to global brands—everything they need to design and deliver exceptional digital experiences! We’re passionate about empowering people to create beautiful and powerful images, videos, and apps, and transform how companies...


  • noida, India Adobe Full time

    Our Company Changing the world through digital experiences is what Adobe’s all about. We give everyone—from emerging artists to global brands—everything they need to design and deliver exceptional digital experiences! We’re passionate about empowering people to create beautiful and powerful images, videos, and apps, and transform how companies...


  • Noida, India Red Hat India Private Limited Full time

    Red Hat is seeking a Senior Site Reliability Engineer (SRE) to develop, scale, and operate our OpenShift managed cloud services. OpenShift is Red Hat’s enterprise Kubernetes distribution. As an SRE you will contribute to running OpenShift at scale by enabling customer self-service, making our monitoring system more sustainable, and eliminating work...


  • Noida, India Adobe Full time

    JOB LEVELP50EMPLOYEE ROLEIndividual ContributorThe Adobe Connect Team is at the forefront of revolutionizing virtual communication and collaboration experiences. Adobe Connect is a leading web conferencing and online meeting platform, used by millions of professionals globally to connect, communicate, and collaborate seamlessly. As a Software Development...

  • Process Developer

    4 weeks ago


    noida, India Genpact Full time

    Genpact (NYSE: G) is a global professional services and solutions firm delivering outcomes that shape the future. Our 125,000+ people across 30+ countries are driven by our innate curiosity, entrepreneurial agility, and desire to create lasting value for clients. Powered by our purpose – the relentless pursuit of a world that works better for people –...

  • Process Developer

    4 weeks ago


    Noida, India Genpact Full time

    Genpact (NYSE: G) is a global professional services and solutions firm delivering outcomes that shape the future. Our 125,000+ people across 30+ countries are driven by our innate curiosity, entrepreneurial agility, and desire to create lasting value for clients. Powered by our purpose – the relentless pursuit of a world that works better for people –...

  • Sre Architect

    2 weeks ago


    Noida, India Triangle Global Full time

    "System Reliability ? Lead efforts to enhance the reliability, availability, and performance of critical systems ? Perform in-depth analysis of system behavior, identifying areas for improvement and implementing solutions 2. Automation Frameworks ? Design, implement, and maintain automation tools and frameworks to streamline operational processes ? Drive the...


  • Noida, India GlobalLogic Full time

    Job: - IRC212200- Location: - India - Noida, Bangalore- Designation: - Associate Consultant- Experience: - 5-10 years- Function: - Engineering- Skills: - AWS, DevOps, Linux- Work Model: - Hybrid**Description**: Join GlobalLogic, to be a valid part of the team working on a huge software project for the world-class company providing M2M / IoT 4G/5G modules...


  • Noida, Uttar Pradesh, India Microsoft Full time

    Overview Are you passionate about building and maintaining the world's computer? Do you want to work on the cutting-edge of cloud technology and solve challenging problems at hyperscale? If so, join us as a Site Reliability Engineer (SRE) in the Microsoft Azure Networking team. As an SRE, you will be part of a team that ensures the buildout of...


  • noida, India Microsoft Full time

    Overview Are you passionate about building and maintaining the world’s computer? Do you want to work on the cutting-edge of cloud technology and solve challenging problems at hyperscale? If so, join us as a Site Reliability Engineer (SRE) in the Microsoft Azure Networking team. As an SRE, you will be part of a team that ensures the buildout of...


  • Noida, India Microsoft Full time

    Overview Are you passionate about building and maintaining the world’s computer? Do you want to work on the cutting-edge of cloud technology and solve challenging problems at hyperscale? If so, join us as a Site Reliability Engineer (SRE) in the Microsoft Azure Networking team. As an SRE, you will be part of a team that ensures the buildout of...

  • SRE (6-10 years)

    2 weeks ago


    Noida, Uttar Pradesh, India Adobe Full time

    Our Company Changing the world through digital experiences is what Adobe's all about. We give everyone—from emerging artists to global brands—everything they need to design and deliver exceptional digital experiences We're passionate about empowering people to create beautiful and powerful images, videos, and apps, and transform how companies...


  • noida, India Sumo Logic Full time

    Own availability, the most important product feature, by continually striving for sustained operational excellence of Sumo’s planet-scale observability and security products. Work with your global SRE team to optimize operations, increase efficiency in our use of cloud resources and our developer’s time, harden security posture, and increase feature...

  • Process Developer

    2 weeks ago


    Noida, Uttar Pradesh, India Genpact Full time

    Genpact (NYSE:G) is a global professional services and solutions firm delivering outcomes that shape the future. Our 125,000+ people across 30+ countries are driven by our innate curiosity, entrepreneurial agility, and desire to create lasting value for clients. Powered by our purpose – the relentless pursuit of a world that works better for people – we...