See more Collapse

Senior Site Reliability Engineer, Product

1 month ago


Bengaluru, India Sumo Logic Full time

Location

Remote from India.

Summary of role

Own availability, the most important product feature, by continually striving for sustained operational excellence of Sumo’s planet-scale observability and security products. Work alongside your global SRE team, executing on projects in your product-area specific reliability roadmap, to optimize operations, increase efficiency in our use of cloud resources and our developer’s time, harden security posture, and increase feature velocity of our developers Work closely with multiple teams to optimize the operations of their microservices - and improve the lives of the engineers within your product area engineering teams.

Responsibilities

Support the engineering teams within your product area by maintaining and executing a reliability roadmap of opportunities for improvement for reliability, maintainability, security, efficiency, and velocity - and help for realizing those opportunities. Collaborate with development infrastructure, Global SRE, and your product area engineering teams to establish and continually refine your reliability roadmap. Participate in defining, evolving, and managing SLOs for several teams within your product area. Participate in on-call rotations within your product area to understand operations workload so you can continually work to improve the on-call experience and reduce operational workload for running microservices and related components. Complete projects to optimize and tune on-call experience for your engineering teams. Continually improve the lifecycle of microservices and architectural components from inception and design, through deployment, operation, and refinement. Write code and automation to reduce operational workload, increase efficiency, improve security posture, eliminate toil, and enable Sumo’s developers to deliver features more rapidly. Work closely with the developer infrastructure teams to expedite development infrastructure adoption of tools to advance your reliability roadmap by identifying needs for your supported engineering teams, and contributing back features and bug fixes when needed.  Scale systems sustainably through mechanisms like automation, and evolve systems by pushing for changes that improve reliability and velocity. Facilitate blame-free root cause analysis meetings for incidents to learn and drive improvement Participate in and continually improve our global IRC (incident response coordination) for all products. Drive root cause identification and issue resolution with the teams. Work inside of a fast-paced iterative environment.

Required Qualifications and Skills

Cloud native application development experience leveraging best practices and design patterns Strong debugging and trouble-shooting skills across the entire technology stack Deep understanding of AWS Networking, Compute, Storage, and managed services. Competency with modern CI/CD tooling like Kubernetes, Terraform, Ansible & Jenkins Experience with full life cycle support of services, from creation to production support Versed in Infrastructure as Code practices using technologies like Terraform or Cloud Formation Ability to author production ready code in at least one the following: Java, Scala or Go. Experience with Linux systems and at home on the command line Understand and apply modern approaches to cloud-native software security Experienced with agile frameworks, such as Scrum and Kanban, and how to operate within these frameworks to continually deliver value. Flexible and willing to step into new roles and responsibilities Willingness to learn and use Sumo Logic products for solving reliability and security issues Bachelor’s or Master's Degree in Computer Science, Electrical Engineering, or another scientific or technical discipline 6+ years of industry experience.

Desirable Skills

Experience using Sumo Logic products or other observability products for reliability and security Experienced with planet scale product development Running and operating SaaS products on AWS Cloud with expert level proficiency Experience with streaming technologies like Kafka, Kafka Streams, or KSQL Expert level experience in one or more of: Java, Go, Scala, or Python Expert level experience in one or more of: Terraform, Jenkins, Kubernetes Extensive experience running and tuning JVM workloads at scale

We have other current jobs related to this field that you can find below


  • Bengaluru, Karnataka, India Nilasu consulting Full time

    Job Title : Senior Site Reliability Engineer (SRE)Department : Cloud EngineeringJob Type : Full-timeJob Description:We are seeking a highly skilled Senior Site Reliability Engineer (SRE) with extensive experience in Cloud Engineering, particularly in AWS. The ideal candidate should have hands-on expertise in developing Cloud solutions using Terraform or...


  • Bengaluru, Karnataka, India Oracle Full time

    Title: Senior Site Reliability EngineeringJob Description :Building off our Cloud momentum, Oracle has formed a new organization Oracle Health Applications & Infrastructure. This team will focus on product development and product strategy for Oracle Health while building out a complete platform supporting modernized, automated healthcare. This is a net new...


  • Bengaluru, India First American (India) Full time

    The Role: A SRE Manager is ultimately responsible for system reliability, developer productivity and reducing time to market by striving to reduce technical debt of the services your SRE team supports. We seek managers who are passionate about site reliability to influence and drive the strategic SRE mission. As a Site Reliability Engineering Manager...


  • Bengaluru, India Oracle Full time

    Title: Senior Site Reliability Engineering Job Description :  Building off our Cloud momentum, Oracle has formed a new organization - Oracle Health Applications & Infrastructure. This team will focus on product development and product strategy for Oracle Health while building out a complete platform supporting modernized, automated healthcare. This...


  • Bengaluru, India Oracle Full time

    Title: Senior Site Reliability Engineering Job Description :  Building off our Cloud momentum, Oracle has formed a new organization - Oracle Health Applications & Infrastructure. This team will focus on product development and product strategy for Oracle Health while building out a complete platform supporting modernized, automated healthcare. This...


  • Bengaluru, India Ultrabot Innovations Full time

    Position Overview :As a Senior Site Reliability Engineer with 5-8 years of experience, you will play a key role in ensuring the reliability, scalability, and performance of our systems and infrastructure. You will leverage your expertise in Site Reliability Engineering (SRE) to implement best practices and methodologies, effectively troubleshoot complex...


  • Bengaluru, India Ultrabot Innovations Full time

    Position Overview :As a Senior Site Reliability Engineer with 5-8 years of experience, you will play a key role in ensuring the reliability, scalability, and performance of our systems and infrastructure. You will leverage your expertise in Site Reliability Engineering (SRE) to implement best practices and methodologies, effectively troubleshoot complex...


  • Bengaluru, Karnataka, India Squareroot Consulting Pvt Ltd. Full time

    Job Title : Senior Site Reliability Engineer (SRE)Location : Bangalore (Hybrid)Company Overview :We are Hiring for a dynamic and innovative FinTech company committed to delivering cutting-edge solutions to their clients. As part of our growth strategy, we are seeking a talented and experienced Hands-On Site Reliability Engineer (SRE) to join our...


  • Bengaluru, Karnataka, India Cisco Full time

    Team: Site Reliability Engineering : Core Cloud VerticalDuo Security, now a part of Cisco, is the leading provider of Trusted Access security and multi-factor authentication delivered through the cloud.Duo's mission is to make security simple for everyone. We were born from a hacker ethos and a desire to make the Internet a secure place. We empower people to...


  • Bengaluru, India Oracle Full time

    Title: Senior Database Site Reliability EngineerJob Description :Building off our Cloud momentum, Oracle has formed a new organization - Oracle Health Applications & Infrastructure. This team will focus on product development and product strategy for Oracle Health while building out a complete platform supporting modernized, automated healthcare. This is a...


  • Bengaluru, Karnataka, India CareerXperts Consulting Full time

    We are seeking a passionate Senior Site Reliability Engineer (SRE) to join our growing team. You will play a critical role in building, maintaining, and automating the infrastructure that supports our cloud-native platform. You will collaborate closely with development and operations teams to ensure high availability, scalability, and performance of our...


  • Bengaluru, Karnataka, India Oracle Full time

    Title: Senior Database Site Reliability EngineerJob Description :Building off our Cloud momentum, Oracle has formed a new organization Oracle Health Applications & Infrastructure. This team will focus on product development and product strategy for Oracle Health while building out a complete platform supporting modernized, automated healthcare. This is a net...


  • Bengaluru, Karnataka, India JFrog Full time

    Senior Site Reliability Engineer Bangalore, India | Production Share position At JFrog, we're reinventing DevOps to help the world's greatest companies innovate -- and we want you along for the ride. This is a special place with a unique combination of brilliance, spirit and just all-around great people. Here, if you're willing to do more, your career can...


  • Bengaluru, India JFrog Full time

    Senior Site Reliability Engineer Bangalore, India | Production Share position At JFrog, we’re reinventing DevOps to help the world’s greatest companies innovate -- and we want you along for the ride. This is a special place with a unique combination of brilliance, spirit and just all-around great people. Here, if you’re willing to do more, your career...


  • Bengaluru, India JFrog Full time

    Senior Site Reliability Engineer Bangalore, India | Production Share position At JFrog, we’re reinventing DevOps to help the world’s greatest companies innovate -- and we want you along for the ride. This is a special place with a unique combination of brilliance, spirit and just all-around great people. Here, if you’re willing to do more, your career...


  • Bengaluru, India First American (India) Full time

    The Role:A SRE Manager is ultimately responsible for system reliability, developer productivity and reducing time to market by striving to reduce technical debt of the services your SRE team supports. We seek managers who are passionate about site reliability to influence and drive the strategic SRE mission.As a Site Reliability Engineering Manager working...


  • Bengaluru, India Oracle Full time

    Title: Senior Database Site Reliability Engineer Job Description :  Building off our Cloud momentum, Oracle has formed a new organization - Oracle Health Applications & Infrastructure. This team will focus on product development and product strategy for Oracle Health while building out a complete platform supporting modernized, automated healthcare....


  • Bengaluru, India Oracle Full time

    Title: Senior Database Site Reliability Engineer Job Description :  Building off our Cloud momentum, Oracle has formed a new organization - Oracle Health Applications & Infrastructure. This team will focus on product development and product strategy for Oracle Health while building out a complete platform supporting modernized, automated healthcare....


  • Bengaluru, Karnataka, India Oracle Full time

    Title: Senior Database Site Reliability Engineer Job Description : Building off our Cloud momentum, Oracle has formed a new organization - Oracle Health Applications & Infrastructure. This team will focus on product development and product strategy for Oracle Health while building out a complete platform supporting modernized, automated healthcare....


  • Bengaluru, India Ensono Full time

    About Role Ensono is continuing its growth and building a cloud-native managed service offering for our clients. We are looking for energetic and skilled remote Site Reliability Engineers to join us on this exciting new journey. As a Site Reliability Engineer, you and your team will be responsible for between four and ten of Ensono cloud-native managed...