Senior Site Reliability Engineer

4 days ago


bangalore, India Nexthink Full time
Job Description

· Manage and maintain our Kubernetes clusters, including deployment, configuration, and upgrades. Ensure the stability and scalability of the clusters to accommodate increasing demands.

· Utilize your hands-on knowledge to automate routine tasks and streamline operations. Implement infrastructure as code (IaC) practices to facilitate rapid and reliable deployments, ensuring efficient resource provisioning and management.

· Participate in an on-call rotation, responding promptly and resolving critical incidents. Your commitment to running the cloud infrastructure will be crucial to maintaining high availability.

· Continuously assess the performance of our cloud infrastructure (AWS) and applications. Implement optimizations to enhance system efficiency and reduce response times.

 · Stay current with best practices, tools, and market trends. Evaluate and recommend innovative solutions to be applied in the company.

· Participate in incident handling

· Work closely with cloud architects and the team’s technical lead to validate new system architecture proposals to support new features in the cloud

· Proactively identify potential issues and troubleshoot system anomalies. Collaborate with other teams to address incidents and implement preventive measures to reduce downtime.

· Set up and maintain comprehensive monitoring and alerting systems to detect anomalies, capacity constraints, and potential performance bottlenecks. Ensure timely responses to alerts and alarms.

· Maintain accurate and up-to-date documentation of processes, procedures, and troubleshooting guides to facilitate knowledge sharing and standardization.


Qualifications

· Bachelor’s degree in computer science, Computer Engineering, or related field, or 6+ years relevant work experience.

· Strong hands-on experience in managing Kubernetes clusters in a production environment.

· Excellent communication skills and teamwork

· Knowledge in config automation (Ansible), CI/CD (Jenkins), and IaC (Terraform, Crossplane) for infrastructure management. Also proficient in at least one scripting language (bash, python)

· Extensive experience in Linux container technologies (e.g., Docker, LXC)

· Good knowledge of Linux, mainly Debian and CentOS,

· Familiar with source code management solutions (GitHub, Bitbucket) and the Atlassian suite (JIRA, Confluence)

· Experience working in an on-call rotation environment and running operations.

· Proven problem-solving skills and the ability to troubleshoot complex technical issues.

· Deep commitment to maintaining high system reliability and availability.

· Extensive experience with AWS cloud computing platform and related services.

· Intense motivation/curiosity to learn new things and discover new technologies,

· Be able to work autonomously

· Knowledge of monitoring systems (e.g., ELK, Prometheus, Kibana, New Relic, Datadog, Pagerduty)

· Speak professional-level English.

#LI-Hybrid

 


Additional Information

We are the pioneers and trailblazers of a global IT Market Category (DEX) that is shaping the future of how the world works, giving our customers’ IT Teams total digital visibility across their enterprise. Our innovative solutions integrate real-time analytics, automation, and employee feedback across all endpoints. This enables our IT teams to solve complex technical challenges, create ever more productive workplaces, and deliver happy, satisfied employees in the digital workplace.

With over 1000 employees across 5 continents, Nexthink operates as One Team, connecting, collaborating and innovating to continuously grow. We call our employees ‘Nexthinkers’ and our commitment to diversity, inclusion, and equity is second to none. We currently have over 75 nationalities working with us, from all cultures and backgrounds, speaking many different languages.

If you are looking for a change and like a nice atmosphere, lots of challenges, and having fun while working, this is a great opportunity for you Check what we offer:

  • Permanent Contract and a competitive compensation package (Stock Options also included).
  • Health insurance through our partnership with ACKO, including OPD coverage for dental, vision, health check-ups, consultations, and pharmacy expenses.
  • Hybrid work model balancing office and remote work, with a structured approach for new hires to foster connections and onboarding.
  • ️ Flexible Hours and unlimited vacation (employees have unlimited paid time off on top of the 22 days of holidays we offer). Plus, company-paid bank holidays (12), sick days (10-30), bereavement leave (5), and 3 days per year for volunteering.
  • Free access to professional training platforms to explore your interests and enhance your skills.
  • ️ Stay covered against accidents, bodily injuries, and disabilities with our personal accident insurance policy, providing assurance with coverage up to three times your annual CTC.
  • New mothers are entitled to up to 26 weeks of maternity leave, with the flexibility to use up to 8 weeks before the expected delivery and the remaining 18 weeks after. Birth fathers can take 4 weeks of paternity leave, while adoptive parents are eligible for 26 weeks of leave for mothers and 4 weeks for fathers.
  • Under the Payment of Gratuity Act, receive gratuity at the rate of 15 days of basic pay for every completed year of service, provided you've been employed by the company for a minimum of 5 years. Gratuity is payable at retirement or resignation based on your last drawn basic pay.
  •   Bonuses for referring successful hires after three months of continuous employment.

Please note that not all the benefits listed above are available for temporary, contract, and internship roles. To ensure you have the most up-to-date information, we recommend checking with your Recruitment Partner.



  • bangalore, India nference Full time

    Senior Site Reliability Engineer (SRE) Job Location: Bangalore Work Mode: Hybrid (3 days in the office, 2 days remote) As a Senior Site Reliability Engineer (SRE) at Nference, you will ensure the reliability, scalability, and performance of our nSights platform. Collaborate closely with engineering teams to design, build, and maintain systems supporting...


  • bangalore, India nference Full time

    Senior Site Reliability Engineer (SRE)Job Location: BangaloreWork Mode: Hybrid (3 days in the office, 2 days remote)As a Senior Site Reliability Engineer (SRE) at Nference, you will ensure the reliability, scalability, and performance of our nSights platform. Collaborate closely with engineering teams to design, build, and maintain systems supporting our...


  • bangalore, India Qlik Full time

    Description What makes us Qlik? A Gartner Magic Quadrant Leader for 13 years in a row, Qlik transforms complex data landscapes into actionable insights, driving strategic business outcomes. Serving over 40,000 global customers, our portfolio leverages pervasive data quality and advanced AI/ML capabilities that lead to better decisions, faster. ...


  • bangalore, India Qlik Full time

    Description What makes us Qlik? A Gartner Magic Quadrant Leader for 13 years in a row, Qlik transforms complex data landscapes into actionable insights, driving strategic business outcomes. Serving over 40,000 global customers, our portfolio leverages pervasive data quality and advanced AI/ML capabilities that lead to better decisions, faster. We...


  • bangalore, India Barracuda Full time

    Job ID: 25-251Come Join Our Passionate Team! At Barracuda, we make the world a safer place. We believe every business deserves access to cloud-enabled, enterprise-grade security solutions that are easy to buy, deploy, and use. We protect email, networks, data and applications with innovative solutions that grow and adapt with our customers’ journey. More...


  • bangalore, India Barracuda Full time

    Job ID: 25-251Come Join Our Passionate Team! At Barracuda, we make the world a safer place. We believe every business deserves access to cloud-enabled, enterprise-grade security solutions that are easy to buy, deploy, and use. We protect email, networks, data and applications with innovative solutions that grow and adapt with our customers’ journey. More...


  • bangalore, India Okta, Inc. Full time

    Get to know Okta Okta is The World’s Identity Company. We free everyone to safely use any technology—anywhere, on any device or app. Our Workforce and Customer Identity Clouds enable secure yet flexible access, authentication, and automation that transforms how people move through the digital world, putting Identity at the heart of business security...


  • bangalore, India Oracle Full time

    Building off our Cloud momentum, Oracle has formed a new organization - Oracle Health Applications & Infrastructure. This team will focus on product development and product strategy for Oracle Health while building out a complete platform supporting modernized, automated healthcare. This is a net new line of business, constructed with an entrepreneurial...


  • bangalore, India Oracle Full time

    Building off our Cloud momentum, Oracle has formed a new organization - Oracle Health Applications & Infrastructure. This team will focus on product development and product strategy for Oracle Health while building out a complete platform supporting modernized, automated healthcare. This is a net new line of business, constructed with an entrepreneurial...


  • bangalore, India Barracuda Full time

    Job ID 25-281 Come Join Our Passionate Team! At Barracuda, we make the world a safer place. We believe every business deserves access to cloud-enabled, enterprise-grade security solutions that are easy to buy, deploy, and use. We protect email, networks, data, and applications with innovative solutions that grow and adapt with our customers’ journey....


  • bangalore, India NetApp Full time

    Title: Senior Site Reliability Engineer Location: Bangalore, Karnataka, IN, 560071 Requisition ID: 126263 Job SummaryAs a Cloud Infrastructure/Site Reliability Engineer, you will operate at the intersection of development and operations. Your role will involve engaging in and enhancing the lifecycle of cloud services - from design through deployment,...


  • bangalore, India CirrusLabs Full time

    About the CompanyWe are CirrusLabs. Our vision is to become the world's most sought-after niche digital transformation company that helps customers realize value through innovation. Our mission is to co-create success with our customers, partners and community. Our goal is to enable employees to dream, grow and make things happen. We are committed to...


  • bangalore, India Mimecast Full time

    Senior Site Reliability Engineer – Data  Retention The driving force behind  our award -winning  Data Retention platform  at  Mimecast Dive into the forefront of innovation with our Data Retention engineering team, taking on the crucial Operations role to help us develop operational aspects of our archiving and security software and its...


  • bangalore, India Mimecast Full time

    Senior Site Reliability Engineer – Data RetentionThe driving force behind our award-winning Data Retention platform at MimecastDive into the forefront of innovation with our Data Retention engineering team, taking on the crucial Operations role to help us develop operational aspects of our archiving and security software and its associated...


  • bangalore, India Cricbuzz.com Full time

    Site Reliability EngineerWe are looking for a highly skilled and motivated Web Server Site Reliability Engineer to join our team. As a Web Server Site Reliability Engineer, you will be responsible for ensuring the reliability, scalability, and performance of our web server infrastructure and CDN services.Experience - 3 - 5 yearsResponsibilities:● Design,...


  • bangalore, India Cricbuzz.com Full time

    Site Reliability EngineerWe are looking for a highly skilled and motivated Web Server Site Reliability Engineer to join our team. As a Web Server Site Reliability Engineer, you will be responsible for ensuring the reliability, scalability, and performance of our web server infrastructure and CDN services.Experience - 4 - 5 yearsResponsibilities:● Design,...


  • bangalore, India Cricbuzz.com Full time

    Site Reliability Engineer We are looking for a highly skilled and motivated Web Server Site Reliability Engineer to join our team. As a Web Server Site Reliability Engineer, you will be responsible for ensuring the reliability, scalability, and performance of our web server infrastructure and CDN services. Experience - 4 - 5 years Responsibilities: ●...


  • bangalore, India Zscaler Full time

    Our Engineering team built the world's largest cloud security platform from the ground up, and we keep building. With more than 100 patents and big plans for enhancing services and increasing our global footprint, the team has made us and our multitenant architecture today's cloud security leader, with more than 15 million users in 185 countries. Bring your...


  • bangalore, India Autodesk Full time

    Position Overview Want to help make a better world? As a Senior Site Reliability Engineer (SRE) Autodesk you can do just that. How is this possible? As a member of the team responsible for operating critical customer facing services. You will have the opportunity to contribute to and drive improvements in the operation of mission critical...


  • bangalore, India Autodesk Full time

    Position Overview Want to help make a better world? As a Senior Site Reliability Engineer (SRE) Autodesk you can do just that. How is this possible? As a member of the team responsible for operating critical customer facing services. You will have the opportunity to contribute to and drive improvements in the operation of mission critical components...