Site Reliability Engineer

1 week ago


india Hydrolix Full time

About the jobAt Hydrolix, we are revolutionizing the world of data management and analytics with our innovative cloud data platform, purpose-built for petabyte-scale datasets. Our mission is to help organizations drastically reduce data costs while increasing their data retention.We are looking for a Site Reliability Engineer (SRE) with 8 to 10+ years experience to join our dynamic Services team. In this role, you will contribute to the reliability and scalability of our cutting-edge platform, ensuring exceptional solutions tailored to our customers' unique needs. This is a highly technical, hands-on role that requires deep expertise in system reliability and automation.Key ResponsibilitiesInfrastructure Reliability: Deploy, maintain, and ensure a highly reliable fleet of Kubernetes clusters and Hydrolix deployments across multiple cloud platforms.Service Optimization: Design, implement, and maintain systems and processes to enhance the reliability, availability, and performance of our services.CI/CD Management: Build and optimize CI/CD tools and processes to ensure efficient and reliable deployments.Monitoring and Incident Response: Develop and manage monitoring, alerting, and incident response strategies to minimize downtime and enable rapid recovery.Root Cause Analysis: Conduct comprehensive root cause analyses for system failures, implementing long-term preventive measures.Automation and Efficiency: Automate repetitive tasks and optimize system performance to improve operational efficiency.On-Call Support: Participate in covering weekday business hours and once-monthly weekend shifts.Collaboration and Customer EngagementCross-Functional Teamwork: Work closely with software engineering, infrastructure, and product teams to integrate reliability practices into every stage of the development lifecycle.Reliability Advocacy: Champion SRE best practices and foster a culture of operational excellence across the organization.Global Team Collaboration: Collaborate with a distributed team of engineers worldwide to provide round-the-clock support.Customer Support: Interface with customers to address and resolve reported incidents, ensuring a seamless user experience.Qualifications and SkillsSRE Expertise: Proven experience as a Site Reliability Engineer or similar role, with a history of supporting complex distributed systems.Observability Tools: Experience with monitoring and debugging tools like Prometheus, Vector, Grafana, Superset, or Kibana.Cloud Platforms: Proficiency in at least one major cloud platform (AWS, GCP, Azure, or Linode).UI Development Experience: Hands-on experience building internal tooling using modern frontend frameworks (e.g., React, Vue, or Angular etc), enabling improved visibility, and operational workflows for engineering teams.Database Knowledge: Experience with SQL databases; familiarity with PostgreSQL is a plus but not required.Programming/Scripting Skills: Proficiency in Unix scripting and programming languages such as Python or GoLinux Expertise: Strong experience with Linux systems, including performance tuning and system-level troubleshooting.Communication Skills: Excellent written and verbal communication skills, with the ability to convey technical concepts clearly to diverse audiences, including customers and cross-functional teams.Hydrolix provides equal employment opportunities without regard to an applicant's race, sex, pregnancy, sexual orientation, gender identity or expression, genetic information, national origin, age, physical or mental disability, medical condition, religion, marital status or veteran status.Applicants with disabilities may be entitled to reasonable accommodation under the terms of the Americans with Disabilities Act and certain state or local laws. A reasonable accommodation is a change in the way things are normally done which will ensure an equal employment opportunity without imposing undue hardship on Hydrolix. Please inform us if you need assistance completing any forms or to otherwise participate in the application process.



  • india Pagos Consultants Full time

    we are looking for experienced site reliability engineers to join a founding team of startup-minded individuals that will lay the groundwork for our new fintech offering. This team will play a pivotal role in spearheading innovation. As such, you will have the opportunity to shape the early architecture and design of the system and set the trajectory for its...


  • India Pagos Consultants Full time

    we are looking for experienced site reliability engineers to join a founding team of startup-minded individuals that will lay the groundwork for our new fintech offering. This team will play a pivotal role in spearheading innovation. As such, you will have the opportunity to shape the early architecture and design of the system and set the trajectory for its...


  • India Pagos Consultants Full time

    we are looking for experienced site reliability engineers to join a founding team of startup-minded individuals that will lay the groundwork for our new fintech offering. This team will play a pivotal role in spearheading innovation. As such, you will have the opportunity to shape the early architecture and design of the system and set the trajectory for its...


  • India Pagos Consultants Full time

    we are looking for experienced site reliability engineers to join a founding team of startup-minded individuals that will lay the groundwork for our new fintech offering. This team will play a pivotal role in spearheading innovation. As such, you will have the opportunity to shape the early architecture and design of the system and set the trajectory for its...


  • India Datum Technologies Group Full time

    Job Title: Site Reliability Engineer (SRE) – AWS Experience: 8+ years Location: Chennai / Mumbai Work Mode: Hybrid Key Skills: AWS, Terraform, Kubernetes, Docker, Grafana, Prometheus, Datadog Job Summary: We are looking for a skilled Site Reliability Engineer (SRE) with strong AWS experience and a solid background in DevOps, automation, observability, and...


  • India Insight Global Full time

    Company: Insight Global Duration: Approved for 1 year 📍 Location: Remote (India) 💼 Type: Contract with Insight Global Client 💰 Compensation: 14 LPA – 20 LPA 🕒 Working Hours: Normal IST hours 🚀 Start Date: Immediate (No notice period) About the Role Join our Site Reliability Engineering (SRE) team as a Fullstack Developer, focused on building...


  • India Insight Global Full time

    Company: Insight GlobalDuration: Approved for 1 year📍 Location: Remote (India)💼 Type: Contract with Insight Global Client💰 Compensation: 14 LPA – 20 LPA🕒 Working Hours: Normal IST hours🚀 Start Date: Immediate (No notice period)About the RoleJoin our Site Reliability Engineering (SRE) team as a Fullstack Developer, focused on building and...


  • India InOrg Full time ₹ 12,00,000 - ₹ 36,00,000 per year

    About VivaOps :VivaOps is a leading DevSecOps platform company specializing in GitLab - The comprehensive DevOps platform, to transform and secure software development processes. We help organizations to streamline their DevSecOps journey by offering a complete range of GitLab services, from advisory, to implementation and managed services, to accelerate...


  • India Jobgether Full time ₹ 10,00,000 - ₹ 12,00,000 per year

    This position is posted by Jobgether on behalf of a partner company. We are currently looking for a Site Reliability Engineer (India 3rd Shift) in India.This role is ideal for an engineer who thrives in high-availability, mission-critical environments and enjoys ensuring systems operate reliably at scale. As a Site Reliability Engineer, you will work during...


  • India Akamai Full time ₹ 8,00,000 - ₹ 24,00,000 per year

    Do you like collaborating across teams to solve complex problems?Do you enjoy solving large scale distributed content delivery challenges?Join our highly skilled Compute Site Reliability teamOur team designs, develops, and manages applications and infrastructure that support Akamai's Compute products and services. We specialize in creating solutions that...