Site Reliability Engineer

3 days ago


Mumbai, Maharashtra, India Fynd Full time ₹ 8,00,000 - ₹ 24,00,000 per year

Fynd is India's largest omnichannel platform and a multi-platform tech company specializing in retail technology and products in AI, ML, big data, image editing, and the learning space. It provides a unified platform for businesses to seamlessly manage online and offline sales, store operations, inventory, and customer engagement. Serving over 2,300 brands, Fynd is at the forefront of retail technology, transforming customer experiences and business processes across various industries.

Are you passionate about building ultra-reliable systems at scale? Join our team as a Site Reliability Engineer (SRE) and be the driving force behind our site's performance and uptime. Embrace a culture of end-to-end ownership, collaboration, and engineering excellence. In this role, you'll blend software development and systems engineering skills to ensure our platform is massively scalable, fault-tolerant, and lightning-fast. It's a discipline that combines software engineering and systems engineering to ensure the scalability, performance, and reliability of large-scale systems – exactly what's needed to delight millions of online shoppers. You'll work from our Mumbai headquarters, taking ownership of product reliability from day one and working across teams to keep our services robust and customers happy.

We are looking for a Engineer's who not only builds performant, scalable applications but also embraces AI as a development multiplier. Your core job will be building and owning web applications end-to-end, but we expect you to use tools like GitHub Copilot, Cursor, ChatGPT, or equivalent to write, debug, test, and document faster and better.

  • Use AI tools (e.g. Copilot, Cursor, GPT-4) to accelerate code generation, refactoring, testing, and documentation
  • Code with AI — Use GitHub Copilot, Cursor, or similar tools for scaffolding, debugging, and testing

What will you do at Fynd?

  • Influence technical direction by evaluating change requests, participating in architectural discussions across teams to uphold best practices and decide on appropriate technologies.
  • Lead incident response and root cause analysis to rapidly resolve issues and implement preventive measures, ensuring we never fail for the same reason twice.
  • Identify any bottleneck in current processes and build or improve tools to support incident management.
  • Go on-call, respond to automated alerts, and execute playbooks.
  • Continuously monitor and fine-tune our infrastructure using industry-standard observability tools, ensuring high performance even under heavy load.
  • Conduct rigorous load tests for critical sales events and optimise system capacity to handle peak demand seamlessly.
  • Own availability and performance for key products. Be responsible for ensuring the product's architecture, changes, incident response, and technology choices support its target availability and performance levels.
  • Remove unnecessary noise from our signals to obtain a clearer understanding of our platform and enable more effective debugging.
  • Develop production tooling and services to improve our platform's resilience.

Minimum Qualification:

  • Bachelor's degree (B. E./B. Tech.) in Computer Science, or a related technical field, or equivalent practical experience.
  • 2+ years of experience in an SRE or DevOps role, preferably within the e-commerce sector.
  • 2+ years of experience in programming languages such as Go, Python, or JavaScript, coupled with a solid understanding of data structures and algorithms.
  • Experience with containerisation technologies such as Docker and Kubernetes.
  • Experience with cloud platforms like AWS, GCP, or Azure.
  • Experience with monitoring and alerting tools such as Grafana, Prometheus, Sentry, PagerDuty, New Relic, AWS CloudWatch, etc.
  • Proficiency in Unix/Linux shell environments.

Some specific Requirements:

  • 3+ years of experience in an SRE or DevOps role, preferably within the e-commerce sector.
  • 3+ years of experience managing production infrastructure. Prior experience leading or managing a team is a strong advantage.
  • Experience with message queues like Kafka or RabbitMQ and a strong understanding of event-driven architectures.
  • Experience with any orchestration and deployment tools such as Terraform, Pulumi, AWS CloudFormation, etc.
  • Hands-on experience with any configuration management systems like Ansible, Chef, Puppet, SaltStack, etc.
  • Understanding of load testing methodologies and tools such as Grafana k6, Gatling, Locust, Apache JMeter, etc.

What do we offer?

Growth

Growth knows no bounds, as we foster an environment that encourages creativity, embraces challenges, and cultivates a culture of continuous expansion. We are looking at new product lines, international markets and brilliant people to grow even further. We teach, groom and nurture our people to become leaders. You get to grow with a company that is growing exponentially.

Flex University: We help you upskill by organising in-house courses on important subjects

Learning Wallet: You can also do an external course to upskill and grow, we reimburse it for you.

Culture

Community and Team building activities

Host weekly, quarterly and annual events/parties.

Wellness

Mediclaim policy for you + parents + spouse + kids

Experienced therapist for better mental health, improve productivity & work-life balance

We work from the office 5 days a week to promote collaboration and teamwork. Join us to make an impact in an engaging, in-person environment



  • Mumbai, Maharashtra, India Oracle Financial Services Software Ltd Full time ₹ 15,00,000 - ₹ 25,00,000 per year

    Software Developer 3 We are seeking a mid-career Site Reliability/DevOps Engineer (IC3) to strengthen our infrastructure and operations teams. This role is critical in advancing our organizational goals of operational excellence, cloud migration, and cost optimization. As part of Oracle Health Applications & Infrastructure (OHAI), this engineer will...


  • Mumbai, Maharashtra, India Deqode Full time

    Profile : Site Reliability Engineer (SRE)Experience Required : 6+ YearsLocations : Mumbai, Gurgaon, ChennaiWork Arrangement : HybridKey Responsibilities :- Design and implement scalable, resilient cloud-native infrastructure across AWS/Azure/GCP platforms- Own the SRE function including availability, latency, performance monitoring, emergency response,...

  • Site Engineer

    3 days ago


    Navi Mumbai, Maharashtra, India M L Labade Engineer Contractor Full time ₹ 4,00,000 - ₹ 12,00,000 per year

    Role & responsibilitiesOrganizing materials and ensuring sites are safe and clean.Preparing cost estimates and ensuring appropriate materials and tools are available.Providing technical advice and suggestions for improvement on particular projects.Diagnosing and troubleshooting equipment as required.Negotiating with suppliers and vendors to ensure the best...


  • Mumbai, Maharashtra, India ALIQAN Technologies Full time ₹ 20,00,000 - ₹ 25,00,000 per year

    Position: SITE Reliability EngineerBudget- 2. 4 LPM + GSTExp- 10 yrsDuration- 6 monthsLocation- Andheri MumbaiTechnical Skills:Programming: Proficiency in languages like Python.Operating Systems: Deep understanding of Linux/Windows operating systems and networking concepts.Cloud Technologies: Experience with Azure including services, architecture, and best...


  • Mumbai, Maharashtra, India Natobotics Full time

    Job DescriptionWere on an exciting journey with our client and we want you to join us. With our client, you will beexposed to the latest technologies and work with some of the brightest minds in the industry.Our client is leading Banking company so you will be playing a key role as a VP Site Reliability Engineering (SRE), who can assist with the below:Roles...


  • Mumbai, Maharashtra, India Search Synergy Pvt Ltd Full time ₹ 6,00,000 - ₹ 18,00,000 per year

    Note - Location - Dadar/Kurla (Mumbai)Skill, Knowledge &Trainings : - Own and manage the CI/CD pipelines for automated build, test, and deployment. - Design and implement robust deployment strategies for microservices and web applications. - Set up and maintain monitoring, alerting, and logging frameworks (e.g., Prometheus, Grafana, ELK) - Build...


  • Mumbai, Maharashtra, India ETP Group Full time ₹ 1,04,000 - ₹ 1,30,878 per year

    Experience Required7-10LocationMumbaiRole TypeFull timeJob Title: Senior Site Reliability Engineer (SRE) – MACH SaaS PlatformKey ResponsibilitiesEnsure uptime SLAs and overall reliability of production, staging, and test environments.Continuously assess all platform components for correct configuration — including instance sizes, memory allocation,...


  • Navi Mumbai, Maharashtra, India Uplers Full time ₹ 8,00,000 - ₹ 25,00,000 per year

    Experience: 4+ yearsSalary: ConfidentialShift: (GMT+05:30) Asia/Kolkata (IST)Opportunity Type: Office (Mumbai)Placement Type: Full time Permanent Position(*Note: This is a requirement for one of Uplers' client--Gofynd)What do you need for this opportunity?Must have skills required: and AWS/Google Cloud and MongoDB/CI/CD/GrafanaJob descriptionFynd is Indias...


  • Mumbai, Maharashtra, India Wipro Full time ₹ 20,00,000 - ₹ 25,00,000 per year

    Role Purpose RequiredSkills:- 5+Years of experience in system administration, application development, infrastructure development or related areas 5+ years of experience with programming in languages like Javascript, Python, PHP, Go, Java or Ruby 3+ years of in reading, understanding and writing code in the same 3+years Mastery of infrastructure...


  • Mumbai, Maharashtra, India JPMorganChase Full time ₹ 15,00,000 - ₹ 25,00,000 per year

    JOB DESCRIPTIONPlay a key role in ensuring system reliability at one of the world's most iconic and largest financial institutions.As a Site Reliability Engineer II at JPMorgan Chase within the Client Onboarding team which is aligned to Corporate Technology division, you will use technology to solve business problems and leverage software engineering best...