Site Reliability Engineer

4 days ago


Mumbai, Maharashtra, India Fynd Full time ₹ 8,00,000 - ₹ 24,00,000 per year

Fynd is India's largest omnichannel platform and a multi-platform tech company specializing in retail technology and products in AI, ML, big data, image editing, and the learning space. It provides a unified platform for businesses to seamlessly manage online and offline sales, store operations, inventory, and customer engagement. Serving over 2,300 brands, Fynd is at the forefront of retail technology, transforming customer experiences and business processes across various industries.

Are you passionate about building ultra-reliable systems at scale? Join our team as a Site Reliability Engineer (SRE) and be the driving force behind our site's performance and uptime. Embrace a culture of end-to-end ownership, collaboration, and engineering excellence. In this role, you'll blend software development and systems engineering skills to ensure our platform is massively scalable, fault-tolerant, and lightning-fast. It's a discipline that combines software engineering and systems engineering to ensure the scalability, performance, and reliability of large-scale systems – exactly what's needed to delight millions of online shoppers. You'll work from our Mumbai headquarters, taking ownership of product reliability from day one and working across teams to keep our services robust and customers happy.

We are looking for a Engineer's who not only builds performant, scalable applications but also embraces AI as a development multiplier. Your core job will be building and owning web applications end-to-end, but we expect you to use tools like GitHub Copilot, Cursor, ChatGPT, or equivalent to write, debug, test, and document faster and better.

  • Use AI tools (e.g. Copilot, Cursor, GPT-4) to accelerate code generation, refactoring, testing, and documentation
  • Code with AI — Use GitHub Copilot, Cursor, or similar tools for scaffolding, debugging, and testing

What will you do at Fynd?

  • Influence technical direction by evaluating change requests, participating in architectural discussions across teams to uphold best practices and decide on appropriate technologies.
  • Lead incident response and root cause analysis to rapidly resolve issues and implement preventive measures, ensuring we never fail for the same reason twice.
  • Identify any bottleneck in current processes and build or improve tools to support incident management.
  • Go on-call, respond to automated alerts, and execute playbooks.
  • Continuously monitor and fine-tune our infrastructure using industry-standard observability tools, ensuring high performance even under heavy load.
  • Conduct rigorous load tests for critical sales events and optimise system capacity to handle peak demand seamlessly.
  • Own availability and performance for key products. Be responsible for ensuring the product's architecture, changes, incident response, and technology choices support its target availability and performance levels.
  • Remove unnecessary noise from our signals to obtain a clearer understanding of our platform and enable more effective debugging.
  • Develop production tooling and services to improve our platform's resilience.

Minimum Qualification:

  • Bachelor's degree (B. E./B. Tech.) in Computer Science, or a related technical field, or equivalent practical experience.
  • 2+ years of experience in an SRE or DevOps role, preferably within the e-commerce sector.
  • 2+ years of experience in programming languages such as Go, Python, or JavaScript, coupled with a solid understanding of data structures and algorithms.
  • Experience with containerisation technologies such as Docker and Kubernetes.
  • Experience with cloud platforms like AWS, GCP, or Azure.
  • Experience with monitoring and alerting tools such as Grafana, Prometheus, Sentry, PagerDuty, New Relic, AWS CloudWatch, etc.
  • Proficiency in Unix/Linux shell environments.

Some specific Requirements:

  • 3+ years of experience in an SRE or DevOps role, preferably within the e-commerce sector.
  • 3+ years of experience managing production infrastructure. Prior experience leading or managing a team is a strong advantage.
  • Experience with message queues like Kafka or RabbitMQ and a strong understanding of event-driven architectures.
  • Experience with any orchestration and deployment tools such as Terraform, Pulumi, AWS CloudFormation, etc.
  • Hands-on experience with any configuration management systems like Ansible, Chef, Puppet, SaltStack, etc.
  • Understanding of load testing methodologies and tools such as Grafana k6, Gatling, Locust, Apache JMeter, etc.

What do we offer?

Growth

Growth knows no bounds, as we foster an environment that encourages creativity, embraces challenges, and cultivates a culture of continuous expansion. We are looking at new product lines, international markets and brilliant people to grow even further. We teach, groom and nurture our people to become leaders. You get to grow with a company that is growing exponentially.

Flex University: We help you upskill by organising in-house courses on important subjects

Learning Wallet: You can also do an external course to upskill and grow, we reimburse it for you.

Culture

Community and Team building activities

Host weekly, quarterly and annual events/parties.

Wellness

Mediclaim policy for you + parents + spouse + kids

Experienced therapist for better mental health, improve productivity & work-life balance

We work from the office 5 days a week to promote collaboration and teamwork. Join us to make an impact in an engaging, in-person environment



  • Mumbai, Maharashtra, India Oracle Financial Services Software Ltd Full time ₹ 12,00,000 - ₹ 36,00,000 per year

    Senior Site Reliability Developer OCI is Oracle's next-generation cloud platform, built for the most demanding enterprise workloads. We deliver high-performance computing, storage, networking, and platform services at global scale. The AI Platform, Services & Solutions organization within OCI is building the foundation for enterprise AI—spanning GPU...


  • Mumbai, Maharashtra, India Oracle Financial Services Software Ltd Full time ₹ 15,00,000 - ₹ 25,00,000 per year

    Software Developer 3 We are seeking a mid-career Site Reliability/DevOps Engineer (IC3) to strengthen our infrastructure and operations teams. This role is critical in advancing our organizational goals of operational excellence, cloud migration, and cost optimization. As part of Oracle Health Applications & Infrastructure (OHAI), this engineer will...


  • Mumbai, Maharashtra, India Talent Leads HR Solutions Pvt Ltd Full time ₹ 20,00,000 - ₹ 25,00,000 per year

    Skill, Knowledge &Trainings : - Site Reliability Engineer will be responsible to develop and implement services that improve Software development Life Cycle. - Build automations which will help optimize software delivery. - Improve reliability, quality, and time-to-market of our suite of software solutions. - Will be responsible for availability,...


  • Mumbai, Maharashtra, India Avant-Garde Corporate Services Private Limited Full time ₹ 15,00,000 - ₹ 25,00,000 per year

    We are seeking a skilled and proactive Site Reliability Engineer (SRE) to join the IT Transformation team.The role involves driving automation, reliability, and performance optimization across mission-critical applications and infrastructure within a financial market ecosystem.The successful candidate will manage end-to-end deployment automation, CI/CD...


  • Mumbai, Maharashtra, India Oracle Financial Services Software Ltd Full time ₹ 6,00,000 - ₹ 18,00,000 per year

    Site Reliability Developer (Networking) As a Network Reliability Engineer on the OCI Network Availability team, you will play a crucial role in ensuring the high availability and performance of Oracle Cloud's global network infrastructure. This role involves applying engineering methodologies to measure, monitor, and automate the reliability of OCI's...

  • Site Engineer

    4 days ago


    Navi Mumbai, Maharashtra, India M L Labade Engineer Contractor Full time ₹ 4,00,000 - ₹ 12,00,000 per year

    Role & responsibilitiesOrganizing materials and ensuring sites are safe and clean.Preparing cost estimates and ensuring appropriate materials and tools are available.Providing technical advice and suggestions for improvement on particular projects.Diagnosing and troubleshooting equipment as required.Negotiating with suppliers and vendors to ensure the best...


  • Mumbai, Maharashtra, India ALIQAN Technologies Full time ₹ 20,00,000 - ₹ 25,00,000 per year

    Position: SITE Reliability EngineerBudget- 2. 4 LPM + GSTExp- 10 yrsDuration- 6 monthsLocation- Andheri MumbaiTechnical Skills:Programming: Proficiency in languages like Python.Operating Systems: Deep understanding of Linux/Windows operating systems and networking concepts.Cloud Technologies: Experience with Azure including services, architecture, and best...


  • Mumbai, Maharashtra, India Search Synergy Pvt Ltd Full time ₹ 6,00,000 - ₹ 18,00,000 per year

    Note - Location - Dadar/Kurla (Mumbai)Skill, Knowledge &Trainings : - Own and manage the CI/CD pipelines for automated build, test, and deployment. - Design and implement robust deployment strategies for microservices and web applications. - Set up and maintain monitoring, alerting, and logging frameworks (e.g., Prometheus, Grafana, ELK) - Build...


  • Mumbai, Maharashtra, India ETP Group Full time ₹ 1,04,000 - ₹ 1,30,878 per year

    Experience Required7-10LocationMumbaiRole TypeFull timeJob Title: Senior Site Reliability Engineer (SRE) – MACH SaaS PlatformKey ResponsibilitiesEnsure uptime SLAs and overall reliability of production, staging, and test environments.Continuously assess all platform components for correct configuration — including instance sizes, memory allocation,...


  • Navi Mumbai, Maharashtra, India Uplers Full time ₹ 8,00,000 - ₹ 25,00,000 per year

    Experience: 4+ yearsSalary: ConfidentialShift: (GMT+05:30) Asia/Kolkata (IST)Opportunity Type: Office (Mumbai)Placement Type: Full time Permanent Position(*Note: This is a requirement for one of Uplers' client--Gofynd)What do you need for this opportunity?Must have skills required: and AWS/Google Cloud and MongoDB/CI/CD/GrafanaJob descriptionFynd is Indias...