Senior Site Reliability Engineer
7 hours ago
Introduction
Our goal at Pivotree is to help accelerate the future of frictionless commerce. We will help lead this change over the next decade because we believe a future where technology is embedded intimately into all aspects of our everyday lives can benefit everyone and will shape the interactions with the brands we love. We will help shape the future of frictionless commerce by working together with some of the best brands in the world and some of the best people in the industry to leverage converging technologies that will make it possible to accelerate frictionless commerce faster than ever.
Pivotree provides services focused on the design, implementation, management, and maintenance of complex ecommerce solutions for large enterprises. We provide the technical skills necessary to enable the effective use of technologies combined with the business context to leverage a solution to solve our clients' business challenges. We strive to fill the gaps in available technology with our own IP to reduce the barriers to adoption.
We enable inclusive, immersive and highly personalized experiences for our clients and their customers. We build our products with a view to productizing and scaling technology to lower the costs and reduce the risks of implementing and managing our integrated solutions. Each of our solutions starts with reliable and reputable e-commerce and MDM platforms, which run on enterprise grade infrastructure that are customized to meet a variety of client needs, situations, and budgets. Over the next 10 years we will add new categories and capabilities that will define frictionless commerce ecosystems.
This is a journey of technology acceleration combined with consumer readiness and adoption. We are looking for people capable of adapting relentlessly to the rapidly evolving world around us.
Position Summary
We are currently seeking a Senior Site Reliability Engineer (SRE) to join our team. In
this role you will contribute to the reliability and enhancement of the technology
engine that powers multiple Pivotree solutions. The primary function of this role is the
direct responsibility for the availability of platform solutions, focusing on several key
areas, including availability, performance, change management, monitoring and
emergency response. You will work with other members of the platform, solutions,
operations, and application teams to understand and ultimately address changing
and evolving requirements through extending and exposing capabilities in a simple
and consistent fashion. You will be a member of a team who maintains expertise
with Utility Computing services and will advise management and the organization as
a whole on this mode of computing.
You will
- Be responsible for the availability, performance, and reliability of platform
services.
- Design and manage infrastructure in AWS, especially across multi-account
AWS Organization setups.
- Lead efforts to automate cloud resource provisioning using IaC tools such as
Terraform and AWS CloudFormation.
- Contribute to ensuring pooled and independent utility services are highly
available
- Actively take part and initiate continuous improvement: measure and reduce
manual tasks and overhead
- Be a subject matter expert for Utility Computing providers and respective
services both existing and emerging - with particular focus on AWS
- Complete systems development, administration, and engineering tasks
including integration, documentation and testing
- Develop and maintain tools, processes, and workflows for automated
infrastructure resource(s) and application deployment, configuration
management & maintenance
- Own the responsibility for platform management, supporting services, and all
related tooling and automation
- Design, implement, and maintain monitoring, alerting, and observability
solutions to ensure system reliability, performance, and timely incident
detection.
- Investigate and troubleshoot relevant platform-based issues and incidents,
(high availability, performance, security, etc.)
- Collaborate across distributed teams and act as a technical liaison for various
stakeholders.
- Participate in change management, release automation, compliance, and
audit-readiness practices.
- Participate in recurring stand-ups with other team members located in
different locations and time zones
- Participate in on-call rotation, escalations, and shift work
- Work with other team members to improve processes and advance relevant
and related competencies
- Provide technical mentorship, helping teammates grow through knowledge
sharing and peer coaching.
You are
- Experienced in production-grade AWS Cloud environments with a focus on
scalability, security, and governance.
- Well-versed in Linux (RHEL/Debian) environments and proficient in system
administration.
- Adept at supporting modern development workflows, Agile teams, and CI/CD
tooling
- A strong communicator, with the ability to interface with technical and non-
technical stakeholders across geographies.
- Motivated to mentor others and foster a collaborative engineering culture.
- Capable of operating independently and making strategic infrastructure
decisions.
- Passionate about reliability engineering and operational excellence.
- Experienced at working on large projects with deadlines
- Committed to high quality and attention to detail
- Focused and committed to delivering high quality services
- A strategic thinker who is able to link business and technical objectives
- Someone that can go wide and deep, who works with several disparate
systems and services and ultimately acquires expert knowledge and who can
navigate accordingly
You have (MUST HAVE)
- 5+ years of experience in Site Reliability Engineering, Cloud Engineering, or
DevOps roles.
- Minimum one Associate-level Amazon AWS certification.
- 3+ years mature, production level experience with infrastructure-as-code
concepts and practices using Terraform, AWS CloudFormation or similar.
- 3+ years of hands-on experience managing Kubernetes clusters (EKS
preferred), including container orchestration and troubleshooting.
- Strong knowledge and practical experience in Linux systems administration
(RHEL and Debian-based distros), networking, storage, and virtualization in
production-grade environments.
- Experience working with API-driven and/or Event-driven architectures at scale
in AWS environments.
- Demonstrated ability to manage and troubleshoot web applications,
middleware, and databases in real-world deployments.
- Expertise with observability stacks and performance monitoring using tools
like Grafana, Prometheus, CloudWatch, and Loki or similar.
- Advanced scripting proficiency in Python, Bash, and basic PowerShell.
- Solid experience with CI/CD pipelines, version control, and automated testing
using Git, Bitbucket, GitHub, Jenkins, or similar.
- Proven track record in implementing security and compliance controls,
particularly in regulated environments (SOC 2, PCI-DSS, or ISO
- Strong understanding of systems security, including identity, permissions,
network policies, and audit tooling.
- Exceptional troubleshooting skills, attention to detail, and a strong drive for
continuous improvement.
- Excellent communication skills with the ability to clearly articulate complex
concepts to both technical and non-technical audiences, and collaborate
effectively across distributed teams.
- Demonstrated ability to mentor junior engineers, share knowledge, and act as
a regional technical leader.
- Ability to work both independently and collaboratively, learn new technologies
quickly, and help set standards and best practices.
Nice to Have
- Experience and/or exposure to the Serverless Framework
- Experience with APM tools such as AppDynamics, NewRelic, Grafana or
Dynatrace, Amazon X-Ray
- Experience with the following Amazon AWS services in a production
environment (API Gateway, Cognito, RDS, DynamoDB, ECS, EMR, Lambda)
- AWS Certified Developer
- AWS Certified SysOps Administrator
- AWS Certified Solution Architect
Pivotree is an equal opportunity employer. We celebrate diversity and are committed to creating an inclusive and accessible workplace.
-
Senior Site Reliability Engineer
2 weeks ago
Mumbai, Maharashtra, India beBeeSiteReliability Full time US$ 1,00,000 - US$ 1,50,000Unlock Your Potential as a Senior Site Reliability EngineerWe are seeking a highly skilled and motivated Senior Site Reliability Engineer to join our team. As a key member of our Information Systems (IS) team, you will play a critical role in ensuring the smooth operation of our production services, supporting over 60 million Ubuntu users.The ideal candidate...
-
Sr. Site Reliability Engineer
2 days ago
Mumbai, Maharashtra, India ETP Group Full time ₹ 1,04,000 - ₹ 1,30,878 per yearExperience Required7-10LocationMumbaiRole TypeFull timeJob Title: Senior Site Reliability Engineer (SRE) – MACH SaaS PlatformKey ResponsibilitiesEnsure uptime SLAs and overall reliability of production, staging, and test environments.Continuously assess all platform components for correct configuration — including instance sizes, memory allocation,...
-
Senior Site Reliability Engineer II
3 weeks ago
Mumbai, Maharashtra, India Relx Group Full timeJob DescriptionAbout the RoleWe are seeking a skilled and proactive Site Reliability Engineer (SRE). This role involves close collaboration with .NET developers and QA teams, ensuring seamless transitions and ongoing reliability of applications.Responsibilities- Plan and execute infrastructure migration and deployment strategies.- Design and implement...
-
Site Reliability Engineer
2 weeks ago
Mumbai, Maharashtra, India Deqode Full timeProfile : Site Reliability Engineer (SRE)Experience Required : 6+ YearsLocations : Mumbai, Gurgaon, ChennaiWork Arrangement : HybridKey Responsibilities :- Design and implement scalable, resilient cloud-native infrastructure across AWS/Azure/GCP platforms- Own the SRE function including availability, latency, performance monitoring, emergency response,...
-
VP – Site Reliability Engineering
2 weeks ago
Mumbai, Maharashtra, India Natobotics Full timeJob DescriptionWere on an exciting journey with our client and we want you to join us. With our client, you will beexposed to the latest technologies and work with some of the brightest minds in the industry.Our client is leading Banking company so you will be playing a key role as a VP Site Reliability Engineering (SRE), who can assist with the below:Roles...
-
Site Reliability Engineer
9 hours ago
Mumbai, Maharashtra, India Search Synergy Pvt Ltd Full time ₹ 6,00,000 - ₹ 18,00,000 per yearNote - Location - Dadar/Kurla (Mumbai)Skill, Knowledge &Trainings : - Own and manage the CI/CD pipelines for automated build, test, and deployment. - Design and implement robust deployment strategies for microservices and web applications. - Set up and maintain monitoring, alerting, and logging frameworks (e.g., Prometheus, Grafana, ELK) - Build...
-
Site Reliability Engineer 2
1 week ago
Navi Mumbai, Maharashtra, India Uplers Full time ₹ 8,00,000 - ₹ 25,00,000 per yearExperience: 4+ yearsSalary: ConfidentialShift: (GMT+05:30) Asia/Kolkata (IST)Opportunity Type: Office (Mumbai)Placement Type: Full time Permanent Position(*Note: This is a requirement for one of Uplers' client--Gofynd)What do you need for this opportunity?Must have skills required: and AWS/Google Cloud and MongoDB/CI/CD/GrafanaJob descriptionFynd is Indias...
-
Mumbai, Maharashtra, India beBeeInfrastructure Full time ₹ 5,00,000 - ₹ 8,00,000Job Title: Site Reliability EngineerThis is an exceptional opportunity to join a global team of skilled professionals as a Site Reliability Engineer. In this role, you will be responsible for ensuring the reliability and performance of our cloud-based services.Job DescriptionWe are seeking a highly skilled engineer with experience in IT operations...
-
Site Reliability Engineer II
2 days ago
Mumbai, Maharashtra, India JPMorganChase Full time ₹ 15,00,000 - ₹ 25,00,000 per yearJOB DESCRIPTIONPlay a key role in ensuring system reliability at one of the world's most iconic and largest financial institutions.As a Site Reliability Engineer II at JPMorgan Chase within the Client Onboarding team which is aligned to Corporate Technology division, you will use technology to solve business problems and leverage software engineering best...
-
Senior Site Reliability Engineer
6 days ago
Mumbai, Maharashtra, India AXIS DIRECT Full time ₹ 20,00,000 - ₹ 25,00,000 per yearExperience: 2 - 4 yrsEducation: Bachelors / Masters in Software EngineeringKey Responsibilities:Working on APM (Application Performance Monitoring-Moderate level understanding) is a must.Run the production environment by monitoring availability and taking a holistic viewof system healthMeasure and optimize system performance, push our capabilities forward,...