Senior Site Reliability Engineer

3 weeks ago


Gurugram, India Cvent Full time

Site Reliability is about combining development and operations knowledge and skills to help make the organization better. If you have SRE or development background and have experience on improving reliability of your services/products by adding Observability to it – Cvent SRE can benefit from your skillsets. Ultimately, we are looking for passionate people who love learning, love technology and always want to make things better.

As a Senior SRE on the SRE Observability team, you will be responsible for helping Cvent to achieve our reliability goals. We are looking for someone with the drive, ownership and ability to take on challenging problems, both technical and process related, in a dynamic, collaborative and highly distributed, multi-disciplinary team environment. You will use your background as a generalist to work closely with product development teams, Cloud Infrastructure and other SRE teams to ensure the effective observability and improve reliability of our products, SLDC and Infrastructure. You must be able to see the big picture and work collaboratively with teams to solve hard multi-disciplinary problems. Technical expertise in topics such as cloud operations, the software development lifecycle, and Observability tools will be of great help to you. We use SRE principals such as blameless postmortems and a focus on automation to ensure we're constantly improving our knowledge and maintaining a good quality of life. Overall, we're passionate about continuous improvement, learning

and participating in dynamic day to day work where success is rewarded with recognition and upward mobility.

What You Will Be Doing


•Enlighten, Enable and Empower a fast-growing set of multi-disciplinary teams, across multiple applications and locations.


•Tackle complex development, automation and business process problems. Champion Cvent standards and best practices.


•Ensure the scalability, performance, and resilience of Cvent products and processes.


•Work with product development teams, Cloud Automation and other SRE teams to ensure a holistic understanding of observability gaps and their effective and efficient identification and resolution.


•Identify recurring problems and anti-patterns in development, operational and security processes and help respective team to build observability for those.


•Develop build, test and deployment automation that seamlessly targets multiple on-premises and AWS regions.


•Give back by working on and contributing to Open-Source projects.

What You Need for this Position

Must have skills:


•Excellent communication skills and track record working in distributed teams


•A passion for and track record in making things better for your peers.


•Experience managing AWS services / operational knowledge of managing applications in AWS – ideally via automation.


•Fluent in at least one scripting languages like Typescript, Javascript, Python, Ruby and Bash.


•Experience with SDLC methodologies (preferably Agile).


•Experience with Observability (Logging, Metrics, Tracing) and SLI/SLO


•Working with APM, monitoring, and logging tool (Datadog, New Relic, Splunk)


•Good understanding of containerization concepts - docker, ECS, EKS, Kubernetes.


•Self-motivation and the ability to work under minimal supervision


•Troubleshooting and responding to incidents, set a standard for others to prevent the issues in future.

Good to have skills:


•Experience with Infrastructure as Code (IaC) tools such as CloudFormation, CDK (preferred) and Terraform.


•Experience managing 3 tier application stacks.


•Understanding of basic networking concepts.


•Experience on Server configuration through Chef, Puppet, Ansible or equivalent


•Working experience with NoSQL databases such as MongoDB, Couchbase, Postgres etc


•Use APM data to Troubleshooting and finding performance bottleneck



  • Gurugram, India Freecharge Full time

    Job Title: Site Reliability Engineer (SRE)3 Years Experience About the Role: We are looking for a Site Reliability Engineer (SRE) with 3 years of experience to join our team. You will be responsible for ensuring the reliability, scalability, and efficiency of our production systems. This role requires a balance of software engineering, system administration,...


  • Gurugram, India Careerfit.ai Full time

    Description :About the RoleWere looking for a senior DBA with strong Site Reliability skills to design, automate, and optimize cloud-native database systems on GCP.Key Responsibilities :- Design, deploy, and tune clusters for MySQL, MongoDB, PostgreSQL, Aurora, and DynamoDB.- Automate provisioning, scaling, and patching; build CI/CD pipelines for DB...


  • Gurugram, Pune, India Prerna Malhotra (Proprietor Of Praxis Hr Solutions) Full time

    Job Description Description We are seeking a skilled Site Reliability Engineer (SRE) to join our dynamic team in India. The SRE will be responsible for ensuring the reliability, availability, and performance of our applications and services. This role requires a combination of software engineering and systems engineering to build and maintain scalable and...


  • Gurugram, India Leapwork Full time

    At Leapwork, our vision is to break down the barriers between humans and computers through the world's most accessible automation platform. We are the leading global AI-powered visual test automation solution, enabling some of the world's largest enterprises to adopt, scale, and maintain automation – in under 30 days. In today's environment, where...


  • Gurugram, India Gemini Solutions Pvt Ltd Full time

    Position Summary In this role, you will play a crucial part in shaping the firm's infrastructure reliability and efficiency by implementing robust Site Reliability Engineering practices. Your contribution will be pivotal in ensuring the availability, scalability, and performance of our systems and applications. Leveraging your strong technical skills and...


  • Gurugram, Gurugram, India Impronics Technologies Full time

    Job Description We are seeking a seasoned Site Reliability Engineer (SRE) with a solid background in payment systems and high-availability architectures. The ideal candidate will have hands-on experience managing large-scale, distributed systems in production, with a deep understanding of reliability, scalability, and performance tuning in the financial...


  • Gurugram, Hyderabad, India Talent Hired-the Job Store Full time ₹ 15,00,000 - ₹ 25,00,000 per year

    9+ years of experience in a Site Reliability Engineering or DevOps role.Hands-on experience with Dynatrace and Splunk for monitoring, logging, and alerting.Strong proficiency in Terraform for infrastructure provisioning (AWS, Azure, or GCP).Experience with CI/CD pipelines and automation tools (e.g., Jenkins, GitLab CI, Azure DevOps).Deep understanding of...


  • Gurugram, India Cvent Full time

    Overview: Cvent is a global meeting, event, travel, and hospitality technology leader, with more than 4000+ employees worldwide. As a leading cloud-based technology company, we have over 28,000+ customers, including 80% of the Fortune 100 companies, in more than 100 countries.Cvent’s software solutions optimize the entire event management value chain and...


  • Gurugram, India Cvent Full time

    Overview: Cvent is a global meeting, event, travel, and hospitality technology leader, with more than 4000+ employees worldwide. As a leading cloud-based technology company, we have over 28,000+ customers, including 80% of the Fortune 100 companies, in more than 100 countries.Cvent’s software solutions optimize the entire event management value chain and...


  • Gurugram, India PointClickCare Full time

    At PointClickCare our mission is simple: to help providers deliver exceptional care. And that starts with our people. As a leading health tech company that’s founder-led and privately held, we empower our employees to push boundaries, innovate, and shape the future of healthcare. With the largest long-term and post-acute care dataset and a Marketplace of...