Site Reliability Engineer Specialist

1 week ago


Delhi, India McCain Foods Full time
JOB RESPONSIBILITIES:

Work with stakeholders such as product owners and Engineering to define service level objectives (SLOs) for system operations.Track performance against SLOs in partnership with monitoring teams or other stakeholders, and ensure systems continue to meet SLOs over time.Create dashboards and reports to communicate key metrics.Create software to improve performance, scalability, and stability of systems.Collaborate with development teams to promote the concept of reliability engineering during all phases of the software development lifecycle to detect and correct performance issues and meet availability goals.Design, code, test, and deliver infrastructure software to automate manual operational work (i.e., “toil”).Participate in operational support and on-call rotation shifts for supported systems and products.Conduct blameless post mortems to troubleshoot priority incidents.Perform analytics on previous incidents to understand root causes and better predict and prevent future issues.Use automation to reduce the probability and/or impact of problem recurrence.Identify, evaluate, and recommend monitoring tools and diagnostic techniques to improve system observability.Participate in system design consulting, platform management, capacity planning and launch reviews.Collaborate and share lessons learned regarding performance and reliability issues with all stakeholders including developers, other SREs, operations teams, and project management teams.Participate in communities of practice to share knowledge and foster continuous improvement.Remain current on site reliability engineering methods and trends such as observability-driven development and chaos engineering.Drive continuous improvement in software quality and infrastructure reliability and resilience.Oversee, design, implement, and manage DevOps capabilities using continuous integration/continuous delivery toolsets and automation.SRE engineer

will focus on Application Performance Monitoring (APM) including Design, Solution, POC, profiling and tuning application compute and data nodes and resources. Some key duties of this role are:Assist in defining SRE and Observability architecture, design

Analyze, Implement new features of SRE and Observability PlatformFull stack monitoring across all layers (Infrastructure/Network/Database/Application/Services/Third Party)Provide technical hands-on leadership in commercial and Open source/commercial monitoring Tool salection Implementation.Implement SRE driven automated Incident Detection -> automated Engagement –> Triage/Mitigate – RCA/Postmortems -> Problem task Remediation.AI Driven Correlation, De-duplication Noise Reduction and Auto RemediationProvide weekly monitoring and alert analysis and continuous improvementCreate a model of the run-time environment (discovery)Profile the performance and behavior of user-defined transactionsEstablish Performance metrics from each of the applications/systems technical components (Webserver, App server, Database, etc.)Application performance management databaseAPM tool Administration and SupportMonitoring Tool design and implementationAPM Setup/Usage policies and guidelinesCapacity Planning and monitoringMonitor selected application performanceReport vital statistics of application performance in productionMake recommendations for improvements with Service DeskMake recommendations for adjustments to runtime resources to improve overall performance profile

KEY QUALIFICATION & EXPERIENCES:Strong problem solving and analytical skills.Strong interpersonal and written and verbal communication skills.Highly adaptable to changing circumstances. Interest in continuously learning new skills and technologies.Experience with programming and scripting languages (e.g. Java, C#, C++, Python, Bash, PowerShell).Experience with incident and response management.Experience with Agile and DevOps development methodologies.Experience with container technologies and supporting tools (e.g. Docker Swarm, Podman, Kubernetes, Mesos).Experience with working in cloud ecosystems (Microsoft Azure AWS, Google Cloud Platform,).Experience with monitoring and observability tools (e.g. Splunk, Cloudwatch, AppDynamics, NewRelic, ELK, Prometheus, OpenTelemetry).Experience with configuration management systems (e.g. Puppet, Ansible, Chef, Salt, Terraform).Experience working with continuous integration/continuous deployment tools (e.g. Git, Teamcity, Jenkin, Artifactory).Experience in GitOps based automation is PlusBachelor’s degree (or equivalent years of experience).5+ years of relevant work experience. SRE experience preferred.Background in Manufacturing, Platform/Tech compnies is preferred.Must have Public Cloud provider certifications (Azure, GCP or AWS)Having CNCF certification is plus



  • Delhi, India TechBlocks Full time

    Seeking a skilled Senior Site Reliability Engineer with expertise in Google Cloud Platform (GCP) to join our dynamic team. As a Senior SRE, you will play a crucial role in ensuring the reliability, scalability, and performance of our infrastructure and applications hosted on GCP.Responsibilities:Design, build, and maintain the core infrastructure used by all...


  • Delhi, India Integra Connect Full time

    About IntegraConnect Integra Connect delivers a comprehensive, integrated suite of cloud-based technologies and services that enable specialty groups to optimize clinical and financial performance as reimbursement shifts to value-based models. Connected by the IntegraCloud platform, the company’s core applications span population health including care...


  • Delhi, India Integra Connect Full time

    About IntegraConnect Integra Connect delivers a comprehensive, integrated suite of cloud-based technologies and services that enable specialty groups to optimize clinical and financial performance as reimbursement shifts to value-based models. Connected by the IntegraCloud platform, the company’s core applications span population health including care...


  • Delhi, India Persistent Systems Full time

    About Position:We are looking for Site Reliability Engineers who are proficient with monitoring tools, preferably New Relic. The person should have experience with Terraform, Docker, Kubernetes, and any cloud. Python coding experience is very much preferred.Role: Site Reliability EngineerLocation: HyderabadExperience: 8+ Yrs.Job Type: Full Time...


  • new delhi, India dentsu Full time

    The purpose of this role is to ensure the availability and stability of production and test platforms. Job Title: Site Reliability Engineer Job Description: Key responsibilities:Troubleshoots and owns issues in our development, test and production environments. Including performance optimisation and continuous tuningWorks alongside the DevOps team in...


  • new delhi, India Antal International Full time

    Job Description Summary role description: Hiring for a Site Reliability Engineer for a fastest-growing energy technology company. Company description: Our client is one of the fastest-growing energy technology companies in India, founded by some of the leaders in this space. They lead technological innovation for the most effective energy...


  • Delhi, India Sigmaways Inc Full time

    BackgroundAs a developer, you will work with a team of skilled Site Reliability Engineers and help them to improve the application reliability. You will play a critical role in working with the reliability of the massive scale application that processes billions of events every day. You will collaborate with multiple stakeholders and help the team write...


  • delhi, India Insight Global Full time

    Required Skills & ExperienceBachelor's degree in Computer Science, Engineering, or a related field.3+ years of experience in Systems Engineering or Site Reliability Engineering.Strong proficiency in GoLang programming.Experience with Red Hat OpenShift and container technologies (Docker, Kubernetes).Understanding of cloud platforms (AWS, Azure,...


  • Delhi, India noon Full time

    Job Description- Site Reliability EngineerAbout noonnoon.com is a technology leader with a simple mission: to be the best place to buy and sell things. In doing this we hope to accelerate the digital economy of the Middle East, empowering regional talent and businesses to meet the full range of consumers' online needs.noon operates without boundaries; we are...


  • Delhi, India noon Full time

    Job Description- Site Reliability EngineerAbout noon noon.Com is a technology leader with a simple mission: to be the best place to buy and sell things. In doing this we hope to accelerate the digital economy of the Middle East, empowering regional talent and businesses to meet the full range of consumers' online needs.noon operates without boundaries;we are...


  • new delhi, India Mrsool Full time

    Who Are We❓ Welcome to the world of Mrsool! ✨ Where on-demand delivery meets unparalleled user needs to deliver anything you desire. As one of the largest delivery platforms in the Middle East and North Africa (MENA) region, Mrsool has captivated users with its unique and seamless experience, earning it the highest ratings among all major delivery...


  • New Delhi, India Mrsool Full time

    Who Are We❓Welcome to the world of Mrsool! ✨ Where on-demand delivery meets unparalleled user needs to deliver anything you desire. As one of the largest delivery platforms in the Middle East and North Africa (MENA) region, Mrsool has captivated users with its unique and seamless experience, earning it the highest ratings among all major delivery...

  • SDIII Engineer

    4 days ago


    New Delhi, India Mrsool Full time

    Who Are We❓Welcome to the world of Mrsool! ✨ Where on-demand delivery meets unparalleled user needs to deliver anything you desire. As one of the largest delivery platforms in the Middle East and North Africa (MENA) region, Mrsool has captivated users with its unique and seamless experience, earning it the highest ratings among all major delivery...


  • Delhi, India Vimeo Full time

    We are looking for Self starter, motivated and extraordinary individuals with strong communication and interpersonal skills to join our Site Reliability Engineering team that supports the database infrastructure, as well as builds and runs a platform that delivers Vimeo product/ services to all of its customers around the world.What you’ll do:Gain a deep...


  • Delhi, India Vimeo Full time

    We are looking for Self starter, motivated and extraordinary individuals with strong communication and interpersonal skills to join our Site Reliability Engineering team that supports the database infrastructure, as well as builds and runs a platform that delivers Vimeo product/ services to all of its customers around the world.What you’ll do:Gain a deep...

  • Senior Engineer

    4 weeks ago


    Delhi, India C&R Software Full time

    Job Description SummaryThe Cloud Operations team is accountable for the operational excellence of the C&R cloud platform, which hosts several business-critical, client-facing applications. The objective of the SRE within Cloud Operations is to coordinate a timely and focused organisational-wide response to severe/high-impact technical incidents airing from...


  • Delhi, India Airtel International LLP-Airtel Africa Full time

    Job title: Site Reliability Engineer - Airtel MoneyWork Location: GurgaonDivision/Department: EngineeringWhy Airtel Africa?At Airtel, we don’t just make things – we make things possible. Airtel Africa is on a mission to change the world by connecting people with ideas. We are building next generation systems to improve the quality of life for millions of...


  • Delhi, India Alp Consulting Ltd. Full time

    Experienced L3 SRE engineer based on business-critical SaaS applicationCapacity to L3 across the full stack including infra backend and front-end, before escalation to engineering business unitCapacity to automate SRE tools to provide proactive L3 support, close to our tech monitoring strategyCapacity to work under business pressure for business-critical...

  • Site Engineer

    1 week ago


    Delhi, India Bare Wall Studio Full time

    Company DescriptionBare Wall Studio is a dynamic multi-disciplinary design studio based in Bengaluru, India. Our team of passionate architects and designers is committed to delivering innovative and sustainable design solutions. We believe in leveraging technology to create impactful and lasting designs for the future.Role DescriptionThis is a full-time...


  • Delhi, India Castlight Health Full time

    Job Description:Experience Level: 5 - 7 yearsResponsibilities:● Create reusable solutions using terraform plans, chef recipes and cookbooks, DSL for provisioning formaintaining and decommissioning the infrastructure● Provide day-to-day support of multiple environments such as: production, staging, and development● Provide 24x7 support for platform...