
Reliability Systems Engineer
7 days ago
About the Role :
We are seeking an experienced Reliability Systems Engineer to join our high-performance infrastructure team.
Your Key Responsibilities :
- Incident & Alert Management : Monitor production systems and handle alerts to ensure minimal service disruption.
- Act as the first point of escalation for production incidents and critical system issues.
- Drive rapid resolution of major incidents to restore services as quickly as possible.
- Coordinate with cross-functional teams to resolve unresolved incidents following defined escalation procedures.
Monitoring & Observability :
- Design, implement, and maintain application monitoring using tools such as OpenSearch, ELK, Grafana, Prometheus, PagerDuty, Pingdom, Datadog, and Splunk.
- Evaluate robust logging, metrics, and distributed tracing practices to provide full observability into system performance.
- Regularly review and refine monitoring configurations to align with evolving system needs.
Automation & Reliability Engineering :
- Collaborate with product engineering teams to develop SOPs for operational excellence.
- Automate deployment, scaling, and operational tasks using tools like Ansible, Kubernetes, and CI/CD frameworks.
- Implement proof-of-concepts for new tools and technologies with the aim of integrating them into production environments.
Root Cause Analysis & Continuous Improvement :
- Perform detailed root cause analysis for service-impacting events.
- Identify trends and recurring issues to proactively improve system stability.
- Contribute to post-incident reviews and recommend preventive measures to prevent similar issues in the future.
Requirements :
- Experience with large-scale, mission-critical production systems.
- Familiarity with Agile methodologies and DevOps practices.
- Prior experience driving POCs for production-scale technology adoption.
Skills and Qualifications :
- Strong problem-solving and analytical abilities.
- Excellent communication and documentation skills.
- Ability to work effectively in high-pressure situations and tight deadlines.
- Strong organizational skills with the ability to manage multiple priorities.
-
System Reliability Engineer
2 weeks ago
Gurgaon, Haryana, India beBeeReliability Full time ₹ 15,00,000 - ₹ 20,00,000Job Title: System Reliability EngineerWe are seeking a highly skilled System Reliability Engineer to lead capacity management, operational support, and incident resolution for our platforms. This role requires a professional with a background in both SRE and application support, who can collaborate with development and infrastructure teams to ensure the...
-
Reliable System Architect
7 days ago
Gurgaon, Haryana, India beBeeSystem Full time ₹ 2,00,00,000 - ₹ 2,50,00,000Reliable System ArchitectJob Overview:This is a challenging position that requires software engineering and operations expertise to design, build, and maintain systems capable of handling high production traffic efficiently.Experience with Observability Tools: A solid understanding of Dynatrace, including on-premises and SaaS solutions, is essential for...
-
Site Reliability Engineer
6 days ago
Gurgaon, Haryana, India ElevenX Capital Full time US$ 1,50,000 - US$ 2,00,000 per yearAbout the Role:We are looking for a skilled Site Reliability Engineer (SRE) to join our team and help us ensure the reliability, scalability, and performance of our critical systems. As an SRE, you will work closely with development and operations teams to build and maintain highly available services, automate operational tasks, and monitor system health.Key...
-
Reliability Engineer
3 days ago
Gurgaon, Haryana, India beBeeReliability Full time ₹ 1,20,00,000 - ₹ 2,50,00,000We are seeking a skilled Reliability Engineering Specialist to support our technology infrastructure.Your primary focus will be on ensuring the reliability, performance, and scalability of enterprise applications. You will play a critical role in supporting and optimizing systems like Salesforce, Oracle, Mulesoft, Oracle Integration Cloud, ServiceNow, and...
-
Reliable System Architect
1 week ago
Gurgaon, Haryana, India beBeeEngineering Full time ₹ 1,04,000 - ₹ 1,30,878System Reliability RoleEnsure continuous system availability by designing and implementing scalable global systems.Develop and maintain robust system architecture, deploying and scaling applications to meet business needs.Lead incident response efforts, analyzing system failures and implementing solutions to prevent recurrence.Implement automation processes...
-
Cloud System Reliability Specialist
7 days ago
Gurgaon, Haryana, India beBeeReliability Full time US$ 1,20,000 - US$ 1,50,000**Job Overview:**A Site Reliability Engineer plays a crucial role in ensuring the smooth operation of cloud-based systems. Their primary goal is to guarantee that these systems are running efficiently and providing the expected performance.">They are responsible for daily operations tasks, including monitoring, deployment, and incident management as well as...
-
Site Reliability Engineer
3 days ago
Gurgaon, Haryana, India Aerial Telecom Solutions (ATS) Full time ₹ 1,04,000 - ₹ 1,30,878 per yearPosition Overview:SRE- Lead will be responsible for managing a team of engineers focused on software deployments and site reliability engineering practices. The role will involve overseeing the deployment process of software applications and services, implementing automation, monitoring, and alerting tools, and ensuring the reliability, availability, and...
-
Reliable Systems Specialist
1 week ago
Gurgaon, Haryana, India beBeesite Full time ₹ 1,20,00,000 - ₹ 1,70,00,000About the OpportunityThis role is part of our Institutional Services Distribution business, which operates across the UK, EMEA and Asia Pacific.We are a strategic area targeted for growth over the coming years.The Technology Department has been acting as key enablers for the business in achieving their goals.We are seeking a motivated and skilled Site...
-
Site Reliability Engineer
2 weeks ago
Gurgaon, Haryana, India RBS Full time US$ 1,50,000 - US$ 2,00,000 per yearJoin us as a Site Reliability EngineerIn this key role, you'll improve, drive, and embed non-functional and operational characteristics such as availability, performance, efficiency, change management, monitoring, security, incident response, and capacity planning of our products and servicesYou'll enjoy significant stakeholder interaction, working in...
-
Senior System Reliability Specialist
4 days ago
Gurgaon, Haryana, India beBeeReiability Full time ₹ 13,80,600 - ₹ 2,01,42,000Job Title: Reliability EngineerThis role is for a highly skilled Reliability Engineer to join our Application Reliability Engineering team.The ideal candidate will be able to provide functional support for Github Actions, Azure DevOps, Ansible, and Olam jobs, troubleshoot and resolve issues reported by end-users, manage and prioritize incidents ensuring...