
High Availability System Engineer
2 weeks ago
We are seeking a seasoned system reliability expert to lead the design, implementation, and maintenance of our mission-critical systems. The successful candidate will work closely with cross-functional teams to ensure the scalability, reliability, and performance of our infrastructure.
About This Role
- Infrastructure Management: Develop and maintain monitoring tools and dashboards for system health, performance, and capacity planning to ensure optimal resource utilization.
- Automation: Implement automation scripts for repetitive tasks, including deployments, scaling, and maintenance, to reduce manual intervention and improve efficiency.
- Incident Response: Respond to incidents, lead post-incident analysis, and drive efforts to prevent recurrences through detailed root cause analysis.
- Performance Optimization: Identify and resolve performance bottlenecks in infrastructure, databases, applications, and networking to ensure optimal system performance.
- System Design & Architecture: Collaborate with development teams to design systems that are highly scalable, fault-tolerant, and optimized for performance and cost.
- CI/CD Pipelines: Build and maintain CI/CD pipelines to automate deployments and improve delivery cycles.
- Disaster Recovery: Develop and maintain disaster recovery plans, ensuring that systems are resilient and capable of fast recovery.
- Capacity Planning: Monitor and forecast resource utilization and implement cost-effective scaling strategies.
- Security & Compliance: Ensure that all systems meet or exceed security and compliance requirements.
- Documentation & Knowledge Sharing: Maintain comprehensive documentation of systems and processes. Provide mentorship and share knowledge with team members.
Required Skills and Qualifications
- At least 3 years of experience in a Site Reliability Engineer, DevOps, or Infrastructure role.
- Proficiency in one or more scripting languages (Python, NodeJS, Bash, etc.).
- Hands-on experience with AWS cloud platforms.
- Proficiency with tools like Terraform, Ansible, or CloudFormation.
- Exceptional problem-solving skills.
- Familiarity with tools such as New relic, Prometheus, Grafana, ELK stack, Datadog, or equivalent.
- Experience with CI/CD tools like Jenkins, Code Pipeline, GitLab CI, CircleCI, or equivalent.
- Version control best practices using Git.
- Containerization and orchestration using Docker and Kubernetes.
- Strong understanding of networking concepts such as DNS, load balancing, VPNs, and firewalls.
- Ability to analyze complex systems, diagnose issues, and drive improvements.
- Excellent communication skills and ability to collaborate across teams.
-
High Availability System Engineer
3 days ago
Bengaluru, Karnataka, India beBeeNetwork Full time ₹ 1,50,00,000 - ₹ 2,50,00,000Reliability Engineer OpportunityWe seek an experienced Reliability Engineer to ensure high availability and performance of our critical services.Key ResponsibilitiesMonitor and maintain applications on CentOS servers to guarantee high availability and performance.Conduct routine tasks for system and application maintenance following established procedures to...
-
High Availability System Engineer
1 week ago
Bengaluru, Karnataka, India beBeeEngineer Full time ₹ 2,00,00,000 - ₹ 2,50,00,000Senior SRE Position OverviewAbout the Role:This senior position involves leading infrastructure management, cloud-native system design, and production operations. The ideal candidate will have extensive experience in managing Kubernetes clusters at scale, data platforms like Kafka and ClickHouse, and a strong programming background.Key...
-
High Availability Systems Specialist
1 week ago
Bengaluru, Karnataka, India beBeeReliability Full time ₹ 25,00,000 - ₹ 35,00,000Site Reliability Engineering RoleOur organization seeks a skilled Site Reliability Engineer (SRE) to guarantee the dependability, scalability, and performance of our critical systems. The SRE will collaborate closely with development and operations teams to design and maintain high availability services, automate operational tasks, and monitor system...
-
High Availability System Developer
3 days ago
Bengaluru, Karnataka, India beBeeEngineer Full time ₹ 90,00,000 - ₹ 1,20,00,000We are seeking experienced engineers who can drive reliability, scalability, and efficiency in building tools, services, and automation to manage and improve production.Key Responsibilities:Improve the reliability and performance of distributed systems and containerized deployments using strong engineering principles.Diagnose and troubleshoot complex...
-
High Availability System Administrator
2 weeks ago
Bengaluru, Karnataka, India beBeeLinux Full time ₹ 20,00,000 - ₹ 25,00,000System Operations SpecialistWe are seeking an experienced System Operations Specialist to join our global team. The ideal candidate will have a strong background in server infrastructure, virtualization, and containerization.Key Responsibilities:Provide high-level support for Linux Server platforms, including on-call coverage and collaboration with...
-
High Availability Engineer
2 days ago
Bengaluru, Karnataka, India beBeeReliability Full time US$ 2,00,000 - US$ 2,50,000As a Senior Site Reliability Engineer, you will be responsible for ensuring the availability and efficiency of our SaaS platform on Azure. This involves defining and enforcing reliability standards, implementing error-budget policies, and maintaining service-level objectives (SLOs) and error budgets.The ideal candidate will have deep operational expertise in...
-
High Availability System Specialist
3 days ago
Bengaluru, Karnataka, India beBeeJob Full time ₹ 12,00,000 - ₹ 18,00,000Computer Operations ExpertWe are seeking an experienced Computer Operations expert to join our team. As a key member of our operations group, you will be responsible for overseeing the day-to-day activities of IBMi(AS/400) systems, ensuring high availability, performance, and reliability of critical business applications.
-
High Availability Specialist
4 days ago
Bengaluru, Karnataka, India beBeeReliability Full time ₹ 1,80,00,000 - ₹ 2,10,00,000Job OverviewOur organization is seeking a skilled System Reliability Engineer to ensure the reliability, scalability, and performance of our systems and applications. In this role, you will be responsible for ensuring high availability and optimal system performance.Key Responsibilities:Performance Monitoring & OptimizationUse Dynatrace and CloudWatch to...
-
High Availability Specialist
2 days ago
Bengaluru, Karnataka, India beBeeResiliency Full time ₹ 15,00,000 - ₹ 25,00,000Resiliency EngineerAbout the Role:We are seeking a skilled Resiliency Engineer to join our team. The ideal candidate will have a strong background in software testing, QA, or systems engineering with a focus on resiliency testing.Main Responsibilities:The successful candidate will be responsible for designing and implementing resiliency-focused testing...
-
High Availability System Architect
3 days ago
Bengaluru, Karnataka, India beBeecloud Full time US$ 12,00,000 - US$ 14,00,000Sapaad is a leading unified commerce platform dedicated to delivering software solutions. Its flagship product has seen success worldwide.You will collaborate with engineers and developers across applications and infrastructure ensuring high availability and performance of our suite of solutions.Key ResponsibilitiesDesign deploy and manage scalable secure...