
System Reliability Specialist
9 hours ago
We are seeking an experienced reliability engineer to play a critical role in ensuring the scalability and reliability of our internal platform.
As a key member of our engineering team, you will work closely with development teams to ensure they have the tools, practices, and expertise needed to deliver high-quality software in a collaborative culture.
- Design, build, harden, and maintain key parts of our internal platform from CI/CD to developer tools.
- Migrate to industry-leading CICD tools like GitHub Actions.
- Automate safe deployment practices using GitHub Actions, ArgoCD, Argo Rollouts, Helm Charts, etc.
- Automate infrastructure provisioning and other engineering processes by working on automations built on top of an engineering platform written in GitHub Actions.
- Coach and up-skill other engineering team members.
- Solve challenging technical problems and see an immediate impact of your work.
- Develop effective tooling, alerts, and response to identify and address reliability risks.
- Drive protocols on production readiness and operational excellence.
- Partner with product engineering teams to debug production outages and improve reliability.
- Advocate for automated testing, continuous integration, and delivery.
- Plan for growth of our infrastructure.
Our ideal candidate has:
- 5-8 years of experience.
- Understand large-scale complex systems from a reliability perspective.
- Design, implement, and maintain CI/CD processes and tools.
- Passion for producing clean, standards-compliant, secure code.
- Bringing a developer mindset to infrastructure.
- Experience with Linux/Unix systems.
- Experience with Kubernetes.
- Experience with Infrastructure as Code tools like Terraform and Ansible.
- Experience building software with Java, Kotlin, Scala, or any other JVM-based languages.
- Experience writing scripts for automating tasks with Ruby, Python, Bash, or any other scripting language.
- Experience with relational and non-relational databases.
- Ability to identify time-consuming manual tasks and automate them.
- Ability to identify root causes of instability in distributed systems.
-
System Reliability Specialist
4 days ago
Kanpur, Uttar Pradesh, India beBeeEngineer Full time ₹ 20,00,000 - ₹ 25,00,000We are looking for a seasoned reliability expert to join our news team. The ideal candidate will be responsible for ensuring the stability, performance, and scalability of our systems.The successful candidate will have 8+ years of experience in Site Reliability Engineering and will possess expertise in Kubernetes, Docker, and container orchestration.Key...
-
Reliability Engineering Specialist
1 week ago
Kanpur, Uttar Pradesh, India beBeeEngineering Full time ₹ 1,50,00,000 - ₹ 2,00,00,000Job Title: Reliability Engineering SpecialistWe are seeking an experienced professional to join our Platform Engineering team as a Reliability Engineering Specialist. The ideal candidate will have a strong background in software engineering and systems operations, with expertise in building infrastructure that powers AI-driven code reviews at scale.Main...
-
Chief System Reliability Architect
4 days ago
Kanpur, Uttar Pradesh, India beBeePerformance Full time ₹ 1,80,00,000 - ₹ 2,20,00,000Reliability Engineering LeadAs a pivotal figure in building and scaling robust systems, the Reliability Engineering Lead oversees the reliability, scalability, and performance of our critical systems. This position combines software engineering and systems engineering expertise to build and maintain high-performing, reliable systems.Key...
-
Reliable Systems Expert
5 days ago
Kanpur, Uttar Pradesh, India beBeeResponsibility Full time ₹ 18,00,000 - ₹ 26,40,000Job OverviewThis is a key position for a skilled Site Reliability Engineer to join our team.Experience working with microservices on a Kubernetes background and possessing a strong understanding of observability tools and metrics.
-
Senior System Reliability Engineer
5 days ago
Kanpur, Uttar Pradesh, India beBeeMonitoring Full time ₹ 18,00,000 - ₹ 20,00,000System Health MonitorThe Insight Global team is hiring a full-time Monitoring Engineer to join the LLM Proxy Team. This role involves monitoring system health via Grafana dashboards, managing incident communications, and ensuring high reliability of globally deployed web applications.Key Responsibilities:Monitor Grafana dashboards and observability tools to...
-
AI/ML System Reliability Engineer
5 days ago
Kanpur, Uttar Pradesh, India beBeeSiteReliability Full time ₹ 13,04,000 - ₹ 26,12,000Transform Your Career with AI/ML Site ReliabilityWe seek an experienced professional to ensure the reliability and scalability of cloud-based AI/ML systems.Key Responsibilities:Design, implement, and maintain scalable and reliable Azure infrastructure (storage, networking, security, IAM)Collaborate with cross-functional teams to develop and deploy Databricks...
-
Enterprise Systems Support Specialist
9 hours ago
Kanpur, Uttar Pradesh, India beBeeMiddleware Full time ₹ 15,00,000 - ₹ 18,00,000Job Overview:">We are seeking a skilled and proactive Systems Support Specialist with hands-on experience in supporting enterprise business applications.">Key Qualifications:">">Strong administration expertise across Windows, Linux environments, and automation tools.">Proficiency in shell scripting and job management using Control-M.">Expertise in middleware...
-
Reliability and Performance Expert
4 days ago
Kanpur, Uttar Pradesh, India beBeeReliability Full time US$ 1,00,000 - US$ 1,50,000About the Role:As a reliability and performance expert, we are seeking a skilled professional to help build and maintain highly available services. This role will involve collaborating with development and operations teams to ensure the scalability and reliability of our critical systems.The ideal candidate will have expertise in designing and implementing...
-
Systems Integration Specialist
10 hours ago
Kanpur, Uttar Pradesh, India beBeeElectrical Full time ₹ 9,00,000 - ₹ 12,00,000Job Title: Systems Integration SpecialistThis is a full-time on-site position for an Electrical and Low Voltage (ELV) Systems Professional based in Hyderabad.Primary Responsibilities:Install, test, commission, and maintain extra-low voltage systems to ensure operational efficiency and safety standards.Key Skills and Qualifications:Bachelor's degree in...
-
Reliability Infrastructure Specialist
5 days ago
Kanpur, Uttar Pradesh, India beBeeSite Full time ₹ 1,80,00,000 - ₹ 2,52,00,000Job Opportunity:We are seeking a skilled Site Reliability Engineer to provide production, operations support and application administration to business and web applications, 3rd party applications and related ecosystems. Our ideal candidate will have experience in administering applications in AWS/Azure cloud with a minimum of two years of experience, as...