Senior site reliability engineer- elk expert

6 days ago


Delhi, India IVedha Inc. Full time

Senior Site Reliability Engineer (SRE) – ELK Expert | Platform Engineering PracticeLocation: India (Remote) - Must be available to work in the EST (US/Canada) Time Zone.Role Summary:Are you a Senior Site Reliability Engineer (SRE) with deep ELK expertise, ready to take ownership of large-scale observability infrastructure?We're looking for an SRE with 7+ years of experience , including 4+ years specializing in the ELK stack (Elasticsearch, Logstash, Kibana) , to join our Platform Engineering Practice. In this role, you’ll design, manage, and scale ELK clusters ingesting 2–3+ TB/day , enhance reliability across distributed systems, and drive automation within Azure cloud environments. This is a high-impact engineering opportunity focused on performance, observability, and operational excellence at scale.Why Join Us Career Growth: Work alongside industry experts on cutting-edge cloud technologiesCompetitive Compensation and Benefits: We recognize and reward top talentExciting, Impactful Work: Design and build scalable, resilient cloud environmentsStrategic Platform Role: Contribute to the foundation of next-gen observability and reliability infrastructureWhat You Will Do Design and Optimize Cloud Infrastructure: Architect scalable, fault-tolerant systems on Microsoft AzureAutomate Everything: Use Terraform, Ansible, and Git Hub Actions to streamline deployment and configurationEnsure Reliability and Performance: Proactively monitor, troubleshoot, and resolve production issues using Prometheus, Grafana, and Azure MonitorEnhance Security and Compliance: Implement security best practices across Dev Ops workflowsCollaborate and Innovate: Work closely with engineering, security, and operations teams to drive automation and efficiencyManage and scale large ELK clusters handling 2–3+ TB/day log volumes, ensuring high availability and performanceOptimize ELK architecture: Implement efficient index lifecycle policies, shard strategies, and hot-warm-cold tiered storageBuild and tune log pipelines: Scale Logstash and Beats pipelines across distributed environmentsSupport Kibana observability layers: Create dashboards, visualizations, and custom alerting frameworks (e.g., Watcher, Elast Alert)What You Bring 7+ years of experience in Site Reliability Engineering, Dev Ops, or Cloud Engineering4+ years of dedicated, hands-on experience with ELK (Elasticsearch, Logstash, Kibana)Strong experience managing large-scale ELK clusters in production with heavy ingestion (multi-TB/day)Deep knowledge of index tuning, shard allocation, ILM policies , and scaling ELK componentsExpertise in Git Hub Actions, Terraform, Ansible, and Infrastructure as Code (Ia C)Proficiency in Python, Go, or Bash for automation and scriptingDeep understanding of Kubernetes, Docker , and cloud-native architecturesExperience with observability tools such as Prometheus, Grafana, Azure MonitorAbility to work in a fast-paced, collaborative environment and solve complex operational issuesEducation Bachelor’s or Master’s degree in Computer Science, Information Technology, or a related fieldCertifications (Nice to Have) Microsoft Azure certifications: AZ-104 , AZ-400



  • Delhi, India iVedha Inc. Full time

    Senior Site Reliability Engineer (SRE) – ELK Expert | Platform Engineering PracticeLocation: India (Remote) - Must be available to work in the EST (US/Canada) Time Zone.Role Summary:Are you a Senior Site Reliability Engineer (SRE) with deep ELK expertise, ready to take ownership of large-scale observability infrastructure?We're looking for an SRE with 7+...


  • Delhi, India iVoyant Full time

    One of our clients is looking for an experienced Senior Site Reliability Engineer (SRE) - Mission-Critical SaaS Cloud Products to join their team. Key Responsibilities: Reliability and Performance Management: - Design, implement, and maintain highly available, scalable, and resilient cloud-native architectures for mission-critical SaaS products. - Develop...


  • Delhi, India iVoyant Full time

    One of our clients is looking for an experienced Senior Site Reliability Engineer (SRE) - Mission-Critical SaaS Cloud Products to join their team.Key Responsibilities:Reliability and Performance Management:- Design, implement, and maintain highly available, scalable, and resilient cloud-native architectures for mission-critical SaaS products.- Develop and...


  • Delhi, India IVoyant Full time

    One of our clients is looking for an experienced Senior Site Reliability Engineer (SRE) - Mission-Critical Saa S Cloud Products to join their team.Key Responsibilities:Reliability and Performance Management:- Design, implement, and maintain highly available, scalable, and resilient cloud-native architectures for mission-critical Saa S products.- Develop and...


  • New Delhi, India iVoyant Full time

    One of our clients is looking for an experienced Senior Site Reliability Engineer (SRE) - Mission-Critical SaaS Cloud Products to join their team.Key Responsibilities:Reliability and Performance Management:- Design, implement, and maintain highly available, scalable, and resilient cloud-native architectures for mission-critical SaaS products. - Develop and...


  • Delhi, NCR, New Delhi, Pune, India Ithena Full time ₹ 20,00,000 - ₹ 25,00,000 per year

    Senior Site Reliability Engineer (SRE) Backend SystemsLocation: Remote (India) Pune/Delhi/Delhi NCRMumbaiExperience: 8+ years Were looking for a Senior SRE to join our backend team and help scale our real-time, event-driven platform. This role goes beyond traditional DevOps we're seeking engineers who can write high-quality code, debug complex distributed...


  • New Delhi, India AutoRABIT Full time

    AutoRABIT Profile AutoRABIT is the leader in DevSecOps for SaaS platforms such as Salesforce. Its unique metadata-aware capability makes Release Management, Version Control, and Backup & Recovery complete, reliable, and effective. AutoRABIT’s highly scalable framework covers the entire DevSecOps cycle, which makes it the favourite platform for companies,...


  • New Delhi, India AutoRABIT Full time

    AutoRABIT ProfileAutoRABIT is the leader in DevSecOps for SaaS platforms such as Salesforce. Its unique metadata-aware capability makes Release Management, Version Control, and Backup & Recovery complete, reliable, and effective. AutoRABIT’s highly scalable framework covers the entire DevSecOps cycle, which makes it the favourite platform for companies,...


  • Delhi, India People Hire Consulting Full time

    Looking for a Manager, Site Reliability Engineering to help us scale our systems and ensurestability, reliability and performance and rapid deployments of our platform. We build teams thatare inclusive, collaborative, and have a strong sense of ownership for the things they build. If youhave a passion and track record for solving problems; moreover, have...


  • New Delhi, India CodeKarma Full time

    Site Reliability Engineer (Multi-Cloud Deployments)Location: Bangalore / RemoteExperience: 4–10 yearsType: Full-time (6-month probation)About CodeKarmaCodeKarma is redefining how engineering teams understand and evolve complex systems — bringing production context directly into the developer’s workflow.Our platform runs both as SaaS and as sub-account...