Senior Site Reliability Engineer- ELK Expert

4 weeks ago


India iVedha Inc. Full time

Senior Site Reliability Engineer (SRE) – ELK Expert | Platform Engineering PracticeLocation: India (Remote) - Must be available to work in the EST (US/Canada) Time Zone.Role Summary:Are you a Senior Site Reliability Engineer (SRE) with deep ELK expertise, ready to take ownership of large-scale observability infrastructure?We're looking for an SRE with 7+ years of experience, including 4+ years specializing in the ELK stack (Elasticsearch, Logstash, Kibana), to join our Platform Engineering Practice. In this role, you’ll design, manage, and scale ELK clusters ingesting 2–3+ TB/day, enhance reliability across distributed systems, and drive automation within Azure cloud environments. This is a high-impact engineering opportunity focused on performance, observability, and operational excellence at scale.Why Join UsCareer Growth: Work alongside industry experts on cutting-edge cloud technologiesCompetitive Compensation and Benefits: We recognize and reward top talentExciting, Impactful Work: Design and build scalable, resilient cloud environmentsStrategic Platform Role: Contribute to the foundation of next-gen observability and reliability infrastructureWhat You Will DoDesign and Optimize Cloud Infrastructure: Architect scalable, fault-tolerant systems on Microsoft AzureAutomate Everything: Use Terraform, Ansible, and GitHub Actions to streamline deployment and configurationEnsure Reliability and Performance: Proactively monitor, troubleshoot, and resolve production issues using Prometheus, Grafana, and Azure MonitorEnhance Security and Compliance: Implement security best practices across DevOps workflowsCollaborate and Innovate: Work closely with engineering, security, and operations teams to drive automation and efficiencyManage and scale large ELK clusters handling 2–3+ TB/day log volumes, ensuring high availability and performanceOptimize ELK architecture: Implement efficient index lifecycle policies, shard strategies, and hot-warm-cold tiered storageBuild and tune log pipelines: Scale Logstash and Beats pipelines across distributed environmentsSupport Kibana observability layers: Create dashboards, visualizations, and custom alerting frameworks (e.g., Watcher, ElastAlert)What You Bring7+ years of experience in Site Reliability Engineering, DevOps, or Cloud Engineering4+ years of dedicated, hands-on experience with ELK (Elasticsearch, Logstash, Kibana)Strong experience managing large-scale ELK clusters in production with heavy ingestion (multi-TB/day)Deep knowledge of index tuning, shard allocation, ILM policies, and scaling ELK componentsExpertise in GitHub Actions, Terraform, Ansible, and Infrastructure as Code (IaC)Proficiency in Python, Go, or Bash for automation and scriptingDeep understanding of Kubernetes, Docker, and cloud-native architecturesExperience with observability tools such as Prometheus, Grafana, Azure MonitorAbility to work in a fast-paced, collaborative environment and solve complex operational issuesEducationBachelor’s or Master’s degree in Computer Science, Information Technology, or a related fieldCertifications (Nice to Have)Microsoft Azure certifications: AZ-104, AZ-400 



  • India iVedha Inc. Full time

    Senior Site Reliability Engineer (SRE) – ELK Expert | Platform Engineering Practice Location: India (Remote) - Must be available to work in the EST (US/Canada) Time Zone. Role Summary: Are you a Senior Site Reliability Engineer (SRE) with deep ELK expertise, ready to take ownership of large-scale observability infrastructure? We're looking for an SRE with...


  • India iVoyant Full time

    One of our clients is looking for an experienced Senior Site Reliability Engineer (SRE) - Mission-Critical SaaS Cloud Products to join their team.Key Responsibilities:Reliability and Performance Management:Design, implement, and maintain highly available, scalable, and resilient cloud-native architectures for mission-critical SaaS products. Develop and...


  • India InOrg Full time ₹ 12,00,000 - ₹ 36,00,000 per year

    About VivaOps :VivaOps is a leading DevSecOps platform company specializing in GitLab - The comprehensive DevOps platform, to transform and secure software development processes. We help organizations to streamline their DevSecOps journey by offering a complete range of GitLab services, from advisory, to implementation and managed services, to accelerate...

  • ELK Developer

    1 week ago


    Bengaluru, India Dicetek LLC Full time

    Job Description The ELK Developer is responsible to enhance the monitoring of the Business-Critical applications of the enterprise and align with the standards and policies defined by the Enterprise Tools and CSI department. An ELK Developer is a specialist who uses data to monitor and improve the performance, reliability, and security of infrastructure and...


  • Hyderabad, India AutoRABIT Full time

    Job Description AutoRABIT Profile AutoRABIT is the leader in DevSecOps for SaaS platforms such as Salesforce. Its unique metadata-aware capability makes Release Management, Version Control, and Backup & Recovery complete, reliable, and effective. AutoRABIT's highly scalable framework covers the entire DevSecOps cycle, which makes it the favourite platform...


  • India CareerUS Solutions Full time

    Job Description Position Overview: The Site Reliability Engineer (SRE) is responsible for ensuring the stability, scalability, performance, and reliability of production systems and services. This role bridges software development and operations, using automation, monitoring, and performance optimization to build resilient systems that can scale efficiently...


  • India Weekday AI Full time ₹ 15,00,000 - ₹ 25,00,000 per year

    This role is for one of Weekday's clientsMin Experience: 4 yearsJobType: full-timeWe are looking for an experienced and motivated Site Reliability Engineer (SRE) – Platform Engineering to join our growing technology team. In this role, you will be responsible for designing, building, and maintaining scalable, resilient, and secure infrastructure platforms...


  • India Akamai Full time ₹ 12,00,000 - ₹ 24,00,000 per year

    DescriptionDo you like collaborating across teams to solve complex problems?Do you enjoy solving large scale systems problems?Join our Site Reliability Engineering teamThe Senior Site Performance and Reliability Engineer ensures optimal performance and uptime of Akamai's portal services and infrastructure. Responsibilities include analyzing system...


  • India Jobgether Full time ₹ 12,00,000 - ₹ 24,00,000 per year

    This position is posted by Jobgether on behalf of a partner company. We are currently looking for a Senior Site Reliability Engineer in India.We are seeking an experienced Senior Site Reliability Engineer to ensure the reliability, scalability, and performance of critical security infrastructure. In this role, you will lead initiatives for operational...


  • India techolution Full time

    We are seeking a highly skilled Site Reliability Engineer - AWS to enhance the reliability, scalability, and security of our cloud infrastructure. The ideal candidate will be responsible for designing, implementing, and maintaining high-availability systems, automating processes, and ensuring seamless operations on AWS. This role requires expertise in...