Senior Site Reliability Engineer- ELK Expert

4 days ago


Thrissur, Kerala, India iVedha Inc. Full time
Senior Site Reliability Engineer (SRE) – ELK Expert | Platform Engineering Practice

Location: India (Remote) - Must be available to work in the EST (US/Canada) Time Zone.

Role Summary:

Are you a Senior Site Reliability Engineer (SRE) with deep ELK expertise, ready to take ownership of large-scale observability infrastructure?

We're looking for an SRE with 7+ years of experience, including 4+ years specializing in the ELK stack (Elasticsearch, Logstash, Kibana), to join our Platform Engineering Practice. In this role, you'll design, manage, and scale ELK clusters ingesting 2–3+ TB/day, enhance reliability across distributed systems, and drive automation within Azure cloud environments. This is a high-impact engineering opportunity focused on performance, observability, and operational excellence at scale.

Why Join Us
  • Career Growth: Work alongside industry experts on cutting-edge cloud technologies
  • Competitive Compensation and Benefits: We recognize and reward top talent
  • Exciting, Impactful Work: Design and build scalable, resilient cloud environments
  • Strategic Platform Role: Contribute to the foundation of next-gen observability and reliability infrastructure
What You Will Do
  • Design and Optimize Cloud Infrastructure: Architect scalable, fault-tolerant systems on Microsoft Azure
  • Automate Everything: Use Terraform, Ansible, and GitHub Actions to streamline deployment and configuration
  • Ensure Reliability and Performance: Proactively monitor, troubleshoot, and resolve production issues using Prometheus, Grafana, and Azure Monitor
  • Enhance Security and Compliance: Implement security best practices across DevOps workflows
  • Collaborate and Innovate: Work closely with engineering, security, and operations teams to drive automation and efficiency
  • Manage and scale large ELK clusters handling 2–3+ TB/day log volumes, ensuring high availability and performance
  • Optimize ELK architecture: Implement efficient index lifecycle policies, shard strategies, and hot-warm-cold tiered storage
  • Build and tune log pipelines: Scale Logstash and Beats pipelines across distributed environments
  • Support Kibana observability layers: Create dashboards, visualizations, and custom alerting frameworks (e.g., Watcher, ElastAlert)
What You Bring
  • 7+ years of experience in Site Reliability Engineering, DevOps, or Cloud Engineering
  • 4+ years of dedicated, hands-on experience with ELK (Elasticsearch, Logstash, Kibana)
  • Strong experience managing large-scale ELK clusters in production with heavy ingestion (multi-TB/day)
  • Deep knowledge of index tuning, shard allocation, ILM policies, and scaling ELK components
  • Expertise in GitHub Actions, Terraform, Ansible, and Infrastructure as Code (IaC)
  • Proficiency in Python, Go, or Bash for automation and scripting
  • Deep understanding of Kubernetes, Docker, and cloud-native architectures
  • Experience with observability tools such as Prometheus, Grafana, Azure Monitor
  • Ability to work in a fast-paced, collaborative environment and solve complex operational issues
Education
  • Bachelor's or Master's degree in Computer Science, Information Technology, or a related field
Certifications (Nice to Have)
  • Microsoft Azure certifications: AZ-104, AZ-400 


  • Thrissur, Kerala, India beBeeElkexpert Full time ₹ 25,00,000 - ₹ 35,00,000

    Observability Expert SoughtWe are looking for an experienced Observability Expert to lead our observability strategy and drive the implementation of large-scale observability infrastructure.Main Responsibilities:Design and manage high-volume ELK clusters ingesting terabytes of data dailyEnhance reliability across distributed systemsDrive automation within...


  • Thrissur, Kerala, India beBeeReliability Full time US$ 1,80,000 - US$ 2,25,000

    Observability Expert WantedWe're seeking a seasoned engineer to join our platform engineering practice and help us design, manage, and scale large-scale observability infrastructure.In this role, you'll be responsible for:Designing Cloud Infrastructure: Architect scalable, fault-tolerant systems on Microsoft Azure.Automating Everything: Use Terraform,...


  • Thrissur, Kerala, India CES Full time

    We are seeking a hands-on SRE with expertise in infrastructure automation, cloud scalability, and performance optimization. You'll design, manage, and monitor large-scale AWS environments, ensuring high availability, security, and reliability for our SaaS platformsKey ResponsibilitiesDevelop and execute UI automation using Cypress with TypeScript.Conduct...


  • Thrissur, Kerala, India beBeeSre Full time ₹ 23,00,000 - ₹ 29,00,000

    Job Role: Enterprise SRE ExpertJob DescriptionThe role of Senior Site Reliability Engineer is to provide technical expertise across our IT organization. This includes investigating and resolving high-impact production issues, guiding development teams through performance challenges, and participating in incident response bridges.Key ResponsibilitiesResolve...

  • Site Engineer

    2 weeks ago


    Thrissur, Kerala, India CONNECTING 2 WORK Full time

    Job Description Supervision required for various sites.Need to report the daily updates. Training will be provided on:1) Estimation and quantitysurveying.2)Project planning and scheduling3)QA/QC Reports4)Daily, Weekly/monthly, and yearly project report making.5)Bar Bending Schedule preparation6)Handling site workers in terms of timely completion and...


  • Thrissur, Kerala, India beBeeMonitoring Full time ₹ 18,00,000 - ₹ 20,00,000

    System Monitoring RoleAs a System Monitoring Engineer, you will play a crucial role in ensuring the smooth operation of our globally deployed web application.You will be responsible for monitoring Grafana dashboards and observability tools to detect failures and performance issues. Your primary focus will be on incident response, initiating reports from...


  • Thrissur, Kerala, India beBeeEngineering Full time ₹ 18,00,000 - ₹ 25,00,000

    Maximize your career as a DevOps/Platform EngineerJob Description:Empower organizations to transform their software delivery and infrastructure operations using advanced automation, platform reliability, and robust pipelines.Key Responsibilities:Design, deploy, and manage continuous integration/continuous deployment (CI/CD) pipelines and infrastructure...


  • Thrissur, Kerala, India beBeeSite Full time ₹ 1,50,00,000 - ₹ 2,50,00,000

    Key Responsibilities:Maintain and improve the reliability of cloud-based AI&ML infrastructure on Azure.Design and implement DevOps pipelines using Terraform and Databricks.Collaborate with cross-functional teams to resolve issues and implement new features.Our ideal candidate will possess:At least 6 years of experience in SRE with a strong focus on cloud...

  • Scada Engineer

    3 weeks ago


    Thrissur, Kerala, India TIGI HR Full time

    Job Title : Senior SCADA Engineer – Solar EnergyJob Summary :The Senior SCADA Engineer leads the design, programming, and deployment of SCADA systems for utility-scale solar PV and battery energy storage (BESS) projects. This role ensures reliable, secure monitoring and control of renewable energy assets.Key Responsibilities:Design SCADA systems for solar...


  • Thrissur, Kerala, India beBeeSite Full time ₹ 18,00,000 - ₹ 26,40,000

    Job SummaryWe are seeking a skilled Site Reliability Engineer to support our LLM Proxy team.Key ResponsibilitiesMonitor and interpret Grafana dashboards to signal failures and problems, managing incident communication.Act as the primary point of contact, exhibiting excellent communication skills to end customers and incident commanders.Provide frequent...