Current jobs related to Cloud-Scale Reliability Engineering Leader - Hyderabad, Telangana - Splunk Inc


  • Hyderabad, Telangana, India Zyoin Full time

    **Job Overview**Zyoin is seeking a highly experienced Cloud-Native Reliability Engineer Leader to join our team in Hyderabad, India. The successful candidate will lead the development of scalable, secure, and efficient cloud-native applications.About ZyoinZyoin is a fast-growing technology company based in Denver, Colorado, founded in 2014. We specialize in...


  • Hyderabad, Telangana, India FedEx ACC Full time

    About FedEx ACC IndiaWe are a strategic technology division for FedEx, focusing on developing innovative solutions to enhance productivity and minimize expenses globally. Our mission is to provide outstanding customer experiences.A Cloud Reliability Engineer (CRE) combines software engineering and cloud capabilities to ensure the scalability, performance,...


  • Hyderabad, Telangana, India Celigo Full time

    **Job Description:** Cloud Infrastructure LeaderJoin Celigo as a Cloud Infrastructure Leader and take charge of designing and implementing scalable, secure, and efficient infrastructure systems. As a key member of our engineering team, you will be responsible for driving the adoption of cloud-native technologies and ensuring high availability and scalability...


  • Hyderabad, Telangana, India Truetech Full time

    About the Role:We are seeking a highly experienced Site Reliability Engineering (SRE) leader to manage our team of SREs. The successful candidate will be responsible for providing mentorship, guidance, and support to ensure the team's success.Key Responsibilities:Develop and implement strategies for improving system reliability, scalability, and...


  • Hyderabad, Telangana, India Talent500 Full time

    About Talent500:We're a company that drives innovation and excellence in the travel industry.Our teams tackle complex challenges, pioneer cutting-edge technologies, and redefine the travel experience.We offer unique opportunities for engineers to solve real-world problems on a grand scale.Join our dynamic tech-driven environment where creativity and...


  • Hyderabad, Telangana, India FedEx ACC Full time

    About FedEx ACC">We are a leading company in the logistics industry, known for our reliability and efficiency.">Salary Range">$120,000 - $180,000 per year">Job Description">A Cloud Systems Reliability Specialist is responsible for ensuring the scalability, performance, and reliability of large-scale cloud-based applications. They combine software engineering...


  • Hyderabad, Telangana, India ANSR Full time

    Job OverviewWe are seeking a skilled Cloud Engineering Leader to spearhead the development of scalable solutions at ANSR. As a key member of our team, you will be responsible for designing, building, and maintaining complex cloud-based systems that meet the highest standards of quality and reliability.Estimated SalaryThe estimated salary range for this...


  • Hyderabad, Telangana, India Talent500 Full time

    We are seeking a seasoned Cloud Engineering Senior Leader to join Talent500's Core Tech Engg & Solutions (CTES) team. This role requires strong technology leadership with hands-on digital experience.About the RoleThe ideal candidate will be responsible for enriching a Cloud Practice within CTES, accelerating our Cloud journey for claim platforms by...


  • Hyderabad, Telangana, India Talent500 Full time

    About Talent500: Talent500 is a leading provider of innovative technology solutions, with a relentless drive for excellence and innovation. Our team of experts shapes the future of travel by tackling complex challenges and pioneering cutting-edge technologies.As a Digital Reliability Engineer, you will play a vital role in shaping the future of travel by...


  • Hyderabad, Telangana, India Manuh Technologies Full time

    Job DescriptionKey Responsibilities:Lead the design and implementation of large-scale data processing systems on AWS.Collaborate with development teams to develop, test, and deploy cloud-based data platforms.Develop and maintain CI/CD pipelines using CloudFormation and Jenkins.Analyze data using SQL stored procedures and implement data ingestion pipelines in...


  • Hyderabad, Telangana, India Talent500 Full time

    About the RoleWe are seeking an experienced Cloud Reliability Engineering Specialist to join our team at FedEx ACC. As a Cloud Reliability Engineer, you will play a critical role in ensuring the scalability, performance, and reliability of our cloud-based applications.

  • Cloud Engineer

    3 weeks ago


    Hyderabad, Telangana, India Tanla Platforms Limited Full time

    Company Overview:Tanla Platforms Limited is a rapidly growing company in the telecom and CPaaS space, with a mission to safeguard its assets, data, and reputation in the industry. Our team is passionate about delivering innovative solutions that make a real impact.Salary: ₹1,200,000 - ₹2,500,000 per annum, depending on experience.Job Description:We are...


  • Hyderabad, Telangana, India raptorX Full time

    We are seeking a highly skilled Cloud-Scale Software Engineer to join our team at RaptorX.As a key member of our engineering team, you will design and optimize server-side systems for rapid data ingestion, fraud detection logic, and real-time scoring. You will work on building APIs and distributed systems that support real-time and batch processing of...


  • Hyderabad, Telangana, India FedEx ACC Full time

    About FedEx ACC India:As a strategic technology division, we develop innovative solutions for customers and team members worldwide. Our goal is to enhance productivity, minimize expenses, and update our technology infrastructure to deliver exceptional customer experiences.A Site Reliability Engineer (SRE) combines software engineering and Cloud capabilities...


  • Hyderabad, Telangana, India Oracle Full time

    Job DescriptionWe are seeking a skilled Cloud Engineering Leader to join our team at Oracle. As a key member of our SRE division, you will play a crucial role in defining and evolving standard practices and procedures.Key Responsibilities:Define and develop software for tasks associated with developing, designing, and debugging software applications or...


  • Hyderabad, Telangana, India WS Audiology Full time

    About the RoleWe are seeking an experienced Site Reliability Engineer to join our team, with a focus on monitoring, alerting, and infrastructure stability. This role primarily involves maintaining the reliability and performance of our systems hosted in Azure Cloud.As a core member of our Site Reliability Engineering (SRE) team, you will use tools such as...

  • Cloud Engineer Leader

    2 weeks ago


    Hyderabad, Telangana, India RGS Outsourcing Full time

    **Your Opportunity**We have an exciting opportunity for a Cloud Engineer Leader to join our team at RGS Outsourcing in Hyderabad. As a Principal Engineer, you will be responsible for leading the design and implementation of complex features and architectural improvements to our products.

  • Data Engineer Leader

    1 month ago


    Hyderabad, Telangana, India Talent500 Full time

    **About the Role**Talent500 is seeking an experienced Data Engineer Leader to join our team. As a key member of our analytics solutions group, you will design, implement, and support data engineering solutions that provide insights for better decision-making.We are looking for a technical leader with expertise in IBM Cloud products and services, including...


  • Hyderabad, Telangana, India ValueLabs LLP Full time

    About the JobWe are looking for an experienced Cloud-Scale .NET Developer to join our team in Hyderabad.The successful candidate will be responsible for designing, developing, and deploying scalable microservices architecture using .NET Core and containerizing applications with Docker and Kubernetes.Additionally, the developer will ensure cloud computing...


  • Hyderabad, Telangana, India Grid Dynamics Full time

    About the Role">Grid Dynamics is seeking a highly skilled Cloud-Scale Java Software Architect to join our team. In this role, you will be responsible for designing and building technical solutions that are built for quality, scale, and performance.Key ResponsibilitiesCollaborate with business stakeholders, product management, and PMO on product roadmaps and...

Cloud-Scale Reliability Engineering Leader

1 month ago


Hyderabad, Telangana, India Splunk Inc Full time

We are committed to our work, customers, having fun, and most significantly to each other's success.

About the Role

As a Cloud-Scale Reliability Engineering Leader at Splunk Inc., you will help us run one of the largest and most sophisticated cloud-scale, bigdata, and microservices platforms in the world.

Key Responsibilities
  • Set technical direction and get consensus from internal and external partners.
  • Develop new processes to make the team more efficient and effective.
  • Collaborate with other team leaders to orchestrate large system changes.
  • Spend a significant amount of time on technical leadership activities in addition to hands-on technical work.
  • Design new services, tools, and monitoring to be implemented by the entire team.
  • Analyze the tradeoffs of the proposed design and make recommendations based on these tradeoffs.
  • Mentor new engineers to achieve more than they thought possible.
Work on reliability projects, including:
  • HA, Business Continuity Planning, disaster recovery, backup/restore, RTO, RPO.
  • Chaos engineering.
  • Application uptime and performance.
  • Capacity management & planning.
  • SLIs, SLOs, error budgets, and monitoring dashboards.
  • Responsible for deployment and operations of large-scale distributed data stores and streaming services.
  • Establishing design patterns for monitoring and benchmarking.
  • Establishing and documenting production run books and guidelines for developers.
  • Tooling, toil reduction, runbooks & automation to handle production environments.
  • Incident management and improving MTTD/MTTR for services.
  • Cloud cost optimization.
Requirements
  • 10+ years of SRE experience in handling large-scale cloud-native microservices platforms.
  • 4+ years of strong hands-on experience deploying, handling, and monitoring large-scale Kubernetes clusters in the public cloud specifically AWS or GCP.
  • Experience with infrastructure automation and scripting using Python and/or bash scripting.
  • Strong hands-on experience in monitoring tools such as Splunk, Prometheus, Grafana, ELK stack, etc. in order to build observability for large-scale microservices deployments.
  • Experience with deployment, operations and performance management of one or more of the following large-scale clusters such as Cassandra, Kafka, Elastic Search, MongoDB, ZooKeeper, Redis, etc.
  • Experience leading large-scale technical initiatives across multiple teams.
  • Excellent problem-solving, triaging, and debugging skills in large-scale distributed systems.
Preferred Qualifications
  • AWS Solutions Architect certification preferred.
  • Confluent Certified Administrator for Apache Kafka and/or Apache Cassandra Administrator Associate certifications are preferred.
  • Experience with Infrastructure-as-Code using Terraform, CloudFormation, Google Deployment Manager, Pulumi, Packer, ARM, etc.
  • Experience with CI/CD frameworks and Pipeline-as-Code such as Jenkins, Spinnaker, Gitlab, Argo, Artifactory, etc.
  • Experience with one or more security/compliance frameworks such as SOC2, PCI, and/or FedRAMP.
  • Proven skills to effectively work across teams and functions to influence the design, operations, and deployment of highly available software.

Bachelors/Masters in Computer Science, Engineering, or related technical field, or equivalent practical experience.

The estimated salary for this position is $180,000 - $250,000 per year, depending on location and experience.