Lead Site Reliability Engineer

1 month ago


Bangalore, Karnataka, India GetHyr Full time

Job Description :


- Maintain services once they are live by measuring and monitoring availability, latency, and overall system reliability.

- Work closely with team members to ensure best practices and strategic goals are incorporated into development work.

- Collaborate with other engineering teams to identify and anticipate changing requirements and opportunities to improve the development environment.

- Monitoring at scale with VictoriaMetrics and the like.

- Orchestrating and managing with K8S and the like.

- Implementing best practices, challenging the status quo, and tab on industry and technical trends, changes, and developments to ensure the team is always striving for best-in-class work.

- Manage capacity, build security into every layer, and reduce cost.

- Implement secure networking, key management, user management, access management, process management, and image management.

- Effectively lead and manage team deliverable (short/long term) project planning and coaching, quarterly reviews, participation in the selection process for new hires, and technical and non-technical guidance to the team.

Requirements :

- Proven experience in handling large infrastructure and distributed systems like Yarn, Kubernetes, Elasticsearch, Kafka, etc.

- Familiarity with Python-related technologies and frameworks like Falcon, Django, or Pyramid.

- Experience with Unix/Linux operating systems internals and administration (e. g. filesystems, inodes, system calls, etc. ) or networking (e. g. TCP/IP, routing, network topologies, and hardware, SDN, etc. ).

- Familiarity with the cloud computing infrastructure, preferably Azure.

- Familiarity with task queue frameworks like Celery or Pika is a plus.

- Source code management and Implementation of security best practices.

- Deep understanding of modern software architectures, including load-balancing, queueing, caching, distributed systems failure modes generally, microservices, and big data technologies.

- Know-how in gathering metrics across distributed systems (instances/container) and generating automated notifications, and reports.

- Prowess in analyzing App bottlenecks, and performance degradation, and implementing automated processes/tools to detect such anomalies.

- Good understanding and implementation experience using 12-factor App principles.

Mandatory Skills :

- 8 - 10 years of Experience on the AWS/Azure platform.

- Excellent programming (Python, Go, Ruby, or preferred scripting languages) and automation skills.

- Deep understanding of container orchestration technologies - Kubernetes.

- Should have had prior experience in migrating high throughput services to Kubernetes.

- Expertise in any CI/CD tools build, artifact, packaging, and service discovery management tools. Gitops preferred.

- Expertise in skillsets for centralized logging systems, metrics, and tooling frameworks such as ELK, Prometheus/VictoriaMetrics, and Grafana.

- Great communication, interpersonal, and teamwork skills.

- Experience with AWS/Azure cost explorer, billing analysis, and various cost optimization techniques.

- Awareness of Cloud Security concepts.

- Awareness of Information Security Concepts and Best Practices.

Good to have :

- AWS/Azure cloud certification preferred.

- Certification in Kubernetes Administrator (CKA).

- Certification in Kubernetes Application Developer (CKAD).

- Experience with configuration management tools and strong code analysis skills in Python.

- Experience in working with APM-based tools like New Relic.

(ref:hirist.tech)

  • Bangalore, Karnataka, India Cyitechsearch Full time

    We are hiring for Site Reliability Engineer Skills : - Develop and provide operational support for full-stack software applications.- Relevant industry certifications, such as through the Site Reliability Engineering (SRE) Foundation.- Five years' experience as a site reliability engineer or similar role.- Collaborate with development operations staff to...


  • Bangalore, Karnataka, India Cyitechsearch Full time

    We are hiring for Site Reliability Engineer Skills : - Develop and provide operational support for full-stack software applications.- Relevant industry certifications, such as through the Site Reliability Engineering (SRE) Foundation.- Five years' experience as a site reliability engineer or similar role.- Collaborate with development operations staff to...


  • Bangalore, Karnataka, India Ultrabot Innovations Full time

    Position Overview :As a Senior Site Reliability Engineer with 5-8 years of experience, you will play a key role in ensuring the reliability, scalability, and performance of our systems and infrastructure. You will leverage your expertise in Site Reliability Engineering (SRE) to implement best practices and methodologies, effectively troubleshoot complex...


  • Bangalore, Karnataka, India Ultrabot Innovations Full time

    Position Overview :As a Senior Site Reliability Engineer with 5-8 years of experience, you will play a key role in ensuring the reliability, scalability, and performance of our systems and infrastructure. You will leverage your expertise in Site Reliability Engineering (SRE) to implement best practices and methodologies, effectively troubleshoot complex...


  • Bangalore, Karnataka, India The HRBPs Full time

    Lead Site Reliability Engineer - BangaloreExp - 8 to 12 yearsResponsibilities :- Collaborating with customer success managers and solutions engineers to bring deep technical expertise in implementing intelligent automation solutions for customers.- Providing customers and solution engineers with ongoing technical support for complex issues and support...


  • Bangalore, Karnataka, India TERRAGIG LLP Full time

    Role : Site Reliability EngineerExperience : 5+ Years Work Model : Remote / Contract 3 years Skills :- Develop and provide operational support for full-stack software applications.- Relevant industry certifications, such as through the Site Reliability Engineering (SRE) Foundation.- Five years' experience as a site reliability engineer or similar role.-...

  • Engineering Director

    2 months ago


    Bangalore, Karnataka, India CareerNet Technologies Full time

    Job Description :Site Reliability Engineers (SREs) at Coupang is a mission-critical role that combines software and system engineering to build, run, and scale our complex, large-scale ecommerce systems. As part of the Site Reliability Engineering team, you will be responsible for ensuring all our customer-facing services are healthy, monitored, automated,...


  • Bangalore, Karnataka, India CareerNet Technologies Full time

    Job Description :Site Reliability Engineers (SREs) at Coupang is a mission-critical role that combines software and system engineering to build, run, and scale our complex, large-scale ecommerce systems. As part of the Site Reliability Engineering team, you will be responsible for ensuring all our customer-facing services are healthy, monitored, automated,...


  • Bangalore, Karnataka, India Protoporos Staffing Services Pvt Ltd Full time

    Opportunity with a leading B2B SaaS product client specializing in cutting-edge data integration solutions. Position Overview: We are seeking a highly skilled and experienced Staff Site Reliability Engineer to join our team. As a Staff SRE, you will play a critical role in ensuring the reliability, scalability, and performance of our data integration...


  • Bangalore, Karnataka, India Protoporos Staffing Services Pvt Ltd Full time

    Opportunity with a leading B2B SaaS product client specializing in cutting-edge data integration solutions. Position Overview: We are seeking a highly skilled and experienced Staff Site Reliability Engineer to join our team. As a Staff SRE, you will play a critical role in ensuring the reliability, scalability, and performance of our data integration...


  • Bangalore, Karnataka, India AQUASoft Full time

    AQUASoft is a software development company that specializes in creating custom-made products and software solutions for various clients, including Fortune 500 giants and medium-sized businesses.Our team of highly skilled and experienced software engineers across two continents utilize the latest frameworks and state-of-the-art technologies to build robust,...


  • Bangalore, Karnataka, India AQUASoft Full time

    AQUASoft is a software development company that specializes in creating custom-made products and software solutions for various clients, including Fortune 500 giants and medium-sized businesses.Our team of highly skilled and experienced software engineers across two continents utilize the latest frameworks and state-of-the-art technologies to build robust,...


  • Bangalore, Karnataka, India SWAI TECHNOLOGIES PRIVATE LIMITED Full time

    Role : Senior Site reliability Engineer Exp : 5 to 10 Years of experience Remote Opportunity Company Description :Tech recruitment is broken Companies say there is a shortage of talent and it's hard to find good developers, while developers find it hard to find companies that value the skill, experience and passion they bring to the table.Quite the...


  • Bangalore, Karnataka, India SWAI TECHNOLOGIES PRIVATE LIMITED Full time

    Role : Senior Site reliability Engineer Exp : 5 to 10 Years of experience Remote Opportunity Company Description :Tech recruitment is broken Companies say there is a shortage of talent and it's hard to find good developers, while developers find it hard to find companies that value the skill, experience and passion they bring to the table.Quite the...


  • Bangalore, Karnataka, India Prudential Manpower Pvt.lTD Full time

    Position : Site Reliability EngineerLocation : BangaloreNotice Period : Immediate to 30 Days Minimum Requirements : - 4 years of experience as a Site Reliability Engineer.- Experience with one or more of the following : C++, Java, Python, Go, Perl and/or Ruby etc.- Experience with Unix/Linux operating systems internals and administration or networking.-...


  • Bangalore, Karnataka, India Prudential Manpower Pvt.lTD Full time

    Position : Site Reliability EngineerLocation : BangaloreNotice Period : Immediate to 30 Days Minimum Requirements : - 4 years of experience as a Site Reliability Engineer.- Experience with one or more of the following : C++, Java, Python, Go, Perl and/or Ruby etc.- Experience with Unix/Linux operating systems internals and administration or networking.-...


  • Bangalore, Karnataka, India Protoporos Staffing Services Pvt Ltd Full time

    About :Opportunity for a role of Engineering Manager with a Enterprise B2B SaaS product firm providing Services/products to Fortune 100 organizations.The ideal candidate must be from a B2B SaaS product organization only. Title : Staff Platform Engineer/Site Reliability Engineer. Mandatory Skills : B2B SaaS Product Development, Java, AWS, 2 years of...


  • Bangalore, Karnataka, India Protoporos Staffing Services Pvt Ltd Full time

    About :Opportunity for a role of Engineering Manager with a Enterprise B2B SaaS product firm providing Services/products to Fortune 100 organizations.The ideal candidate must be from a B2B SaaS product organization only. Title : Staff Platform Engineer/Site Reliability Engineer. Mandatory Skills : B2B SaaS Product Development, Java, AWS, 2 years of...


  • Bangalore, Karnataka, India Cyitechsearch Full time

    About the job :We are hiring for Site Reliability EngineerExperience : 5+ Years Work Model : Remote / Contract 3 years Skills :- Develop and provide operational support for full-stack software applications.- Relevant industry certifications, such as through the Site Reliability Engineering (SRE) Foundation.- Five years' experience as a site reliability...


  • Bangalore, Karnataka, India Cyitechsearch Full time

    About the job :We are hiring for Site Reliability EngineerExperience : 5+ Years Work Model : Remote / Contract 3 years Skills :- Develop and provide operational support for full-stack software applications.- Relevant industry certifications, such as through the Site Reliability Engineering (SRE) Foundation.- Five years' experience as a site reliability...