Linux Site Reliability Architect

3 weeks ago


Hyderabad, Telangana, India Pythian Full time

Job Overview

We are seeking a talented Site Reliability Engineering Leader to join our team at Pythian. As a Site Reliability Engineering Leader, you will be responsible for designing, implementing, and maintaining scalable and reliable systems to meet the needs of our clients.

Key Responsibilities

  • Operate, maintain, and administer solutions contributing to customer infrastructure's operational efficiency, availability, and visibility.
  • Plan and execute maintenance activities, design documentation, and standard procedures.
  • Provide Root Cause Analysis reports for outages/incidents (ITIL - Problem Management).
  • Observe and provide feedback on the current state of the client's infrastructure, and identify opportunities to improve resiliency, reduce incident occurrence, and automate repetitive administrative and operational tasks.
  • Contribute to, improve, and maintain team documentation about client systems and infrastructure, procedures, policies, and schedules.
  • Gather and document information about client environments through audit activities, and analyze the information to identify opportunities for improvement and application of best practices.
  • Work collaboratively with teammates to contribute to the continuous improvement of our working culture.
  • Act as a technology leader for clients, as well as drive client discussions on technology road maps.
  • Participate in an on-call rotation in an escalation capacity.

Requirements

  • Experience working with Google and AWS Clouds (including infrastructure as code deployment with Cloud Formation, Terraform, Opsworks, etc)
  • Scripting and automation of administrative tasks using Python and Scala is a must
  • Solid understanding of microservices architecture and container technologies (Kubernetes is a must, Docker, lxc, etc)
  • Clear understanding of software development lifecycles and best practices from an infrastructure point of view (PRs, merge, rebase, etc)
  • Understanding the end-to-end operations of a 'Business System' vs components.
  • Comprehensive systems hardware and network troubleshooting experience
  • Common Linux distribution platform installation, configuration, performance tuning, and cloud migration.
  • TCP/IP networking, NIC bonding, and network services configuration (DNS, NTP, DHCP, SMTP, etc)
  • Operation and administration of virtual infrastructure, including experience with at least one hypervisor (VMware, Hyper-V, KVM, etc.)
  • Ability to describe IaaS, PaaS, SaaS, pros and cons of each, use cases for virtualization and cloud
  • Administration of web servers and supporting technologies, including network load balancers
  • Experience with the design, development, and deployment of Puppet
  • System and application error investigation, troubleshooting of access/availability issues including deep multi-system root cause analysis
  • Experience managing networking devices, such as switches and firewalls from a variety of vendors
  • Solid understanding of DevOps tools, processes, and culture
  • Ability to pick up new technologies quickly
  • Ability to provide accurate work scheduling and task estimations for work delivery

What We Offer

  • A competitive total rewards package, including a generous salary and benefits
  • The opportunity to work with a talented team of professionals and contribute to the growth and success of the company
  • Professional development opportunities, including training and certification programs
  • Flexible work arrangements, including remote work options
  • A collaborative and dynamic work environment

How to Apply

If you are a motivated and talented individual who is passionate about site reliability engineering, please submit your application through our website. We thank all applicants for their interest; however, only those selected for an interview will be contacted.



  • Hyderabad, Telangana, India GeekBull Consulting Full time

    We are seeking a highly skilled Senior Site Reliability Engineer to join our team at GeekBull Consulting in Hyderabad. This is a Contract-to-Hire (C2H) opportunity with a duration of 6 months.About the RoleAs a Senior Site Reliability Engineer, you will be responsible for designing, developing, and maintaining infrastructure through popular Infrastructure as...


  • Hyderabad, Telangana, India Pythian Full time

    Job SummaryWe are seeking a highly skilled Linux Site Reliability Engineer to join our team at Pythian. As a key member of our Site Reliability Engineering team, you will be responsible for designing, implementing, and maintaining scalable and highly available cloud infrastructure solutions.Key ResponsibilitiesOperate, maintain, and administer solutions...


  • Hyderabad, Telangana, India Thomson Reuters Full time

    About the RoleIn this opportunity as Site Reliability Engineer, you will be responsible for overseeing the operational aspects of cloud-based systems, ensuring their efficiency, reliability, and scalability. Key responsibilities include managing change and problem management, application and configuration management, and production support of strategic...


  • Hyderabad, Telangana, India NTT Full time

    About the RoleThe Site Reliability Engineer (SRE) is a seasoned subject matter expert responsible for ensuring the reliability, availability, and performance of company systems and infrastructure.This SRE works closely with development teams, operations teams, and other stakeholders to enhance system resiliency, automate processes, and improve overall system...


  • Hyderabad, Telangana, India Virtusa Full time

    Job Summary: We are seeking a skilled Site Reliability Engineer to join our team at Virtusa. In this role, you will be responsible for ensuring the reliability, scalability, and performance of our cloud-based systems.Responsibilities:Troubleshoot recurring failures and participate in incident triages to minimize downtime and ensure system...


  • Hyderabad, Telangana, India Thomson Reuters Full time

    Site Reliability Engineer OpportunityThomson Reuters is seeking a skilled Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will be responsible for ensuring the reliability and scalability of our web-based applications on Windows, Linux, hosted, and cloud platforms.About the RoleThis is an exciting opportunity for a talented...


  • Hyderabad, Telangana, India Thomson Reuters Full time

    About the RoleAs an AWS Site Reliability Engineer, you will work with application teams to manage and support applications into production. This includes continuous improvement to an ongoing support model, including release and change management for maintaining strategic environments. You will provide well-written documentation and technical presentations on...


  • Hyderabad, Telangana, India RiskInsight Consulting Pvt Ltd Full time

    Job Title: Site Reliability EngineerWe are seeking a highly skilled Site Reliability Engineer to join our team at RiskInsight Consulting Pvt Ltd. As a Site Reliability Engineer, you will be responsible for ensuring the smooth operation of our banking applications and infrastructure.Key Responsibilities:Manage a 24/7 production support team in the Banking...


  • Hyderabad, Telangana, India Zenoti Full time

    Zenoti is seeking a seasoned Site Reliability Engineering Manager to join our team. As a key member of our engineering organization, you will be responsible for leading the adoption of DevOps practices and architecture across various services in the company.The ideal candidate will be a self-starter with a zeal to own things from start to end with little...


  • Hyderabad, Telangana, India NTT DATA Full time

    About the RoleThe Site Reliability Engineer (SRE) position at NTT DATA is a high-level technical role that requires a subject matter expert to lead efforts in ensuring the reliability, scalability, and performance of the company's systems and infrastructure. The ideal candidate will have advanced technical expertise in Linux/Unix systems, networking, and...


  • Hyderabad, Telangana, India Thomson Reuters Full time

    Site Reliability EngineerThomson Reuters is seeking a skilled Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will be responsible for ensuring the reliability and performance of our cloud-based applications on Windows and Linux platforms.About the RoleWork with application teams to manage and support applications into...


  • Hyderabad, Telangana, India Unison Consulting Pte Ltd Full time

    Job Title: Site Reliability Engineer - Cloud ExpertAbout the Role:We are seeking a highly skilled Site Reliability Engineer to join our team at Unison Consulting Pte Ltd. As a Site Reliability Engineer, you will be responsible for ensuring the high availability and performance of our cloud-based applications.Key Responsibilities:Support Java (J2EE/Spring...


  • Hyderabad, Telangana, India Tata Consultancy Services Full time

    Job Role: Linux Systems ArchitectKey Responsibilities:Design and implement Linux-based solutions to meet business goals and technical requirements.Oversee the architecture of Linux systems to ensure security standards and scalability.Develop and maintain robust, scalable, and secure Linux infrastructures or applications.Requirements:Strong experience with...


  • Hyderabad, Telangana, India Tata Consultancy Services Full time

    Job Summary:We are seeking an experienced Linux Systems Architect to join our team at Tata Consultancy Services. As a key member of our infrastructure team, you will be responsible for designing, implementing, and overseeing Linux-based solutions that align with our business goals, security standards, and technical requirements.Key Responsibilities:Design...


  • Hyderabad, Telangana, India RealPage, Inc. Full time

    Job OverviewWe are seeking a highly skilled Site Reliability Engineer to join our global team at RealPage, Inc. This is a critical role that requires expertise in provisioning, de-provisioning, and support of multiple open-source application environments.About the RoleThis Site Reliability Engineer will report directly to the Sr.MANAGER - SRE & SYstems and...


  • Hyderabad, Telangana, India Conviction HR Full time

    Job Title : Site Reliability Engineer (SRE) - : Hyderabad (5 Days WFO) (Hyderabad Candidates are only preferred.) IMMEDIATE : Conviction HRType : Contract-to-Hire (C2H)Job Description :ConvictionHR is seeking a talented Site Reliability Engineer (SRE) to join our growing team. This Contract-to-Hire position is perfect for an individual who is passionate...


  • Hyderabad, Telangana, India RealPage, Inc. Full time

    Job SummaryRealPage, Inc. is seeking a highly skilled Site Reliability Engineer to join our SRE & Systems team. As a Site Reliability Engineer, you will be responsible for ensuring the reliability and scalability of our multiple open-source application environments.Key ResponsibilitiesProvision, de-provision, and support multiple open-source application...


  • Hyderabad, Telangana, India Experian Full time

    Job Title: Site Reliability EngineerJob Summary:Experian is seeking a highly skilled Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will be responsible for ensuring the reliability, performance, and scalability of our AWS platform.Key Responsibilities:Optimize microservice and serverless processes on robust distributed...


  • Hyderabad, Telangana, India 5100 Kyndryl Solutions Private Limited Full time

    About the RoleWe are seeking a highly skilled Site Reliability Engineer to join our team at 5100 Kyndryl Solutions Private Limited. As a Site Reliability Engineer, you will play a critical role in ensuring the reliability, resiliency, and innovation of our information systems and ecosystems.Key ResponsibilitiesDevelop and maintain key reliability metrics...

  • DevOps Engineer

    1 month ago


    Hyderabad, Telangana, India DBS Bank Full time

    Job SummaryDBS Bank is seeking a skilled DevOps Engineer to join our team. As an Enterprise Architect Site Reliability Engineering Associate, you will be responsible for improving system performance through environment upgrades and improvements. You will work with engineering and application development teams to deploy, support and monitor existing and new...