Site Reliability Engineer

4 weeks ago


Hyderabad, Telangana, India NTT Full time

About the Role

The Site Reliability Engineer (SRE) is a seasoned subject matter expert responsible for ensuring the reliability, availability, and performance of company systems and infrastructure.

This SRE works closely with development teams, operations teams, and other stakeholders to enhance system resiliency, automate processes, and improve overall system reliability.

Key Responsibilities

  • Designs and architects resilient and scalable systems, ensuring high availability, fault tolerance, and efficient resource utilization.
  • Establishes and maintains robust Observability solutions to proactively detect system issues, performance bottlenecks, and security vulnerabilities.
  • Continuously analyses system performance, identifies bottlenecks, and implements optimizations to improve system scalability, responsiveness, and resource efficiency.
  • Leads capacity planning efforts, analyses system resource utilization, and forecasts future needs to ensure adequate scalability and optimal resource allocation.
  • Deliver implementations or custom-scoped technical solutions to Logic Monitor customers in-line with customer requirements and signed SOWs.
  • As Solutions Architect is responsible architecting, and the successful delivery of Logic Monitor based Full Stack Observability solution.
  • Duties vary from crafting advanced configurations of Logic Monitor, leading discovery, design & deployment working sessions with customers and relaying product features and improvements to customers CIO/CTO teams.
  • Considered a Subject Matter Expert on all things Logic Monitor based FSO solutions.
  • Act as the subject matter expert for CMDB integrations.
  • Assist the Solution Architect where required in scoping of CMDB integration projects.
  • Guide customers on best practices and how to leverage CMDB integrations in efficient scalable solutions.
  • Attend remote working sessions with customers to drive successful FSO adoption - through discovery, design, and deployment of the NTT Managed Services Platform.
  • Identify gaps, feature requests or issues with going solutions and escalate to LM product and development teams.
  • Assists develop customer-specific, scripted solutions using Logic Monitor product features (Websites, Logic Modules, NetScans) and externally using the REST API.
  • Occasionally assist Monitoring Engineering with Logic Module development.
  • Provides technical leadership and mentorship to junior team members.
  • Fosters a collaborative and inclusive work environment and drives cross functional initiatives and facilitates knowledge sharing and continuous learning across the organization.
  • Stays updated with industry trends, emerging technologies, and best practices to drive innovation and improve overall system performance.

Requirements

  • Advanced technical expertise in Linux/Unix systems, networking, and system administration.
  • Advanced proficiency in scripting or programming languages, such as Python, Go, Java, or Ruby.
  • Advanced knowledge of cloud platforms (such as AWS, Azure, or Google Cloud) and associated services.
  • Advanced proven expertise in performance monitoring, optimization, and troubleshooting using tools such as Prometheus, Grafana, or New Relic.
  • Advanced expertise in incident management, root cause analysis, and post-incident reviews.
  • Excellent problem-solving and analytical skills, with a keen attention to detail.
  • Excellent communication, collaboration, and leadership skills.
  • Advanced ability to optimize system performance, scalability, and reliability. Experience with performance monitoring and tuning tools (for example, Prometheus, Grafana, or New Relic) to identify bottlenecks, analyse performance data, and implement optimization strategies.
  • Advanced understanding of security principles, best practices, and compliance requirements. Experience in designing and implementing security controls, performing security assessments, and ensuring compliance with industry standards.
  • Willingness to travel (20-25%).

About NTT DATA

NTT DATA is a $30+ billion trusted global innovator of business and technology services. We serve 75% of the Fortune Global 100 and are committed to helping clients innovate, optimize and transform for long-term success. We invest over $3.6 billion each year in R&D to help organizations and society move confidently and sustainably into the digital future. As a Global Top Employer, we have diverse experts in more than 50 countries and a robust partner ecosystem of established and start-up companies. Our services include business and technology consulting, data and artificial intelligence, industry solutions, as well as the development, implementation and management of applications, infrastructure, and connectivity. We are also one of the leading providers of digital and AI infrastructure in the world. NTT DATA is part of NTT Group and headquartered in Tokyo.

Equal Opportunity Employer

NTT DATA is proud to be an Equal Opportunity Employer with a global culture that embraces diversity. We are committed to providing an environment free of unfair discrimination and harassment. We do not discriminate based on age, race, colour, gender, sexual orientation, religion, nationality, disability, pregnancy, marital status, veteran status, or any other protected category. Join our growing global team and accelerate your career with us. Apply today.



  • Hyderabad, Telangana, India SID Global Solutions Full time

    Site Reliability EngineerAt SID Global Solutions, we are seeking a highly skilled Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will be responsible for ensuring the reliability and performance of our cloud-based systems.Key Responsibilities:Design, implement, and maintain scalable and highly available cloud...


  • Hyderabad, Telangana, India Virtusa Full time

    Job Title: SRE Devops awsJob Summary: We are seeking a highly skilled Site Reliability Engineer to join our team at Virtusa. As a Site Reliability Engineer, you will be responsible for designing, building, and maintaining reliable and scalable infrastructure solutions to support our applications and services.Key Responsibilities:Design and implement robust...


  • Hyderabad, Telangana, India SINGLE POINT TECHNOLOGIES PRIVATE LIMITED Full time

    Job Title: Site Reliability EngineerAbout the Role:We are seeking a skilled Site Reliability Engineer to join our team at Single Point Technologies Private Limited. As a Site Reliability Engineer, you will play a critical role in ensuring the reliability, performance, and security of our cloud-based product suite.Key Responsibilities:* Design and implement...


  • Hyderabad, Telangana, India Crox Consulting Inc Full time

    Site Reliability EngineerJob Summary:Crox Consulting Inc is seeking a highly skilled Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will play a critical role in ensuring the reliability, scalability, and performance of our cloud-based SaaS environment.Key Responsibilities:Design and implement automation and software solutions...


  • Hyderabad, Telangana, India Tata Consultancy Services Full time

    Job Title: Site Reliability EngineerTata Consultancy Services is a global leader in the technology arena, and we're looking for a skilled Site Reliability Engineer to join our team.Key Responsibilities:Design, develop, and test Java applications using standard frameworks and tools.Analyze and resolve application issues in collaboration with team...


  • Hyderabad, Telangana, India SID Global Solutions Full time

    Job Title: Site Reliability EngineerWe are seeking a highly skilled Site Reliability Engineer to join our team at SID Global Solutions.Key Responsibilities:Design and implement scalable and reliable cloud infrastructure using GCP, AWS/Azure, and Kubernetes.Develop and maintain CI/CD pipelines using Jenkins, GitLab CI, and Docker.Collaborate with...


  • Hyderabad, Telangana, India RealPage, Inc. Full time

    Job SummaryRealPage, Inc. is seeking a highly skilled Site Reliability Engineer to join our SRE & Systems team. As a Site Reliability Engineer, you will be responsible for ensuring the reliability and scalability of our multiple open-source application environments.Key ResponsibilitiesProvision, de-provision, and support multiple open-source application...


  • Hyderabad, Telangana, India Quest Diagnostics Full time

    Job Title: Site Reliability Engineering ManagerWe are seeking a highly skilled Site Reliability Engineering Manager to join our team at Quest Diagnostics. As a Site Reliability Engineering Manager, you will be responsible for leading a team of Site Reliability Engineers in designing, implementing, and maintaining scalable and reliable systems.Key...


  • Hyderabad, Telangana, India Experian Full time

    Job Title: Site Reliability EngineerJob Summary:Experian is seeking a highly skilled Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will be responsible for ensuring the reliability, performance, and scalability of our AWS platform.Key Responsibilities:Optimize microservice and serverless processes on robust distributed...


  • Hyderabad, Telangana, India Zelis Full time

    Job Title: Site Reliability EngineerZelis is seeking a highly skilled Site Reliability Engineer to join our Engineering team. As a Site Reliability Engineer, you will play a critical role in ensuring the reliability, scalability, and performance of our cloud-based infrastructure.Key Responsibilities:Gather and analyze metrics from operating systems and...


  • Hyderabad, Telangana, India Quest Diagnostics Full time

    Job SummaryWe are seeking a highly skilled Site Reliability Engineering Manager to join our team at Quest Diagnostics. As a Site Reliability Engineering Manager, you will be responsible for leading a team of Site Reliability Engineers in designing, implementing, and maintaining reliable and scalable systems.Key ResponsibilitiesLead and manage a team of Site...


  • Hyderabad, Telangana, India Live Connections Full time

    We are looking for Manager Site Reliability Engineer in Hyderabad locationRoles and Responsibilities :Position will manage 5 to 10 engineers both directly and indirectly. The engineers will include Site Reliability Engineers, Observability Engineers, Performance Engineers, DevSecOps Engineers, and others These individuals will vary from entry level to senior...


  • Hyderabad, Telangana, India Quest Diagnostics Full time

    Job Title: Site Reliability Engineering ManagerQuest Diagnostics is seeking a highly skilled Site Reliability Engineering Manager to lead our team of engineers in delivering high-quality, reliable, and scalable systems.Key Responsibilities:Lead and manage a team of Site Reliability Engineers, providing mentorship, guidance, and support to ensure the team's...


  • Hyderabad, Telangana, India FactSet Full time

    Job Title: Lead Site Reliability EngineerAt FactSet, we're seeking a highly skilled Lead Site Reliability Engineer to join our team. As a key member of our infrastructure team, you will be responsible for designing, implementing, and maintaining highly available and scalable architectures for our applications and infrastructure.Key...


  • Hyderabad, Telangana, India FactSet Full time

    Job SummaryWe are seeking a skilled Site Reliability Engineer to join our team at FactSet. The ideal candidate will have a strong background in designing, implementing, and maintaining highly available and scalable architectures for our applications and infrastructure.Key ResponsibilitiesCollaborate with cross-functional teams to define, review, and...


  • Hyderabad, Telangana, India Virtusa Full time

    Job SummaryVirtusa is seeking a highly skilled Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will be responsible for designing, building, and maintaining reliable and scalable infrastructure solutions to support our applications and services.Key ResponsibilitiesDesign and implement robust monitoring and alerting systems to...


  • Hyderabad, Telangana, India Virtusa Full time

    Job SummaryVirtusa is seeking a highly skilled Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will be responsible for designing, building, and maintaining reliable and scalable infrastructure solutions to support our applications and services.Key ResponsibilitiesDesign and implement robust monitoring and alerting systems to...


  • Hyderabad, Telangana, India RiskInsight Consulting Pvt Ltd Full time

    Job Title: Site Reliability EngineerWe are seeking a highly skilled Site Reliability Engineer to join our team at RiskInsight Consulting Pvt Ltd. As a Site Reliability Engineer, you will be responsible for ensuring the smooth operation of our banking applications and infrastructure.Key Responsibilities:Manage a 24/7 production support team in the Banking...


  • Hyderabad, Telangana, India RiskInsight Consulting Pvt Ltd Full time

    Job Title: Site Reliability EngineerWe are seeking a highly skilled Site Reliability Engineer to join our team at RiskInsight Consulting Pvt Ltd. As a Site Reliability Engineer, you will be responsible for ensuring the smooth operation of our banking applications and infrastructure.Key Responsibilities:Manage a 24/7 production support team in the Banking...


  • Hyderabad, Telangana, India Tata Consultancy Services Full time

    About the RoleTata Consultancy Services is a global leader in the technology arena, and we're looking for talented individuals to join our team. As a Site Reliability Engineer, you'll play a crucial role in ensuring the stability and performance of our applications.Key ResponsibilitiesDesign, develop, and test Java applications using standard frameworks and...