Site Reliability Engineer
4 weeks ago
About the Role
The Site Reliability Engineer (SRE) is a seasoned subject matter expert responsible for ensuring the reliability, availability, and performance of company systems and infrastructure.
This SRE works closely with development teams, operations teams, and other stakeholders to enhance system resiliency, automate processes, and improve overall system reliability.
Key Responsibilities
- Designs and architects resilient and scalable systems, ensuring high availability, fault tolerance, and efficient resource utilization.
- Establishes and maintains robust Observability solutions to proactively detect system issues, performance bottlenecks, and security vulnerabilities.
- Continuously analyses system performance, identifies bottlenecks, and implements optimizations to improve system scalability, responsiveness, and resource efficiency.
- Leads capacity planning efforts, analyses system resource utilization, and forecasts future needs to ensure adequate scalability and optimal resource allocation.
- Deliver implementations or custom-scoped technical solutions to Logic Monitor customers in-line with customer requirements and signed SOWs.
- As Solutions Architect is responsible architecting, and the successful delivery of Logic Monitor based Full Stack Observability solution.
- Duties vary from crafting advanced configurations of Logic Monitor, leading discovery, design & deployment working sessions with customers and relaying product features and improvements to customers CIO/CTO teams.
- Considered a Subject Matter Expert on all things Logic Monitor based FSO solutions.
- Act as the subject matter expert for CMDB integrations.
- Assist the Solution Architect where required in scoping of CMDB integration projects.
- Guide customers on best practices and how to leverage CMDB integrations in efficient scalable solutions.
- Attend remote working sessions with customers to drive successful FSO adoption - through discovery, design, and deployment of the NTT Managed Services Platform.
- Identify gaps, feature requests or issues with going solutions and escalate to LM product and development teams.
- Assists develop customer-specific, scripted solutions using Logic Monitor product features (Websites, Logic Modules, NetScans) and externally using the REST API.
- Occasionally assist Monitoring Engineering with Logic Module development.
- Provides technical leadership and mentorship to junior team members.
- Fosters a collaborative and inclusive work environment and drives cross functional initiatives and facilitates knowledge sharing and continuous learning across the organization.
- Stays updated with industry trends, emerging technologies, and best practices to drive innovation and improve overall system performance.
Requirements
- Advanced technical expertise in Linux/Unix systems, networking, and system administration.
- Advanced proficiency in scripting or programming languages, such as Python, Go, Java, or Ruby.
- Advanced knowledge of cloud platforms (such as AWS, Azure, or Google Cloud) and associated services.
- Advanced proven expertise in performance monitoring, optimization, and troubleshooting using tools such as Prometheus, Grafana, or New Relic.
- Advanced expertise in incident management, root cause analysis, and post-incident reviews.
- Excellent problem-solving and analytical skills, with a keen attention to detail.
- Excellent communication, collaboration, and leadership skills.
- Advanced ability to optimize system performance, scalability, and reliability. Experience with performance monitoring and tuning tools (for example, Prometheus, Grafana, or New Relic) to identify bottlenecks, analyse performance data, and implement optimization strategies.
- Advanced understanding of security principles, best practices, and compliance requirements. Experience in designing and implementing security controls, performing security assessments, and ensuring compliance with industry standards.
- Willingness to travel (20-25%).
About NTT DATA
NTT DATA is a $30+ billion trusted global innovator of business and technology services. We serve 75% of the Fortune Global 100 and are committed to helping clients innovate, optimize and transform for long-term success. We invest over $3.6 billion each year in R&D to help organizations and society move confidently and sustainably into the digital future. As a Global Top Employer, we have diverse experts in more than 50 countries and a robust partner ecosystem of established and start-up companies. Our services include business and technology consulting, data and artificial intelligence, industry solutions, as well as the development, implementation and management of applications, infrastructure, and connectivity. We are also one of the leading providers of digital and AI infrastructure in the world. NTT DATA is part of NTT Group and headquartered in Tokyo.
Equal Opportunity Employer
NTT DATA is proud to be an Equal Opportunity Employer with a global culture that embraces diversity. We are committed to providing an environment free of unfair discrimination and harassment. We do not discriminate based on age, race, colour, gender, sexual orientation, religion, nationality, disability, pregnancy, marital status, veteran status, or any other protected category. Join our growing global team and accelerate your career with us. Apply today.
-
Site Reliability Engineer
4 weeks ago
Hyderabad, Telangana, India SID Global Solutions Full timeSite Reliability EngineerAt SID Global Solutions, we are seeking a highly skilled Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will be responsible for ensuring the reliability and performance of our cloud-based systems.Key Responsibilities:Design, implement, and maintain scalable and highly available cloud...
-
Site Reliability Engineer
4 weeks ago
Hyderabad, Telangana, India Virtusa Full timeJob Title: SRE Devops awsJob Summary: We are seeking a highly skilled Site Reliability Engineer to join our team at Virtusa. As a Site Reliability Engineer, you will be responsible for designing, building, and maintaining reliable and scalable infrastructure solutions to support our applications and services.Key Responsibilities:Design and implement robust...
-
Site Reliability Engineer
4 weeks ago
Hyderabad, Telangana, India SINGLE POINT TECHNOLOGIES PRIVATE LIMITED Full timeJob Title: Site Reliability EngineerAbout the Role:We are seeking a skilled Site Reliability Engineer to join our team at Single Point Technologies Private Limited. As a Site Reliability Engineer, you will play a critical role in ensuring the reliability, performance, and security of our cloud-based product suite.Key Responsibilities:* Design and implement...
-
Site Reliability Engineer
4 weeks ago
Hyderabad, Telangana, India Crox Consulting Inc Full timeSite Reliability EngineerJob Summary:Crox Consulting Inc is seeking a highly skilled Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will play a critical role in ensuring the reliability, scalability, and performance of our cloud-based SaaS environment.Key Responsibilities:Design and implement automation and software solutions...
-
Site Reliability Engineer
4 weeks ago
Hyderabad, Telangana, India Tata Consultancy Services Full timeJob Title: Site Reliability EngineerTata Consultancy Services is a global leader in the technology arena, and we're looking for a skilled Site Reliability Engineer to join our team.Key Responsibilities:Design, develop, and test Java applications using standard frameworks and tools.Analyze and resolve application issues in collaboration with team...
-
Site Reliability Engineer
4 weeks ago
Hyderabad, Telangana, India SID Global Solutions Full timeJob Title: Site Reliability EngineerWe are seeking a highly skilled Site Reliability Engineer to join our team at SID Global Solutions.Key Responsibilities:Design and implement scalable and reliable cloud infrastructure using GCP, AWS/Azure, and Kubernetes.Develop and maintain CI/CD pipelines using Jenkins, GitLab CI, and Docker.Collaborate with...
-
Site Reliability Engineer
3 weeks ago
Hyderabad, Telangana, India RealPage, Inc. Full timeJob SummaryRealPage, Inc. is seeking a highly skilled Site Reliability Engineer to join our SRE & Systems team. As a Site Reliability Engineer, you will be responsible for ensuring the reliability and scalability of our multiple open-source application environments.Key ResponsibilitiesProvision, de-provision, and support multiple open-source application...
-
Site Reliability Engineering Manager
4 weeks ago
Hyderabad, Telangana, India Quest Diagnostics Full timeJob Title: Site Reliability Engineering ManagerWe are seeking a highly skilled Site Reliability Engineering Manager to join our team at Quest Diagnostics. As a Site Reliability Engineering Manager, you will be responsible for leading a team of Site Reliability Engineers in designing, implementing, and maintaining scalable and reliable systems.Key...
-
Site Reliability Engineer
4 weeks ago
Hyderabad, Telangana, India Experian Full timeJob Title: Site Reliability EngineerJob Summary:Experian is seeking a highly skilled Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will be responsible for ensuring the reliability, performance, and scalability of our AWS platform.Key Responsibilities:Optimize microservice and serverless processes on robust distributed...
-
Site Reliability Engineer
4 weeks ago
Hyderabad, Telangana, India Zelis Full timeJob Title: Site Reliability EngineerZelis is seeking a highly skilled Site Reliability Engineer to join our Engineering team. As a Site Reliability Engineer, you will play a critical role in ensuring the reliability, scalability, and performance of our cloud-based infrastructure.Key Responsibilities:Gather and analyze metrics from operating systems and...
-
Site Reliability Engineering Manager
3 weeks ago
Hyderabad, Telangana, India Quest Diagnostics Full timeJob SummaryWe are seeking a highly skilled Site Reliability Engineering Manager to join our team at Quest Diagnostics. As a Site Reliability Engineering Manager, you will be responsible for leading a team of Site Reliability Engineers in designing, implementing, and maintaining reliable and scalable systems.Key ResponsibilitiesLead and manage a team of Site...
-
Manager - Site Reliability Engineering
4 weeks ago
Hyderabad, Telangana, India Live Connections Full timeWe are looking for Manager Site Reliability Engineer in Hyderabad locationRoles and Responsibilities :Position will manage 5 to 10 engineers both directly and indirectly. The engineers will include Site Reliability Engineers, Observability Engineers, Performance Engineers, DevSecOps Engineers, and others These individuals will vary from entry level to senior...
-
Site Reliability Engineering Manager
4 weeks ago
Hyderabad, Telangana, India Quest Diagnostics Full timeJob Title: Site Reliability Engineering ManagerQuest Diagnostics is seeking a highly skilled Site Reliability Engineering Manager to lead our team of engineers in delivering high-quality, reliable, and scalable systems.Key Responsibilities:Lead and manage a team of Site Reliability Engineers, providing mentorship, guidance, and support to ensure the team's...
-
Site Reliability Engineer
4 weeks ago
Hyderabad, Telangana, India FactSet Full timeJob Title: Lead Site Reliability EngineerAt FactSet, we're seeking a highly skilled Lead Site Reliability Engineer to join our team. As a key member of our infrastructure team, you will be responsible for designing, implementing, and maintaining highly available and scalable architectures for our applications and infrastructure.Key...
-
Site Reliability Engineer
2 weeks ago
Hyderabad, Telangana, India FactSet Full timeJob SummaryWe are seeking a skilled Site Reliability Engineer to join our team at FactSet. The ideal candidate will have a strong background in designing, implementing, and maintaining highly available and scalable architectures for our applications and infrastructure.Key ResponsibilitiesCollaborate with cross-functional teams to define, review, and...
-
Site Reliability Engineer
3 weeks ago
Hyderabad, Telangana, India Virtusa Full timeJob SummaryVirtusa is seeking a highly skilled Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will be responsible for designing, building, and maintaining reliable and scalable infrastructure solutions to support our applications and services.Key ResponsibilitiesDesign and implement robust monitoring and alerting systems to...
-
Site Reliability Engineer
3 weeks ago
Hyderabad, Telangana, India Virtusa Full timeJob SummaryVirtusa is seeking a highly skilled Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will be responsible for designing, building, and maintaining reliable and scalable infrastructure solutions to support our applications and services.Key ResponsibilitiesDesign and implement robust monitoring and alerting systems to...
-
Site Reliability Engineer
4 weeks ago
Hyderabad, Telangana, India RiskInsight Consulting Pvt Ltd Full timeJob Title: Site Reliability EngineerWe are seeking a highly skilled Site Reliability Engineer to join our team at RiskInsight Consulting Pvt Ltd. As a Site Reliability Engineer, you will be responsible for ensuring the smooth operation of our banking applications and infrastructure.Key Responsibilities:Manage a 24/7 production support team in the Banking...
-
Site Reliability Engineer
4 weeks ago
Hyderabad, Telangana, India RiskInsight Consulting Pvt Ltd Full timeJob Title: Site Reliability EngineerWe are seeking a highly skilled Site Reliability Engineer to join our team at RiskInsight Consulting Pvt Ltd. As a Site Reliability Engineer, you will be responsible for ensuring the smooth operation of our banking applications and infrastructure.Key Responsibilities:Manage a 24/7 production support team in the Banking...
-
Site Reliability Engineer
4 weeks ago
Hyderabad, Telangana, India Tata Consultancy Services Full timeAbout the RoleTata Consultancy Services is a global leader in the technology arena, and we're looking for talented individuals to join our team. As a Site Reliability Engineer, you'll play a crucial role in ensuring the stability and performance of our applications.Key ResponsibilitiesDesign, develop, and test Java applications using standard frameworks and...