Reliable Systems Expert

3 days ago


Pune, Maharashtra, India LTIMindtree Full time
About the Job

We are seeking a highly skilled Reliable Systems Expert to join our team at LTIMindtree. As a key member of our organization, you will be responsible for ensuring the high availability and performance of our mission-critical services.

Key Responsibilities:

  • Engage in the entire lifecycle of services—from inception and design through deployment, operation, and refinement.
  • Responsible for improving the end-to-end availability and performance of mission-critical services and building automation to prevent problem recurrence.
  • Partner with business and technical product owners to set Service Level Objectives (SLOs), Service Level Indicators (SLIs), and error budgets to manage reliability of infrastructure and applications.
  • Scale and optimize existing infrastructure and services sustainably through mechanisms, including automation, and evolve them by improving reliability and efficiency.
  • Maintain end-to-end availability and performance of mission-critical services and build automation to prevent problem recurrence.
  • Maintain infrastructure (infrastructure as code) and services by measuring, and monitoring system metrics to proactively identify operational efficiencies, potential outages, and security threats in Development, UAT, Staging, and Production environments.
  • Practice sustainable incident response and blameless postmortems.
  • Build infrastructure and drive projects that break things with the aim to improve the robustness of production systems.
  • Preserve operational visibility and response capabilities—fixing and improving our dashboards, alerts, and automation.
  • Maintain operational uptime and reliability by participating in triage and issue support calls for mission-critical systems.
Requirements

To succeed in this role, you must possess the following qualifications:

  • Bachelor's degree in computer science or a related technical field.
  • Strong debugging, troubleshooting, and problem-solving skills.
  • Proficient in Nodejs, with familiarity with other scripting languages such as JavaScript, Python, Maven, Ansible, Bash, etc.
  • Experience with monitoring and alerting systems like Dynatrace, Prometheus, Grafana.
  • Experience with logs and metrics analytics platforms like Sumologic, Splunk.
  • Experience setting SLOs/SLIs/error budgets and managing reliability for infrastructure and applications using Kubernetes, AWS Native components, CloudWatch, Dynatrace.
  • Experience handling large numbers of diverse systems with configuration management systems like Puppet, Chef, Ansible.
  • Proven history of leveraging automation.
  • Experience using tools like PagerDuty for managing incidents.
  • Understanding of standard networking protocols and components such as HTTP, DNS, ECMP, TCP/IP, ICMP, the OSI Model, Subnetting, and Load Balancing strategies.
  • Experience in Serverless Application Framework.
  • Experience in containerized workloads and management platforms such as Docker or Kubernetes.
  • Familiarity with distributed systems, including Microservices.
  • Experience in Infrastructure automation tools such as CDK.
  • Understanding of CI/CD processes and experience with deployment automation tools such as Code Pipeline, Code Deploy, Jenkins, Bamboo.
  • Effective communication, collaboration, and negotiation skills with the ability to interface with various business units and vendors.
  • Experience liaising with developers, operations engineers, and third-party resources.
  • Experience consuming APIs.
Salary Range

The estimated salary range for this position is ₹1,200,000 - ₹2,500,000 per annum, based on your location and qualifications.



  • Pune, Maharashtra, India Neerinfo Solutions Full time

    Job Title: Performance Optimization ExpertWe are looking for a highly skilled Performance Optimization Expert to join our team at Neerinfo Solutions.The successful candidate will have extensive experience in application support, microservices architecture, and automation, with expertise in monitoring and dashboarding tools like Splunk and AppDynamics. The...


  • Pune, Maharashtra, India Collabera Full time

    **Job Title:** Infrastructure Reliability Expert**Salary:** $120,000 - $180,000 per year (dependent on experience)We are looking for an experienced Infrastructure Reliability Expert to join our team at Collabera in Pune, India. As a key member of our infrastructure team, you will be responsible for designing and implementing highly available, scalable, and...

  • Reliability Expert

    4 weeks ago


    Pune, Maharashtra, India PubMatic Full time

    Job DescriptionWe are seeking a highly skilled Reliability Expert to join our team at PubMatic. As an SRE Engineer, you will be responsible for ensuring the seamless operation and optimal performance of large-scale distributed software applications.Your role will encompass maintaining a robust and high-performing environment, contributing to the reliability...

  • Reliability Expert

    2 weeks ago


    Pune, Maharashtra, India Synechron Full time

    **Job Responsibilities**As a Reliability Expert, you will thrive working in incident response environments, performing post-mortem analysis, and designing and implementing secured solutions. You will also take ownership of initiatives and assets and provide highest quality customer service.Your expertise in container technology, Docker, and Kubernetes/EKS,...


  • Pune, Maharashtra, India Tata Consultancy Services Full time

    We are a global leader in the technology arena, and we're looking for exceptional talent to join our team. As a Site Reliability Engineer, you will play a crucial role in ensuring the smooth operation of our systems and applications.Job DescriptionAs a Site Reliability Engineer at Tata Consultancy Services, you will be responsible for designing,...


  • Pune, Maharashtra, India Emerson Full time

    About the RoleWe are seeking a highly skilled Maintenance and Reliability Expert to join our team at Emerson. As a key member of our maintenance department, you will be responsible for ensuring the smooth operation of our equipment and facilities.


  • Pune, Maharashtra, India SwiftWin Technologies LLP Full time

    Job Title: Site Reliability Automation ExpertAbout Us:Skyrocket your career with SwiftWin Technologies LLP, a leader in innovative solutions. We offer competitive salaries and benefits to talented professionals like you.Job Overview:We seek a highly skilled Azure DevOps Infrastructure Architect to join our team. As an SRE, you will be responsible for...


  • Pune, Maharashtra, India Virtusa Consulting Services Private Limited Full time

    Company OverviewVirtusa Consulting Services Private Limited is a leading IT consulting firm that delivers cutting-edge technology solutions to its clients.SalaryThe estimated salary for this position is $120,000 - $180,000 per year, depending on location and experience.Job DescriptionWe are seeking an experienced Cloud Reliability Operations Expert to join...


  • Pune, Maharashtra, India Growel Softech Pvt. Ltd. Full time

    Job Summary:We are seeking a skilled IT System Reliability Engineer to contribute to the technical analysis, coding, and support of back-office settlement systems within a Cash Equities Settlement domain.The ideal candidate will play a pivotal role in supporting integration applications while ensuring system reliability and efficiency.Key...


  • Pune, Maharashtra, India Fulcrum Digital Full time

    About Fulcrum Digital: We are a dynamic company seeking an experienced Senior Reliability Engineer to join our team. As a key contributor, you will play a pivotal role in ensuring the reliability, scalability, and performance of our critical systems. Our company culture emphasizes collaboration and innovation. You will work closely with development,...


  • Pune, Maharashtra, India One2N Full time

    Job OverviewWe're seeking a highly skilled Production Systems Expert to join our team at One2N.This role is ideal for individuals with 2+ years of DevOps/SRE experience who are passionate about building and running reliable software systems in production.About the RolePrimary responsibility will be working with our Startup and mid-size clients to deliver...


  • Pune, Maharashtra, India One2N Full time

    We are seeking a highly skilled Site Reliability Engineer who can help us build and maintain reliable software systems in production. If you have a strong passion for solving complex technical problems, working with our clients to deliver scalable and efficient solutions will be a great fit.About the Role:We are looking for an experienced engineer with 2+...


  • Pune, Maharashtra, India Fulcrum Digital Full time

    About the Role:We are seeking a seasoned Performance Engineering Lead to join Fulcrum Digital.This key contributor will play a pivotal role in ensuring the reliability, scalability, and performance of our critical systems.As a key member of the team, you will work closely with development, operations, and infrastructure teams to identify and resolve issues,...


  • Pune, Maharashtra, India Tata Consultancy Services Full time

    Greetings from Tata Consultancy ServicesTCS Walk-in Drive for Infrastructure ExpertsWe are looking for highly skilled professionals to join our Site Reliability Engineering team. As a key member, you will be responsible for maintaining the stability and performance of our Payment Gateway Services application.Key Responsibilities:Maintain production...


  • Pune, Maharashtra, India Fulcrum Digital Full time

    About the RoleWe are seeking an experienced Enterprise System Reliability Engineer to join our team at Fulcrum Digital. In this role, you will be responsible for designing, implementing, and maintaining scalable and reliable systems that meet the needs of our customers.With a strong understanding of system architecture and design principles, you will work...


  • Pune, Maharashtra, India Hansen Tehcnologies Full time

    About the Role:Key ResponsibilitiesOperation Support: Triage incidents, service restoration, initial resolution, and permanent resolution.End-to-end Incident Management: Collaborate with 3rd party teams to ensure resolution within SLA. Perform root cause analysis.System Reliability: Proactively identify and address system issues to ensure high availability,...


  • Pune, Maharashtra, India One2N Full time

    We're seeking a meticulous engineer to oversee the stability and scalability of our software systems. The ideal candidate will primarily collaborate with clients on One-to-N kind problems, focusing on Proof of Concept development, system maintainability, and reliability.About YouAt least 2 years of experience in DevOps/SRE rolesFamiliarity with Linux systems...


  • Pune, Maharashtra, India Futran Solutions Full time

    Job DescriptionFutran Solutions is seeking a highly skilled Senior System Reliability Engineer to join our team. As a key member of our infrastructure team, you will be responsible for designing, implementing, and maintaining scalable and reliable systems that meet the needs of our customers.Key ResponsibilitiesDesign and implement scalable and reliable...


  • Pune, Maharashtra, India Collabera Full time

    Job Title: Reliability Engineer: Scalable System ArchitectAbout the Role:We are seeking a highly skilled Senior Site Reliability Engineer (SRE) to join our team at Collabera. In this role, you will focus on designing and building highly reliable, scalable, and efficient systems.Main Responsibilities:Implement SRE best practices to ensure system reliability,...


  • Pune, Maharashtra, India Futran Solutions Full time

    About Futran SolutionsWe are a cutting-edge technology company providing innovative solutions to complex problems. Our team of experts is dedicated to delivering high-quality services and products that meet the evolving needs of our clients.