Reliable Systems Expert
3 days ago
We are seeking a highly skilled Reliable Systems Expert to join our team at LTIMindtree. As a key member of our organization, you will be responsible for ensuring the high availability and performance of our mission-critical services.
Key Responsibilities:
- Engage in the entire lifecycle of services—from inception and design through deployment, operation, and refinement.
- Responsible for improving the end-to-end availability and performance of mission-critical services and building automation to prevent problem recurrence.
- Partner with business and technical product owners to set Service Level Objectives (SLOs), Service Level Indicators (SLIs), and error budgets to manage reliability of infrastructure and applications.
- Scale and optimize existing infrastructure and services sustainably through mechanisms, including automation, and evolve them by improving reliability and efficiency.
- Maintain end-to-end availability and performance of mission-critical services and build automation to prevent problem recurrence.
- Maintain infrastructure (infrastructure as code) and services by measuring, and monitoring system metrics to proactively identify operational efficiencies, potential outages, and security threats in Development, UAT, Staging, and Production environments.
- Practice sustainable incident response and blameless postmortems.
- Build infrastructure and drive projects that break things with the aim to improve the robustness of production systems.
- Preserve operational visibility and response capabilities—fixing and improving our dashboards, alerts, and automation.
- Maintain operational uptime and reliability by participating in triage and issue support calls for mission-critical systems.
To succeed in this role, you must possess the following qualifications:
- Bachelor's degree in computer science or a related technical field.
- Strong debugging, troubleshooting, and problem-solving skills.
- Proficient in Nodejs, with familiarity with other scripting languages such as JavaScript, Python, Maven, Ansible, Bash, etc.
- Experience with monitoring and alerting systems like Dynatrace, Prometheus, Grafana.
- Experience with logs and metrics analytics platforms like Sumologic, Splunk.
- Experience setting SLOs/SLIs/error budgets and managing reliability for infrastructure and applications using Kubernetes, AWS Native components, CloudWatch, Dynatrace.
- Experience handling large numbers of diverse systems with configuration management systems like Puppet, Chef, Ansible.
- Proven history of leveraging automation.
- Experience using tools like PagerDuty for managing incidents.
- Understanding of standard networking protocols and components such as HTTP, DNS, ECMP, TCP/IP, ICMP, the OSI Model, Subnetting, and Load Balancing strategies.
- Experience in Serverless Application Framework.
- Experience in containerized workloads and management platforms such as Docker or Kubernetes.
- Familiarity with distributed systems, including Microservices.
- Experience in Infrastructure automation tools such as CDK.
- Understanding of CI/CD processes and experience with deployment automation tools such as Code Pipeline, Code Deploy, Jenkins, Bamboo.
- Effective communication, collaboration, and negotiation skills with the ability to interface with various business units and vendors.
- Experience liaising with developers, operations engineers, and third-party resources.
- Experience consuming APIs.
The estimated salary range for this position is ₹1,200,000 - ₹2,500,000 per annum, based on your location and qualifications.
-
Reliable Systems Engineer
2 weeks ago
Pune, Maharashtra, India Neerinfo Solutions Full timeJob Title: Performance Optimization ExpertWe are looking for a highly skilled Performance Optimization Expert to join our team at Neerinfo Solutions.The successful candidate will have extensive experience in application support, microservices architecture, and automation, with expertise in monitoring and dashboarding tools like Splunk and AppDynamics. The...
-
Infrastructure Reliability Expert
2 weeks ago
Pune, Maharashtra, India Collabera Full time**Job Title:** Infrastructure Reliability Expert**Salary:** $120,000 - $180,000 per year (dependent on experience)We are looking for an experienced Infrastructure Reliability Expert to join our team at Collabera in Pune, India. As a key member of our infrastructure team, you will be responsible for designing and implementing highly available, scalable, and...
-
Reliability Expert
4 weeks ago
Pune, Maharashtra, India PubMatic Full timeJob DescriptionWe are seeking a highly skilled Reliability Expert to join our team at PubMatic. As an SRE Engineer, you will be responsible for ensuring the seamless operation and optimal performance of large-scale distributed software applications.Your role will encompass maintaining a robust and high-performing environment, contributing to the reliability...
-
Reliability Expert
2 weeks ago
Pune, Maharashtra, India Synechron Full time**Job Responsibilities**As a Reliability Expert, you will thrive working in incident response environments, performing post-mortem analysis, and designing and implementing secured solutions. You will also take ownership of initiatives and assets and provide highest quality customer service.Your expertise in container technology, Docker, and Kubernetes/EKS,...
-
Reliability Engineering Expert
3 weeks ago
Pune, Maharashtra, India Tata Consultancy Services Full timeWe are a global leader in the technology arena, and we're looking for exceptional talent to join our team. As a Site Reliability Engineer, you will play a crucial role in ensuring the smooth operation of our systems and applications.Job DescriptionAs a Site Reliability Engineer at Tata Consultancy Services, you will be responsible for designing,...
-
Maintenance and Reliability Expert
3 weeks ago
Pune, Maharashtra, India Emerson Full timeAbout the RoleWe are seeking a highly skilled Maintenance and Reliability Expert to join our team at Emerson. As a key member of our maintenance department, you will be responsible for ensuring the smooth operation of our equipment and facilities.
-
Site Reliability Automation Expert
2 weeks ago
Pune, Maharashtra, India SwiftWin Technologies LLP Full timeJob Title: Site Reliability Automation ExpertAbout Us:Skyrocket your career with SwiftWin Technologies LLP, a leader in innovative solutions. We offer competitive salaries and benefits to talented professionals like you.Job Overview:We seek a highly skilled Azure DevOps Infrastructure Architect to join our team. As an SRE, you will be responsible for...
-
Cloud Reliability Operations Expert
3 weeks ago
Pune, Maharashtra, India Virtusa Consulting Services Private Limited Full timeCompany OverviewVirtusa Consulting Services Private Limited is a leading IT consulting firm that delivers cutting-edge technology solutions to its clients.SalaryThe estimated salary for this position is $120,000 - $180,000 per year, depending on location and experience.Job DescriptionWe are seeking an experienced Cloud Reliability Operations Expert to join...
-
IT System Reliability Engineer
1 month ago
Pune, Maharashtra, India Growel Softech Pvt. Ltd. Full timeJob Summary:We are seeking a skilled IT System Reliability Engineer to contribute to the technical analysis, coding, and support of back-office settlement systems within a Cash Equities Settlement domain.The ideal candidate will play a pivotal role in supporting integration applications while ensuring system reliability and efficiency.Key...
-
Reliability Systems Engineer
1 month ago
Pune, Maharashtra, India Fulcrum Digital Full timeAbout Fulcrum Digital: We are a dynamic company seeking an experienced Senior Reliability Engineer to join our team. As a key contributor, you will play a pivotal role in ensuring the reliability, scalability, and performance of our critical systems. Our company culture emphasizes collaboration and innovation. You will work closely with development,...
-
Production Systems Expert
4 weeks ago
Pune, Maharashtra, India One2N Full timeJob OverviewWe're seeking a highly skilled Production Systems Expert to join our team at One2N.This role is ideal for individuals with 2+ years of DevOps/SRE experience who are passionate about building and running reliable software systems in production.About the RolePrimary responsibility will be working with our Startup and mid-size clients to deliver...
-
Reliable Systems Architect
3 weeks ago
Pune, Maharashtra, India One2N Full timeWe are seeking a highly skilled Site Reliability Engineer who can help us build and maintain reliable software systems in production. If you have a strong passion for solving complex technical problems, working with our clients to deliver scalable and efficient solutions will be a great fit.About the Role:We are looking for an experienced engineer with 2+...
-
Reliability Systems Architect
2 weeks ago
Pune, Maharashtra, India Fulcrum Digital Full timeAbout the Role:We are seeking a seasoned Performance Engineering Lead to join Fulcrum Digital.This key contributor will play a pivotal role in ensuring the reliability, scalability, and performance of our critical systems.As a key member of the team, you will work closely with development, operations, and infrastructure teams to identify and resolve issues,...
-
Reliable Systems Engineer
3 weeks ago
Pune, Maharashtra, India Tata Consultancy Services Full timeGreetings from Tata Consultancy ServicesTCS Walk-in Drive for Infrastructure ExpertsWe are looking for highly skilled professionals to join our Site Reliability Engineering team. As a key member, you will be responsible for maintaining the stability and performance of our Payment Gateway Services application.Key Responsibilities:Maintain production...
-
Enterprise System Reliability Engineer
2 weeks ago
Pune, Maharashtra, India Fulcrum Digital Full timeAbout the RoleWe are seeking an experienced Enterprise System Reliability Engineer to join our team at Fulcrum Digital. In this role, you will be responsible for designing, implementing, and maintaining scalable and reliable systems that meet the needs of our customers.With a strong understanding of system architecture and design principles, you will work...
-
Reliable Systems Engineer
3 weeks ago
Pune, Maharashtra, India Hansen Tehcnologies Full timeAbout the Role:Key ResponsibilitiesOperation Support: Triage incidents, service restoration, initial resolution, and permanent resolution.End-to-end Incident Management: Collaborate with 3rd party teams to ensure resolution within SLA. Perform root cause analysis.System Reliability: Proactively identify and address system issues to ensure high availability,...
-
Reliable Systems Engineer
2 months ago
Pune, Maharashtra, India One2N Full timeWe're seeking a meticulous engineer to oversee the stability and scalability of our software systems. The ideal candidate will primarily collaborate with clients on One-to-N kind problems, focusing on Proof of Concept development, system maintainability, and reliability.About YouAt least 2 years of experience in DevOps/SRE rolesFamiliarity with Linux systems...
-
System Reliability Engineer Lead
2 weeks ago
Pune, Maharashtra, India Futran Solutions Full timeJob DescriptionFutran Solutions is seeking a highly skilled Senior System Reliability Engineer to join our team. As a key member of our infrastructure team, you will be responsible for designing, implementing, and maintaining scalable and reliable systems that meet the needs of our customers.Key ResponsibilitiesDesign and implement scalable and reliable...
-
Reliability Engineer: Scalable System Architect
3 weeks ago
Pune, Maharashtra, India Collabera Full timeJob Title: Reliability Engineer: Scalable System ArchitectAbout the Role:We are seeking a highly skilled Senior Site Reliability Engineer (SRE) to join our team at Collabera. In this role, you will focus on designing and building highly reliable, scalable, and efficient systems.Main Responsibilities:Implement SRE best practices to ensure system reliability,...
-
Reliable Systems Engineer
1 month ago
Pune, Maharashtra, India Futran Solutions Full timeAbout Futran SolutionsWe are a cutting-edge technology company providing innovative solutions to complex problems. Our team of experts is dedicated to delivering high-quality services and products that meet the evolving needs of our clients.