Current jobs related to Site Reliability Engineer - Chennai - Talent500


  • Chennai, Tamil Nadu, India Altimetrik Full time

    Job Title: Site Reliability EngineerJob Summary:We are seeking a highly skilled Site Reliability Engineer to join our team at Altimetrik. As a Site Reliability Engineer, you will be responsible for ensuring the reliability, scalability, and performance of our cloud-based infrastructure.Key Responsibilities:Development Background: You should have a strong...


  • Chennai, Tamil Nadu, India IT Full time

    Job Title: Site Reliability EngineerWe are seeking a highly skilled Site Reliability Engineer to join our IT team. As a Site Reliability Engineer, you will be responsible for ensuring the high availability and performance of our OpenShift clusters.Key Responsibilities:Manage and maintain OpenShift clusters to ensure high availability and performance.Develop,...


  • Chennai, India NexionPro Services Full time

    Job Title: Site Reliability EngineerWe are seeking a highly skilled Site Reliability Engineer to join our team at NexionPro Services. The ideal candidate will have a solid development background, combined with experience in monitoring, automation, and support of applications and infrastructure in cloud environments.Key Responsibilities:Implement and maintain...


  • Chennai, India Changeleaders Full time

    Job Description:About the Role: At Changeleaders, we're looking for a highly skilled Site Reliability Engineering Expert to join our team. As a key member of our team, you will be responsible for ensuring the reliability, performance, and scalability of our software systems. Key Responsibilities:• Run production environments by monitoring availability and...


  • Chennai, India Mokshaa LLC Full time

    Job Title: Site Reliability EngineerKey Requirements:Expert-level knowledge of Kubernetes, Terraform, and Azure Cloud.Intermediate-level knowledge of Databricks and NO-SQL databases (Cassandra, Mongo, PostGres).Core Responsibilities:Design and implement scalable cloud infrastructure on Azure, ensuring high availability and performance.Develop and maintain...


  • Chennai, India Tata Consultancy Services Full time

    TCS has been a great pioneer in feeding the fire of young techies like you. We are a global leader in the technology arena and there’s nothing that can stop us from growing together. What we are looking for Role: Site Reliability Engineer Experience Range: 8 – 12 Years Location: Pune/Chennai/Delhi/Bengaluru SRE Team Skills: (Must have) ...


  • Chennai, Tamil Nadu, India Bounteous Full time

    Job Title: Site Reliability Engineer LeadWe are seeking a highly skilled Site Reliability Engineer Lead to join our team at Bounteous x Accolite. As a Site Reliability Engineer Lead, you will be responsible for owning the outcomes of the incident management process and leading a team of 24/7 site reliability engineers within the technology department.Key...


  • Chennai, India Tata Consultancy Services Full time

    TCS has been a great pioneer in feeding the fire of young techies like you. We are a global leader in the technology arena and there’s nothing that can stop us from growing together.What we are looking forRole: Site Reliability EngineerExperience Range: 8 – 12 YearsLocation: Pune/Chennai/Delhi/BengaluruSRE Team Skills: (Must have)Exceptional skills in...


  • chennai, India Tata Consultancy Services Full time

    TCS has been a great pioneer in feeding the fire of young techies like you. We are a global leader in the technology arena and there’s nothing that can stop us from growing together.What we are looking forRole: Site Reliability EngineerExperience Range: 8 – 12 YearsLocation: Pune/Chennai/Delhi/BengaluruSRE Team Skills: (Must have) Exceptional skills in...


  • Chennai, India Tata Consultancy Services Full time

    TCS has been a great pioneer in feeding the fire of young techies like you. We are a global leader in the technology arena and there’s nothing that can stop us from growing together.What we are looking forRole: Site Reliability EngineerExperience Range: 8 – 12 YearsLocation: Pune/Chennai/Delhi/BengaluruSRE Team Skills: (Must have) Exceptional skills in...


  • Chennai, India Tata Consultancy Services Full time

    TCS has been a great pioneer in feeding the fire of young techies like you. We are a global leader in the technology arena and there’s nothing that can stop us from growing together.What we are looking forRole: Site Reliability EngineerExperience Range: 8 – 12 YearsLocation: Pune/Chennai/Delhi/BengaluruSRE Team Skills: (Must have) Exceptional skills in...


  • chennai, India Tata Consultancy Services Full time

    TCS has been a great pioneer in feeding the fire of young techies like you. We are a global leader in the technology arena and there’s nothing that can stop us from growing together. What we are looking for Role: Site Reliability Engineer Experience Range: 8 – 12 Years Location: Pune/Chennai/Delhi/Bengaluru SRE Team Skills: (Must have) Exceptional...


  • Chennai, India Tata Consultancy Services Full time

    TCS has been a great pioneer in feeding the fire of young techies like you. We are a global leader in the technology arena and there’s nothing that can stop us from growing together. What we are looking for Role: Site Reliability Engineer Experience Range: 8 – 12 Years Location: Pune/Chennai/Delhi/Bengaluru SRE Team Skills: (Must have) Exceptional...


  • Chennai, Tamil Nadu, India Athenahealth Full time

    Job SummaryWe are seeking a Senior Site Reliability Engineer to join our Service Operations, Site Reliability Engineering team within the Cloud Infrastructure Engineering division.The Team is responsible for managing the fleet of systems owned by its sister teams in the Service Operations zone.We are looking for Site Reliability & Infrastructure Engineering...


  • Chennai, India Tata Consultancy Services Full time

    Dear CandidateGreetings from TCS !!!TCS has been a great pioneer in feeding the fire of young Techies like you. We are a global leader in the technology arena and there's nothing that can stop us from growing together.Role: Site Reliability Engineer Location: Pune/Chennai/Bangalore/DelhiExperience Range: 8-12 yearsEducational Qualification : 15 Years of Full...


  • Chennai, India Tata Consultancy Services Full time

    Dear CandidateGreetings from TCS !!!TCS has been a great pioneer in feeding the fire of young Techies like you. We are a global leader in the technology arena and there's nothing that can stop us from growing together.Role: Site Reliability Engineer Location: Pune/Chennai/Bangalore/DelhiExperience Range: 8-12 yearsEducational Qualification : 15 Years of Full...


  • Chennai, Tamil Nadu, India Altimetrik Full time

    Job Description:We are seeking a highly skilled Site Reliability Engineer to join our team at Altimetrik. The ideal candidate will have a strong background in development and a proven track record of troubleshooting and triaging complex issues.Key Responsibilities:Mandatory Skills:Development experience in Java, .Net, or Python, with a strong focus on code...


  • Chennai, Tamil Nadu, India Virtusa Full time

    Job Title: DevOps ArchitectJob Summary:We are seeking a highly skilled DevOps Architect to join our team at Virtusa. As a DevOps Architect, you will be responsible for designing and implementing scalable and efficient cloud infrastructure solutions.Key Responsibilities:Design and implement cloud infrastructure solutions using Hadoop, HBase, Hive, Oozie, and...


  • Chennai, Tamil Nadu, India NexionPro Services Full time

    Job Title : Site Reliability Engineer (SRE)Location : Chennai (Guindy)Experience : 5-8 yearsNotice Period : Immediate or serving notice (August joiners preferred)Work Mode : 5 days in-officeReferences are highly appreciated.Job Summary : We are seeking a highly skilled Site Reliability Engineer (SRE) to join our team. The ideal candidate will have a solid...


  • Chennai, India NexionPro Services Full time

    Job Title : Site Reliability Engineer (SRE)Location : Chennai (Guindy)Experience : 5-8 yearsNotice Period : Immediate or serving notice (August joiners preferred)Work Mode : 5 days in-officeReferences are highly appreciated.Job Summary : We are seeking a highly skilled Site Reliability Engineer (SRE) to join our team. The ideal candidate will have a solid...

Site Reliability Engineer

5 months ago


Chennai, India Talent500 Full time
Position Title:Senior Engineer, Site Reliability Engineering

ROLE DESCRIPTION AND SCOPERole:As a Senior Site Reliability Engineer at Ford Motor Company, you will play a pivotal role in elevating the performance and dependability of our eCommerce platforms and applications. In this essential position, your responsibilities will include closely collaborating with diverse teams across the organization to fortify our online systems, ensuring they are not only robust and scalable but also equipped to efficiently manage the complexities of a global customer base. Your expertise in site reliability will be crucial in driving ongoing enhancements to our technology landscape. This continuous improvement effort is vital to maintaining Ford’s leadership in innovation within the automotive industry, helping us set standards in digital commerce and customer satisfaction. Your contributions will directly impact the smooth operation and evolutionary growth of our eCommerce capabilities, aligning with Ford's commitment to excellence and innovation.

KEY RESPONSIBILITIES / DELIVERABLES:As a Site Reliability Engineer, your responsibilities will include:Participating in 24x7 on-call production support rotations and handling incident response to minimize disruptions.Continuously monitoring the availability, reliability, and performance of systems, platforms, and applications, maintaining a holistic view of system health.Regularly review key site technical metrics such as transactions errors, logging, response times, caching strategies, conversion/bounce rates, capacity & resource utilization.Providing primary operational and engineering support for multiple large, distributed software applications.Proactively identify stability risks & work with engineering leadership to establish appropriate mitigation plans.Using automation tools, scripts, and processes to reduce or eliminate repetitive tasks, thereby improving the support provided by Site Reliability Engineering.Creating or modifying terraform files according to Ford formats to develop new monitoring dashboards and alert policies.Collaborating with engineering and architecture teams to evaluate and identify optimal cloud solutions, focusing on scalability, high-performance, and security.Gathering and analyzing metrics from operating systems and applications to assist in performance tuning and fault finding.Measuring and optimizing system performance continuously to exceed customer needs and advance capabilities.Troubleshooting and resolving issues related to full stack websites, cloud platforms, and infrastructure.Working closely with developers, testers, and business stakeholders to ensure the delivery of high-quality solutions, balancing feature development speed and reliability with well-defined service-level objectives.Ensuring compliance with security and regulatory standards, implementing and maintaining disaster recovery processes.Providing technical guidance and mentorship to other team members.These responsibilities ensure the stability, efficiency, and continuous improvement of Ford Motor Company’s eCommerce solutions, aligning with the organization's high standards and innovative approach.

EXPERIENCES / COMPETENCIES:Education Qualification:Bachelor’s or Equivalent

Number of Years of Experience:4+ years SRE experience

Leadership Skills and Personality Traits:Ability to work effectively in a remote/virtual work setting with other global team members.Effectively work with cross-functional teams across the organization – inside and outside of the technology and software organizationAbility to dissect problems and explore them from different angles to find the most efficient solutions.Staying composed under pressure and bouncing back from setbacks quickly, maintaining focus on achieving system reliability.Keen attention to specifics to catch and address small issues before they escalate into larger problems.A strong desire to understand how things work and a willingness to explore and implement new technologies and methodologies.Flexibility in handling unexpected challenges and changes in technology or project directions.Taking initiative to prevent problems before they occur and continuously seeking improvements in system performance.Confidence and ability to make quick decisions during critical situations to prevent or minimize disruptions.Understanding and considering team members’ perspectives and challenges, fostering a supportive and inclusive environment.Clear and effective communication skills, capable of conveying complex information in a straightforward manner and engaging with both technical and non-technical stakeholders.Taking responsibility for the systems and the team, ensuring reliability, and being accountable for the outcomes.Commitment to the development of team members, providing guidance and feedback to help them grow in their professional capacities.Encouraging a collaborative team environment where ideas and solutions are shared openly and where each member’s contribution is valued.Motivating the team to strive for excellence, pushing the boundaries of what is possible, and inspiring innovation through leadership.

Functional/Technical Skills:5 - 6 years’ experience with JAVA, J2EE, NoSQL/SQL Datastore, Spring Boot, GCP/AWS/Azure & Docker/K8 in Maintenance and Development of multi-tier applications.Understanding of RESTful APIs and microservices platform4 - 5 Years of experience with any of APM and other monitoring tools such as Dynatrace, New Relic, ELK, Splunk, Prometheus, Sensu, Nagios, Kafka, DataDog, PagerDuty.Strong experience with product & development teams to establish error budgets by identifying the right SLOs (Service level objective), SLIs (Service level indicators), KPIs (Key performance indicators) and effectively drive the use of the budget to ensure maximum domain availability/uptime.Experience in solving complex architecture/design & business problems, work to simplify, optimize, remove bottlenecks, etc.Architect, design & develop automation experience to reduce toil, improve recoverability, availability, latency & scalability of supported applications with understanding of MTTD (Mean Time to Detection) & MTTR (Mean Time to Resolution)Ability to quickly diagnose and resolve issues in high-pressure situations.Strong verbal and written communication skills to effectively collaborate with cross-functional teams and articulate technical concepts to non-technical stakeholders.Experience in leading teams, mentoring junior staff, and promoting a culture of continuous improvement and learning.Ability to analyze complex data to improve system performance and predict future challenges.Experience in handling outages and the ability to lead incident response efforts, minimizing impact on services.Understanding of network architecture, protocols, and security practices to ensure robust and secure systems.Skills/understanding of performance tuning and optimization of systems and applications.Knowledge of database administration and management, particularly in configuring, managing, and scaling databases.Experience in planning and executing disaster recovery strategies to ensure data integrity and availability.

Travel:As needed and flexible

Other Preferred:N/A