Site Reliability Engineer

1 month ago


Chennai, India Ford Business Solutions Full time

Short Description:
A site reliability engineer (SRE) is a role that combines software engineering and systems engineering to ensure that a software system is available, scalable, and maintainable 24*7*365 in "Always ON" aspect for the Ford's e-Commerce Platform
Description for Internal Candidates
Strong background in software development and systems administration, as well as excellent problem-solving and communication skills.
Improve reliability, quality, and time-to-market of our suite of software solutions
Measure and optimize system performance, with an eye toward pushing our capabilities forward, getting ahead of customer needs, and innovating to continually improve
Identify and reduce or eliminate toil via automation to maximize the time spent on engineering and innovation
Performing root cause analysis of production incidents and implementing preventive measures
Responsibilities for Internal Candidates
Strong background in software development and systems administration, as well as excellent problem-solving and communication skills.
Run the production environment by monitoring availability and taking a holistic view of system health.
Developing, improving, and operating the deployment and orchestration of a complex distributed system
Improve reliability, quality, and time-to-market of our suite of software solutions
Measure and optimize system performance, with an eye toward pushing our capabilities forward, getting ahead of customer needs, and innovating to continually improve
Provide primary operational and engineering Support for multiple large, distributed software applications
Identify and reduce or eliminate toil via automation to maximize the time spent on engineering and innovation
Collaborating with development teams to design, build, and operate scalable and resilient software systems
Automating deployment, monitoring, and incident response processes
Performing root cause analysis of production incidents and implementing preventive measures
Conducting performance analysis and optimization of the system
Ensuring compliance with security and regulatory standards
Implementing and maintaining disaster recovery processes
Providing technical guidance and mentorship to other team members
Participating in an on-call rotation for incident response and support.
Qualifications:
4 Year College Degree in Computer Science or Equivalent.
2-5 years experience with JAVA, J2EE, NoSQL/SQL Datastore, Spring Boot, GCP/AWS/Azure & Docker/K8 in Maintenance and Development of multi-tier applications.
Understanding of RESTful APIs and microservices platform
2-5 Years of experience with any of APM and other monitoring tools such as Dynatrace, New Relic, ELK, Splunk, Prometheus, Sensu, Nagios, Kafka, DataDog, PagerDuty.
Strong experience with product & development teams to establish error budgets by identifying the right SLOs (Service level objective), SLIs (Service level indicators), KPIs (Key performance indicators) and effectively drive the use of the budget to ensure maximum domain availability/uptime.
Regularly review key site technical metrics such as transactions errors, logging, response times, caching strategies, conversion/bounce rates, capacity & resource utilization.
Proactively identify stability risks & work with engineering leadership to establish appropriate mitigation plans
Experience in solving complex architecture/design & business problems, work to simplify, optimize, remove bottlenecks, etc.
Architect, design & develop automation to reduce toil, improve recoverability, availability, latency & scalability of supported applications with understanding of MTTD (Mean Time to Detection) & MTTR (Mean Time to Resolution)
Maintain knowledge repository that includes Standard operating procedure, Release checklists, Runbooks for incident recovery Same Posting Description for Internal and External Candidates



  • chennai, India iLink Digital Full time

    7 years of experience as a Site Reliability Engineer, DevOps Engineer, or similar role. Strong expertise in Azure cloud services and solutions. Proficiency in scripting and automation using PowerShell, Azure CLI, or similar tools. Experience with infrastructure as code (IaC) tools such as ARM templates, Terraform, or Ansible. Familiarity with CI/CD pipelines...


  • Chennai, India iLink Digital Full time

    7 years of experience as a Site Reliability Engineer, DevOps Engineer, or similar role.Strong expertise in Azure cloud services and solutions.Proficiency in scripting and automation using PowerShell, Azure CLI, or similar tools.Experience with infrastructure as code (IaC) tools such as ARM templates, Terraform, or Ansible.Familiarity with CI/CD pipelines and...


  • Chennai, India iLink Digital Full time

    7 years of experience as a Site Reliability Engineer, DevOps Engineer, or similar role.Strong expertise in Azure cloud services and solutions.Proficiency in scripting and automation using PowerShell, Azure CLI, or similar tools.Experience with infrastructure as code (IaC) tools such as ARM templates, Terraform, or Ansible.Familiarity with CI/CD pipelines and...


  • chennai, India TERRAGIG LLP Full time

    Role : Site Reliability EngineerExperience : 5+ Years Work Model : Remote / Contract 3 years Skills :- Develop and provide operational support for full-stack software applications.- Relevant industry certifications, such as through the Site Reliability Engineering (SRE) Foundation.- Five years' experience as a site reliability engineer or similar role.-...


  • Chennai, India ZF Group Full time

    Req ID SDC Chennai, IndiaYour Tasks7-9 years of experience as a Cloud native production environment Site Reliability Engineer (preferably AWS) supporting high-availability large-scale web-based applicationsExperience with infrastructure & service monitoring and alertingExperience with application observability Experience with Kafka, Terraform, CI/CD...


  • Chennai, India Ford Business Solutions Full time

    Short Description:A site reliability engineer (SRE) is a role that combines software engineering and systems engineering to ensure that a software system is available, scalable, and maintainable 24*7*365 in "Always ON" aspect for the Ford's e-Commerce PlatformDescription for Internal Candidates Strong background in software development and systems...


  • Chennai, Tamil Nadu, India Ford Business Solutions Full time

    Short Description:A site reliability engineer (SRE) is a role that combines software engineering and systems engineering to ensure that a software system is available, scalable, and maintainable 24*7*365 in "Always ON" aspect for the Ford's e-Commerce PlatformDescription for Internal Candidates Strong background in software development and systems...


  • Chennai, India Ford Business Solutions Full time

    Short Description:A site reliability engineer (SRE) is a role that combines software engineering and systems engineering to ensure that a software system is available, scalable, and maintainable 24*7*365 in "Always ON" aspect for the Ford's e-Commerce PlatformDescription for Internal Candidates Strong background in software development and systems...


  • chennai, India ZF Group Full time

    Req ID SDC Chennai, India Your Tasks 7-9 years of experience as a Cloud native production environment Site Reliability Engineer (preferably AWS) supporting high-availability large-scale web-based applicationsExperience with infrastructure & service monitoring and alertingExperience with application observability Experience with Kafka, Terraform,...


  • Chennai, India ZF Group Full time

    Req ID SDC Chennai, India Your Tasks 7-9 years of experience as a Cloud native production environment Site Reliability Engineer (preferably AWS) supporting high-availability large-scale web-based applications Experience with infrastructure & service monitoring and alerting Experience with application observability Experience with Kafka, Terraform,...


  • Chennai, India Corpxcel Consulting Full time

    Job Description : - For SRE coach we need someone with 10+ yrs of exp (female candidates requirement)- Have extensive experience as an agile coach with good knowledge of SRE- Super communication skill- Document the SRE manual and training material- Who can lead/coach the SRE team- Establish the processes, and work with multiple teams evangelising the...


  • chennai, India Corpxcel Consulting Full time

    Job Description : - For SRE coach we need someone with 10+ yrs of exp (female candidates requirement)- Have extensive experience as an agile coach with good knowledge of SRE- Super communication skill- Document the SRE manual and training material- Who can lead/coach the SRE team- Establish the processes, and work with multiple teams evangelising the...


  • Chennai, Tamil Nadu, India Corpxcel Consulting Full time

    Job Description :- For SRE coach we need someone with 10+ yrs of exp (female candidates requirement)- Have extensive experience as an agile coach with good knowledge of SRE- Super communication skill- Document the SRE manual and training material- Who can lead/coach the SRE team- Establish the processes, and work with multiple teams evangelising the...


  • Chennai, India Corpxcel Consulting Full time

    Job Description :- For SRE coach we need someone with 10+ yrs of exp (female candidates requirement)- Have extensive experience as an agile coach with good knowledge of SRE- Super communication skill- Document the SRE manual and training material- Who can lead/coach the SRE team- Establish the processes, and work with multiple teams evangelising the...


  • Chennai, India Corpxcel Consulting Full time

    For SRE : - Have experience in automation - Operational Knowledge in any of the CICD Tooling Technologies - Understanding of the cloud deployments and SRE - 5-8 years of solid, diverse work experience in a Java development and DevOps Platform Engineering with Development Disciplines in a high pace Production Environment- At least 3 years of experience with...


  • Chennai, Tamil Nadu, India Corpxcel Consulting Full time

    For SRE :- Have experience in automation- Operational Knowledge in any of the CICD Tooling Technologies- Understanding of the cloud deployments and SRE- 5-8 years of solid, diverse work experience in a Java development and DevOps Platform Engineering with Development Disciplines in a high pace Production Environment- At least 3 years of experience with Java...


  • Chennai, India Corpxcel Consulting Full time

    For SRE :- Have experience in automation- Operational Knowledge in any of the CICD Tooling Technologies- Understanding of the cloud deployments and SRE- 5-8 years of solid, diverse work experience in a Java development and DevOps Platform Engineering with Development Disciplines in a high pace Production Environment- At least 3 years of experience with Java...


  • Chennai, India Anicalls (Pty) Ltd Full time

    The RoleMentor teammates on SRE best practices and guide technical directionWork closely with the product engineering team to rapidly deliver capabilitiesAutomate and optimize developer pipelinesBuild monitoring to assess system and pipeline healthQualifications:Proficiency in Python, Go, Ruby, or Java is a plusExpertise in Linux administration,...


  • Chennai, India Encora Inc. Full time

    Important InformationExperience: 6 to 8 yearsJob Location: ChennaiPosition Type: Full time.Work Mode- Hybrid (3 days in office)Principal Site Reliability EngineerAbout the Opportunity:The Principal Site Reliability Engineer is vital in our Site Reliability Engineering team. As the technical leader at the Center for Operational Excellence, you will guide our...


  • chennai, India Encora Inc. Full time

    Important Information Experience: 6 to 8 years Job Location: Chennai Position Type: Full time. Work Mode- Hybrid (3 days in office)  Principal Site Reliability Engineer About the Opportunity:   The Principal Site Reliability Engineer is vital in our Site Reliability Engineering team. As the technical leader at the...