Site Reliability Engineer
3 weeks ago
Experience : 8+ Years
Location : Mumbai,Chennai (Other cities Remote)
Notice period : Immediate to 30 days max
Responsibilities of Senior SRE :
- The Site Reliability Engineering (SRE) team is responsible for the reliability, scalability, stability and performance of systems and services.
- They work with cross-functional teams to design, build and maintain systems and they troubleshoot issues when they arise. They bridge the gap between development and operations teams.
- They work closely with business teams to define Service Level Objectives (SLO) and agreements (SLA) of critical systems. They also monitor and maintain the uptime of these systems in-line with the defined SLO's and SLA's.
- They deploy and manage monitoring tools to gain insights on system health and performance.
- They analyze performance, identify bottlenecks and implement solutions to improve a system's scalability and latency durations.
- They develop scripts, implement tools and automation frameworks to reduce the manual intervention efforts of deployment, monitoring and scaling.
- They work with development teams for design and development of observability practices like logging, metrics, tracing, etc. They aim to diagnose and troubleshoot issues proactively.
- They create actionable alerts on monitoring systems to ensure rapid response for potential production incidents.
- They forecast resource needs and provision adequately for current and future demand.
- They design and execute "chaos experiments" to test system's failure resiliency.
- They own, define and implement the Disaster Recovery (DR) processes for systems. They also conduct planned and unplanned mock DR drills to test for response preparedness during production incidents.
- They ensure that security best practices are followed and implemented during design and operations of systems.
- They also own and maintain documentation of processes, playbooks, and systems.
- They publish KPI reports and other system health updates on a regular basis to the business.
Requirements :
- Must-have - Bachelor's degree, preferably in CS or a related field, or equivalent experience
- Must-have - 12+ years of overall IT experience
- Must-have - 7+ years of proven work experience as a Senior Site Reliability Engineer or a similar position.
- Must-have - 5+ years of AWS Cloud experience with AWS Certified DevOps Engineer or SysOps or Security etc.
- Must-have - AWS experience - 3+ years' experience with using a broad range of AWS technologies (e.g. EC2, RDS, ELB, S3, VPC, CloudWatch & Monitoring Tools) to develop and maintain an Amazon AWS based cloud solution, with an emphasis on best practice cloud security.
- Must-have - 2+ years of experience in CDN and/or Cache systems like Fastly, Akamai, CloudFront, etc.
- Proven Understanding & strong experience with Cloud deployments ( AWS / Docker/ Kubernetes)
- Knowledge on provisioning IAC Tools like Terraform, Chef, Ansible, Shell, groovy, python, etc.
- Experience with monitoring systems such as CloudWatch, NewRelic, Datadog/Splunk, ELK stack.
- Experience managing cloud network resources (AWS Preferred) such as CloudWatch, VPC, URL proxies, private link, DNS, ACLs, firewalls, and C2S access points.
- Platform or Application Engineering and Operational Knowledge in any of the CI/CD tooling like GitHub Actions, Jenkins, etc.
- Experience in other tooling Technologies like JIRA, Bitbucket, Jenkins, Fortify, SonarQube, Nexus, Nexus IQ
- Experience with configuration automation tools like Puppet/Ansible/Chef/Salt
- Scripting Skills : Strong scripting (e.g. Bash & Python) and automation skills.
- Operating Systems : Windows and Linux system administration.
- Problem Solving : Ability to analyze and resolve complex infrastructure resource and application deployment issues
- Strong attention to detail. Excellent verbal and written communication skills. Strong documentation skills.
Good To Have :
- Experience with Terraform/Ansible/Chef/Puppet
- Experience with GitHub Actions
- Experience with CloudFront, Fastly
-
Site Reliability Engineer
2 days ago
Chennai, Tamil Nadu, India HCLTech Full timeSite Reliability Engineer, in Application/Cloud Support, will be responsible for ensuring the reliability, scalability, and performance of critical business systems. Involve in identifying and solving issues within multiple components of these systems, utilizing expertise in Site Reliability Engineering. Roles & Responsibilities: -Lead efforts to improve the...
-
Site Reliability Engineer
2 days ago
Chennai, Tamil Nadu, India HCLTech Full timeJob DescriptionSite Reliability Engineer, in Application/Cloud Support, will be responsible for ensuring the reliability, scalability, and performance of critical business systems. Involve in identifying and solving issues within multiple components of these systems, utilizing expertise in Site Reliability Engineering.Roles & Responsibilities: -1. Lead...
-
Site Reliability Engineer
1 hour ago
Chennai, Tamil Nadu, India HCLTech Full timeSite Reliability Engineer, in Application/Cloud Support, will be responsible for ensuring the reliability, scalability, and performance of critical business systems. Involve in identifying and solving issues within multiple components of these systems, utilizing expertise in Site Reliability Engineering.Roles & Responsibilities: -1. Lead efforts to improve...
-
Site Reliability Engineer
1 week ago
Chennai, Tamil Nadu, India Bright Vision Technologies Full timeBright Vision Technologies has an immediate Full-time opportunity for Site Reliability Engineer (SRE) Job Role: Site Reliability Engineer (SRE) Job Type: Full Time Candidates Looking for Visa sponsorship and willing to relocate to USA are encouraged to apply.About Bright Vision Technologies: Bright Vision Technologies is a fast-growing technology company...
-
Site Reliability Engineer
1 week ago
Chennai, Tamil Nadu, India 10decoders Full timeJD: Site Reliability Engineer - GCP With Terraform The Role: We are looking for a Senior SRE with 5+ years of experience to work primarily with our Application development team. An ideal candidate would have extensive experience building cloud infrastructure on Google Cloud with Terraform and have strong experience running workloads that scale on Google's...
-
Site Reliability Engineer
4 weeks ago
Chennai, Tamil Nadu, India 10decoders Full timeJD: Site Reliability Engineer - GCP With TerraformThe Role:We are looking for a Senior SRE with 5+ years of experience to work primarily with ourApplication development team. An ideal candidate would have extensive experiencebuilding cloud infrastructure on Google Cloud with Terraform and have strongexperience running workloads that scale on Google's...
-
Site Reliability Engineer
4 weeks ago
Chennai, Tamil Nadu, India Burgeon It Services Pvt Ltd Full timeJob Title : SRE EngineerLocation : ChennaiExperience : 8+ YearsJob Description :We are seeking an experienced Site Reliability Engineer (SRE) to join our dynamic team. The ideal candidate will have a strong background in software engineering and operations, with a passion for building scalable and reliable systems.Key Responsibilities :- Design, implement,...
-
Site Reliability Engineer
3 weeks ago
Chennai, Tamil Nadu, India Burgeon It Services Pvt Ltd Full timeJob Title : SRE EngineerLocation : ChennaiExperience : 8+ YearsJob Description :We are seeking an experienced Site Reliability Engineer (SRE) to join our dynamic team. The ideal candidate will have a strong background in software engineering and operations, with a passion for building scalable and reliable systems.Key Responsibilities :- Design, implement,...
-
Site Reliability Engineer
3 weeks ago
Chennai, Tamil Nadu, India 10decoders Full timeJob Summary We are seeking a Senior Site Reliability Engineer (SRE) with 5+ years of experience to join our team and work primarily with our Application development team. The ideal candidate will have extensive experience building cloud infrastructure on Google Cloud Platform using Terraform and strong experience running workloads that scale on Google's...
-
Site Reliability Engineer
3 weeks ago
Chennai, Tamil Nadu, India 10decoders Full timeJD: Site Reliability Engineer -GCP With TerraformThe Role:We are looking for a Senior SRE with5+ yearsof experience to work primarily with ourApplication development team. An ideal candidate would have extensive experiencebuilding cloud infrastructure onGoogle Cloud with Terraformand have strongexperience running workloads that scale on Google's Kubernetes...
-
Site Reliability Engineer
3 weeks ago
Chennai, Tamil Nadu, India Zf Friedrich Full timeJob DescriptionJob Description :Req ID 77489|GEC Chennai, India,ZF Commercial Vehicle Control Systems India LimitedLong DescriptionAbout the Team:Garuda team is a SRE team responsible for the reliability and operations of our Fleet management services platform. We ensure the availability and performance of the platform through proactive incident management,...
-
Site Reliability Engineer
4 weeks ago
Chennai, Tamil Nadu, India Kiash Solutions LLp Full timeWe are hiring a Site Reliability Engineer (SRE) with strong expertise in Azure operations, containerized workflows (Docker), and Python scripting. The ideal candidate will lead efforts to ensure system reliability, automate operational tasks, and optimize cloud-based infrastructure, while collaborating with cross-functional teams to deliver high-performing...
-
Site Reliability Engineer
3 weeks ago
Chennai, Tamil Nadu, India 10decoders Full timeJD: Site Reliability Engineer - GCP With TerraformThe Role:We are looking for a Senior SRE with 5+ years of experience to work primarily with ourApplication development team. An ideal candidate would have extensive experiencebuilding cloud infrastructure on Google Cloud with Terraform and have strongexperience running workloads that scale on Google's...
-
Site Reliability Engineer
2 weeks ago
Chennai, Tamil Nadu, India ZF Group Full timeJob DescriptionJob description:About the Team:Garuda team is a SRE team responsible for the reliability and operations of our Fleet management services platform. We ensure the availability and performance of the platform through proactive incident management, optimization, and continuous improvement while contributing to the development of SCALAR&aposs...
-
Site Reliability Engineer
3 weeks ago
Chennai, Tamil Nadu, India Everstage Inc. Full timeEverstage is looking to hire Site Reliability Engineer. Please write to bharath@everstage.com if the below opportunity excites you. We are seeking a skilled and motivated Site Reliability Engineer (SRE) with at least 2 years of experience in maintaining and optimising infrastructure. The ideal candidate will be responsible for ensuring system reliability...
-
Site Reliability Engineer
4 weeks ago
Chennai, Tamil Nadu, India 10decoders Full timeJob Description The Role: We are seeking a Senior Site Reliability Engineer with 5+ years of experience to work closely with our Application Development team. Responsibilities: Contribute to establishing best practices and shaping the SRE culture within our organization. Collaborate with teams to design, build, and improve Google Cloud infrastructure using...
-
Site Reliability Engineering Manager
7 days ago
Chennai, Tamil Nadu, India Bastion Data Solutions Full timeBecome a part of Bastion Data Solutions' mission to deliver exceptional data solutions.ResponsibilitiesThis on-site role at Bastion Data Solutions in Chennai requires a strong background in Site Reliability Engineering, software development, and system administration.Main duties will include:Ensuring site reliability and performanceDeveloping software...
-
Site Reliability Engineer
6 hours ago
Chennai, Tamil Nadu, India NexionPro Services Full timeRole: Site Reliability EngineerJob Title:Senior Infrastructure Engineer (Observability & Monitoring) Location:Bangalore / Pune Experience:Minimum 6 Years Job Description: We are looking for a highly skilledSenior Infrastructure Engineerwith extensive experience inInfrastructure as Code, Observability, and Monitoring. The ideal candidate will have advanced...
-
Site Reliability Engineer
1 day ago
Chennai, Tamil Nadu, India NexionPro Services Full timeRole: Site Reliability Engineer Job Title: Senior Infrastructure Engineer (Observability & Monitoring) Location: Bangalore / Pune Experience: Minimum 6 Years Job Description: We are looking for a highly skilled Senior Infrastructure Engineer with extensive experience in Infrastructure as Code, Observability, and Monitoring . The ideal candidate will...
-
Site Reliability Engineer
4 weeks ago
Chennai, Tamil Nadu, India Ascendion Full timeJob Description :We are looking for an experienced Azure Site Reliability Engineer (SRE) with 6-9 years of experience to support and administer Azure Kubernetes Service (AKS) clusters running critical middleware handling thousands of transactions per second (TPS). The ideal candidate will have a strong background in Infrastructure as Code (IaC), cloud...