Site Reliability Engineer

4 weeks ago


Ahmedabad, India Datum Technologies Group Full time

Job Title: Site Reliability Engineer (SRE) – AWSExperience: 8+ yearsLocation: Chennai / MumbaiWork Mode: HybridKey Skills: AWS, Terraform, Kubernetes, Docker, Grafana, Prometheus, DatadogJob Summary:We are looking for a skilled Site Reliability Engineer (SRE) with strong AWS experience and a solid background in DevOps, automation, observability, and large-scale distributed systems.Responsibilities:• Manage and optimize cloud infrastructure using AWS IaaS.• Implement SRE practices to enhance reliability, performance, and SDLC efficiency.• Build and maintain CI/CD pipelines (Jenkins, GitLab, Terraform).• Work with containers and orchestration (Docker, ECS, Kubernetes).• Troubleshoot performance, networking, and distributed system issues.• Drive DevOps and QA best practices across teams.• Implement observability: SLI/SLO, Error Budgets, monitoring, logging, tracing, alerting.• Lead incident resolution and perform RCA.• Automate tasks using Python/Bash/PowerShell.• Collaborate effectively with cross-functional teams with minimal supervision.Qualifications:• Strong AWS cloud experience• Proven DevOps & SRE implementation skills• Good understanding of Linux, networking, and distributed systems• Hands-on experience with observability tools• Strong scripting and automation expertise• Excellent communication and teamwork skills



  • Ahmedabad, India Proglite Full time

    We have the following requirements for the Site Reliability Engineer roleSkill Set:AWS: EC2, Networking, Storage, autoscaling, CloudWatch, SSM, management (patching/upgrades/security) of OS(windows/Linux) in EC2GCP: GKE/Compute, Networking, storage, Cloud Monitoring, management (patching/upgrades/security) of OS(windows/Linux) in computeSRE Practices:...


  • Ahmedabad, India Proglite Full time

    We have the following requirements for the Site Reliability Engineer roleSkill Set:AWS: EC2, Networking, Storage, autoscaling, CloudWatch, SSM, management (patching/upgrades/security) of OS(windows/Linux) in EC2GCP: GKE/Compute, Networking, storage, Cloud Monitoring, management (patching/upgrades/security) of OS(windows/Linux) in computeSRE Practices:...


  • ahmedabad, India ACL Digital Full time

    Job Description :- Continuous monitoring of system performance and identify potential issues before they impact users.- Experience working with Industry leading monitoring tools.- Respond to incidents related to monitoring systems, troubleshooting Level 1 issues and resolving issues promptly.- Analyze monitoring data to identify trends, anomalies, to...


  • Ahmedabad, India ACL Digital Full time

    Job Description :- Continuous monitoring of system performance and identify potential issues before they impact users.- Experience working with Industry leading monitoring tools.- Respond to incidents related to monitoring systems, troubleshooting Level 1 issues and resolving issues promptly.- Analyze monitoring data to identify trends, anomalies, to...


  • Ahmedabad, India ACL Digital Full time

    Job Description :- Continuous monitoring of system performance and identify potential issues before they impact users.- Experience working with Industry leading monitoring tools.- Respond to incidents related to monitoring systems, troubleshooting Level 1 issues and resolving issues promptly.- Analyze monitoring data to identify trends, anomalies, to...


  • Ahmedabad, India ACL Digital Full time

    Job Description :- Continuous monitoring of system performance and identify potential issues before they impact users.- Experience working with Industry leading monitoring tools.- Respond to incidents related to monitoring systems, troubleshooting Level 1 issues and resolving issues promptly.- Analyze monitoring data to identify trends, anomalies, to...


  • Ahmedabad, India ACL Digital Full time

    Job Description : - Continuous monitoring of system performance and identify potential issues before they impact users. - Experience working with Industry leading monitoring tools. - Respond to incidents related to monitoring systems, troubleshooting Level 1 issues and resolving issues promptly. - Analyze monitoring data to identify trends, anomalies, to...


  • Ahmedabad, Gujarat, India ACL Digital Full time

    Job Description :Continuous monitoring of system performance and identify potential issues before they impact users.Experience working with Industry leading monitoring tools.Respond to incidents related to monitoring systems, troubleshooting Level 1 issues and resolving issues promptly.Analyze monitoring data to identify trends, anomalies, to identify...


  • Ahmedabad, Hyderabad, India S&P Global Market Intelligence Full time

    Role Overview:As a Site Reliability Engineer at ChartIQ, you'll play a critical role not only in building, maintaining, and scaling the infrastructure that supports our Development our Development and QA needs, but also in driving new, exciting cloud-based solutions that will add to our offerings.Your work will ensure that the platforms used by our team...


  • Thaltej, Ahmedabad, Gujarat, India Artem HealthTech Private Limited Full time

    About the RoleWe are looking for a Senior Site Reliability Engineer (SRE) to lead the reliability strategy of our mission-critical HealthTech SaaS platform. This role is designed for a hands-on engineer who can architect and operate large-scale, high-availability systems, establish a 24×7 SRE practice, and enforce reliability standards through SLAs, SLOs,...