Site Reliability Engineer
2 weeks ago
Job Details:Job Title: Site Reliability Engineer (SRE) With Azure & AIDuration: Contract Position (On the Payroll of Datum Technology Group)Location: Chennai || Mumbai || GurugramInterview Process: Virtual (2 Rounds) + 1 Technical screening.Job Description:We are seeking a skilled and collaborative Site Reliability Engineer (SRE) with deep expertise in Azure cloud hosting, AI infrastructure, and automation.The ideal candidate will have hands-on experience managing cloud environments using GitHub/Azure DevOps lifecycle, and a strong understanding of AI model deployment and scaling.You will work closely with a team of engineers to ensure reliable, secure, and scalable infrastructure for AI workloads and enterprise applications.Key ResponsibilitiesDesign, build, and maintain scalable cloud infrastructure on Microsoft Azure.Automate infrastructure provisioning and deployment using Terraform, Argo, and Helm.Manage and optimize Azure Kubernetes Service (AKS) clusters for AI and microservices workloads.Support hosting of AI models using frameworks like Huggingface Transformers, vLLM, or Llama.cpp on Azure OpenAI, VMs, or GPUs.Implement CI/CD pipelines using GitHub Actions and integrate with JFrog Artifactory.Monitor system performance and reliability using Grafana and proactively address issues.Collaborate with software engineers to ensure infrastructure supports application needs.Ensure compliance with networking and information security best practices.Manage caching and data layer performance using Redis.Required Skills & TechnologiesCore to Role:Azure Cloud Services (including Azure OpenAI)AI Model Hosting & Infrastructure KnowledgeGitHub (CI/CD, workflows)Azure Kubernetes Service (AKS)Argo, HelmTerraformDockerJFrogGrafanaNetworking & SecurityRedisQualificationsBachelor's or master's degree in computer science, Engineering, or related field.6+ years of experience in SRE, DevOps, or Cloud Infrastructure roles .Proven experience with AI infrastructure and model deployment.Strong communication and teamwork skills.
-
Site Reliability Engineer
2 weeks ago
Delhi, India Datum Technologies Group Full timeJob Title: Site Reliability Engineer (SRE) – Azure & AIExperience: 7+ yearsWork Mode: HybridWork Location: Chennai/Mumbai/GurgaonJob Summary:We are looking for an experienced Site Reliability Engineer (SRE) with strong expertise in Microsoft Azure , AI infrastructure , and automation . The ideal candidate will have a solid background in managing...
-
Site Reliability Engineer
2 weeks ago
Delhi, India Datum Technologies Group Full timeJob Title: Site Reliability Engineer (SRE) – Azure & AIExperience: 7+ yearsWork Mode: HybridWork Location: Chennai/Mumbai/GurgaonJob Summary:We are looking for an experienced Site Reliability Engineer (SRE) with strong expertise in Microsoft Azure, AI infrastructure, and automation. The ideal candidate will have a solid background in managing cloud...
-
Site reliability engineer
6 days ago
Delhi, India Datum Technologies Group Full timeJob Title: Site Reliability Engineer (SRE) – AWSExperience: 8+ yearsLocation: Chennai / MumbaiWork Mode: HybridKey Skills: AWS, Terraform, Kubernetes, Docker, Grafana, Prometheus, DatadogJob Summary:We are looking for a skilled Site Reliability Engineer (SRE) with strong AWS experience and a solid background in Dev Ops, automation, observability, and...
-
Site Reliability Engineer
6 days ago
Delhi, India Datum Technologies Group Full timeJob Title: Site Reliability Engineer (SRE) – AWSExperience: 8+ yearsLocation: Chennai / MumbaiWork Mode: HybridKey Skills: AWS, Terraform, Kubernetes, Docker, Grafana, Prometheus, DatadogJob Summary:We are looking for a skilled Site Reliability Engineer (SRE) with strong AWS experience and a solid background in DevOps, automation, observability, and...
-
Site reliability engineer
5 days ago
Delhi, India Datum Technologies Group Full timeJob Title: Site Reliability Engineer (SRE) – AWSExperience: 8+ yearsLocation: Chennai / MumbaiWork Mode: HybridKey Skills: AWS, Terraform, Kubernetes, Docker, Grafana, Prometheus, DatadogJob Summary:We are looking for a skilled Site Reliability Engineer (SRE) with strong AWS experience and a solid background in Dev Ops, automation, observability, and...
-
Site Reliability Engineer
6 days ago
Delhi, India Datum Technologies Group Full timeJob Title: Site Reliability Engineer (SRE) – AWSExperience: 8+ yearsLocation: Chennai / MumbaiWork Mode: HybridKey Skills:AWS, Terraform, Kubernetes, Docker, Grafana, Prometheus, DatadogJob Summary:We are looking for a skilled Site Reliability Engineer (SRE) with strong AWS experience and a solid background in DevOps, automation, observability, and...
-
Site Reliability Engineer
2 weeks ago
Delhi, India Grootan Technologies Full timeAbout the RoleWe are seeking a skilled Site Reliability Engineer (SRE) with 4–5 years of hands-on experience to join our engineering team. In this role, you will be responsible for building and maintaining reliable, scalable, and secure infrastructure to support our applications. You will leverage your expertise in automation, cloud platforms, and...
-
Site Reliability Engineer
2 weeks ago
Delhi, India Grootan Technologies Full timeAbout the RoleWe are seeking a skilled Site Reliability Engineer (SRE) with 4–5 years of hands-on experience to join our engineering team. In this role, you will be responsible for building and maintaining reliable, scalable, and secure infrastructure to support our applications. You will leverage your expertise in automation, cloud platforms, and...
-
Site Reliability Engineer
1 week ago
Delhi, India VXI Global Solutions Full timeWe are looking for a Site Reliability Engineer with 3+ years for Experience into design, implement, and manage robust observability solutions across our cloud infrastructure and applications. The ideal candidate will have hands-on experience withPrometheus ,Grafana , along with exposure toSolarWinds . You should be comfortable working withmetrics, logs, and...
-
Site Reliability Engineer
3 weeks ago
New Delhi, India Tata Consultancy Services Full timeRole: Site Reliability Engineer Experience: 4 to 7 Years Locations: Chennai/Pune/Kolkata