
Site Reliability Engineer
1 day ago
we have open requirement for "SRE LEAD Engineer"
client: MNC.
PRODUSCT BASE US COMPANY
Role & responsibilities
Responsibilities:
- Architect, design, and deploy end-to-end infrastructure solutions for a multi-tenant
microservices-based SaaS application with a focus on AI/ML model integration.
- Ensure system reliability, scalability, performance, and security, specifically enhancing
AI/ML processing pipelines and workflows.
- Utilize Terraform scripting for on-demand environment provisioning within the AWS
cloud, optimized for AI/ML workloads.
- Implement and refine monitoring and alerting systems across application, network, and
OS layers to support AI model operations and data processing.
- Diagnose, support, and resolve production issues and alerts, participating in a 24/7
on-call rotation to maintain seamless AI/ML service operations.
Qualifications :
- 8+ years of experience in Site Reliability Engineering (SRE) and DevOps roles with a
track record of managing large-scale enterprise SaaS services in production, including
1+ year in AI/ML infrastructure.
- Demonstrated expertise with AWS public cloud technologies, including extensive
experience in deploying and managing large-scale container clusters using AWS, EKS.
Skilled in Infrastructure as Code (IaC) using Terraform, and container technologies such
as Docker and Kubernetes.
- Proficient in scripting and programming for automation (Python, Bash, etc.), with strong
Linux OS and networking fundamentals relevant to AI/ML workloads.
Job Description:
- Experience in establishing monitoring systems to ensure high availability, performance,
and security integrity, using tools like ELK Stack, CloudWatch, and others tailored for
AI/ML monitoring.
- Hands-on experience managing microservices architecture SaaS products, enabling
RESTful web services, SSO integration (Okta, Auth0), and utilizing cloud databases like
EC2-RDS, MySQL, and Elasticsearch, especially in AI/ML deployments.
- Proficient in backup and disaster recovery strategies specific to AI/ML data resources
like RDS and Elasticsearch.
- AWS Certified Solutions Architect is strongly preferred.
- Self-driven, proactive, and adaptable to thrive in an early-stage startup environment, with
a keen interest in integrating AI/ML technologies into modern SaaS solutions.
Preferred candidate profile
If interested candidates please share the your profiles to .AI
NP: Immediate to 30 days
loc:HYD
-
Site Reliability Engineer
6 days ago
Hyderabad, Telangana, India Apple Full time ₹ 15,00,000 - ₹ 25,00,000 per yearImagine what you could do here. Apple is a place where extraordinary people gather to do their best work. Together we craft products and experiences people once couldn't have imagined — and now can't imagine living without. If you're motivated by the idea of making a real impact, and joining a team where we pride ourselves in being one of the most diverse...
-
SRE(Site Reliability Engineer)
4 days ago
Hyderabad, Telangana, India Talent Worx Full time ₹ 20,00,000 - ₹ 25,00,000 per yearSRE (Site Reliability Engineer)Talent Worx is seeking a talented SRE (Site Reliability Engineer) to enhance our technology team. In this role, you will be pivotal in ensuring the reliability, performance, and availability of our applications and services. Your work will involve both software engineering and systems operations as you strive to improve...
-
Site Reliability Engineer
2 weeks ago
Hyderabad, Telangana, India TurboHire Full time ₹ 15,00,000 - ₹ 28,00,000 per yearSite Reliability Engineer (SRE)Location: Hyderabad (Hybrid)Experience: 3–5 yearsAbout the RoleWe are looking for an SRE Engineer to own reliability, deployment, and monitoringof TurboHire's cloud infrastructure. You will ensure our platform is scalable, secure,and highly available. The role balances hands-on coding, automation, and infraoperations, freeing...
-
Site Reliability Engineer
4 days ago
Hyderabad, Telangana, India LivePerson Full time ₹ 8,00,000 - ₹ 15,00,000 per yearLivePerson (NASDAQ: LPSN) is a leading customer engagement company, creating digital experiences powered by Curiously Human AI. Every person is unique, and our technology makes it possible for companies, including leading brands like HSBC, Orange, and GM Financial, to treat their audiences that way at scale. Nearly a billion conversational interactions are...
-
Site Reliability Engineer III
3 days ago
Hyderabad, Telangana, India Chase- Candidate Experience page Full time ₹ 1,04,000 - ₹ 1,30,878 per yearThere's nothing more exciting than being at the center of a rapidly growing field in technology and applying your skillsets to drive innovation and modernize the world's most complex and mission-critical systems.As a Site Reliability Engineer III at JPMorgan Chase within the Chief Technology Office team, you will solve complex and broad business problems...
-
Site Reliability Engineering
1 day ago
Hyderabad, Telangana, India Acesoft Labs Full time ₹ 20,00,000 - ₹ 25,00,000 per yearHi ,Kindly find the below JD :Job Title: Site Reliability Engineering (SRE) ManagerLocation: HyderabadEmployment Type: Full-TimeWork Model - 3 Days from office (Hybrid)Summary:The SRE Manager at TechBlocks India will lead the reliability engineering function, ensuring infrastructure resiliency and optimal operational performance. This hybrid role blends...
-
Lead Site Reliability Engineer
1 week ago
Hyderabad, Telangana, India EPAM Systems Full time ₹ 15,00,000 - ₹ 25,00,000 per yearWe are seeking a skilledLead Site Reliability Engineerto drive the stability, scalability, and reliability of our systems while improving efficiency through automation and best practices.This role calls for deep expertise in DevOps methodologies, Infrastructure as Code (IaC), and collaboration across teams to ensure optimal system...
-
Site Reliability Support Engineer
20 hours ago
Hyderabad, Telangana, India Innovatz Global Full time ₹ 9,00,000 - ₹ 12,00,000 per yearCompany DescriptionInnovatz Global is a leading Management Consulting, Technology Services, and Business Process Outsourcing company headquartered in Kuala Lumpur, Malaysia. With over 500 skilled professionals, we have a significant presence across America, China, India, Australia, and several other countries. We have a proven track record of delivering...
-
Principal Site Reliability Engineer
2 weeks ago
Hyderabad, Telangana, India Amgen Inc Full time ₹ 8,00,000 - ₹ 12,00,000 per yearWe are looking for a Site Reliability Engineer/Cloud Engineer (SRE) to work on the performance optimization, standardization, and automation of Amgens critical infrastructure and systems. This role is crucial to ensuring the reliability, scalability, and cost-effectiveness of our production systems. The ideal candidate will work on operational excellence...
-
Lead Site Reliability Engineer
5 days ago
Hyderabad, Telangana, India JPMorgan Chase Full time ₹ 12,00,000 - ₹ 36,00,000 per yearAssume a critical role in defining the future of a globally recognized firm and have a direct and significant effect in a realm tailored for top achievers in site reliability. As a Lead Site Reliability Engineer at JPMorgan Chase within the Consumer & Community Banking, you hold a leadership role in your team, demonstrate strong knowledge across multiple...