Site Reliability Engineer
5 days ago
Role : Google Cloud SRE Engineer
    We are seeking an exceptional Google Cloud SRE Engineer to join our engineering team.
    This role requires a highly skilled professional with deep expertise in Google Cloud Platform (GCP), Kubernetes, Infrastructure as Code, and CI/CD automation.
    The ideal candidate thrives in high-pressure production environments, excels at automation, and continuously drives improvements in system reliability, scalability, and operational efficiency.
Title : Google Cloud SRE Engineer.
Location : Remote Work.
Employment Type : Full Time.
No of Openings : 2.
Timings : 24-7 (rotational Shifts).
Key Responsibilities :
    - Ensure the reliability, availability, and performance of production systems hosted on GCP.
    - Lead incident response and troubleshooting efforts for critical production issues.
    - Perform root cause analysis and implement long-term fixes to prevent recurrence.
    - Champion monitoring, alerting, and observability practices to enhance system resilience.
Programming & Automation
    - Develop and maintain automation tools, scripts, and services using Python, Go, and Bash.
    - Identify repetitive operational tasks and convert them into automated workflows.
    - Build scalable, robust solutions to reduce operational toil and improve reliability.
Google Cloud Platform (GCP)
    - Architect, deploy, and optimize production-grade workloads on GCP.
    - Ensure adherence to GCP best practices, cost optimization strategies, and security compliance.
    - Continuously evaluate and adopt emerging GCP services to enhance cloud operations.
Kubernetes (GKE)
    - Manage and optimize large-scale GKE clusters.
    - Implement deployment strategies, resource management, and cluster security.
    - Troubleshoot complex issues in containerized workloads and cluster environments.
CI/CD & Infrastructure as Code
    - Design, implement, and maintain CI/CD pipelines using Jenkins, GitLab CI, or GitHub Actions.
    - Define and manage cloud infrastructure using Terraform, including reusable and modular configurations.
    - Collaborate with developers to ensure seamless integration and automated testing.
Required Skills & Experience :
    - Programming/Scripting : Expert in Python, Go, and Bash with proven automation portfolio.
    - GCP : 2 years of hands-on GCP experience with deep understanding of its services and architecture.
    - Kubernetes (GKE) : Advanced experience in managing production clusters, deployments, and troubleshooting.
    - CI/CD : Strong expertise with Jenkins, GitLab CI, or GitHub Actions; proven history of building enterprise-grade pipelines.
    - Terraform : Proficiency in Infrastructure as Code with Terraform, including reusable and modular configurations.
    - Incident Response : Demonstrated excellence in handling critical production incidents and performing RCA.
    - Automation-First Mindset : Consistent track record of converting manual tasks into automated workflows.
    - AI Integration : Awareness and experience in applying AI/ML tools in DevOps practices is a strong plus.
Preferred Qualifications
    - GCP Professional Cloud DevOps Engineer or Architect certification.
    - Experience with monitoring/observability tools (Prometheus, Grafana, ELK, Stackdriver).
    - Exposure to service mesh technologies (Istio, Linkerd).
    - Familiarity with security practices such as IAM, workload identity, and secrets management.
- 
					  Site Reliability Engineer1 week ago 
 Anywhere in India/Multiple Locations Hashone Careers Full time ₹ 15,00,000 - ₹ 25,00,000 per yearWe are seeking a dedicated Reliability Engineer to ensure the optimal performance, availability, and reliability of our systems and infrastructure. In this role, you will focus on identifying potential issues before they impact users, improving system robustness, and driving continuous improvement in operational practices.Key Responsibilities : - Monitor,... 
- 
					  Site Reliability Engineer1 week ago 
 Anywhere in India/Multiple Locations MyOperator Full time ₹ 5,00,000 - ₹ 15,00,000 per yearAbout MyOperator MyOperator is a Business AI Operator, a category leader that unifies WhatsApp, Calls, and AI-powered chat & voice bots into one intelligent business communication platform. Unlike fragmented communication tools, MyOperator combines automation, intelligence, and workflow integration to help businesses run WhatsApp campaigns, manage calls,... 
- 
					  Site Reliability Engineer/Architect1 day ago 
 Anywhere in India/Multiple Locations Cling Multi Solutions Full time ₹ 15,00,000 - ₹ 25,00,000 per yearJob Summary : We are seeking an experienced Site Reliability Engineer (SRE) Architect with over 10 years of IT experience, specializing in designing and implementing highly scalable, reliable, and automated systems. The ideal candidate will have strong expertise in cloud-native architectures, automation, monitoring, and SRE practices. This role... 
- 
					  Site Reliability Engineer/Architect1 week ago 
 Anywhere in India/Multiple Locations Cling Multi Solutions Full time ₹ 15,00,000 - ₹ 25,00,000 per yearJob Description : Role : Site Reliability Engineer (SRE) Location : Bangalore / Chennai / Pune (Hybrid) Experience : 5 years Role Overview : We are looking for a skilled SRE to ensure the reliability, scalability, and performance of our cloud-native applications. The ideal candidate has hands-on experience in cloud environments, container... 
- 
					  Site Reliability Engineer1 week ago 
 , India, IN Sonata Software Full timeWe're Hiring: Senior Site Reliability Engineer Location: Onsite (Office: Hyderabad – Mandatory from Day 1) Employment Type: Full-time Notice Period: Immediate to 15 Days Only Experience: 8+ Years About the RoleWe’re looking for a Senior Site Reliability Engineer (SRE) to lead reliability initiatives across our production systems. This is a high-impact... 
- 
					Site Reliability Engineer3 days ago 
 India Akamai Full time ₹ 5,00,000 - ₹ 15,00,000 per yearDo you want to grow your career in Linux and Site Reliability Engineering?Would you like to contribute to the foundation of a new public cloud platform?Join our IaaS Site Reliability Engineering (SRE) team.We design, develop, and operate infrastructure and services that power the backbone of our cloud platform. This is a rare opportunity to help build a... 
- 
					Site Reliability Engineer1 week ago 
 India Akamai Full time ₹ 8,00,000 - ₹ 24,00,000 per yearDo you like collaborating across teams to solve complex problems?Do you enjoy solving large scale distributed content delivery challenges?Join our highly skilled Compute Site Reliability teamOur team designs, develops, and manages applications and infrastructure that support Akamai's Compute products and services. We specialize in creating solutions that... 
- 
					  Site Reliability Engineer2 weeks ago 
 Bengaluru, Karnataka, India, Karnataka ViewSonic Full timeJob Requirements:Bachelor's degree in Computer Science, Engineering, or a related field.3+ year of experience in a relevant role, such as Site Reliability Engineer, DevOps Engineer, or similar, is preferred but not mandatory.Basic understanding of AWS solutions including EC2, S3, CloudWatch, Lambda, and RDS.Interest and understanding of Platform Engineering... 
- 
					Site Reliability Engineer7 days ago 
 India LivePerson Full time ₹ 9,00,000 - ₹ 12,00,000 per yearLivePerson (NASDAQ: LPSN) is a leading customer engagement company, creating digital experiences powered by Curiously Human AI. Every person is unique, and our technology makes it possible for companies, including leading brands like HSBC, Orange, and GM Financial, to treat their audiences that way at scale. Nearly a billion conversational interactions are... 
- 
					  Site Reliability Engineer7 days ago 
 Bengaluru, Karnataka, India, Karnataka WhiteLotus Talent Partners Full timeWe are looking for a L0 and L1 Site Reliability Engineer (SRE) Support to join our Krutrim Cloud Site Reliability operations team and ensure the smooth functioning of our cloud infrastructure powered by OpenStack and Kubernetes. In this role, you will focus on monitoring, basic troubleshooting, and incident response, helping to maintain high system...