Site Reliability Engineer
4 weeks ago
Category Details Role Site Reliability Engineer (SRE) III – Data Engineering Location Hyderabad- Employment Type Full Time Experience 7–12 years insite reliability, cloud-based data infrastructure, data pipeline observability, automation, and high-availability engineeringwithinEdTech platforms (2U) Primary Skills (Must-Have) AWS, CI/CD, Jenkins, IAAC, Terraform, Kubernetes Secondary Skills (Good-to-Have) AWS systems; Dataiku data, Platform updates and patching Tools & Platforms Data Warehousing & Processing: Snowflake, Redshift, Apache Airflow, dbt CI/CD & Deployment: Jenkins, GitHub Actions, AWS CodePipeline, Terraform Cloud & Event Processing: AWS Lambda, API Gateway, SNS/SQS, Kafka, Step Functions Monitoring & Logging: DataDog, AWS CloudWatch, Prometheus, Splunk Incident Management: PagerDuty, Opsgenie, AWS Health Dashboard Collaboration & Code Review: GitHub, Jira, Confluence Key Responsibilities Data Pipeline Reliability & Observability: - Maintain and optimizehighly available, fault-tolerant infrastructurefordata pipelines, ETL jobs, and real-time data processing - Implementend-to-end monitoring of Airflow DAGs, Snowflake queries, and AWS-based data workflows - Automatedata pipeline health checks, error handling, and auto-remediation strategiesInfrastructure & Cloud Automation: - Deploy and manageAWS-based data infrastructure using Terraform and CloudFormation - OptimizeKubernetes (EKS) clustersfor processing large-scale datasets and real-time analytics - Ensurehigh availability and cost-efficient scalingforRedshift, Snowflake, and data storage solutionsPerformance, Monitoring & Incident Response: - Implementreal-time monitoring, logging, and alertingusingDataDog, AWS CloudWatch, and Prometheus - Define and trackSLOs, SLIs, and error budgetsto improve data reliability and uptime - ConductRoot Cause Analysis (RCA), security audits, and post-mortems for incidentsSecurity & Compliance: - EnsureGDPR, CCPA, and SOC 2 compliancefordata storage, access controls, and retention policies - ImplementAWS security best practices (IAM, KMS, Shield, WAF) to secure data access and encryption - SecureAPI gateways, authentication mechanisms, and data lake permissionsto prevent unauthorized accessCollaboration & Leadership: - Work closely withdata engineers, analytics teams, and DevOps engineersto enhance data platform reliability - Participate inincident response drills, disaster recovery planning, and security compliance reviews - Advocate forbest practices in automation, cost optimization, and cloud-native data solutions
-
Site Reliability Engineer
3 weeks ago
New Delhi, India IntraEdge Full timeJob Title: Site Reliability Engineer (SRE) – Production SupportLocation: BengaluruJob Summary:We are looking for a skilled Site Reliability Engineer (SRE) with strong experience in production support, DevOps practices, and cloud infrastructure management. The ideal candidate will be responsible for maintaining the reliability, performance, and scalability...
-
Site Reliability Engineer
3 weeks ago
New Delhi, India WhiteLotus Talent Partners Full timeWe are looking for a L0 and L1 Site Reliability Engineer (SRE) Support to join our Krutrim Cloud Site Reliability operations team and ensure the smooth functioning of our cloud infrastructure powered by OpenStack and Kubernetes. In this role, you will focus on monitoring, basic troubleshooting, and incident response, helping to maintain high system...
-
Site Reliability Engineer
3 weeks ago
New Delhi, India WhiteLotus Talent Partners Full timeWe are looking for aL0 and L1 Site Reliability Engineer (SRE) Supportto join our Krutrim Cloud Site Reliability operations team and ensure the smooth functioning of our cloud infrastructure powered byOpenStackandKubernetes . In this role, you will focus onmonitoring ,basic troubleshooting , andincident response , helping to maintain high system availability,...
-
Site Reliability Engineer
4 days ago
New Delhi, India Tata Consultancy Services Full timeRole: Site Reliability Engineer Experience: 4 to 7 Years Locations: Chennai/Pune/Kolkata
-
Site Reliability Engineer
4 weeks ago
New Delhi, India Endpoint Clinical Full timeAbout Us:Endpoint is an interactive response technology (IRT®) systems and solutions provider that supports the life sciences industry. Since 2009, we have been working with a single vision in mind, to help sponsors and pharmaceutical companies achieve clinical trial success. Our solutions, realized through the proprietary PULSE® platform, have proven to...
-
Site Reliability Engineer
4 weeks ago
New Delhi, India Endpoint Clinical Full timeAbout Us:Endpoint is an interactive response technology (IRT®) systems and solutions provider that supports the life sciences industry. Since 2009, we have been working with a single vision in mind, to help sponsors and pharmaceutical companies achieve clinical trial success. Our solutions, realized through the proprietary PULSE® platform, have proven to...
-
Site Reliability Engineer
3 weeks ago
New Delhi, India SID Global Solutions Full timeJob Role: Site Reliability Engineer (SRE) – GCPExperience: 3+ yearsLocation: HyderabadAbout SIDGS:SIDGS is a premium global systems integrator and global implementation partner of Google corporation, providing Digital Solutions & Services to Fortune 500 companies. Our Digital solutions go across following domains: User Experience, CMS, API Management,...
-
Site Reliability Engineer
4 weeks ago
New Delhi, India IntraEdge Full timeJob Title: Site Reliability Engineer (SRE) – Production Support Location: BengaluruJob Summary: We are looking for a skilledSite Reliability Engineer (SRE)with strong experience inproduction support, DevOps practices, and cloud infrastructure management . The ideal candidate will be responsible for maintaining the reliability, performance, and scalability...
-
Site Reliability Engineer
4 days ago
New Delhi, India JRD Systems Full timeSite Reliability Engineer (Windows / Cloud / Automation) Job Summary: We are seeking an experienced Site Reliability Engineer with a strong background in managing Windows infrastructure and cloud environments. The ideal candidate will be responsible for designing, implementing, automating, and maintaining scalable infrastructure solutions across AWS, Azure,...
-
Site Reliability Engineering Manager
4 weeks ago
New Delhi, India Tata Consultancy Services Full timeRole**: Manager, Site Reliability Engineering Required Technical Skill Set: Manager, Site Reliability Engineering Desired Experience Range: 12 - 18 yrs Notice Period: Immediate to 90Days only Location of Requirement:Bangalore We are currently planning to do a VirtualInterviewJob Description: Describe what the person will do in the role - how he/she will...