
Site Reliability Engineer
2 days ago
Role:Site Reliability Engineer
Location:Hyderabad
Notice Period: Immediate to 20 Days
Employment Type:Full Time
Experience
- 7–12 years in site reliability, cloud-based data infrastructure, data pipeline observability, automation, and high-availability engineering within EdTech platforms (2U)
- Primary Skills (Must-Have)
- AWS, CI/CD, Jenkins, IAAC, Terraform, Kubernetes
- Secondary Skills (Good-to-Have)
- AWS systems; Dataiku data, Platform updates and patching
- Tools & Platforms
- Data Warehousing & Processing: Snowflake, Redshift, Apache Airflow, dbt
- CI/CD & Deployment: Jenkins, GitHub Actions, AWS CodePipeline, Terraform
- Cloud & Event Processing: AWS Lambda, API Gateway, SNS/SQS, Kafka, Step Functions
- Monitoring & Logging: DataDog, AWS CloudWatch, Prometheus, Splunk
- Incident Management: PagerDuty, Opsgenie, AWS Health Dashboard
- Collaboration & Code Review: GitHub, Jira, Confluence
Key Responsibilities
Data Pipeline Reliability & Observability:
- Maintain and optimize highly available, fault-tolerant infrastructure for data pipelines, ETL jobs, and real-time data processing
- Implement end-to-end monitoring of Airflow DAGs, Snowflake queries, and AWS-based data workflows
- Automate data pipeline health checks, error handling, and auto-remediation strategies
Infrastructure & Cloud Automation:
- Deploy and manage AWS-based data infrastructure using Terraform and CloudFormation
- Optimize Kubernetes (EKS) clusters for processing large-scale datasets and real-time analytics
- Ensure high availability and cost-efficient scaling for Redshift, Snowflake, and data storage solutions
Performance, Monitoring & Incident Response:
- Implement real-time monitoring, logging, and alerting using DataDog, AWS CloudWatch, and Prometheus
- Define and track SLOs, SLIs, and error budgets to improve data reliability and uptime
- Conduct Root Cause Analysis (RCA), security audits, and post-mortems for incidents
Security & Compliance:
- Ensure GDPR, CCPA, and SOC 2 compliance for data storage, access controls, and retention policies
- Implement AWS security best practices (IAM, KMS, Shield, WAF) to secure data access and encryption
- Secure API gateways, authentication mechanisms, and data lake permissions to prevent unauthorized access
Collaboration & Leadership:
- Work closely with data engineers, analytics teams, and DevOps engineers to enhance data platform reliability
- Participate in incident response drills, disaster recovery planning, and security compliance reviews
- Advocate for best practices in automation, cost optimization, and cloud-native data solutions
-
Site Reliability Engineer
1 week ago
Hyderabad, Telangana, India, Telangana SID Global Solutions Full timeJob Role: Site Reliability Engineer (SRE) – GCPExperience: 3+ yearsLocation: HyderabadAbout SIDGS:SIDGS is a premium global systems integrator and global implementation partner of Google corporation, providing Digital Solutions & Services to Fortune 500 companies. Our Digital solutions go across following domains: User Experience, CMS, API Management,...
-
Site Reliability Engineer
2 days ago
Hyderabad, Telangana, India, Telangana Sonata Software Full timeCategoryDetailsRoleSite Reliability Engineer (SRE) III – Data EngineeringLocationHyderabad- Employment TypeFull TimeExperience7–12 years in site reliability, cloud-based data infrastructure, data pipeline observability, automation, and high-availability engineering within EdTech platforms (2U)Primary Skills (Must-Have)AWS, CI/CD, Jenkins, IAAC,...
-
Site Reliability Engineer
2 days ago
Hyderabad, Telangana, India, Telangana Insight Global Full timeJob Description:Title: Site Reliability EngineerLocation: Hyderabad (4 days onsite and 1 day remote)Required Skills & Experience:Bachelor's degree in computer science, Engineering, or related field5+ years of experience in SRE or related rolesProficiency in Python and experience with Kubernetes and KafkaExperience with Ignition SCADA and RESTful APIsStrong...
-
Site Reliability Engineer
2 days ago
Hyderabad, Telangana, India, Telangana Talentiser Full timeYOUR IMPACT: Reliability, Automation, and Observability As a hybrid Site Reliability Engineer/DevOps Engineer, you'll be a key driver in ensuring the stability, performance, and scalability of our mission-critical SaaS platform. You'll apply engineering principles to operational challenges, constantly striving to eliminate toil through automation.Operational...
-
AWS Site Reliability Engineer
1 week ago
Hyderabad, Telangana, India, Telangana HTC Global Services Full timeHTC – A brief profileEstablished in 1990, HTC Inc., a company with headquarters in Troy, Michigan, is a leading global Information Technology solution and BPO provider. HTC assists clients across multiple industry verticals, offering turnkey project lifecycle in, e-business, data warehousing, embedded systems, ECM, SCM, CRM, and ERP solutions. HTC Inc....
-
Site Reliability Engineer
1 week ago
Hyderabad, Telangana, India, Telangana ValueMomentum Full timeAbout the RoleWe are seeking an experienced Site Reliability / Azure DevOps Engineer with Dynatrace Experience to join our engineering team and contribute to scalable CI/CD practices, infrastructure automation, and cloud operations. The ideal candidate will have deep expertise in Azure DevOps, Infrastructure as Code (IaC), Azure services, and modern DevOps...
-
Site Reliability Engineer
1 day ago
Hyderabad, Telangana, India Apple Full time ₹ 15,00,000 - ₹ 25,00,000 per yearImagine what you could do here. Apple is a place where extraordinary people gather to do their best work. Together we craft products and experiences people once couldn't have imagined — and now can't imagine living without. If you're motivated by the idea of making a real impact, and joining a team where we pride ourselves in being one of the most diverse...
-
Site Reliability Engineer
2 weeks ago
Hyderabad, Telangana, India SID Global Solutions Full time ₹ 9,00,000 - ₹ 12,00,000 per yearJob Role: Site Reliability Engineer (SRE) – GCPExperience: 3+ yearsLocation: HyderabadAbout SIDGS:SIDGS is a premium global systems integrator and global implementation partner of Google corporation, providing Digital Solutions & Services to Fortune 500 companies. Our Digital solutions go across following domains: User Experience, CMS, API Management,...
-
Site Reliability Engineer
1 week ago
Hyderabad, Telangana, India TurboHire Full time ₹ 15,00,000 - ₹ 28,00,000 per yearSite Reliability Engineer (SRE)Location: Hyderabad (Hybrid)Experience: 3–5 yearsAbout the RoleWe are looking for an SRE Engineer to own reliability, deployment, and monitoringof TurboHire's cloud infrastructure. You will ensure our platform is scalable, secure,and highly available. The role balances hands-on coding, automation, and infraoperations, freeing...
-
Site Reliability Engineer III
2 weeks ago
Hyderabad, Telangana, India JPMorganChase Full time ₹ 20,00,000 - ₹ 25,00,000 per yearJOB DESCRIPTIONThere's nothing more exciting than being at the center of a rapidly growing field in technology and applying your skillsets to drive innovation and modernize the world's most complex and mission-critical systems.As a Site Reliability Engineer III at JPMorgan Chase within the Consumer & Community Banking, you will solve complex and broad...