Cloud Engineering Ops Lead
1 day ago
We are seeking a Cloud Engineering Ops Lead responsible for ensuring the stability, observability, security, and cost-efficiency of our AWS environments and customer-facing applications. This role is critical in maintaining production operations that are reliable, predictable, and optimized for performance and resilience.
Key Responsibilities:
1. AWS Platform Operations
- Manage and maintain AWS core services including EC2, EKS, RDS, ALB/CloudFront, IAM/OIDC, VPC, Transit Gateways, and Security Groups.
- Ensure system hygiene, patching, and infrastructure health.
- Automate operational workflows using Terraform, Ansible, or Python.
2. Application Support
- Ensure production readiness through runbooks, pre-deployment validations, performance baselines, and rollback mechanisms.
- Support releases with deployment assistance, smoke testing, and incident troubleshooting.
- Drive continuous improvement in application stability and availability.
3. Observability & Monitoring
- Build and maintain dashboards, logs, metrics, traces, and synthetic monitoring.
- Ensure alert accuracyeliminate noise and ensure targeted notifications.
- Track SLOs, error budgets, and system performance.
- Lead incident response, RCA, and implement corrective actions.
4. Backup & Disaster Recovery
- Define and manage backup and restore operations with schedules, retention rules, replication, and validation.
- Conduct regular DR drills to ensure RPO/RTO targets are consistently met.
- Maintain up-to-date documentation on disaster recovery processes.
5. Cost Optimization
- Enforce cost governance through tagging, right-sizing, reservation planning, and lifecycle management (EBS, EIP, AMIs).
- Generate cost analysis reports with actionable recommendations to improve efficiency.
6. Team Leadership & Enablement
- Lead high-severity incident bridges (Sev-1/Sev-2) with clear communication.
- Mentor team members in operational excellence and preventive practices.
- Develop reusable runbooks and automation to eliminate repetitive tasks.
- Promote a culture of reliability, transparency, and proactive improvement.
Success Metrics:
- Visibility: Dashboards and alerts are reliable, actionable, and service-specific.
- Backup Health: 100% backup success rate with monthly restore testing.
- Reliability: Reduced MTTR, increased deployment success rate, and runbook-driven resolutions.
- Change Management: Stable release cycles with tested rollback strategies.
- Cost Control: Optimized AWS expenditure with over 95% tagging compliance.
Required Skills & Experience:
- 10+ years in cloud and application operations with deep expertise in AWS.
- Proven leadership in managing production incidents and driving operational excellence.
- Strong knowledge of observability tools: CloudWatch, Prometheus, Grafana, Datadog, etc.
- Hands-on experience with Terraform, Ansible, and/or Python for automation (IaC).
- Expertise in backup strategies and disaster recovery practices with real-world restore testing.
- Solid understanding of AWS cloud networking including VPCs, routing, security groups, and transit gateways.
- Excellent communication, mentoring ability, and problem-solving mindset.
-
Software Engineer – Cloud Ops
6 days ago
Hyderabad, Telangana, India Amgen Technology Private Limited Full time ₹ 1,80,000 - ₹ 2,40,000 per yearSpecialist Software Engineer – Cloud Ops Career CategoryInformation Systems Job Description Join Amgen's Mission of Serving Patients At Amgen, if you feel like you're part of something bigger, it's because you are. Our shared mission—to serve patients living with serious illnesses—drives all that we do. Since 1980, we've helped pioneer the...
-
LLM Ops Engineer
1 week ago
Hyderabad, Telangana, India Apple Full time ₹ 12,00,000 - ₹ 36,00,000 per yearWe work on Apple scale opportunities and challenges. We are engineers at heart. We like solving technical problems. We believe a good engineer has the curiosity to dig into inner workings of technology and is always experimenting, reading and in constant learning mode. If you are a software engineer with passion to code and dig deeper into any technology,...
-
DevOps Engineer
1 week ago
Hyderabad, Telangana, India WaferWire Cloud Technologies Full time ₹ 9,00,000 - ₹ 12,00,000 per yearJob Title: DevOps EngineerJob Location: Hyderabad, IndiaWorksite: Onsite (100%)About WCT:WaferWire Technology Solutions (WCT) specializes in delivering comprehensive Cloud, Data and AI solutions through Microsoft's technology stack. Our services include Strategic Consulting, Data/AI Estate Modernization, and Cloud Adoption Strategy. We excel in Solution...
-
Cloud Security Governance
1 week ago
Hyderabad, Telangana, India Magellanic Cloud Full time ₹ 15,00,000 - ₹ 25,00,000 per yearTotal Experience: YearsRelevant Experience: 8+ YearsLocation: HyderabadWork Mode: Hybrid (3 Days WFO)Shift Timing: Night Shift(8AM- 5 PM EST)Primary Skills:Financial management and cloud cost optimizationExperience with FinOps tools: CoreStack, CloudCheckr, OpsCompassDeep understanding of AWS, Azure, GCP pricing modelsFinancial analysis and reporting (Excel,...
-
ML Ops Engineer
1 week ago
Hyderabad, Telangana, India Tensorgo Technologies Full time ₹ 15,00,000 - ₹ 25,00,000 per yearProfile We are looking for an experienced and high-energy ML Ops Engineer. The primary function of this role is to design enterprisearchitecture. Envision and drive solution architecture after hearing the product svision and user stories with ability toenvision and drive a proactive architectural roadmap for anexisting product keeping in mind the future...
-
Support Engineer
1 week ago
Hyderabad, Telangana, India WaferWire Cloud Technologies Full time ₹ 2,00,000 - ₹ 6,00,000 per yearJob DescriptionJob Title: Support Engineer (MS Teams)Job Location: Hyderabad, IndiaWorksite: Work From OfficeAbout WCT:WaferWire Technology Solutions (WCT) specializes in delivering comprehensive Cloud, Data and AI solutions through Microsoft's technology stack. Our services include Strategic Consulting, Data/AI Estate Modernization, and Cloud Adoption...
-
Support Engineer
20 hours ago
Hyderabad, Telangana, India WaferWire Cloud Technologies Full time ₹ 2,00,000 - ₹ 6,00,000 per yearJob DescriptionJob Title: Support Engineer (MS Teams)Job Location: Hyderabad, IndiaWorksite: Work From OfficeAbout WCT:WaferWire Technology Solutions (WCT) specializes in delivering comprehensive Cloud, Data and AI solutions through Microsoft's technology stack. Our services include Strategic Consulting, Data/AI Estate Modernization, and Cloud Adoption...
-
Support Engineer
2 weeks ago
Hyderabad, Telangana, India, Telangana WaferWire Cloud Technologies Full timeJob Description Job Title: Support Engineer (MS Teams)Job Location: Hyderabad, India Worksite: Work From OfficeAbout WCT: WaferWire Technology Solutions (WCT) specializes in delivering comprehensive Cloud, Data and AI solutions through Microsoft's technology stack. Our services include Strategic Consulting, Data/AI Estate Modernization, and Cloud Adoption...
-
Magellanic Cloud
1 week ago
Hyderabad, Telangana, India Magellanic-Cloud Full time ₹ 15,00,000 - ₹ 25,00,000 per yearJob DescriptionKey Responsibilities :Design, build, and deploy machine learning models to solve real-world business problems.Perform data preprocessing, cleansing, transformation, and feature engineering on both structured and unstructured data.Train, evaluate, and optimize ML models for performance, scalability, and accuracy.Integrate ML models into...
-
IT Ops Leader
3 days ago
Hyderabad, Telangana, India Genzeon Full time ₹ 20,00,000 - ₹ 25,00,000 per yearPosition: IT Ops leaderLocation: Hyderabad, IndiaAbout GenzeonGenzeon is a leading provider of digital engineering, intelligent automation, security,compliance, cloud, and managed services. We empower our clients to adapt and be agile in anever-evolving digital landscape.SummaryWe are seeking an experienced IT Operations to oversee Genzeon's IT systems...