Cloud Reliability Engineer

3 weeks ago


Bengaluru, Karnataka, India Awign Expert Full time
Job Description

This is a remote position.

About Awign Expert:

Awign Expert is an enterprise-focused platform that helps businesses Hire, Assess and Manage highly skilled resources for Gig Based Projects. We provide our Experts a gateway to work for and build a freelance/consulting career with large-scale Enterprises. We are a newly launched business division of Awign, which is one of the pioneers and currently the largest player in the Gig Economy in India. Here at Awign, we are changing how the world works with a vision to uplift millions of Careers.

About the client -

This company is a leading enterprise mobile app development firm, specializing in delivering highly efficient, secure, and scalable applications to a global audience. They offer end-to-end design and development services, collaborating closely with clients to build scalable, user-centric, and innovative solutions. Their skilled designers and developers create engaging user experiences while leveraging cutting-edge technologies to ensure seamless functionality.

Job Title: Cloud Reliability Engineer (CRE)   

Location: Offshore 

Job Description: 

We are seeking Cloud Reliability Engineers (CREs) to support Carnival Cruise Line cloud infrastructure. The ideal candidates will focus on automating cloud operations, improving system reliability, and ensuring seamless observability and monitoring across the Carnival Cruise Line environment. 

The CRE team will be responsible for designing, implementing, and maintaining automation frameworks, monitoring systems, and log-mining solutions to enhance cloud operations. The role will also involve provisioning, fault management (FM), and optimizing cloud infrastructure for high availability and performance. 

Key Responsibilities: 

  • Automation & Cloud Operations: Develop and implement automation scripts and tools to streamline cloud operations and provisioning. 

  • Observability & Monitoring: Design and enhance observability frameworks, including real-time monitoring, log mining, and alerting systems for proactive issue detection. 

  • Infrastructure Reliability: Improve cloud infrastructure reliability through performance tuning, capacity planning, and automated remediation strategies. 

  • Fault Management (FM): Implement fault management processes to detect, diagnose, and resolve cloud infrastructure issues efficiently. 

  • Data Farms & Log Analysis: Leverage data analytics and log mining techniques to gain insights into system performance and troubleshoot anomalies. 

  • Provisioning & Deployment: Automate cloud provisioning and infrastructure-as-code (IaC) practices for efficient deployment across Carnival Cruise Lines' brands. 

  • Collaboration & Best Practices: Work closely with development, security, and operations teams to enforce best practices for cloud reliability and scalability. 

Required Skills & Experience: 

  • Experience in Cloud Operations & Automation (AWS, Azure and GCP) 

  • Proficiency in Infrastructure as Code (IaC) (Terraform,  Azure CloudFormation, Ansible, Chef, Puppet, Azure Resource Manager) 

  • Strong expertise in observability tools (Prometheus, Grafana, ELK Stack, Splunk, or Datadog) 

  • Log Mining & Data Analytics (Kibana, Splunk, or BigQuery) 

  • Fault Management & Incident Response experience in cloud environments 

  • Experience with containerized environments (Docker, Kubernetes) 

  • Proficiency in scripting & automation (Python, Bash, PowerShell) 

  • Understanding of cloud security, networking, and cost optimization 

Preferred Qualifications: 

  • Certifications in Cloud Technologies (AWS Certified DevOps Engineer, Azure DevOps, Google Cloud Professional DevOps Engineer) 

  • Experience in hybrid cloud environments (on-prem & cloud integration) 

  • Hands-on experience with Site Reliability Engineering (SRE) practices 

  • Experience in managing large-scale cloud infrastructure for enterprises 

Need the following along with profiles -
  • Candidate Name
  • Total Experience
  • Exp in CRE
  • Exp in AWS
  • Exp in Azure
  • Exp in GCP
  • Exp in Terraform
  • Exp in Splunk
  • Exp in Docker
  • Exp in Kubernetes
  • Exp in Python


Requirements
CRE, AWS, Azure, GCP, Terraform, Splunk, Docker, Kubernetes, Python

  • Bengaluru, Karnataka, India Banyan Cloud Full time

    Service Reliability Engineer JDAbout USHonest Data technologies Pvt Ltd, is a wholly owned subsidiary of Banyan Cloud, USA, the Cyber Security Product Company, headquartered in San Jose, California, USA, owning the SaaS product "Banyan Cloud", first of its kind Cyber Security CNAP Platform that simplifies the code to cloud security for multi cloud &...


  • Bengaluru, Karnataka, India Amazon Full time

    About the JobWe are seeking a highly skilled Cloud Reliability Engineer to join our team. As a Cloud Reliability Engineer, you will play a key role in ensuring the reliability and scalability of our cloud-based systems.Key ResponsibilitiesDesign and implement reliable and scalable cloud architectures using AWS services.Collaborate with cross-functional teams...


  • Bengaluru, Karnataka, India Sumo Logic Full time

    Role OverviewAs a Senior Site Reliability Engineer at Sumo Logic, you will be responsible for ensuring the reliability and security of our cloud-native applications. You will work closely with our development teams to identify and resolve issues that impact application performance and availability.About the RoleThis is a unique opportunity to join a...


  • Bengaluru, Karnataka, India Google Full time

    About the Role:Site Reliability Engineering (SRE) combines software and systems engineering to build and run large-scale, massively distributed, fault-tolerant systems. Our team ensures that Google Cloud's services have reliability, uptime appropriate to customer's needs and a fast rate of improvement.The SRE role involves managing complex challenges of...


  • Bengaluru, Karnataka, India Awign Expert Full time

    This is a remote position.About Awign Expert:Awign Expert is an enterprise-focused platform that helps businesses Hire, Assess and Manage highly skilled resources for Gig Based Projects. We provide our Experts a gateway to work for and build a freelance/consulting career with large-scale Enterprises. We are a newly launched business division of Awign, which...


  • Bengaluru, Karnataka, India Google Full time

    About Site Reliability EngineeringAt Google, we believe that software and systems engineering are essential to building and running large-scale, massively distributed, fault-tolerant systems. Our Site Reliability Engineering (SRE) team is responsible for ensuring the reliability, uptime, and performance of our services, both internally critical and...


  • Bengaluru, Karnataka, India Bosch India Full time

    At Bosch India, we are seeking a highly skilled Cloud Reliability Engineer to join our data engineering team. The ideal candidate will have extensive experience in designing and implementing scalable and robust cloud infrastructure using Azure services.Key ResponsibilitiesCollaborate with development teams to optimize application design for scaling and...


  • Bengaluru, Karnataka, India Grid Dynamics Full time

    Job Summary:We are seeking a highly skilled and experiencedCloud Site reliability engineerto join our dynamic team. The successful candidate will be responsible for designing, implementing, and managing cloud-based data platforms to support our data-driven initiatives. This role requires a deep understanding of cloud infrastructure, data engineering, and...


  • Bengaluru, Karnataka, India ValueLabs Full time

    Job DescriptionWe are seeking a skilled SRE Engineer with expertise in Python and AWS to enhance the reliability, scalability, and performance of our systems.Key Responsibilities:Design, implement, and maintain scalable and reliable cloud-based solutions using AWS.Develop automation scripts and tools using Python to improve system reliability and...


  • Bengaluru, Karnataka, India Philips Full time

    JOB DESCRIPTION Job Title Cloud Reliability Engineer Job Description Job title: Product Site Reliability Engineer Your role: Drive the design towards ensuring the quality, reliability and serviceability of RI PACS solutions Lead system features feasibility, prototyping, as well as quality and performance analysis. Contribute to future...


  • Bengaluru, Karnataka, India Infosys Limited Full time

    Infosys Limited is a global leader in digital transformation, and we are seeking a Cloud Reliability Engineer to join our team. As a key member of our SRE Automation Group, you will be responsible for monitoring systems for implemented automation and setting Service Level Indicators (SLIs) and Service Level Objectives (SLOs) with respective stakeholders.The...


  • Bengaluru, Karnataka, India Google Inc Full time

    **About the Role**We are seeking an exceptional engineer to join our Google Inc. team as a Site Reliability Engineer, Cloud Databases, AlloyDB SRE.This role combines software and systems engineering to build and run large-scale, massively distributed, fault-tolerant systems.SRE ensures that our services have reliability, uptime appropriate to users needs and...


  • Bengaluru, Karnataka, India Delta Air Lines Full time

    About Delta Air Lines: We are a global airline leader in safety, innovation, and customer experience. Our mission is to connect people and cultures globally while fostering understanding across a diverse world.Key Responsibilities: We collaborate with application SRE teams to comprehend infrastructure requirements and implement solutions that improve system...


  • Bengaluru, Karnataka, India APPLIED CLOUD COMPUTING PRIVATE LIMITED Full time

    About UsApplied Cloud Computing Private Limited is a dynamic company that requires a skilled DevOps Engineer to streamline development, testing, deployment, and operations processes.The RoleWe are seeking an experienced DevOps Engineer to design, implement, and manage robust continuous integration/continuous deployment (CI/CD) pipelines. The successful...


  • Bengaluru, Karnataka, India Delta Air Lines Full time

    About Delta Tech Hub: Delta Air Lines is a leader in safety, innovation, and customer experience. Our team is dedicated to operational excellence while maintaining award-winning customer service.KEY RESPONSIBILITIES: Collaborate with SRE teams to understand infrastructure needs and implement solutions for improved system availability and...


  • Bengaluru, Karnataka, India Delta Air Lines Full time

    About Delta Tech Hub:We are seeking a highly skilled Site Reliability Engineer to join our team at the Delta Technology Hub. As a Site Reliability Engineer, you will play a critical role in ensuring the reliability and performance of our cloud-based systems.Key Responsibilities:


  • Bengaluru, Karnataka, India STM Technologies Full time

    About STM Technologies">As a global recruitment consultancy, STM Technologies specializes in infrastructure services and talent management solutions. We provide comprehensive recruitment services across various sectors, including IT, engineering, finance, telecommunications, and more.">Job Details">This is a full-time role for a GCP Site Reliability Engineer...


  • Bengaluru, Karnataka, India Athenahealth Technology Private Limited Full time

    Athenahealth Technology Private Limited is seeking a Senior Site Reliability Engineer - Cloud Architecture to join our Cloud Infrastructure Engineering division. As a key member of the team, you will be responsible for defining, measuring, and maintaining Service Level Objectives (SLOs) and Service Level Indicators (SLIs) for cloud services and...


  • Bengaluru, Karnataka, India Philips Full time

    Job DescriptionJob DescriptionJob title:Product Site Reliability EngineerYour role:- Drive the design towards ensuring the quality, reliability and serviceability of RI PACS solutions- Lead system features feasibility, prototyping, as well as quality and performance analysis.- Contribute to future studies, technical concepts and program roadmap.- Define the...


  • Bengaluru, Karnataka, India CA-One Tech Cloud Full time

    Job Role : Site Reliability EngineerLocation : BangaloreExperience : 5+ years - At least 5 years of experience in configuring enterprise-level Linux systems within a highly networked environment. - Expertise in using Chef for configuration management and automation, including the creation and management of Chef cookbooks and recipes. - Strong proficiency in...