Senior DevOps Engineer for Cloud Native Architecture

6 days ago


Ahmedabad, Gujarat, India Infraveo Full time

We are seeking a highly skilled Senior DevOps Engineer with deep expertise in Kubernetes complemented by significant experience in MLOps (Machine Learning Operations) and LLMOps (Large Language Model Operations). This role is ideal for someone who has a strong background in managing and architecting SaaS applications in Kubernetes and is passionate about building and optimizing infrastructure to support machine learning and AI-driven applications.

Responsibilities:
  • The Senior DevOps Engineer will play a critical role in ensuring that our systems are highly available, reliable, and scalable. You will architect, build, and monitor cloud-native architectures with Kubernetes and related technologies, particularly in the context of machine learning and AI workloads.
  • You should have a deep understanding of the Software Development Life Cycle, including Continuous Integration and Continuous Deployment (CI/CD) pipeline architecture, particularly as it relates to deploying ML models and AI services in Kubernetes environments.
  • You will assist in the design and operation of critical cloud infrastructure on AWS with a focus on supporting the unique requirements of machine learning and AI-driven applications. Examples include model training, deployment, and scaling. All of these examples would be leveraging AWS SageMaker.
  • Collaborate closely with data scientists and ML engineers to create a streamlined, automated build and deployment process for ML models and LLMs in Kubernetes.
  • Implement and manage the infrastructure necessary for the continuous integration, delivery, and monitoring of ML models and AI services, ensuring they are seamlessly integrated into our SaaS applications.
  • Ensure the availability and performance of production systems that run ML-driven services, proactively identifying and resolving issues that may impact model performance or availability.
  • Optimize infrastructure for the efficient training, deployment, and scaling of ML models and LLMs, leveraging Kubernetes GPU clusters and cloud-native tools, including AWS SageMaker.
  • Develop and maintain monitoring and alerting solutions tailored to ML and AI workloads, ensuring that both the infrastructure and deployed models are performing as expected.
  • Troubleshoot and resolve production incidents, ensuring minimal downtime and quick recovery.
  • Participate in on-call rotation as necessary.
  • Ensure the security and compliance of our production systems and data, with a particular focus on protecting sensitive AI and ML data.
  • Mentor and coach junior DevOps engineers.
Requirements:
  • Bachelor's degree in Computer Science, Engineering, or a related field.
  • A minimum of 7 years of experience in maintaining optimal performance of online production environments, utilizing bare metal, cloud, and container technologies.
  • At least 4 years of experience managing production Kubernetes infrastructure, with exposure to cloud vendor Kubernetes solutions, such as EKS, AKS, and GKE.
  • Strong experience with Docker for containerization, including creating and managing Docker images and containers.
  • Strong experience in architecting and managing SaaS applications in Kubernetes, with specific experience in MLOps and LLMOps.
  • Deep understanding of the machine learning lifecycle, including model training, deployment, monitoring, and scaling, particularly using AWS SageMaker.
  • Experience with MLOps tools and frameworks, such as Kubeflow, MLflow, or similar, and their integration into Kubernetes environments.
  • Familiarity with LLMOps, including the deployment and management of LLMs in production environments.
  • Solid experience in scripting languages, such as Python.
  • Experience with infrastructure deployment and automation tools, such as Terraform, CloudFormation, etc.
  • Working knowledge of industry-standard build tooling and CI/CD using GitHub and GitHub Actions.
  • Expertise in monitoring and logging solutions, such as Prometheus and Grafana.
  • Good understanding of networking and security concepts.
  • Strong knowledge of Linux systems and shell scripting.
  • Strong communication and collaboration skills, with experience working closely with data scientists and ML engineers.
  • Experience working in an agile environment and understanding of agile methodologies.
  • Certifications, such as CKA (Certified Kubernetes Administrator) or CKAD (Certified Kubernetes Application Developer), are a plus.
Nice to Haves:
  • Experience with workflow orchestration tools, like Apache Airflow, particularly for managing complex data pipelines and ML workflows.
  • Experience with GitOps tools, such as ArgoCD, for managing Kubernetes deployments through version-controlled repositories.
  • Familiarity with GPU acceleration technologies and their integration with Kubernetes for optimizing ML model training and inference.
  • Knowledge of data versioning tools and frameworks, like DVC (Data Version Control), in the context of MLOps.
  • Experience with cloud cost optimization strategies, particularly in environments running intensive ML and AI workloads.
Technologies We Use:
  • We use numerous AWS services and are expanding into Azure.
  • AWS SageMaker is central to machine learning model training, deployment, and management processes.
  • Terraform, CloudFormation, Ansible, and Kubernetes are leveraged for our infrastructure deployment and automation.
  • Industry-standard build tooling and CI/CD using GitHub and ArgoCD.
  • A mix of open-source and proprietary technologies tailored to the problems at hand.
Benefits:
  • Work from home.
  • 5 days a week work shift.

Bachelor's degree in Computer Science, Engineering, or a related field. A minimum of 7 years of experience in maintaining optimal performance of online production environments, utilizing bare metal, cloud, and container technologies. At least 4 years of experience managing production Kubernetes infrastructure, with exposure to cloud vendor Kubernetes solutions, such as EKS, AKS, and GKE. Strong experience with Docker for containerization, including creating and managing Docker images and containers. Strong experience in architecting and managing SaaS applications in Kubernetes, with specific experience in MLOps and LLMOps. Deep understanding of the machine learning lifecycle, including model training, deployment, monitoring, and scaling, particularly using AWS SageMaker. Experience with MLOps tools and frameworks, such as Kubeflow, MLflow, or similar, and their integration into Kubernetes environments. Familiarity with LLMOps, including the deployment and management of LLMs in production environments. Solid experience in scripting languages, such as Python. Experience with infrastructure deployment and automation tools, such as Terraform, CloudFormation, etc. Working knowledge of industry-standard build tooling and CI/CD using GitHub and GitHub Actions. Expertise in monitoring and logging solutions, such as Prometheus and Grafana. Good understanding of networking and security concepts. Strong knowledge of Linux systems and shell scripting. Strong communication and collaboration skills, with experience working closely with data scientists and ML engineers. Experience working in an agile environment and understanding of agile methodologies. Certifications, such as CKA (Certified Kubernetes Administrator) or CKAD (Certified Kubernetes Application Developer), are a plus.



  • Ahmedabad, Gujarat, India Encora Inc. Full time

    Senior Infrastructure Systems Engineer - Cloud ArchitectWe are seeking a highly skilled Senior Infrastructure Systems Engineer - Cloud Architect to join our team at Encora Inc. This role will be responsible for designing, implementing, and maintaining our cloud infrastructure, ensuring it meets the highest standards of security, scalability, and...


  • Ahmedabad, Gujarat, India skyheaven Full time

    We are seeking a skilled Senior Software Engineer to join our team. The ideal candidate will have a strong background in cloud architecture and DevOps, with expertise in software development and cloud computing. The successful candidate will be responsible for designing, implementing, and maintaining cloud-based systems, as well as ensuring the smooth...


  • Ahmedabad, Gujarat, India TEKSUN Full time

    Job Title: Senior DevOps EngineerJob Summary:We are seeking a highly skilled Senior DevOps Engineer to join our team at TEKSUN. As a key member of our Engineering team, you will be responsible for designing, implementing, and maintaining our cloud infrastructure and ensuring the smooth operation of our services.Key Responsibilities:Establish and maintain...


  • Ahmedabad, Gujarat, India TEKSUN Full time

    Job Title: Senior DevOps EngineerJob Summary:We are seeking a highly skilled Senior DevOps Engineer to join our team at TEKSUN. As a key member of our Engineering team, you will be responsible for designing, implementing, and maintaining our cloud infrastructure and ensuring the smooth operation of our services.Key Responsibilities:Design and implement...


  • Ahmedabad, Gujarat, India Encora Inc. Full time

    Senior Infrastructure Systems Engineer - Cloud ArchitectWe are seeking a highly skilled Senior Infrastructure Systems Engineer - Cloud Architect to join our team at Encora Inc. This role will be responsible for designing, implementing, and maintaining our cloud infrastructure, ensuring it is secure, scalable, and efficient.Key Responsibilities:Design and...

  • DevOps Engineer

    4 weeks ago


    Ahmedabad, Gujarat, India Nlineaxis Full time

    Role : AWS DevOps EngineerLocation : AhmedabadJob Description :- Cloud Architecture & Deployment- Design hybrid cloud architectures and target state architectures.- Experience with private/public cloud (AWS, Azure, GCP), microservices, and hybrid cloud integration.- Design and deploy IaaS/SaaS/PaaS solutions to meet client needs, leading teams through...


  • Ahmedabad, Gujarat, India Vyana Consultancy Full time

    Job Title: Senior DevOps EngineerJob Summary: Vyana Consultancy is seeking a highly skilled Senior DevOps Engineer to lead our cloud infrastructure initiatives. As a key member of our team, you will be responsible for designing, implementing, and maintaining our DevOps and CloudOps processes and tools.Key Responsibilities:• Lead the design and...


  • Ahmedabad, Gujarat, India Xoriant Full time

    About this roleWe are seeking a highly skilled Senior Mobile React Native Developer to join our dynamic team. As a senior member of our engineering team, you will be responsible for designing, developing, and maintaining our mobile platform infrastructure. You will work closely with our frontend, backend, and DevOps engineers to ensure seamless integration...


  • Ahmedabad, Gujarat, India Azilen Technologies Full time

    Job Title: Senior DevOps EngineerAt Azilen Technologies, we are seeking a highly skilled Senior DevOps Engineer to join our team. As a key member of our engineering team, you will be responsible for designing and implementing technical solutions using the latest technologies and tools.Key Responsibilities:Design and implement scalable, secure cloud...


  • Ahmedabad, Gujarat, India Encora Inc. Full time

    Senior Cloud Infrastructure Automation Engineer:Main Responsibilities:Collaborate with our team to identify and prioritize automation opportunities to significantly improve infrastructure efficiency.Design, develop, and maintain scalable and secure cloud-native services and architecture using GCP, Azure, and other cloud platforms.Develop and implement...


  • Ahmedabad, Gujarat, India Azilen Technologies Full time

    Job Title: Senior DevOps EngineerWe are seeking a highly skilled Senior DevOps Engineer to join our team at Azilen Technologies. As a key member of our engineering team, you will be responsible for designing and implementing technical solutions using the latest technologies and tools.Key Responsibilities:Design and implement scalable, secure cloud...

  • Cloud Engineer

    2 weeks ago


    Ahmedabad, Gujarat, India Cygnet Infotech Full time

    Job Summary:Cygnet Infotech is seeking a highly skilled Cloud Engineer - DevOps to join our team. The ideal candidate will have experience setting up clusters on Azure cloud platform, working with Kubernetes, and implementing security aspects.Key Responsibilities:Design and implement scalable cloud architectures on AzureDevelop and maintain Kubernetes...

  • Cloud Engineer

    2 weeks ago


    Ahmedabad, Gujarat, India Cygnet Infotech Full time

    About the Role:We are seeking a skilled Cloud Engineer - DevOps Expert to join our team at Cygnet Infotech. The ideal candidate will have experience with GCP DevOps, Kubernetes, and microservice architecture.Key Responsibilities:Design and implement scalable cloud infrastructure using GCP.Develop and maintain DevOps processes and tools.Collaborate with...


  • Ahmedabad, Gujarat, India Vyana Consultancy Full time

    Job Description:Vyana Consultancy is seeking a highly skilled Sr DevOpsCloudOps professional to join our team.Key Responsibilities:Design and Implement DevOps Processes: Develop and maintain efficient DevOps processes and tools to ensure seamless integration and operation of our cloud-based systems.Automation and Orchestration: Utilize automation and...


  • Ahmedabad, Gujarat, India Azilen Technologiues Full time

    Job Requirements:We are seeking a highly skilled Senior DevOps Engineer to join our team at Azilen Technologies. As a key member of our infrastructure team, you will be responsible for designing and implementing best-engineered technical solutions using the latest technologies and tools.Key Responsibilities:Design and implement cloud infrastructure using...


  • Ahmedabad, Gujarat, India Infraveo Full time

    This is a remote position that offers flexibility and autonomy. As a Senior Engineer at Infraveo, you will be working on cloud native development, digital architecture, and integration automation. Key Responsibilities: Designing Solutions: Apply technical knowledge to drive outcomes for HR Business Partners. You will work on initiatives that align to the...


  • Ahmedabad, Gujarat, India Infraveo Full time

    Job Title: Senior Cloud Engineer - DevOps and ReleaseWe are seeking an experienced Senior Cloud Engineer to join our DevOps and Release Engineering team. As a key member of our team, you will be responsible for leading the combined efforts of DevOps and Release Engineering to streamline our development processes, enhance deployment efficiency, and ensure the...


  • Ahmedabad, Gujarat, India Xoriant Full time

    About this OpportunityWe are seeking a senior mobile React Native developer to join our dynamic team. As a senior mobile React Native developer, you will be responsible for developing and maintaining the platform infrastructure that powers our platform.Key ResponsibilitiesDesign and build our platform, working closely with our Frontend Engineers, DevOps...

  • DevOps Engineer

    4 weeks ago


    Ahmedabad, Gujarat, India TEKSUN Full time

    Job Title: Sr. DevOps EngineerAbout the Role:We are seeking a highly skilled Sr. DevOps Engineer to join our team at TEKSUN. As a key member of our engineering team, you will be responsible for designing, implementing, and maintaining our cloud infrastructure and CI/CD pipelines.Key Responsibilities:Design and implement continuous integration and deployment...


  • Ahmedabad, Gujarat, India Infraveo Full time

    Job Description:We are seeking a highly skilled Senior Software Engineer to join our team as a Cloud Architecture specialist. The ideal candidate will have a strong background in cloud computing and experience with designing and implementing scalable cloud-based systems.Key Responsibilities:Design and implement cloud-based architectures for our...