GCP Infrastructure Engineer

5 days ago


Chennai India UPS Full time

Job Description

Avant de postuler un emploi, slectionnez votre langue de prfrence parmi les options disponibles en haut droite de cette page.

Dcouvrez votre prochaine opportunit au sein d'une organisation qui compte parmi les 500 plus importantes entreprises mondiales. Envisagez des opportunits innovantes, dcouvrez notre culture enrichissante et travaillez avec des quipes talentueuses qui vous poussent vous dvelopper chaque jour. Nous savons ce qu'il faut faire pour diriger UPS vers l'avenir : des personnes passionnes dotes d'une combinaison unique de comptences. Si vous avez les qualits, de la motivation, de l'autonomie ou le leadership pour diriger des quipes, il existe des postes adapts vos aspirations et vos comptences d'aujourd'hui et de demain.

Job Summary

Fiche de poste :

We are seeking a highly skilled GCP Infrastructure Engineer to design, build, and manage the cloud infrastructure that powers Generative AI (GenAI) applications at scale. In this role, you will leverage Google Cloud Platform (GCP) Vertex AI, IBM Watsonx, and containerization technologies such as Docker and Kubernetes (GKE) to deliver secure, scalable, and high-performance AI solutions. You will own the end-to-end infrastructure lifecycle from design and provisioning to automation, monitoring, and optimization while enabling data scientists and ML engineers to seamlessly deploy and operate GenAI workloads.

Key Responsibilities

Cloud Infrastructure & Platform Engineering

- Design, provision, and maintain scalable, secure, and cost-efficient infrastructure for GenAI applications on GCP.
- Deploy and manage containerized workloads using Docker and Kubernetes (GKE).
- Configure and optimize Vertex AI and IBM Watsonx platforms for training, fine-tuning, and serving LLMs and other generative models.
- Implement high-performance GPU/TPU clusters to support distributed training and large-scale inference.
- Ensure business continuity through backup, disaster recovery, and multi-region deployments.

Automation & Reliability

- Develop and maintain Infrastructure as Code (IaC) templates with Terraform, or Cloud Deployment Manager.
- Adopt GitOps practices (Flux) for infrastructure lifecycle management.
- Build and optimize CI/CD pipelines for data pipelines, model workflows, and GenAI applications.
- Apply SRE principles (SLIs, SLOs, SLAs) to guarantee platform reliability and uptime.

Security, Governance & Compliance

- Embed DevSecOps best practices across the infrastructure lifecycle, including policy-as-code, vulnerability scanning, and secrets management.
- Enforce identity and access management (IAM), network segmentation, and data encryption in compliance with standards (HIPAA, SOX, GDPR, FedRAMP).
- Collaborate with enterprise security and compliance teams to implement governance frameworks for GenAI platforms.

Monitoring, Observability & Cost Optimization

- Implement observability stacks (Prometheus, Grafana, Cloud Monitoring, Datadog) for both infra health and ML-specific metrics (model drift, data anomalies).
- Define KPIs to monitor system health, performance, and adoption across AI workloads.
- Optimize cloud cost efficiency for GPU/TPU-intensive workloads using autoscaling, preemptible instances, and utilization monitoring.

Collaboration & Enablement

- Partner with data scientists, ML engineers, and software teams to streamline GenAI application development and deployment.
- Provide onboarding, documentation, and reusable templates to enable faster adoption of AI infrastructure.
- Stay current with the latest advancements in GenAI, cloud-native infrastructure, and container orchestration.

Required Qualifications

Education

Bachelor's or master's degree in computer science, Software Engineering, or a related field.

Experience

- 5+ years of experience in cloud infrastructure engineering, DevOps, or platform engineering.
- Experience with GenAI use cases (chatbots, content generation, code assistants, etc.).
- Strong hands-on expertise with Google Cloud Platform (GCP), especially Vertex AI.
- Experience with IBM Watsonx for AI application deployment and management.
- Proven skills in Docker, Kubernetes (GKE), and container orchestration at scale.
- Proficiency in Python, Bash, or other relevant scripting languages.
- Strong understanding of cloud networking, IAM, and security best practices.
- Experience with CI/CD tools (GitHub Actions, GitLab CI, Jenkins) and IaC tools (Terraform, Pulumi, Ansible, Deployment Manager).
- Familiarity with data pipelines and integration tools (Dataflow, Apache Beam, Pub/Sub, Kafka).
- Excellent problem-solving, debugging, and communication skills.

Preferred Experience

- Experience in MLOps practices for model deployment, monitoring, and retraining.
- Exposure to multi-cloud or hybrid cloud environments (GCP, AWS, Azure, on-prem).
- Hands-on experience with feature stores (Vertex AI Feature Store, Feast) and ML observability tools (EvidentlyAI, Fiddler).
- Knowledge of distributed training frameworks (Horovod, DeepSpeed, PyTorch Distributed).
- Contributions to open-source projects in infrastructure, MLOps, or GenAI.
- Experience managing infrastructure in regulated industries.

Preferred Certifications

- Google Cloud Certified - Professional Cloud Architect
- Google Cloud Certified - Machine Learning Engineer
- Certified Kubernetes Administrator (CKA) or Certified Kubernetes Application Developer (CKAD)
- IBM Certified Watsonx Generative AI Engineer Associate
- IBM Certified Solution Architect - Cloud Pak for Data
- Other relevant certifications in AI, Machine Learning, or Cloud-Native technologies.

Type De Contrat

en CDI

Chez UPS, galit des chances, traitement quitable et environnement de travail inclusif sont des valeurs clefs auxquelles nous sommes attachs.



  • Chennai, India MNR Solutions Full time

    Position : GCP Specialist Location : Chennai Work Mode : Hybrid Experience : 6-9 YearsResponsibility : - Design and manage GCP infrastructure (Compute Engine, VPC, GKE, Cloud SQL, IAM, etc.)- Automate infra using Terraform or Deployment Manager- Monitor performance and optimize cloud cost- Ensure security and compliance best practices- Handle migrations,...

  • GCP cloud Engineer

    2 weeks ago


    Hyderabad, India HCLTech Full time

    Job Description Hiring for GCP Cloud Engineer in HCLTech nterested share your resume to [Confidential Information] EXP- 7years to 15 Years Loaction: Bangalore, Chennai, Noida, Pune, Hyderabad Total EXP: Rel EXP: CTC: Expected CTC: Notice Period: JD: Cloud Infrastructure Design & Implementation: - Design, deploy, and manage scalable, highly...


  • Chennai, Tamil Nadu, India NucleusTeq Full time ₹ 15,00,000 - ₹ 25,00,000 per year

    Description: 6+ years of direct experience working in IT Infrastructure. Experience with Relational Databases, Big Data technologies: Spark, Hadoop is preferred. Experience in understanding a complex customers existing software workload and the ability to define a technical migration roadmap to the Cloud. Experience in Managing large ...


  • Chennai, Tamil Nadu, India Nucleusteq Full time ₹ 1,04,000 - ₹ 1,30,878 per year

    Description:6+ years of direct experience working in IT Infrastructure.Experience with Relational Databases, Big Data technologies: Spark, Hadoop is preferred.Experience in understanding a complex customers existing software workload and the ability to define a technical migration roadmap to the Cloud.Experience in Managing large scale Windows/Linux...


  • Chennai, India NucleusTeq Full time

    Description: 6+ years of direct experience working in IT Infrastructure. Experience with Relational Databases, Big Data technologies: Spark, Hadoop is preferred. Experience in understanding a complex customers existing software workload and the ability to define a technical...


  • Chennai, India NucleusTeq Full time

    Description: 6+ years of direct experience working in IT Infrastructure. Experience with Relational Databases, Big Data technologies: Spark, Hadoop is preferred. Experience in understanding a complex customers existing software workload and the ability to define a technical...

  • GCP DevOps Engineer

    6 days ago


    Chennai, Tamil Nadu, India Ford Motor Full time ₹ 20,00,000 - ₹ 25,00,000 per year

    GCP DevOps Engineer You will play a pivotal role in automating processes, optimizing cloud resources, and collaborating closely with development and operations teams to streamline application deployments and foster a culture of continuous improvement. Key Responsibilities GCP Infrastructure Management:Design, implement, and manage GCP resources,...

  • GCP DevOps Engineer

    2 weeks ago


    Chennai, Tamil Nadu, India Raah Techservices Full time ₹ 6,00,000 - ₹ 18,00,000 per year

    About the Role:We are seeking an experienced GCP DevOps Engineer to join our team. The ideal candidate will have strong expertise in Google Cloud Platform (GCP) and DevOps practices to design, implement, and manage scalable, reliable, and secure cloud-based solutions.Key Responsibilities:Design, build, and maintain CI/CD pipelines using GCP-native tools and...

  • GCP DevOps Engineer

    4 weeks ago


    Chennai, Tamil Nadu, India Ford Motor Company Full time

    Job DescriptionJOB DESCRIPTIONYou will play a pivotal role in automating processes, optimizing cloud resources, and collaborating closely with development and operations teams to streamline application deployments and foster a culture of continuous improvement.RESPONSIBILITIESKey ResponsibilitiesGCP Infrastructure Management:- Design, implement, and manage...

  • GCP DevOps Engineer

    3 weeks ago


    Chennai, India Raah Techservices Full time

    We are seeking an experienced GCP DevOps Engineer to join our team in Chennai. The ideal candidate will have strong expertise in Google Cloud Platform (GCP) and DevOps practices to design, implement, and manage scalable, reliable, and secure cloud-based solutions. Key Responsibilities: Design, build, and maintain CI/CD pipelines using GCP-native tools and...