Lead Solutions Architect AI Infrastructure

3 days ago


Hyderabad, Telangana, India Algoleap Technologies Full time ₹ 40,00,000 - ₹ 80,00,000 per year

SUMMARY

PS- Global Competency Center

Hewlett Packard Enterprise

Job Title Lead Solutions Architect AI Infrastructure & Private Cloud

Job Description:

We are seeking an experienced Lead Solutions Architect with deep expertise in AI/ML infrastructure ,  High Performance Computing (HPC), and container platforms to join our dynamic team focused on delivering HPE Private Cloud AI  and  Enterprise AI Factory Solutions. This role is instrumental in architecting, deploying, and optimizing private cloud environments that leverage HPE's co-developed solutions with NVIDIA, as well as validated HPE reference architectures, to support enterprise-grade AI workloads at scale.

The ideal candidate will bring strong technical expertise in AI infrastructure, container orchestration platforms, and hybrid cloud environments, and will play a key role in delivering scalable, secure, and high-performance AI platform solutions powered by HPE GreenLake and NVIDIA AI Enterprise technologies.

Key Responsibilities:

  1. Leadership and Strategy:

  2. Provide delivery assurance and serve as the lead design authority to ensure seamless execution of Enterprise grade container platform including Red Hat OpenShift and SUSE Rancher, HPE Private Cloud AI and HPC/AI solutions, fully aligned with customer AI/ML strategies and business objectives.

  3. Align solution architecture with NVIDIA Enterprise AI Factory design principles, including modular scalability, GPU optimization, and hybrid cloud orchestration.
  4. Oversee planning, risk management, and stakeholder alignment throughout the project lifecycle to ensure successful outcomes.

  5. Solution Planning and Design:

  6. Architect and optimize end-to-end solutions across container orchestration and HPC workload management domains, leveraging platforms such as Red Hat OpenShift, SUSE Rancher, and/or workload schedulers like Slurm and Altair PBS Pro.

  7. Ensure seamless integration of container and AI platforms with the broader software ecosystem, including NVIDIA AI Enterprise, as well as open-source DevOps, AI/ML tools, and frameworks.

  8. Opportunity assessment:

  9. Lead technical responses to RFPs, RFIs, and customer inquiries, ensuring alignment with business and technical requirements.

  10. Conduct proof-of-concept (PoC) engagements to validate solution feasibility, performance, and integration within customer environments.
  11. Assess customer infrastructure and workloads to recommend optimal configurations using validated reference architectures from HPE and strategic partners such as Red Hat, NVIDIA, SUSE, along with components from the open-source ecosystem.

  12. Innovation and Research:

  13. Stay current with emerging technologies, industry trends, and best practices across HPC, Kubernetes, container platforms, hybrid cloud, and security to inform solution design and innovation.

  14. Customer-centric mindset:

  15. Act as a trusted advisor to enterprise customers, ensuring alignment of AI solutions with business goals.

  16. Translate complex technical concepts into value propositions for stakeholders

6. Team Collaboration:

Collaborate with cross-functional teams, including subject matter experts in infrastructure components such as HPE servers, storage, networking and data science teams to ensure cohesive and integrated solution delivery.

Mentor technical consultants and contribute to internal knowledge sharing through tech talks and innovation forums.

Required Skills:

  1. HPC & AI Infrastructure

Extensive knowledge of HPC technologies and workload scheduler such as Slurm and/or Altair PBS Pro,

Proficient in HPC cluster management tools, including HPE Cluster Management (HPCM) and/or NVIDIA Base Command Manager.

Experience with HPC cluster managers like HPE Cluster Management (HPCM) and/or NVIDIA Base Command Manager.

Good understanding with high-speed networking stacks (InfiniBand, Mellanox) and performance tuning of HPC components.

Solid grasp of high-speed networking technologies, such as InfiniBand and Ethernet.

  1. Containerization & Orchestration

Extensive hands-on experience with containerization technologies such as Docker, Podman, and Singularity

Proficiency with at least two container orchestration platforms: CNCF Kubernetes, Red Hat OpenShift, SUSE Rancher (RKE/K3S), Canonical Charmed Kubernetes.

Strong understanding of GPU technologies, including the NVIDIA GPU Operator for Kubernetes-based environments and DCGM (Data Center GPU Manager) for GPU health and performance monitoring.

3.Operating Systems & Virtualization

Extensive experience in Linux system administration, including package management, boot process troubleshooting, performance tuning, and network configuration.

Proficient with multiple Linux distributions, with hands-on expertise in at least two of the following: RHEL, SLES, and Ubuntu.

Experience with virtualization technologies, including KVM and OpenShift Virtualization, for deploying and managing virtualized workloads in hybrid cloud environments.

  1. Cloud, DevOps & MLOps

Solid understanding of hybrid cloud architectures and experience working with major cloud platforms in conjunction with on-premises infrastructure.

Familiarity with DevOps practices, including CI/CD pipelines, infrastructure as code (IaC), and microservices-based application delivery.

Experience integrating and operationalizing open-source AI/ML tools and frameworks, supporting the full model lifecycle from development to deployment.

Good understanding of cloud-native security, observability, and compliance frameworks, ensuring secure and reliable AI/ML operations at scale.

  1. Networking & Protocols

Strong understanding of core networking principles, including DNS, TCP/IP, routing, and load balancing, essential for designing resilient and scalable infrastructure.

Working knowledge of key network protocols, such as S3, NFS, and SMB/CIFS, for data access, transfer, and integration across hybrid environments.

  1. Programming & Automation

Proficiency in scripting or programming languages such as Python and Bash.

Experience automating infrastructure and AI workflows.

  1. Soft Skills & Leadership

Excellent problem-solving, analytical thinking, and communication skills for engaging both technical and non-technical stakeholders.

Proven ability to lead complex technical projects from requirements gathering through architecture, design, and delivery.

Strong business acumen with the ability to align technical solutions with client challenges and objectives.

Qualifications:

  • Bachelor's/master's degree in computer science, Information Technology, or a related field.
  • Professional certifications in AI Infrastructure, Containers and Kubernetes are highly desirable such as RHCSA, RHCE, CNCF certifications (CKA, CKAD, CKS), NVIDIA-Certified Associate - AI Infrastructure and Operations
  • Typically, 8 10 years of hands-on experience in architecting and implementing HPC, AI/ML, and container platform solutions within hybrid or private cloud environments, with a strong focus on scalability, performance, and enterprise integration.


  • Hyderabad, Telangana, India Nicrron AI Full time ₹ 15,00,000 - ₹ 25,00,000 per year

    Company DescriptionNicrron is a multi-product company that builds AI-native developer tools to help teams code, test, and ship faster with fewer bottlenecks. Our products enhance developer experience and streamline software lifecycles by integrating seamlessly with modern stacks and practices. By focusing on automation, we reduce busywork and amplify...


  • Hyderabad, Telangana, India beBeeInfrastructurist Full time ₹ 1,80,00,000 - ₹ 2,40,00,000

    Job Summary:We are seeking an experienced and innovative AI Infrastructure Architect to lead the design, development, and implementation of our AI infrastructure. This individual will be responsible for creating future-proof AI infrastructure solutions that encompass hardware, software, networking, and multi-cloud environments.Key Responsibilities:Design and...

  • AI Architect

    5 days ago


    Hyderabad, Telangana, India Quickhyre AI Full time ₹ 5,00,000 - ₹ 8,00,000 per year

    We are seeking an experienced AI Architect to lead the design, development, and deployment of large-scale AI solutions. The ideal candidate will bridge the gap between business requirements and technical implementation, with deep expertise in generative AI and modern MLOps practices.Key ResponsibilitiesAI Solution Design & ImplementationArchitect end-to-end...


  • Hyderabad, Telangana, India BDS AI Digital Solutions Pvt Ltd Full time ₹ 1,04,000 - ₹ 1,30,878 per year

    Company DescriptionBDS AI Digital Solutions Pvt. Ltd. is a forward-thinking technology company specializing in artificial intelligence and digital transformation services. We deliver smart, scalable, and efficient solutions that empower organizations across industries to thrive in the digital age. Our services include AI-driven analytics, automation,...


  • Hyderabad, Telangana, India Intellectt Inc Full time US$ 1,50,000 - US$ 2,00,000 per year

    We are seeking an experienced AI Solution Architect to lead the design and implementation of AI-driven, cloud-native applications. The ideal candidate will possess deep expertise in Generative AI, Agentic AI, cloud platforms (AWS, Azure, GCP), and modern data engineering practices. This role involves collaborating with cross-functional teams to deliver...


  • Hyderabad, Telangana, India Mondee Full time US$ 1,25,000 - US$ 1,75,000 per year

    Job Position - Solution Architect (AI & Product Engineering)Experience Required: yearsJob Type - Full-timeLocation - Hyderabad, India (Work From Office)OverviewWe are seeking a visionary leader with extensive experience in AI-based product innovation and development to drive the integration of advanced AI, Generative AI, and Multimodal AI systems into...


  • Hyderabad, Telangana, India Mondee Full time ₹ 20,00,000 - ₹ 25,00,000 per year

    Job Position - Solution Architect (AI & Product Engineering)Experience Required yearsJob Type - Full timeLocation - Hyderabad, IndiaOverviewWe are seeking a visionary leader with extensive experience in AI-based product innovation and development to drive the integration of advanced AI, Generative AI, and Multimodal AI systems into innovative solutions....


  • Hyderabad, Telangana, India beBeeSoftware Full time ₹ 2,00,00,000 - ₹ 2,50,00,000

    Job Description">We are developing cutting-edge solutions that revolutionise operational pipelines for the waste industry, unlocking efficiencies and fostering sustainability at scale. Our team is passionate, driven, and on a mission to make a meaningful difference – and we need a talented Senior AI Platform Engineer to join us.This role will play a...


  • Hyderabad, Telangana, India beBeeArtificial Full time ₹ 2,00,00,000 - ₹ 2,50,00,000

    Artificial Intelligence ArchitectWe are seeking a skilled Artificial Intelligence Architect to join our team.The ideal candidate will have 6-8 years of total experience and expertise in AWS, AI, and cloud-based solutions.We need someone who can design and implement scalable AI-driven applications.The successful candidate will be comfortable working with...

  • Lead AI Engineer

    2 weeks ago


    Hyderabad, Telangana, India Weekday AI Full time ₹ 15,00,000 - ₹ 20,00,000 per year

    This role is for Weekday's client.Role OverviewAs the Lead AI Engineer, you will be responsible for spearheading the design, development, and deployment of AI solutions. You will work with various large language models (LLMs)—both open-source and proprietary—optimizing them through fine-tuning, prompt engineering, agentic frameworks, and...