
Highly Skilled Infrastructure Architect
3 days ago
Job Summary
We are seeking a highly skilled and passionate Platform Engineer to join our dynamic team as an Infrastructure Architect. As a key member of our team, you will be instrumental in designing, developing, and maintaining cutting-edge build and test environments for critical GenAI workloads running on foundational cloud infrastructure.
You will partner with architects to design and implement highly robust and scalable systems, while also providing crucial development support to SRE/Operations teams as they tackle complex distributed systems challenges at scale. We're seeking an engineer who champions Quantiphi's dedication to Cloud-Native development, with a particular emphasis on Kubernetes.
Key Responsibilities:
- Designing and Implementing State-of-the-Art GPU Compute Clusters: You will design and implement state-of-the-art GPU compute clusters to support critical workloads.
- Automated Testing Strategies and Frameworks: You will design comprehensive automated testing strategies and frameworks across unit, integration, API, and end-to-end levels for critical commerce flows.
- Performance Testing Frameworks: You will develop robust performance testing frameworks to validate platform scalability, resilience, and identify optimization opportunities.
- Comprehensive Monitoring Solutions: You will plan comprehensive monitoring solutions with alerting systems to track platform health and ensure SLA compliance.
- Specialized Test Frameworks: You will design specialized test frameworks for security controls and ensure compliance validation across payment and personal data.
- Scalable Automation Infrastructure: You will architect a scalable automation infrastructure that supports growing platform capabilities with consistent test environments.
- Troubleshooting and Root Cause Analysis: You will troubleshoot, diagnose, and perform root cause analysis of system failures, isolating components and failure scenarios in collaboration with internal and external partners.
- Cluster Operations Optimization: You will optimize cluster operations for maximum reliability, efficiency, and performance.
Required Skills and Qualifications
- 6-8 Years of Experience in ML Infrastructure: Over 6-8 years of experience working with developing ML Infrastructure.
- Large-Scale Experience with Kubernetes: Over 3 years of hands-on experience in large-scale direct experience building and deploying production-ready services on Kubernetes.
- Open-Source Contributions: A proven history of engaging with and contributing to open-source projects.
- Collaborative Spirit: A collaborative spirit, demonstrated by prior work developing scalable software solutions for cloud services.
- Effective Communication: The ability to effectively communicate complex technical designs and quality approaches across various mediums.
- GPU Computing and AI Infrastructure: A deep understanding of GPU computing and AI infrastructure.
- Technical Challenges and System Performance: A strong passion for solving complex technical challenges and optimizing system performance.
- Container Technologies and Programming Languages: Working knowledge of cluster configuration management tools such as BCM or Ansible, and infrastructure-level applications including Kubernetes, Terraform, and MySQL. Proficiency in programming with Python and Bash scripting.
Benefits and Advantages
- Sophisticated Infrastructure Tooling: Significant experience with sophisticated infrastructure tooling, including Kubernetes Cluster API, Terraform, Helm, and Operator Framework.
- Major Cloud Platforms: Practical, production-level experience across major cloud platforms: Azure, Google Cloud Platform (GCP), or Amazon Web Services (AWS).
- Adaptability to New Technologies: Ability to adapt to new technologies and Frameworks in ML/GenAI landscape.
- Software Refactoring and Optimization: A strong track record of successfully refactoring and optimising software for deployment within Kubernetes environments.
- Kubernetes Concepts: Comfort discussing and working with core Kubernetes concepts like CSI, CNI, and CRI.
- CNCF Landscape and Associated Tooling: Comprehensive understanding of the CNCF landscape and its associated tooling.
- Complex Problem Decomposition: The ability to decompose complex problems into simpler sub-problems and leverage existing solutions for efficient implementation, along with designing simple, self-sustaining systems.
- AI/ML Incident Detection and Resolution: Experience leveraging AI/ML to proactively detect and resolve incidents, automate alert triaging, perform log analysis, and streamline repetitive workflows.
-
Cloud Infrastructure Architect
2 days ago
Ghaziabad, Uttar Pradesh, India beBeegcp Full time US$ 1,00,000 - US$ 1,20,000Job Title: Cloud Infrastructure Architect
-
Highly Skilled Cloud Architect Wanted
4 days ago
Ghaziabad, Uttar Pradesh, India beBeeExpert Full time ₹ 2,00,00,000 - ₹ 2,50,00,000Cloud Engineer Position Overview:">We are seeking a highly skilled Cloud Engineer with strong AWS Serverless expertise to join our team.">Key Responsibilities and Requirements:">">Design and implement automated monitoring and logging pipelines using CloudWatch, Dynatrace, and Grafana.">Develop self-healing workflows using Python, Node.js/TypeScript, Lambda,...
-
Highly Skilled Cloud Security Expert
4 days ago
Ghaziabad, Uttar Pradesh, India beBeeCloud Full time US$ 2,00,000 - US$ 2,50,000Job Title: Senior Cloud DevSecOps EngineerKey ResponsibilitiesCollaboration and Teamwork: Work closely with cross-functional teams to ensure secure, reliable, and high-performing software delivery.Pipeline Development: Design, implement, and maintain end-to-end CI/CD pipelines using tools like Jenkins, GitHub Actions, GitLab CI, Azure DevOps, and...
-
Highly Skilled Cloud Solutions Architect
4 days ago
Ghaziabad, Uttar Pradesh, India beBeeCloudEngineer Full time ₹ 15,00,000 - ₹ 22,50,000About the Opportunity:We are seeking a seasoned Cloud Engineer to craft, deploy and maintain scalable cloud infrastructure for our clients.The ideal candidate will have a proven track record with Azure and/or AWS cloud platforms, encompassing DevOps pipelines, Infrastructure as Code, and containerization technologies.Design, develop and maintain continuous...
-
CAD Infrastructure Specialist
4 days ago
Ghaziabad, Uttar Pradesh, India beBeeLSFArchitect Full time ₹ 30,00,000 - ₹ 45,00,000Key Role: LSF ArchitectWe are seeking a skilled and experienced LSF Architect to lead the development of CAD/EDA infrastructure for our global design teams.The ideal candidate will have in-depth knowledge of LSF architecture, advanced features such as license fair share and preemption, and experience with automation dashboards for resource utilization and...
-
Ghaziabad, Uttar Pradesh, India beBeeNetwork Full time ₹ 8,00,000 - ₹ 15,00,000Network Support Professional WantedWe are seeking a skilled Network Support Engineer to join our IT department. The ideal candidate will be responsible for providing timely and effective support to ensure the smooth operation of our network infrastructure.Operating System Installation and ConfigurationHardware TroubleshootingSoftware SupportNetwork...
-
Senior Infrastructure Architect
5 days ago
Ghaziabad, Uttar Pradesh, India beBeeInfrastructure Full time ₹ 1,50,00,000 - ₹ 2,50,00,000Job Description:We are seeking a visionary Platform Engineer to spearhead the development of our infrastructure from the ground up, creating a greenfield opportunity for innovative platform architecture design.This rare chance allows you to make pivotal technology choices that will shape our engineering culture and have a lasting impact on the company.You...
-
Highly Skilled Messaging Solutions Architect
5 days ago
Ghaziabad, Uttar Pradesh, India beBeeMessaging Full time ₹ 18,00,000 - ₹ 27,00,000Solace Messaging Engineer Job DescriptionThe role of Solace Messaging Engineer involves troubleshooting messaging systems, providing technical support, and managing Microsoft Exchange infrastructure.
-
Ghaziabad, Uttar Pradesh, India beBeeInfrastructure Full time ₹ 15,00,000 - ₹ 25,00,000About the RoleWe are seeking a skilled Senior System Engineer to provide day-to-day support of client networks, servers, storage, and overall infrastructure.The ideal candidate will maintain our clients' IT infrastructure remotely and at customer site(s) when needed. Additionally, they will help architect and implement a wide range of IT projects, such as...
-
Highly Skilled Data Systems Developer
6 days ago
Ghaziabad, Uttar Pradesh, India beBeeDataEngineer Full time ₹ 20,00,000 - ₹ 25,00,000Job DescriptionWe are seeking a highly skilled professional to design and develop large-scale data processing systems that enable advanced analytics and business intelligence.The successful candidate will be responsible for designing, constructing, and optimizing scalable and robust data pipelines that collect, process, and store large volumes of structured...