
AI Systems Reliability Engineer
2 weeks ago
We are seeking a highly skilled and proactive AI Solutions SRE Lead to oversee the maintenance, optimization, and ongoing performance of deployed AI/ML systems and solutions.
In this role, you'll act as the bridge between innovation and operations, ensuring our AI solutions consistently deliver value and operate seamlessly in real-world environments. You will lead efforts to monitor deployments, troubleshoot issues, and define best practices for sustaining AI systems throughout their lifecycle.
Key Responsibilities
- Maintenance & Optimization: Lead the post-deployment lifecycle of AI solutions, ensuring continued functionality, reliability, and scalability.
- Establish monitoring frameworks to oversee system performance, usage, and metrics for AI/ML models and APIs.
- Detect anomalies in AI systems, troubleshoot operational issues, and initiate timely corrective actions.
Performance Enhancement:
- Continuously assess and optimize the performance of AI models to maintain efficiency and accuracy in production environments.
- Collaborate with data scientists and engineers to refine algorithms, retrain models, and update solutions as needed.
- Implement automation where possible to streamline maintenance processes.
Stakeholder Engagement:
- Work with cross-functional teams (engineering, product, operations, etc.) to ensure alignment of AI sustainment activities with business goals.
- Communicate effectively with stakeholders to provide updates on system health, risks, and improvements.
Governance & Best Practices:
- Define and implement best practices for sustaining AI solutions, including documentation, testing protocols, and version control.
- Ensure compliance with ethical AI standards, regulatory guidelines, and established governance frameworks.
- Manage and mitigate risks associated with model drift, data shifts, and system vulnerabilities.
Incident Response:
- Lead responses to critical incidents involving AI systems by performing root cause analysis and deploying solutions for quick resolution.
- Advocate for proactive risk prevention and early detection strategies.
- Mentor and develop junior team members, fostering their skills in AI observability and domain-specific knowledge in ML, Computer Vision, and Generative AI.
- Bachelor's degree in Computer Science, Engineering, Data Science, or related field; advanced degree preferred.
- 9+ years of experience in machine learning, data science, or software engineering roles, with significant exposure to Computer Vision and Generative AI projects.
- 4+ years of experience specifically focused on AI/ML development and sustain the applications / solutions.
- Strong programming skills in languages such as Python, Java, or Go.
- Extensive experience with AI/ML frameworks (e.g., TensorFlow, PyTorch, scikit-learn) and cloud platforms (e.g., AWS, Azure, GCP).
- Proficiency in data visualization tools and techniques (e.g., Grafana, Tableau, D3.js).
- Deep understanding of AI/ML concepts, including model training, evaluation, and deployment, with specific knowledge of Computer Vision and Generative AI techniques.
- Experience with monitoring and observability tools such as Prometheus, ELK stack, or similar systems.
- Excellent problem-solving skills and ability to troubleshoot complex AI systems across various domains.
- Proven track record of mentoring and developing junior team members in AI-related roles.
- Experience with MLOps practices and tools, particularly for large-scale AI systems.
- Familiarity with AI ethics and responsible AI principles, especially as they relate to Generative AI.
- Knowledge of relevant AI regulations and compliance requirements, including those specific to Computer Vision applications.
- Experience with distributed systems and large-scale data processing for AI applications.
- Contributions to open-source projects or research publications in AI solution at production scale. Previous experience with large-scale AI/ML solutions in production environments.
- Knowledge of DevOps principles and CI/CD pipelines specific to AI/ML systems.
- Strong analytical and critical thinking skills
- Excellent communication and collaboration abilities
- Proactive and self-motivated work ethic
- Ability to explain complex technical concepts to both technical and non-technical audiences
- Adaptability and willingness to learn in a rapidly evolving field
- Strong mentorship and leadership skills
- Deep curiosity and passion for AI, particularly in ML, Computer Vision, and Generative AI domains
-
Reliable Systems Engineer
1 week ago
Dindigul, Tamil Nadu, India beBeeSite Full time ₹ 15,00,000 - ₹ 25,00,000Role OverviewWe are seeking a Reliable Site Operations Specialist to drive system reliability and performance. This proactive mindset, technical depth, and proven track record in automation, cloud infrastructure, and observability solutions will be essential for success.Key ResponsibilitiesDesign and implement resilient infrastructure using Terraform and...
-
Reliable Systems Engineer
1 week ago
Dindigul, Tamil Nadu, India beBeeSRE Full time ₹ 1,80,00,000 - ₹ 2,00,00,000Site Reliability EngineerWe're on an exciting journey and want you to join us.About the RolePromote an 'Automate-first' culture in operating services, through the reduction of toil.Develop methodologies and strategies for identification of toil-heavy and inefficient processes.Assist in developing engineering and operational service metrics with actionable...
-
AI Systems Engineer
6 days ago
Dindigul, Tamil Nadu, India beBeeArtificial Full time ₹ 15,00,000 - ₹ 18,00,000Job Title: Agentic AI DeveloperAbout the RoleWe are seeking a seasoned professional to spearhead the design, build, and deployment of intelligent agents that can autonomously make decisions, utilize tools, and execute multi-step tasks.Key Responsibilities:Design and implement AI agents using frameworks such as LangChain, AutoGen, or OpenAI Agents...
-
Building Scalable AI Systems
1 week ago
Dindigul, Tamil Nadu, India beBeeData Full time ₹ 1,50,00,000 - ₹ 2,00,00,000Job Title: Data & AI StrategistWe are seeking a skilled Data & AI Strategist to lead our data and artificial intelligence initiatives. The ideal candidate is a strategic thinker with technical expertise, capable of guiding projects from concept to production.Lead the architecture, design, and implementation of highly scalable and reliable data platforms and...
-
System Reliability Expert
1 week ago
Dindigul, Tamil Nadu, India beBeeSystem Full time ₹ 1,50,00,000 - ₹ 2,50,00,000Job OverviewWe are seeking an experienced Reliability Expert to join our organization. This individual will play a critical role in ensuring the reliability and efficiency of our systems.Key Responsibilities:Performance testing and engineering with experience in load testing tools such as JMeter/LoadRunner.Experience with APM tools like...
-
AI Product Development Engineer
1 week ago
Dindigul, Tamil Nadu, India beBeeArtificialIntelligence Full time ₹ 15,40,000 - ₹ 2,51,60,000Job OverviewAs a pioneering AI professional, you will play a pivotal role in designing and developing cutting-edge AI solutions that propel product development workflows, engineering services, and organisational processes to new heights.The ideal candidate combines strong technical expertise with the ability to translate complex business challenges into...
-
AI Systems Architect
1 week ago
Dindigul, Tamil Nadu, India beBeeInnovator Full time ₹ 1,50,00,000 - ₹ 2,00,00,000Job DescriptionWe are seeking a skilled professional to develop and implement cutting-edge AI systems that drive business growth and innovation.As a key member of our team, you will be responsible for designing and building end-to-end pipelines for data ingestion, cleaning, embeddings, model training, evaluation, and deployment. You will also architect APIs,...
-
AI Development Engineer Opportunity
2 weeks ago
Dindigul, Tamil Nadu, India beBeeDevelopment Full time ₹ 1,80,00,000 - ₹ 2,40,00,000Senior Engineering SpecialistThe role involves hands-on coding and contributing to the development of next-generation AI workflows. Key responsibilities include architecting and coding agentic workflows, building AI agents using LLM frameworks, cloud and infrastructure optimization, product architecture and coding, improving API performance and reliability,...
-
Senior AI Systems Specialist
2 weeks ago
Dindigul, Tamil Nadu, India beBeeEngineer Full time ₹ 18,00,000 - ₹ 24,00,000Job OpportunityWe are seeking an experienced Senior AI Platform Engineer to drive automation and digital transformation across the waste industry.About UsA Swiss startup is revolutionizing operational pipelines for the European waste sector, unlocking efficiencies and fostering sustainability at scale. Our team is passionate, driven, and on a mission to make...
-
AI Systems Developer
2 weeks ago
Dindigul, Tamil Nadu, India beBeeArtificial Full time ₹ 1,00,00,000 - ₹ 1,50,00,000Job DescriptionWe are seeking a highly skilled AI Engineer to develop and deploy scalable REST APIs using Python, integrating AI services and large language models (LLMs) into existing systems.The ideal candidate will have strong expertise in Python with a focus on machine learning, AI orchestration, and data access patterns for AI workloads.Familiarity with...