DevOps & ML Ops Engineer
3 weeks ago
- Jobs by Location
- Jobs by Industry
DevOps & ML Ops Engineer would be responsible for developing and maintaining scalable, stable services that deliver machine learning models to end users with guaranteed uptime. The primary focus will be on the infrastructure, deployment, and continuous integration/continuous delivery (CI/CD) processes for our ML services.
RESPONSIBILITIES:
- Manage resource allocation and workload scheduling for multiple ML services, ensuring efficient utilization of CPU/GPU resources and creating reliable queues based on service priorities.
- Maintain VM environments and manage OS updates, keep up-to-date VM inventory
- Work alongside the Dev and QA team to detect hot spots in our applications and set preventative measure before it becomes a live issue.
- Troubleshooting and provide solutions for system configurations
- Plan, execute and test disaster recovery
- Monitor and examine all application, performance, event, and system logs to assist in troubleshooting
- Responsible for filing all IT/Colocation tickets ensuring fulfilment of requests, escalating to the right person if necessary.
- Design, develop, and maintain the infrastructure required for deploying and scaling machine learning services.
- Implement and manage the CI/CD pipelines to ensure seamless and efficient deployment of ML models.
- Collaborate with data scientists, ML researchers, and language experts to understand the requirements for deploying ML models and provide necessary infrastructure support.
- Automate and streamline the build, test, and deployment processes to enhance efficiency and reduce time-to-market.
- Monitor and optimize the performance, availability, and scalability of production ML systems.
- Develop and maintain robust monitoring, logging, and alerting systems to proactively identify and address issues.
- Implement security best practices to protect sensitive data and ensure compliance with relevant regulations.
- Stay up-to-date with industry trends and emerging technologies related to ML Ops and DevOps, and propose innovative solutions to improve our ML service delivery.
REQUIRED SKILLS, EXPERIENCE AND QUALIFICATIONS :
- Strong knowledge of cloud platforms (such as AWS, Azure, or GCP) and local cluster deployments, and experience in deploying and managing ML services on these platforms.
- Knowledge of distributed computing frameworks (e.g., Spark) and big data technologies (e.g., Hadoop, Kafka).
- Proficiency in Python, Shell, Ruby, Golang, or C++ and experience with infrastructure-as-code tools (e.g., Terraform, CloudFormation).
- Hands-on experience with containerization technologies (e.g., Docker) and orchestration frameworks (e.g. Kubernetes).
- Familiarity with CI/CD tools (e.g., Jenkins, GitLab CI/CD) and version control systems (e.g., Git).
- Solid understanding of networking, security, and system administration concepts.
- Strong problem-solving and troubleshooting skills, with the ability to quickly analyze and resolve issues in complex ML systems.
- Excellent communication and collaboration skills, with the ability to work effectively in a team-oriented environment.
- Bachelor's or higher degree in Computer Science, Engineering, or a related field.
- Proven experience as an ML Ops Engineer, DevOps Engineer, or a similar role, with a focus on deploying and maintaining machine learning models in production environments.
DESIRED SKILLS AND EXPERIENCE :
- Experience with machine learning frameworks and libraries, such as TensorFlow, PyTorch, or scikit-learn.
- Familiarity with serverless computing and event-driven architectures.
- Experience with logging and monitoring tools (e.g., ELK Stack, Prometheus, Grafana).
- Understanding of software development methodologies and agile practices
By applying, I confirm I have read and accept TransPerfect's Privacy Policy:
-
DevOps & ML Ops Engineer
3 weeks ago
india TransPerfect Full timeSalary: Competitive / Paid in Indian Rupee . INR / Annual Recommended Quick Links Jobs by Location Jobs by Industry What You Should Know About This Job DevOps & ML Ops Engineer would be responsible for developing and maintaining scalable, stable services that deliver machine learning models to end users with...
-
AI ML Ops Engineers
6 days ago
Pune, Maharashtra, India, Maharashtra Amazure Technologies Pvt Ltd Full timeDetailed JD (Roles and Responsibilities)Bachelor’s or Master’s degree in computer science, Data Science, or a related field.3+ years of experience in AI/ML engineering, preferably in IT operations or DevOps environments.Strong programming skills in PythonExperience with implementing GenAi and AI Python SDKsExperience with time-series...
-
ML Ops Engineer
2 weeks ago
Bengaluru, India Aurigo Software Technologies Full timeJob Description Role Brief: We are seeking a skilled ML Ops Engineer to design, implement, and maintain scalable machine learning and large language model (LLM) pipelines in cloud environments, primarily using AWS services. This role is critical to ensuring the reliability, efficiency, and performance of ML systems in production. The ideal candidate will...
-
ML Ops Engineer
2 weeks ago
Bengaluru, Karnataka, India, Karnataka Aurigo Software Technologies Full timeRole Brief: We are seeking a skilled ML Ops Engineer to design, implement, and maintain scalable machine learning and large language model (LLM) pipelines in cloud environments, primarily using AWS services. This role is critical to ensuring the reliability, efficiency, and performance of ML systems in production.The ideal candidate will have hands-on...
-
ML Ops
2 weeks ago
India EXL Full timeJob Description Prior 2+ years of experience working with ML Ops & DS Responsibilities & Skills: Deploy, monitor, and scale ML models on AWS (SageMaker, EKS, Lambda) or GCP (Vertex AI, GKE, Cloud Functions). Build and maintain CI/CD pipelines for ML workflows using GitHub Actions / Jenkins / cloud-native tools. Containerize and orchestrate workloads with...
-
DevOps Engineer
1 week ago
, India, IN WaferWire Cloud Technologies Full timeJob Title: DevOps EngineerJob Location: Hyderabad, IndiaWorksite: Onsite (100%)About WCT:WaferWire Technology Solutions (WCT) specializes in delivering comprehensive cloud, data, and AI solutions through Microsoft's technology stack. Our services include strategic consulting, Data/AI Estate Modernization, and Cloud Adoption Strategy. We excel in Solution...
-
Devops Architect
1 week ago
Coimbatore, India techjays Full timeJob Description What makes Techjays an inspiring place to work At Techjays, we are driving the future of artificial intelligence with a bold mission to empower businesses worldwide by helping them build AI solutions that transform industries. As an established leader in the AI space, we combine deep expertise with a collaborative, agile approach to deliver...
-
ML Ops Engineer
1 week ago
, India, IN Mastech Digital Full timePosition Title: ML Ops Engineer 4Complete onsite.Full-Time role Shift Timings: 10-7 pm /11-8 pm /12-9 pmAddress: Spire T110, Hyderabad Knowledge City, Madhapur, Hyderabad, Telangana, India, 500081.Job Description:Roles & Responsibilities:Define the long-term vision and strategy for MLOps initiatives: Set the direction for the organization’s MLOps, model...
-
ML Ops Engineer
1 week ago
Bengaluru, Karnataka, India, Karnataka Wenger & Watson Full timeExperience - 5 - 10 yrsLocations - Bangalore, Chennai, Mumbai, Pune, HyderabadKey ResponsibilitiesDesign, develop, and deploy machine learning models using AWS SageMaker for various business applicationsImplement end-to-end ML pipelines from data preprocessing to model serving and monitoringBuild and maintain automated model training, validation, and...
-
ML OPS
3 weeks ago
Noida, India IRIS software Full timeJob Description Why Join Iris Are you ready to do the best work of your career at one ofIndia's Top 25 Best Workplaces in IT industry Do you want to grow in an award-winning culture thattruly values your talent and ambitions Join Iris Software - one of thefastest-growing IT services companies- whereyou own and shape your success story. About Us At Iris...