Restored Cloud | Machine Learning Engineer

1 day ago


india Restored Cloud Full time

Machine Learning Engineer - Infrastructure

Job Description:

As a Machine Learning Engineer specializing in infrastructure at Restored Cloud, you will design and build the tools, frameworks, and systems that enable efficient training, deployment, and scaling of machine learning models. You will work on cutting-edge challenges in model optimization, infrastructure automation, and distributed computing to support high-performance AI/ML workflows. Your work will directly impact how engineers train and deploy large-scale models seamlessly and reliably.


Responsibilities:


  • Develop and maintain ML infrastructure for distributed model training and inference.
  • Implement tools for model versioning, experiment tracking, and automated deployments.
  • Optimize ML pipelines to improve training and inference efficiency at scale.
  • Collaborate with data scientists and engineers to integrate ML workflows with existing systems.
  • Monitor and ensure the reliability, security, and performance of the ML infrastructure.
  • Ability to adapt to new technologies and take on new responsibilities and roles in a fast-paced growing company. 




Qualifications:


  • Experience with ML frameworks like TensorFlow, PyTorch, or JAX.
  • Knowledge of MLOps tools such as MLflow, Kubeflow, or Airflow.
  • Proficiency in containerization and orchestration tools (e.g., Docker, Kubernetes).
  • Strong programming skills in Python and familiarity with CI/CD pipelines.
  • Understanding of distributed training methods and hardware acceleration (e.g., GPUs, TPUs).
  • Worked with LLMs and models over 10B parameters. 
  • 5+ years of experience in Machine Learning, Systems Engineering, or a related field.




  • india Restored Cloud Full time

    Machine Learning Engineer - InfrastructureJob Description:As a Machine Learning Engineer specializing in infrastructure at Restored Cloud, you will design and build the tools, frameworks, and systems that enable efficient training, deployment, and scaling of machine learning models. You will work on cutting-edge challenges in model optimization,...


  • India Restored Cloud Full time

    At Restored Cloud, we are seeking an experienced Cloud Infrastructure Machine Learning Architect to design and build cutting-edge tools, frameworks, and systems for efficient machine learning model training, deployment, and scaling.The ideal candidate will have a strong background in cloud infrastructure, machine learning, and software development....


  • India Restored Cloud Full time

    Machine Learning Engineer - Infrastructure Job Description: As a Machine Learning Engineer specializing in infrastructure at Restored Cloud, you will design and build the tools, frameworks, and systems that enable efficient training, deployment, and scaling of machine learning models. You will work on cutting-edge challenges in model optimization,...


  • India Restored Cloud Full time

    Machine Learning Engineer - InfrastructureJob Description:As a Machine Learning Engineer specializing in infrastructure at Restored Cloud, you will design and build the tools, frameworks, and systems that enable efficient training, deployment, and scaling of machine learning models. You will work on cutting-edge challenges in model optimization,...


  • india Restored Cloud Full time

    As a Distributed Systems Engineer at Restored Cloud, you will be key in designing and optimizing distributed infrastructure tailored for large-scale AI/ML model training and inference. Your primary focus will address challenges like minimizing checkpointing delays, enabling seamless fault recovery, and maximizing resource utilization for models exceeding 1B...


  • india Restored Cloud Full time

    As a Distributed Systems Engineer at Restored Cloud, you will be key in designing and optimizing distributed infrastructure tailored for large-scale AI/ML model training and inference. Your primary focus will address challenges like minimizing checkpointing delays, enabling seamless fault recovery, and maximizing resource utilization for models exceeding 1B...


  • india Machine Learning Studies Full time

    Internship Opportunity: Machine Learning and AI Please fill out the questionnaire at (You will have to copy-paste the link, apologies for inconvenience) Are you passionate about diving deep into the realms of Machine Learning and AI? We're seeking enthusiastic interns to join our team for an immersive learning experience. Here's what you can expect:...


  • India Machine Learning Studies Full time

    Internship Opportunity: Machine Learning and AI Please fill out the questionnaire at (You will have to copy-paste the link, apologies for inconvenience) Are you passionate about diving deep into the realms of Machine Learning and AI? We're seeking enthusiastic interns to join our team for an immersive learning experience. Here's what you can expect:...


  • India Restored Cloud Full time

    Job OverviewRestored Cloud is seeking a skilled Distributed Systems Engineer to design and optimize distributed infrastructure for large-scale AI/ML model training and inference. As a key member of our team, you will address challenges like minimizing checkpointing delays, enabling seamless fault recovery, and maximizing resource utilization for models...


  • India Restored Cloud Full time

    As a Distributed Systems Engineer at Restored Cloud, you will be key in designing and optimizing distributed infrastructure tailored for large-scale AI/ML model training and inference. Your primary focus will address challenges like minimizing checkpointing delays, enabling seamless fault recovery, and maximizing resource utilization for models exceeding 1B...


  • India Restored Cloud Full time

    As a Distributed Systems Engineer at Restored Cloud, you will be key in designing and optimizing distributed infrastructure tailored for large-scale AI/ML model training and inference. Your primary focus will address challenges like minimizing checkpointing delays, enabling seamless fault recovery, and maximizing resource utilization for models exceeding 1B...


  • India Restored Cloud Full time

    As a Distributed Systems Engineer at Restored Cloud, you will be key in designing and optimizing distributed infrastructure tailored for large-scale AI/ML model training and inference. Your primary focus will address challenges like minimizing checkpointing delays, enabling seamless fault recovery, and maximizing resource utilization for models exceeding 1B...


  • India Lakshmikumaran and Sridharan Full time

    Hiring Senior Machine Learning Engineer for Delhi Location We are passionate about leveraging cutting-edge technologies to drive business solutions. The ideal candidate will have extensive experience in machine learning, deep learning, and a proven track record of implementing and deploying machine learning models in real-world applications, with a strong...


  • India Binoloop Full time

    Machine Learning Engineer at Binoloop   Location: India (Remote)  About Binoloop:    Binoloop is dedicated to revolutionizing decision-making through the development of autonomous AI agents. Our AI co-pilot, Tally, streamlines procurement evaluations by ensuring transparency and cutting evaluation times by 75%. We believe in the power of AI to...


  • India Wildnet Technologies Full time

    About Wildnet Technologies Wildnet Technologies, an award winning White Label Digital Marketing and IT Staff Augmentation Services Agency is a team of experienced professionals helping businesses and Google Partner Agencies achieve their goals by providing comprehensive range of High-Quality Digital Marketing Services and On-Demand Technology Resources....


  • India OpenIntervue Full time

    Company Description OpenIntervue is an AI-powered interview platform based in Bengaluru, revolutionizing the hiring process. Our AI recruiter, Pihu, conducts end-to-end interviews with human-like intelligence, achieving 97.05% accuracy in tech/non-tech hiring. OpenIntervue automates candidate screening, interview scheduling, and reporting to enhance...


  • India Recro Full time

    Title: Machine Learning Engineer Skills: Kuda Programming, GPU CPU Programming, Audio ML Job Description This role requires a strong background in machine learning, proficiency in relevant programming languages and tools, a willingness to embrace challenges, and a commitment to the best software development and testing practices. Additionally,...


  • India Recro Full time

    Title: Machine Learning Engineer Skills: Kuda Programming, GPU CPU Programming, Audio ML Job Description This role requires a strong background in machine learning, proficiency in relevant programming languages and tools, a willingness to embrace challenges, and a commitment to the best software development and testing practices. Additionally,...


  • India Recro Full time

    Title: ML Engineer Skills: Python, TensorFlow, PyTorch, scikit-learn, Pandas; experienced in AI and ML model development, integration, deployment; familiar with Docker, Kubernetes, CI/CD tools, AWS/Azure/GCP; strong in data structures, algorithms, system design; and knowledgeable in SQL, PostgreSQL, and NoSQL databases. Responsibilities Model...


  • India Wildnet Technologies Full time

    About Wildnet Technologies Wildnet Technologies, an award winning White Label Digital Marketing and IT Staff Augmentation Services Agency is a team of experienced professionals helping businesses and Google Partner Agencies achieve their goals by providing comprehensive range of High-Quality Digital Marketing Services and On-Demand Technology Resources....