Senior Applied Scientist

3 days ago


Bengaluru India Microsoft Full time

Job Description Overview Deep learning and generative models are reshaping how people discover and engage with ads. Microsoft Ads runs large-scale recommender systems that serve billions of requests under tight latency, cost, and reliability conditions. Delivering relevant ads efficiently at this scale requires innovation across the full stack: models, kernels, serving systems, and GPU/accelerator infrastructure. We are looking for a Senior Applied Scientist with strong foundations in systems and machine learning, and with experience in one or more of the following areas: - Large-scale inference and serving architectures - Ads retrieval, ranking, and recommendation models optimized for online performance - GPU / accelerator programming and kernel optimization Primary success metric is (latency, cost, revenue) in production, while also helping shape the technical direction of Ads recommendation and inference Responsibilities - Design and optimize end-to-end Ads inference model & workflows for retrieval & ranking meeting strict p99 latency and throughput goals. - Invent and implement efficiency techniques such as dynamic batching, routing, scheduling, caching, sequence packing, quantization, and speculative decoding to improve utilization and tail latency. - Develop and tune GPU kernels and operators eg kernel fusion, memory-aware layouts, sparsity. - Use profiling and diagnostic tools to analyze GPU utilization, memory bandwidth, and kernel performance - Design and evolve serving architectures for multi-tenant workloads, including policies for placement, parallelism, autoscaling, and safe rollout under real-world SLOs. - Build and optimize caching layers and KV-cache management (feature/result caches, request deduplication, paging/offload) to improve both latency and efficiency. - Co-design model architectures that are inference-friendly while preserving or improving quality metrics Qualifications - Bachelor's or Master's degree in Computer Science, Electrical/Computer Engineering, or a related field, with 6+ years of related experience. - Strong programming skills in C++ or Python (both are a plus; at least one is required). - Hands-on experience in one or more: - Implementing and deploying deep learning models for online inference, - Building and operating latency-sensitive online services at scale - GPU/accelerator programming and performance optimization. - Experience with deep learning frameworks such as PyTorch, TensorFlow, or JAX. - Ability to design experiments, analyze results, and make data-driven decisions in complex systems. - Strong communication and collaboration skills, with experience working across ML, systems, and product or business stakeholders. Preferred Qualifications - 3+ years of experience in Kernel programming and Inference optimization - Experience with inference serving frameworks (for example: vLLM, Triton Inference Server, TensorRT-LLM or similar). - Deep understanding of inference efficiency techniques for LLM/SLM (paged KV cache, continuous batching/sequence packing, speculative decoding, quantization, adapters/LoRA, sparsity). - Familiarity with compiler and auto-tuning techniques, automated kernel/code generation, or ML-based performance optimization. - Background in cost/performance modeling, capacity planning, and autoscaling for large fleets of GPUs or accelerators. - Experience in Ads, search, recommendations, or similar large-scale ranking systems where latency, cost, and relevance are jointly optimized (strong plus, but not required). - Track record of impact via research publications, patents, or shipping large-scale systems in ML, systems, or recommendation domains This position will be open for a minimum of 5 days, with applications accepted on an ongoing basis until the position is filled. Microsoft is an equal opportunity employer. All qualified applicants will receive consideration for employment without regard to age, ancestry, citizenship, color, family or medical care leave, gender identity or expression, genetic information, immigration status, marital status, medical condition, national origin, physical or mental disability, political affiliation, protected veteran or military status, race, ethnicity, religion, sex (including pregnancy), sexual orientation, or any other characteristic protected by applicable local laws, regulations and ordinances. If you need assistance with religious accommodations and/or a reasonable accommodation due to a disability during the application process, read more about requesting accommodations.



  • Bengaluru, India Amazon Full time

    Job Description Description Are you passionate about building data-driven applied science solutions to drive the profitability of the business Are you excited about solving complex real world problems Do you have proven analytical capabilities, exceptional communication, project management skills, and the ability to multi-task and thrive in a fast-paced...


  • Bengaluru, Karnataka, India Amazon Full time

    Are you passionate about building data-driven applied science solutions to drive the profitability of the business? Are you excited about solving complex real world problems? Do you have proven analytical capabilities, exceptional communication, project management skills, and the ability to multi-task and thrive in a fast-paced environment? Join us a Senior...


  • Bengaluru, Karnataka, India Oracle Full time

    Oracle Cloud Infrastructure blends the speed of a startup with the scale of an enterprise leader. Our Generative AI and AI Solutions Engineering team builds advanced AI solutions that run on powerful cloud infrastructure tackling real-world, global challenges. In this role, you will:Design, build, and deploy cutting-edge machine learning and generative AI...


  • Bengaluru, India Amazon Development Centre (India) Private Limited - S55 Full time

    Are you passionate about building data-driven applied science solutions to drive the profitability of the business? Are you excited about solving complex real world problems? Do you have proven analytical capabilities, exceptional communication, project management skills, and the ability to multi-task and thrive in a fast-paced environment? Join us a Senior...


  • Bengaluru, India Amazon Full time

    Are you passionate about building data-driven applied science solutions to drive the profitability of the business? Are you excited about solving complex real world problems? Do you have proven analytical capabilities, exceptional communication, project management skills, and the ability to multi-task and thrive in a fast-paced environment? Join us a Senior...


  • Bengaluru, India Amazon Full time

    This job is with Amazon, an inclusive employer and a member of myGwork – the largest global platform for the LGBTQ+ business community. Please do not contact the recruiter directly.DESCRIPTION:Are you passionate about building data-driven applied science solutions to drive the profitability of the business? Are you excited about solving complex real world...


  • Bengaluru, India Microsoft Full time

    Job Description Overview Overview With the rapid acceleration of AI and the need to deliver trustworthy, high-performing models, Microsoft's Customer Experience (CXP) Data Science team is driving innovation at scale! Our mission is to advance AI capabilities through rigorous evaluations, fine-tuning, and large-scale experimentation to create intelligent,...


  • Bengaluru, Karnataka, India Microsoft Full time ₹ 15,00,000 - ₹ 30,00,000 per year

    The Microsoft Word Writing Assistance and Language Intelligence team located in Redmond, Washington build large scale natural language understanding and natural language generation models using cutting edge Natural Language Processing technologies and advanced linguistic knowledge. The team has a 25-year track record of delivering innovative AI-powered...


  • Bengaluru, Karnataka, India Microsoft Full time ₹ 12,00,000 - ₹ 36,00,000 per year

    Microsoft Advertising empowers businesses to reach global audiences with precision and creativity. We build cutting-edge technology that connects advertisers and consumers across search, display, video, and emerging media platforms. We are seeking a Senior Applied Scientist to drive innovation in training and deploying SLM/LRM for advertising. In this role,...


  • Bengaluru, India Amazon Full time

    This job is with Amazon, an inclusive employer and a member of myGwork – the largest global platform for the LGBTQ+ business community. Please do not contact the recruiter directly.DESCRIPTION:Are you passionate about building data-driven applied science solutions to drive the profitability of the business? Are you excited about solving complex real world...