Microsoft's mission is to empower every person and every organization on the planet to achieve more. Microsoft believes that AI will play a critical role in accomplishing that mission.
We are seeking a Senior Applied Scientist to contribute to the ongoing development of Microsoft 365 Copilot in Excel, with an emphasis on optimizing LLM and Agentic workflows. This position is part of the Microsoft 365 Excel team, dedicated to pioneering advancements in generative AI. The successful candidate will possess substantial expertise in large language models (LLMs), information retrieval, and machine learning. Responsibilities include close collaboration with engineering and product teams to innovate, design, and assess comprehensive AI solutions for millions of enterprise users. This role will influence technical direction, inform product development, and foster cross-team collaboration to deliver impactful AI-driven experiences that empower users to achieve more.
You’ll work as part of an Applied Science team on high-impact, technically ambitious AI projects that directly shape the future of Microsoft 365 Copilot in Excel including
- You will design, fine-tune, and deliver models and agentic flows for integration with Excel Agent and on-canvas experiences.
- You will leverage state-of-the-art LLM fine-tuning and retrieval methods, with robust evaluation metrics and A/B testing to ensure data-driven progress.
- You will gather and curate relevant benchmarks, build a comprehensive evaluation framework, and develop GPT-based evaluators (LLM-as-a-Judge). Run controlled experiments to compare performance, efficiency, and scalability using data-driven metrics and A/B testing focusing on reproducible and impactful results.
- You will continuously study emerging literature, share insights with leadership and peers during research reviews and deep dives adapt quickly to new findings, and integrate them into experiments and when applicable share with broader research community.
M.Sc. / Ph.D. in Computer Science, Information Systems, or Data Science (Ph.D. strongly preferred). Candidates with master’s degrees with proven industry experience or a strong publication record in the areas of LLM, Information Retrieval, Machine Learning, Natural Language Processing, and Deep Learning are considered as well.
We require strong hands-on (at least 3+ years) of experience in building and deploying Machine Learning products. Key areas of expertise include Natural Language Processing and Large Language Models, along with an understanding of concepts such as Privacy and Responsible AI. Candidates are expected to demonstrate a strong history of successfully translating applied research into production-ready solutions, along with a proven track record of delivering projects within large-scale production environments.
We are seeking candidates with proven expertise in the LLM domain, demonstrating comprehensive knowledge of relevant concepts in the domain. Ideal applicants should be proficient in areas such as LLM’s post training, including CPT, SFT and RL, LLM benchmarking, agentic flows, and model alignment.
Outstanding proficiency in problem-solving and data analysis, with substantial expertise in applied statistics. Notably experienced in evaluating the performance of large language models (LLMs), developing benchmarks tailored to practical scenarios.
Preferred Qualifications:
PhD degree in Computer Science, Information Systems, or Data Science.
Proven track record in training large language models and post-training large language models, using reinforcement learning or similar techniques.
First-hand experience building LLM flows and agentic AI models.
Customer obsession and passionate about making real world product impact.
Excellent verbal and written communication skills, with the ability to simplify and explain complex ideas.
Effective collaboration skills while working effectively within a globally distributed organization.
Responsibilities
You’ll work as part of an Applied Science team on high-impact, technically ambitious AI projects that directly shape the future of Microsoft 365 Copilot in Excel including
You will design, fine-tune, and deliver models and agentic flows for integration with Excel Agent and on-canvas experiences.
You will leverage state-of-the-art LLM fine-tuning and retrieval methods, with robust evaluation metrics and A/B testing to ensure data-driven progress.
You will gather and curate relevant benchmarks, build a comprehensive evaluation framework, and develop GPT-based evaluators (LLM-as-a-Judge). Run controlled experiments to compare performance, efficiency, and scalability using data-driven metrics and A/B testing focusing on reproducible and impactful results.
You will continuously study emerging literature, share insights with leadership and peers during research reviews and deep dives adapt quickly to new findings, and integrate them into experiments and when applicable share with broader research community.
This position will be open for a minimum of 5 days, with applications accepted on an ongoing basis until the position is filled.
Microsoft is an equal opportunity employer. All qualified applicants will receive consideration for employment without regard to age, ancestry, citizenship, color, family or medical care leave, gender identity or expression, genetic information, immigration status, marital status, medical condition, national origin, physical or mental disability, political affiliation, protected veteran or military status, race, ethnicity, religion, sex (including pregnancy), sexual orientation, or any other characteristic protected by applicable local laws, regulations and ordinances. If you need assistance with religious accommodations and/or a reasonable accommodation due to a disability during the application process, read more about requesting accommodations.
your
mark