Junlin Yang

Hi, I'm Junlin Yang, a third-year student in Department of Computer Science and Technology at Tsinghua University. I'll be fortunate to visit Siebel School of Computing and Data Science at University of Illinois at Urbana-Champaign as a summer research intern in 2025 summer, where I'll be advised by Prof. Hao Peng. I've had the privilege of being a research intern at the XLANG Lab at The University of Hong Kong, advised by Prof. Tao Yu.

Feel free to reach out if you're interested in my research, looking for collaboration, or just want to chat!

Email  /  Twitter  /  Bluesky  /  Github  /  Google Scholar

profile photo

Research

I am particularly interested in Machine Learning, with a focus on NLP , Reinforcement Learning and Multimodal Learning. My research has been focused on building embodied agents, especially computer agents, that can excel in solving human tasks and collaborating effectively with people. I remain curious and open to exploring various research questions in ML.

Scaling Computer-Use Grounding via User Interface Decomposition and Synthesis
Tianbao Xie*, Jiaqi Deng*, Xiaochuan Li*, Junlin Yang*, Haoyuan Wu, Jixuan Chen, Wenjing Hu, Xinyuan Wang, Yuhui Xu, Zekun Wang, Yiheng Xu, Junli Wang, Doyen Sahoo, Tao Yu†, Caiming Xiong†
arXiv, 2025
project page / arXiv

TL; DR: GUI grounding is essential for computer-use agents but current benchmarks oversimplify the task. We introduce OSWorld-G, a detailed benchmark, and Jedi, the largest GUI grounding dataset with 4M examples. Models trained on Jedi outperform existing methods and boost AI agents' task success on OSWorld from 5% to 27%. Our studies show that specialized, diverse data improves generalization to new interfaces.

Projects

OSWorld: Benchmarking Multimodal Agents for Open-Ended Tasks in Real Computer Environments

2024.8 - present Maintainer

TL; DR: OSWorld is a first-of-its-kind scalable, real computer environment for multimodal agents, supporting task setup, execution-based evaluation, and interactive learning across operating systems. It can serve as a unified environment for evaluating open-ended computer tasks that involve arbitrary apps (e.g., task examples in the above Fig). We also create a benchmark of 369 real-world computer tasks in OSWorld with reliable, reproducible setup and evaluation scripts.

MartialArtsLM: Pretraining and Fine-tuning a Model Capable of Answering Questions about Martial Arts Novel

2023.8-2023.9 Individual Project

Pretrained the LM(Language Model) using preprocessed data from novels by Louis Cha, then SFT the LM with an estimate of 400,000 pieces of synthesized Q\&A data.

Selected Awards and Honors

  • Overall Excellence Scholarship, Tsinghua University, 2024
  • Overall Excellence Scholarship, Tsinghua University, 2023
  • Freshman Scholarship, Tsinghua University, 2022
  • Outstanding Student Cadre, Tsinghua University, 2023

Languages

  • Mandarin (Native)
  • English (Fluent)
  • TOEFL: 110 (R:29, L:29, S:25, W:27)

Service and Leadership

I founded the first alumni mutual-help platform in my high school. Our platform aims to bridge the information gap for high school students from diverse economic, familial, and cognitive backgrounds by providing them with comprehensive insights into university academic life. By sharing experiences, offering guidance, and fostering a supportive community, the platform has become a vital resource for students who might otherwise lack access to such information. To date, it has received over 50k reads and attracted more than 2.5k followers.

Miscellanea

Senior Mentors

At Tsinghua, I was fortunate to meet some incredibly kind, talented, and supportive seniors, including Yuxuan Li and Zirui Cheng. During my internship at HKU, I was lucky to work closely with Tianbao Xie and Yiheng Xu. I'm also grateful to have collaborated with Xinyuan Wang and Bowen Wang on projects like AgentNet.

Hobbies

  • Athletics: Tennis, Badminton, jogging
  • Arts: Music, film, reading

The source code is inspired by Jon Barron. Thanks for his sharing! 🙏