This is Junfeng JIANG (江俊锋，こう　しゅんほう). Currently, I am a project researcher at the Research and Development Center for Large Language Models (LLMC) of the National Institute of Informatics (NII). Previously, I was a PhD student at the University of Tokyo under the supervision of Prof. Akiko Aizawa, supported by SPRING GX Program.

My research direction focuses on developing biomedical large language models. My previous research direction is the Document-grounded Dialogue System (DGDS). Any interesting discussion is welcome, and my full publications can be accessed from Google Scholar.).

Currently, I am working on research related to

Large Language Model (not so large but large enough that cannot be fitted in a single A100 80GB GPU)
Biomedical Language Model
Chain-of-thought Finetuning

Any collaboration or discussion is welcome! And I am also recruiting Research Assistants (RAs) at NII LLMC. Please feel free to contact me with your CV/Resume via jiang (at) nii.ac.jp, if you are interested in this position. :)

🔥 News

2026.04: 🎉 A paper was accepted by ACL 2026 main conference paper.
2026.02: 🎉 A paper was accepted by LREC 2026.
2025.09: 🎓 Finally obtained my PhD degree.
2025.09: 🎉 A paper was accepted as a full paper at NLPIR 2025.
2025.08: 🎉 A paper was accepted as EMNLP 2025 Findings. See you in Suzhou!

📖 Educations

2022.04 - 2025.09, Ph.D. in Computer Science, The University of Tokyo.
2019.09 - 2022.04, M.S. in Computer Science (2.88/3.0), The University of Tokyo.
2015.08 - 2019.07, B.S. in Mathematics And Applied Mathematics (3.7/5.0), Sun Yat-Sen University.

🐂 Work Experience

2025.04 - Now, Project Researcher, National Institute of Informatics, Tokyo, Japan.

👨‍💻 Internships

2022.12 - 2025.04, Research Assistant, National Institute of Informatics, Tokyo, Japan.
2022.05 - 2022.09, NLP Research Intern, Alibaba DAMO Academy, Beijing, China.
2020.12 - 2021.05, NLP Research Intern, Baidu Inc., Shenzhen, China.
2020.08 - 2020.12, NLP R&D Intern, Tencent Inc., Shenzhen, China.
2019.10 - 2020.08, NLP Research Intern, Didi Chuxing AI Labs, Beijing, China.
2018.07 - 2019.01, AI Research Intern, Likelihood Lab, Guangzhou, China.

📝 Publications

Journals

An Wang, Huidong Jiang, Youmi Ma, Junfeng Jiang, Ao Liu, & Okazaki, Naoaki. (2025). Improving Implicit Sentiments Analysis via Explanations of Multiple Perspectives, in IEEE Access, doi: 10.1109/ACCESS.2025.3556762.
Detai Xin*, Junfeng Jiang*, Shinnosuke Takamichi, Yuki Saito, Akiko Aizawa, & Hiroshi Saruwatari. (2024). JVNV: A Corpus of Japanese Emotional Speech with Verbal Content and Nonverbal Expressions, in IEEE Access, vol. 12, pp. 19752-19764, 2024, doi: 10.1109/ACCESS.2024.3360885.

Conference Papers

Fan Gao, Sherry T. Tong, Jiwoong Sohn, Jiahao Huang, Junfeng Jiang, Ding Xia, Piyalitt Ittichaiwong, Kanyakorn Veerakanjana, Hyunjae Kim, Qingyu Chen, Edison Marrese-Taylor, Kazuma Kobayashi, Akiko Aizawa, Irene Li. MED-COREASONER: Reducing Language Disparities in Medical Reasoning via Language-Informed Co-Reasoning. In the Association for Computational Linguistics: ACL 2026, to appear, San Diego, California, United States. Association for Computational Linguistics. (ACL 2026)
Akiko Aizawa, Yuki Arase, Fei Cheng, Jiahao Huang, Zhiyi Huang, Junfeng Jiang, Teruhito Kanazawa, Daisuke Kawahara, Kazuma Kobayashi, Takashi Kodama, Sadao Kurohashi, Yusuke Oda, Yuma Tsuta, Zhen Wan, Zhishen Yang and Rio Yokota. (to appear). Building Effective Japanese Medical LLMs with an Open Recipe for Domain Adaptation through Continued Pre-training. In Proceedings of the Fifteenth International Conference on Language Resources and Evaluation (LREC 2026).
Jiahao Huang, Fei Cheng, Junfeng Jiang, Kazuma Kobayashi, Akiko Aizawa. (2025, December). Language Bias in Multilingual RAG: A Case Study in the Japanese Medical Domain. In Proceedings of the 2025 9th International Conference on Natural Language Processing and Information Retrieval, December, Fukuoka, Japan. (NLPIR 2025).
Kazuma Kobayashi, Zhen Wan, Fei Cheng, Yuma Tsuta, Xin Zhao, Junfeng Jiang, Jiahao Huang, Zhiyi Huang, Yusuke Oda, Rio Yokota, Yuki Arase, Daisuke Kawahara, Akiko Aizawa, and Sadao Kurohashi. 2025. Leveraging High-Resource English Corpora for Cross-lingual Domain Adaptation in Low-Resource Japanese Medicine via Continued Pre-training. In Findings of the Association for Computational Linguistics: EMNLP 2025, pages 11469–11488, Suzhou, China. Association for Computational Linguistics. (EMNLP 2025, findings).
Chengzhi Zhong, Qianying Liu, Fei Cheng, Junfeng Jiang, Zhen Wan, Chenhui Chu, Yugo Murawaki, and Sadao Kurohashi. 2025. What Language Do Non-English-Centric Large Language Models Think in?. In Findings of the Association for Computational Linguistics: ACL 2025, pages 26333–26346, Vienna, Austria. Association for Computational Linguistics. (ACL 2025, findings) [paper].
Junfeng Jiang, Jiahao Huang, and Akiko Aizawa. 2025. JMedBench: A Benchmark for Evaluating Japanese Biomedical Large Language Models. In Proceedings of the 31st International Conference on Computational Linguistics, pages 5918–5935, Abu Dhabi, UAE. Association for Computational Linguistics. (COLING 2025) [paper]; [data]; [code] .
Junfeng Jiang, Fei Cheng, and Akiko Aizawa. 2024. Improving Referring Ability for Biomedical Language Models. In Findings of the Association for Computational Linguistics: EMNLP 2024, pages 6444–6457, Miami, Florida, USA. Association for Computational Linguistics. (EMNLP 2024, findings). [paper]; [code] .
Davide Baldelli, Junfeng Jiang, Akiko Aizawa and Paolo Torroni. TWOLAR: a TWO-step LLM-Augmented distillation method for passage Reranking. In European Conference on Information Retrieval, vol 14608, pages 470-485, 2024, Springer. doi: 10.1007/978-3-031-56027-9_29. (ECIR 2024). [paper]; [code].
Junfeng Jiang, Chengzhang Dong, Sadao Kurohashi, Akiko Aizawa. SuperDialseg: A Large-scale Dataset for Supervised Dialogue Segmentation. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, pages 4086–4101, Singapore. Association for Computational Linguistics. (EMNLP 2023). [paper]; [code].
An Wang, Junfeng Jiang, Youmi Ma, Ao Liu, and Naoaki Okazaki. 2023. Generative Data Augmentation for Aspect Sentiment Quad Prediction. In Proceedings of the 12th Joint Conference on Lexical and Computational Semantics (*SEM 2023), pages 128–140, Toronto, Canada. Association for Computational Linguistics. (*SEM 2023). [paper]; [code].
Che Liu, Rui Wang, Junfeng Jiang, Yongbin Li, Fei Huang. Dial2vec: Self-Guided Contrastive Learning of Unsupervised Dialogue Embeddings. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 7272–7282, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics. (EMNLP 2022). [paper]; [code].
Junfeng Jiang*, An Wang*, and Akiko Aizawa. Attention-based Relational Graph Convolutional Network for Target-Oriented Opinion Words Extraction. The 16th Conference of the European Chapter of the Association for Computational Linguistics, pp.1986–1997. Online, April 19–23, 2021. (EACL 2021). [paper]; [code].
Che Liu, Junfeng Jiang, Chao Xiong, Yi Yang, Jieping Ye. Towards Building an Intelligent Chatbot for Customer Service: Learning to Respond at the Appropriate Time. In Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (pp. 3377-3385). (KDD 2020). [paper].

Preprints

Xanh Ho*, Anh Khoa Duong Nguyen*, An Tuan Dao*, Junfeng Jiang*, Yuki Chida*, Kaito Sugimoto*, Huy Quoc To, Florian Boudin, Akiko Aizawa. A Survey of Pre-trained Language Models for Processing Scientific Text. arXiv preprint arXiv:2401.17824. [paper].
Chao Xiong, Che Liu, Zijun Xu, Junfeng Jiang, Jieping Ye. Sequential Sentence Matching Network for Multi-turn Response Selection in Retrieval-based Chatbots. arXiv preprint arXiv:2005.07923. [paper].
Junfeng Jiang, Jiahao Li. Constructing financial sentimental factors in Chinese market using natural language processing. arXiv preprint arXiv:1809.08390. [paper]; [code].

💻 Projects

med-eval is an evaluation library for medical tasks.
BioMed-LLaMA is a project of continuous pre-training for biomedical LLM.
pytoflow is an unofficial PyTorch version implementation of TOFlow: Video Enhancement with Task-Oriented Flow. [demo].

🎖 Honors and Awards

Best Presentation Award (最優秀賞) at the 13th AAMT for Young Translation Research Group: What language do Japanese-specialized large language models think in?

💰 Fundings

2022.12 - 2023.03, Self-directed and integrated project research, SPRING GX, JST. (~500K JPY)
2022.04 - 2025.04, SPRING GX, JST. (~1M JPY)

👓 Committees

Invited Reviewer for conferences: COLING 2025; CIKM 2024; LREC-COLING 2024; ACL 2023, 2025; EACL 2023; EMNLP 2023,2022,2021.
Secondary Reviewer for conferences: ICML 2025; IJCNLP-AACL 2023.