Publications

Multi-Level Aware Preference Learning: Enhancing RLHF for Complex Multi-Instruction Tasks

under review in Neural Information Processing Systems, 2025

The research identifies a critical oversight in existing techniques, which predominantly focus on comparing responses while neglecting valuable latent signals embedded within prompt inputs, and which only focus on preference disparities at the intra-sample level, while neglecting to account for the inter-sample level preference differentials that exist among preference data. To leverage these previously neglected indicators, it proposes a novel Multi-level Aware Preference Learning (MAPL) framework, capable of enhancing multi-instruction capabilities.

Recommended citation: Sun Ruopei, Cai Jianfeng. (2025). "Multi-Level Aware Preference Learning: Enhancing RLHF for Complex Multi-Instruction Tasks." arXiv preprint arXiv: 2505.12845, 2025. https://arxiv.org/abs/2505.12845

Bias Fitting to Mitigate Length Bias of Reward Model in RLHF

under review in Neural Information Processing Systems, 2025

To accurately model the intricate nature of length bias and facilitate more effective bias mitigation, it proposes FiMi-RM (Bias Fitting to Mitigate Length Bias of Reward Model in RLHF), a framework that autonomously learns and corrects underlying bias patterns.

Recommended citation: Zhao Kangwen, Cai Jianfeng. (2025). "Bias Fitting to Mitigate Length Bias of Reward Model in RLHF." arXiv preprint arXiv: 2505.12843, 2025. https://arxiv.org/abs/2505.12843

Mitigating Hallucination in VideoLLMs via Temporal-Aware Activation Engineering

under review in Neural Information Processing Systems, 2025

It is the first to systematically investigate the effectiveness and underlying mechanisms of activation engineering for mitigating hallucinations in VideoLLMs. And it proposes a temporal-aware activation engineering framework for VideoLLMs, which adaptively identifies and manipulates hallucination-sensitive modules based on the temporal variation characteristic, substantially mitigating hallucinations without additional LLM fine-tuning.

Recommended citation: Cai Jianfeng. (2025). "Mitigating Hallucination in VideoLLMs via Temporal-Aware Activation Engineering." arXiv preprint arXiv: 2505.12826, 2025. https://arxiv.org/abs/2505.12826

Disentangling Length Bias In Preference Learning Via Response-Conditioned Modeling

under review in Empirical Methods in Natural Language Processing, 2025

It introduces a Response-conditioned Bradley-Terry (Rc-BT) model that enhances the model’s capability in length bias mitigating and length instruction following, through training on the augmented dataset. Furthermore, it proposes the Rc-RM and Rc-DPO algorithm to leverage the Rc-BT model for reward modeling and direct policy optimization (DPO) of LLMs.

Recommended citation: Cai Jianfeng. (2025). "Disentangling Length Bias In Preference Learning Via Response-Conditioned Modeling." arXiv preprint arXiv: 2502.00814, 2025. http://arxiv.org/abs/2502.00814

Heterogeneous Network Based Contrastive Learning Method for PolSAR Land Cover Classification

in IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 2023

This paper proposes a Heterogeneous Network based on Contrastive Learning (HCLNet). HCLNet aims to learn high-level representation from unlabeled PolSAR data for few-shot classification according to multi-features.

Recommended citation: Cai, Jianfeng, et al. "Heterogeneous Network Based Contrastive Learning Method for PolSAR Land Cover Classification." IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing (2024). https://ieeexplore.ieee.org/abstract/document/10601228

Flowmind2Digital: The First Comprehensive Flowmind Recognition and Conversion Approach

reviewed in Multidisciplinary Digital Publishing Institute Electronics, 2023

This paper proposes the Flowmind2digital method and hdFlowmind dataset to address the convertion of hand-drawn flowchart/mindmap.

Recommended citation: Liu Huanyu*, Cai Jianfeng*. (2024). "Flowmind2Digital: The First Comprehensive Flowmind Recognition and Conversion Approach." arXiv preprint arXiv: 2401.03742, 2024. http://arxiv.org/abs/2401.03742

手绘场景下的图像识别与智能转化方法、系统及计算机可读介质

under review in 国 家 知 识 产 权 局, 2023

本发明属于图像识别与人机交互技术领域, 涉及手绘流程草图的识别及其计算机可编辑标准格式的生成, 具体涉及手绘场景下的图像识别与智能转化方法、系统及计算机可读介质。

Recommended citation: 蔡建峰. (2023). "手绘场景下的图像识别与智能转化方法、系统及计算机可读介质" 国家知识产权局.