这篇博客主要讲解 RLHF 具体训练的框架 (DeepSpeedChat，OpenRLHF，verl) 的具体细节，包括每个框架的整体架构，架构内的各部分细节 (包括逻辑细节和代码细节)。(建议先阅读我之前关于 RLHF 的博客 The Basic Knowledge of RLHF (Reinforce Learning with Human Feedback))

The Basic Knowledge of Torch Train Pipeline

14 minute read

Published: May 05, 2024

Update: May 14, 2024

这篇博客主要讲解 PyTorch 训练模型的整个流程的具体细节，包括如何在前向过程中构建计算图；后向传播过程中如何计算并保存梯度；优化器如何根据梯度更新模型参数。(建议先阅读我之前关于 torch.autograd 的博客 The Basic Knowledge of PyTorch Autograd)

The Basic Knowledge of RLHF (Reinforce Learning with Human Feedback)

37 minute read

Published: April 12, 2024

Update: April 14, 2024

这篇博客主要讲解关于 RLHF 的基础知识和训练 LLM 的具体(简易)代码实现.

VMware Workstation Pro 安装 MacOS 虚拟机

17 minute read

Published: March 26, 2024

Update: March 30, 2024

这篇博客主要讲解如何在 VMware Workstation Pro 安装 MacOS 虚拟机。

The Basic Knowledge of MoE

109 minute read

Published: January 09, 2024

这篇博客主要讲解了使用 Mixture of Experts (MoE) 将多个模型进行组合的原理。

The Basic Knowledge of TorchScript

15 minute read

Published: December 22, 2023

这篇博客主要讲解了使用 TorchScript 将 Python 模型代码转化为其他语言代码(如 C++)的原理和具体实现。

The Basic Knowledge of Automatic Mixed Precision

11 minute read

Published: December 19, 2023

这篇博客主要讲解了使用自动混合精度(AMP)降低模型内存占用的原理和具体实现。

The Basic Knowledge of Gradient Penalty

22 minute read

Published: December 19, 2023

这篇博客主要讲解了使用梯度惩罚(gradient penalty)作为正则化项来促进模型学习的数学原理和具体实现。

The Basic Knowledge of PyTorch Autograd

25 minute read

Published: December 15, 2023

Update: May 05, 2024

这篇博客主要介绍了 PyTorch 的 autograd 机制及其具体实现方式。

The Basic Knowledge of PyTorch Distributed

49 minute read

Published: December 13, 2023

Update: May 17, 2024

这篇博客主要介绍了 LLM 分布式并行的训练方式，并着重讲解了 PyTorch 代码的实现 DDP 的方式。

PyTorch 随笔

16 minute read

Published: December 11, 2023

torch.backends.cudnn.deterministic: 固定 cuda 的随机种子，使得每次返回的卷积算法都是确定的，即默认算法

The Basic Knowledge of Computer Hardware

25 minute read

Published: December 06, 2023

这篇博客主要介绍了电脑硬件中的基础知识(ps；强烈安利 B 站硬件茶谈的硬件科普视频，讲的太好了🙂。虽然他现在恰饭有点多😥)

The Basic Knowledge of LLM

9 minute read

Published: December 06, 2023

这篇博客主要介绍了 Large Language Model 的基础知识，包括常见的 LLM，微调方式等。

Animate Anyone

19 minute read

Published: December 03, 2023

论文题目：Animate Anyone: Consistent and Controllable Image-to-Video Synthesis for Character Animation

The Basic Knowledge of Expectation Maximization Algorithm

30 minute read

Published: December 02, 2023

这篇博客参考了通俗理解EM算法，详细推导了 Expectation Maximization (EM) 算法。

PixelDance

8 minute read

Published: November 27, 2023

论文题目：Make Pixels Dance: High-Dynamic Video Generation

The Basic Knowledge of NLP

43 minute read

Published: November 27, 2023

这篇博客主要介绍了 NLP 任务中的基础知识，包括性能评价指标(metrics)，分词算法(tokenization)等。

The Advanced Knowledge of Diffusion Model (DM)

32 minute read

Published: November 24, 2023

这篇博客参考了 What are Diffusion Models?，继续详细讲述了最近大火的 DM 模型的改进的数学原理/推导及编程 (ps：DM 的基础知识详见 The Basic Knowledge of Diffusion Model (DM))。

Prompt-to-Prompt

24 minute read

Published: November 22, 2023

论文题目：Prompt-to-Prompt Image Editing with Cross-Attention Control

Emu series (Emu & Emu Edit & Emu Video)

49 minute read

Published: November 21, 2023

本文主要对近期 Meta 发表的三篇关于视觉处理的文章(Emu 系列)进行论文解读(按照它们的发布顺序)：首先是 SOTA 的 text-to-image 生成模型 Emu；接着以它为 baseline，进行 image edit 的研究改进，提出了一个大一统的图像编辑模型 Emu Edit，这基本上就把图像领域主流的任务都刷了个遍。最后又提出了 Emu Video 模型，利用 Emu 完成了对 text-to-video 生成模型的改进，也获得了 SOTA。 (ps：我猜下一步应该就是 video edit 的研究改进了🙂)

The Basic Knowledge of Scored-based Generative Model

89 minute read

Published: November 16, 2023

这篇博客参考了 Generative Modeling by Estimating Gradients of the Data Distribution，详细讲述了最近大火的 Diffusion Model 的另一个理解/推理角度: Score-based Generative Model 的数学原理及编程。 (ps：建议先看完上述的 Generative Modeling by Estimating Gradients of the Data Distribution 博客，虽然是全英文的，但是写的十分详细，且简单易懂，真的非常良心)

Consistency Model

48 minute read

Published: November 14, 2023

论文题目：Consistency Models

TIN-SLT

21 minute read

Published: November 09, 2023

论文题目：Explore More Guidance: A Task-aware Instruction Network for Sign Language Translation Enhanced with Data Augmentation

SLT-BT

24 minute read

Published: November 09, 2023

论文题目：Improving Sign Language Translation with Monolingual Data by Sign Back-Translation

The Basic Knowledge of Diffusion Model (DM)

45 minute read

Published: November 08, 2023

这篇博客参考了 DDPM讲解，详细讲述了最近大火的 DM 模型的数学原理/推导及编程(ps：强烈安利 Lil 的博客，写的太好了🙂)。

Multi-modality with Context

12 minute read

Published: November 07, 2023

论文题目：Is Context all you Need? Scaling Neural Sign Language Translation to Large Domains of Discourse

XmDA

22 minute read

Published: November 06, 2023

论文题目：Cross-modality Data Augmentation for End-to-End Sign Language Translation

笔记本电脑固态硬盘更换扩容

9 minute read

Published: November 06, 2023

这篇博客主要讲解只有一个固态硬盘(SSD)槽的笔记本电脑在需要扩容的简便操作(无需重装系统)。

SLTUnet

13 minute read

Published: November 05, 2023

论文题目：SLTUNET: A Simple Unified Model for Sign Language Translation

PGen

14 minute read

Published: November 04, 2023

论文题目：Scaling Back-Translation with Domain Text Generation for Sign Language Gloss Translation

Transformer

25 minute read

Published: November 03, 2023

论文题目：Attention is All you Need

Glossification with Editing Program

39 minute read

Published: November 02, 2023

论文题目：Transcribing Natural Languages for the Deaf via Neural Editing Programs

ConvNets Match Vision Transformers at Scale

14 minute read

Published: October 27, 2023

论文题目：ConvNets Match Vision Transformers at Scale

MDM

21 minute read

Published: October 27, 2023

论文题目：Matryoshka Diffusion Models

The Basic Knowledge Of Latex 2

20 minute read

Published: October 24, 2023

这篇博客延续 The Basic Knowledge Of Latex，继续扩展关于 latex 的基本用法。

MQ-Det

16 minute read

Published: October 23, 2023

论文题目：Multi-modal Queried Object Detection in the Wild

GLIP

22 minute read

Published: October 23, 2023

论文题目：Grounded Language-Image Pre-training

latex 问题合集

8 minute read

Published: October 22, 2023

这篇博客主要记录我在使用 latex 的过程中所遇到的问题和解决的方法(注：有些问题可能我自己也不知道原理，但是所有的解决方法都是亲测有效)。

The Basic Knowledge Of Latex

24 minute read

Published: October 22, 2023

这篇博客主要记录了 latex 的基本知识和用法，适用于第一次听到 latex 这个写作排版工具的小白。

ShearedLLaMA

23 minute read

Published: October 21, 2023

论文题目：Sheared LLaMA: Accelerating Language Model Pre-training via Structured Pruning

iTransformer

9 minute read

Published: October 20, 2023

论文题目：iTransformer: Inverted Transformers Are Effective for Time Series Forecasting

RetNet

32 minute read

Published: October 19, 2023

论文题目：Retentive Network: A Successor to Transformer for Large Language Models

HumanMAC

23 minute read

Published: October 19, 2023

论文题目：HumanMAC: Masked Motion Completion for Human Motion Prediction

Stable Diffusion

15 minute read

Published: October 15, 2023

论文题目：High-Resolution Image Synthesis With Latent Diffusion Models

ControlNet

13 minute read

Published: October 13, 2023

论文题目：Adding Conditional Control to Text-to-Image Diffusion Models

PaclMap

21 minute read

Published: October 12, 2023

论文题目：Multipattern Mining Using Pattern-Level Contrastive Learning and Multipattern Activation Map

portfolio

Portfolio item number 1

Short description of portfolio item number 1

Portfolio item number 2

Short description of portfolio item number 2

publications

手绘场景下的图像识别与智能转化方法、系统及计算机可读介质

under review in 国家知识产权局, 2023

本发明属于图像识别与人机交互技术领域，涉及手绘流程草图的识别及其计算机可编辑标准格式的生成，具体涉及手绘场景下的图像识别与智能转化方法、系统及计算机可读介质。

Recommended citation: 蔡建峰. (2023). "手绘场景下的图像识别与智能转化方法、系统及计算机可读介质" 国家知识产权局.

Flowmind2Digital: The First Comprehensive Flowmind Recognition and Conversion Approach

reviewed in Multidisciplinary Digital Publishing Institute Electronics, 2023

This paper proposes the Flowmind2digital method and hdFlowmind dataset to address the convertion of hand-drawn flowchart/mindmap.

Recommended citation: Liu Huanyu^*, Cai Jianfeng^*. (2024). "Flowmind2Digital: The First Comprehensive Flowmind Recognition and Conversion Approach." arXiv preprint arXiv: 2401.03742, 2024. http://arxiv.org/abs/2401.03742

Heterogeneous Network Based Contrastive Learning Method for PolSAR Land Cover Classification

in IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 2023

This paper proposes a Heterogeneous Network based on Contrastive Learning (HCLNet). HCLNet aims to learn high-level representation from unlabeled PolSAR data for few-shot classification according to multi-features.

Recommended citation: Cai, Jianfeng, et al. "Heterogeneous Network Based Contrastive Learning Method for PolSAR Land Cover Classification." IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing (2024). https://ieeexplore.ieee.org/abstract/document/10601228

Disentangling Length Bias In Preference Learning Via Response-Conditioned Modeling

under review in Empirical Methods in Natural Language Processing, 2025

It introduces a Response-conditioned Bradley-Terry (Rc-BT) model that enhances the model’s capability in length bias mitigating and length instruction following, through training on the augmented dataset. Furthermore, it proposes the Rc-RM and Rc-DPO algorithm to leverage the Rc-BT model for reward modeling and direct policy optimization (DPO) of LLMs.

Recommended citation: Cai Jianfeng. (2025). "Disentangling Length Bias In Preference Learning Via Response-Conditioned Modeling." arXiv preprint arXiv: 2502.00814, 2025. http://arxiv.org/abs/2502.00814

Mitigating Hallucination in VideoLLMs via Temporal-Aware Activation Engineering

under review in Neural Information Processing Systems, 2025

It is the first to systematically investigate the effectiveness and underlying mechanisms of activation engineering for mitigating hallucinations in VideoLLMs. And it proposes a temporal-aware activation engineering framework for VideoLLMs, which adaptively identifies and manipulates hallucination-sensitive modules based on the temporal variation characteristic, substantially mitigating hallucinations without additional LLM fine-tuning.

Recommended citation: Cai Jianfeng. (2025). "Mitigating Hallucination in VideoLLMs via Temporal-Aware Activation Engineering." arXiv preprint arXiv: 2505.12826, 2025. https://arxiv.org/abs/2505.12826

Bias Fitting to Mitigate Length Bias of Reward Model in RLHF

under review in Neural Information Processing Systems, 2025

To accurately model the intricate nature of length bias and facilitate more effective bias mitigation, it proposes FiMi-RM (Bias Fitting to Mitigate Length Bias of Reward Model in RLHF), a framework that autonomously learns and corrects underlying bias patterns.

Recommended citation: Zhao Kangwen, Cai Jianfeng. (2025). "Bias Fitting to Mitigate Length Bias of Reward Model in RLHF." arXiv preprint arXiv: 2505.12843, 2025. https://arxiv.org/abs/2505.12843

Multi-Level Aware Preference Learning: Enhancing RLHF for Complex Multi-Instruction Tasks

under review in Neural Information Processing Systems, 2025

The research identifies a critical oversight in existing techniques, which predominantly focus on comparing responses while neglecting valuable latent signals embedded within prompt inputs, and which only focus on preference disparities at the intra-sample level, while neglecting to account for the inter-sample level preference differentials that exist among preference data. To leverage these previously neglected indicators, it proposes a novel Multi-level Aware Preference Learning (MAPL) framework, capable of enhancing multi-instruction capabilities.

Recommended citation: Sun Ruopei, Cai Jianfeng. (2025). "Multi-Level Aware Preference Learning: Enhancing RLHF for Complex Multi-Instruction Tasks." arXiv preprint arXiv: 2505.12845, 2025. https://arxiv.org/abs/2505.12845

talks

The Second Price of National Undergraduate Mathmatical Modeling Contest

Published: November 01, 2022

I participated in the 2022 years National Undergraduate Mathmatical Modeling Contest, chose the topic of B, and finally was awarded as the Second Price.

The First Price of National Undergraduate Mathmatical Contest

Published: January 01, 2023

I participated in the 2022 years National Undergraduate Mathmatical Contest, and finally was awarded as the First Price.

teaching

Study Experience-Undergraduate

Undergraduate, Artifical Intelligence Turing Class, the School of Artifical Intelligence, Xidian University, 2020

I completed my undergraduate studies in Xidian University from 2020 to 2024.

Cai-Jianfeng

Sitemap

Pages

Posts

portfolio

publications

talks

teaching