ARise: Towards Knowledge-Augmented Reasoning via Risk-Adaptive Search

A novel framework for knowledge-augmented reasoning in large language models

Code Paper
Yize Zhang1,2,3*    Tianshu Wang4,5,7*    Sirui Chen1,6
Kun Wang4    Xingyu Zeng4    Hongyu Lin5
Xianpei Han5    Le Sun5    Chaochao Lu1,2†
1Shanghai AI Laboratory    2Shanghai Innovation Institute    3Shanghai Jiao Tong University
4SenseTime    5Institute of Software, Chinese Academy of Sciences    6Tongji University
7Hangzhou Institute for Advanced Study, University of Chinese Academy of Sciences
ez220523@sjtu.edu.cn, tianshu2020@iscas.ac.cn, luchaochao@pjlab.org.cn
*Equal contribution. †Corresponding author.

🧠 Abstract

Large language models (LLMs) have demonstrated impressive capabilities and are receiving increasing attention to enhance their reasoning through scaling test-time compute. However, their application in open-ended, knowledge-intensive, complex reasoning scenarios is still limited.

Reasoning-oriented methods struggle to generalize to open-ended scenarios due to implicit assumptions of complete world knowledge. Meanwhile, knowledge-augmented reasoning (KAR) methods fails to address two core challenges:

  1. Error propagation: where errors in early steps cascade through the chain
  2. Verification bottleneck: where the explore–exploit trade-off arises in multi-branch decision processes

To overcome these limitations, we introduce ARise, a novel framework that integrates risk assessment of intermediate reasoning states with dynamic retrieval-augmented generation (RAG) within a Monte Carlo tree search paradigm. This approach enables effective construction and optimization of reasoning plans across multiple maintained hypothesis branches.

Experimental results show that ARise significantly outperforms the state-of-the-art KAR methods by up to 23.10%, and the latest RAG-equipped large reasoning models by up to 25.37%.

πŸš€ Key Features

πŸ” Method: ARise Pipeline

ARise Pipeline

Figure 1: ARise Pipeline Overview

ARise iteratively refines reasoning steps through decomposition, retrieval-then-reasoning, providing fine-grained knowledge for LLMs. MCTS treats each step as a node in the search tree, expanding linear reasoning to mitigate error propagation by enabling exploration of reasoning paths and allowing backtracking when necessary. Risk assessment leverages Bayesian risk minimization to evaluate the quality of each reasoning state, dynamically optimizing action strategies to guide the search towards promising directions.

πŸ“Š Experimental Results

Comparison with Baseline Methods

Comparison with Baseline Methods

Figure 2: Comparison with Baseline Methods

ARise demonstrates superior performance. Specifically, on the Qwen2.5-14B-Instruct model, ARise outperforms across all benchmarks, achieving an absolute improvement of 19.83% in EM over the vanilla RAG method, 13.29% over prompt-based baselines, and 15.5% over search-based baselines.

ARise maintains robust performance on the Qwen2.5-7B-Instruct model with an absolute improvement of 13.67% in EM over the vanilla RAG method and overall surpasses various baselines. We observed that ARise performs slightly worse on Llama models. Nevertheless, ARise still maintains a notable F1 advantage on Llama, indicating its effectiveness in selecting more promising paths.

Comparison with Large Reasoning Models (LRMs)

Comparison with Large Reasoning Models

Figure 3: Comparison with Large Reasoning Models

Learning-based LRMs like DeepSeek-R1 distilled models have not yet approached the point where they can effectively match or even replace search-based reasoning methods in terms of performance.

Our empirical comparison between base models with ARise and the DeepSeek-R1 distilled models reveals key insights into the effectiveness of test-time search. These learning-based LRMs extract the similar reasoning pattern from DeepSeek-R1. ARise exhibits a performance advantage over the LRMs, especially on the Qwen model series. On average, ARise shows a relative improvement of 4.03\%, emphasizing the benefit of our search-based method.

πŸ“„ Citation

@article{zhang2025arise,
  title   = {ARise: Towards Knowledge-Augmented Reasoning via Risk-Adaptive Search},
  author  = {Yize Zhang and Tianshu Wang and Sirui Chen and Kun Wang and Xingyu Zeng and Hongyu Lin and Xianpei Han and Le Sun and Chaochao Lu},
  year    = {2025},
  journal = {arXiv preprint arXiv:2504.10893}
}