Maxwell Ashford. Facilitating Cross-Domain Reasoning Generalization through Conservative Offline Reinforcement Learning Leveraging Pre-trained Large Language Model Representations. CIS [Internet]. 2026 May 19 [cited 2026 Jul. 12];4(1). Available from: https://scivexus.org/index.php/CIS/article/view/196