[1]

Maxwell Ashford, “Facilitating Cross-Domain Reasoning Generalization through Conservative Offline Reinforcement Learning Leveraging Pre-trained Large Language Model Representations”, CIS, vol. 4, no. 1, May 2026.