Return to Article Details Facilitating Cross-Domain Reasoning Generalization through Conservative Offline Reinforcement Learning Leveraging Pre-trained Large Language Model Representations Download Download PDF