[10/2024] New!
One paper on representation modeling for head direction system is accepted by the
Workshop on Symmetry and Geometry in Neural Representations (NeurReps) at NeurIPS 2024!
We introduce Latent-Thought Language Models (LTMs), a novel language model family that incorporates
explicit latent thought vectors. LTMs leverage dual-rate optimization, rapidly updating local latent
vectors while gradually refining global decoder parameters. This approach unlocks new scaling dimensions,
achieving superior efficiency, perplexity, and zero-shot performance over traditional models. They also
exhibit emergent few-shot reasoning, highlighting their potential for advanced language tasks.
This paper explores the conformal isometry hypothesis as a unifying explanation for the hexagonal
firing patterns of grid cells. It posits that an animal’s 2D location is encoded as a high-dimensional
neural vector lying on a 2D manifold, where local distances in physical space are preserved up to a
scaling factor. We are the first paper to provide theortical proof that conformal isometry leads to the
emergence of grid cell hexagonality. And we further conduct numerical experiments that such local
distance preservation naturally produces the observed hexagonal grid.
Decision-making via sequence modeling can be viewed as return-conditioned autoregressive
behavior cloning. Unaware of their own future behaviors, such models were thought to be
susceptible to drifting errors. Decision Transformer alleviates this issue by additionally
predicting the return-to-go labels. We propose an unsupervised solution, where a latent variable
is first inferred from a target return and then guides the policy throughout the episode,
functioning as a plan. Our model discovers improved decisions from suboptimal trajectories.
We propose a method to enhance LLM alignment through fine-grained token-level supervision.
Specifically, we ask annotators to minimally edit less preferred responses within the standard
reward modeling dataset to make them more favorable, ensuring changes are made only where
necessary while retaining most of the original content. The refined dataset is used to train a
token-level reward model, which is then used for training our fine-grained token-level Proximal
Policy Optimization (PPO) model.
In this paper, we present an end-to-end learning framework, termed Sequential Posterior
Inference (SPI), capable of selecting knowledge and generating dialogues by approximately
sampling from the posterior distribution. Unlike other methods, SPI does not require the
inference network or assume a simple geometry of the posterior distribution. This
straightforward and intuitive inference procedure of SPI directly queries the response
generation model, allowing for accurate knowledge selection and generation of faithful
responses.
Cite Scalable Language Models with Posterior Inference of Latent Thought Vectors
@article{kong2025scalable,
title={Scalable Language Models with Posterior Inference of Latent Thought Vectors},
author={Kong, Deqian and Zhao, Minglu and Xu, Dehong and Pang, Bo and Wang, Shu and Honig, Edouardo and Si, Zhangzhang and Li, Chuan and Xie, Jianwen and Xie, Sirui and others},
journal={arXiv preprint arXiv:2502.01567},
year={2025}
}
Cite On Conformal Isometry of Grid Cells: Learning Distance-Preserving Position Embedding
@inproceedings{xuconformal,
title={On Conformal Isometry of Grid Cells: Learning Distance-Preserving Position Embedding},
author={Xu, Dehong and Gao, Ruiqi and Zhang, Wenhao and Wei, Xue-Xin and Wu, Ying Nian},
booktitle={The Thirteenth International Conference on Learning Representations}
}
Cite Latent Plan Transformer: Planning as Latent Variable Inference
@article{kong2024latent,
title={Latent Plan Transformer for Trajectory Abstraction: Planning as Latent Space Inference},
author={Kong, Deqian and Xu, Dehong and Zhao, Minglu and Pang, Bo and Xie, Jianwen and Lizarraga, Andrew and Huang, Yuhao and Xie, Sirui and Wu, Ying Nian},
journal={Advances in Neural Information Processing Systems},
year={2024}
}
Cite Aligning Large Language Models via Fine-grained Supervision
@article{xu2024aligning,
title={Aligning Large Language Models via Fine-grained Supervision},
author={Xu, Dehong and Qiu, Liang and Kim, Minseok and Ladhak, Faisal and Do, Jaeyoung},
journal={arXiv preprint arXiv:2406.02756},
year={2024}
}
Cite Diverse and Faithful Knowledge-Grounded Dialogue Generation via Sequential Posterior
Inference
@inproceedings{xu2023diverse,
title={Diverse and faithful knowledge-grounded dialogue generation via sequential posterior inference},
author={Xu, Yan and Kong, Deqian and Xu, Dehong and Ji, Ziwei and Pang, Bo and Fung, Pascale and Wu, Ying Nian},
booktitle={International Conference on Machine Learning},
pages={38518--38534},
year={2023},
organization={PMLR}
}
Improving Instruction-following Capability of Multi-modal
Embedding Models (Preprint)
Developed a multi-modal, decoder-only framework for
learning representations with
instruction-following capabilities.
Designed and implemented a two-stage training
approach: a pre-training phase for modality
alignment, followed by instruction fine-tuning.
Our method achieved SoTA performance on multi-modal information retrieval benchmarks.
Applied Scientist Intern Amazon Inc. -
Alexa
AGI
Team &
Rufus Team, 2023.06 - 2023.10
Aligning Large Language Models via Fine-grained
Supervision and Token-level RLHF (Paper published in ACL 2024)
Developed a fine-grained data collection method for
reward training via minimal editing, which
pinpoints the exact output segments that affect user choices.
Proposed token-level RLHF by training a token-level
reward
model with fine-grained supervision and
incorporated it into PPO training.
Our method outperformed LLaMA2-chat-7B and achieved the SoTA performance on AlpacaFarm.