Dehong Xu

Ph.D. Candidate

University of California, Los Angeles
Advisor: Prof. Ying Nian Wu

Email: xudehong1996@ucla.edu

Google Scholar LinkedIn GitHub

Bio

I am a final-year Ph.D. student in Statistics at UCLA, advised by Prof. Ying Nian Wu. Previously, I conducted research at Amazon AI.

My research explores the intersections of language modeling, representation learning, and decision-making. I'm also interested in GenAI, and work on LLM/VLM pre-training and post-training.

News

[06/2025] New! One paper is accepted by ICCV 2025!
[05/2025] New! One paper Latent Thought Models with Variational Bayes Inference-Time Computation, is accepted by ICML 2025!
[04/2025] New! One paper on representation learning of head-direction (HD) system is accepted by Cogsci 2025!
[01/2025] New! One paper On Conformal Isometry of Grid Cells: Learning Distance-Preserving Position Embedding, is accepted by ICLR 2025 with Oral Presentation (1.8%)!
[10/2024] New! One paper on representation modeling for head direction system is accepted by the Workshop on Symmetry and Geometry in Neural Representations (NeurReps) at NeurIPS 2024!
[09/2024] New! One paper Latent Plan Transformer, is accepted by NeurIPS 2024!
[06/2024] New! I will be joining Amazon Search team as a Applied Scientist intern this summer!
[05/2024] New! One paper Aligning Large Language Models via Fine-grained Supervision, is accepted by ACL 2024!
[06/2023] I will be joining Amazon AGI team as a Applied Scientist intern this summer!
[05/2023] One paper Diverse and Faithful Knowledge-Grounded Dialogue Generation via Sequential Posterior Inference, is accepted by ICML 2023!

Selected Publications

* denotes equal contribution.

Illustration of An Investigation of Conformal Isometry Hypothesis for Grid Cells

Latent Thought Models with Variational Bayes Inference-Time Computation
Deqian Kong*, Minglu Zhao*, Dehong Xu*, Bo Pang, Shu Wang, Edouardo Honig, Zhangzhang Si, Chuan Li, Jianwen Xie^†, Sirui Xie^†, Ying Nian Wu^†

ICML 2025

We introduce Latent-Thought Language Models (LTMs), a novel language model family that incorporates explicit latent thought vectors. LTMs leverage dual-rate optimization, rapidly updating local latent vectors while gradually refining global decoder parameters. This approach unlocks new scaling dimensions, achieving superior efficiency, perplexity, and zero-shot performance over traditional models. They also exhibit emergent few-shot reasoning, highlighting their potential for advanced language tasks.

paper

On Conformal Isometry of Grid Cells: Learning Distance-Preserving Position Embedding
Dehong Xu, Ruiqi Gao, Wen-Hao Zhang, Xue-Xin Wei, Ying Nian Wu

ICLR 2025 [Oral Presentation (1.8%)]

This paper explores the conformal isometry hypothesis as a unifying explanation for the hexagonal firing patterns of grid cells. It posits that an animal’s 2D location is encoded as a high-dimensional neural vector lying on a 2D manifold, where local distances in physical space are preserved up to a scaling factor. We are the first paper to provide theortical proof that conformal isometry leads to the emergence of grid cell hexagonality. And we further conduct numerical experiments that such local distance preservation naturally produces the observed hexagonal grid.

paper

Illustration of Latent Plan Transformer: Planning as Latent Variable Inference

Latent Plan Transformer: Planning as Latent Variable Inference
Deqian Kong*, Dehong Xu*, Minglu Zhao*, Bo Pang, Jianwen Xie, Andrew Lizarraga, Yuhao Huang, Sirui Xie*, Ying Nian Wu

NeurIPS 2024

Decision-making via sequence modeling can be viewed as return-conditioned autoregressive behavior cloning. Unaware of their own future behaviors, such models were thought to be susceptible to drifting errors. Decision Transformer alleviates this issue by additionally predicting the return-to-go labels. We propose an unsupervised solution, where a latent variable is first inferred from a target return and then guides the policy throughout the episode, functioning as a plan. Our model discovers improved decisions from suboptimal trajectories.

paper website

Illustration of Aligning Large Language Models via Fine-grained Supervision

Aligning Large Language Models via Fine-grained Supervision
Dehong Xu*, Liang Qiu*, Minseok Kim, Faisal Ladhak, Jaeyoung Do

ACL 2024

We propose a method to enhance LLM alignment through fine-grained token-level supervision. Specifically, we ask annotators to minimally edit less preferred responses within the standard reward modeling dataset to make them more favorable, ensuring changes are made only where necessary while retaining most of the original content. The refined dataset is used to train a token-level reward model, which is then used for training our fine-grained token-level Proximal Policy Optimization (PPO) model.

paper

Illustration of Diverse and Faithful Knowledge-Grounded Dialogue Generation via Sequential Posterior Inference

Diverse and Faithful Knowledge-Grounded Dialogue Generation via Sequential Posterior Inference
Yan Xu*, Deqian Kong*, Dehong Xu, Ziwei Ji, Bo Pang, Pascale Fung, Ying Nian Wu,

ICML 2023

In this paper, we present an end-to-end learning framework, termed Sequential Posterior Inference (SPI), capable of selecting knowledge and generating dialogues by approximately sampling from the posterior distribution. Unlike other methods, SPI does not require the inference network or assume a simple geometry of the posterior distribution. This straightforward and intuitive inference procedure of SPI directly queries the response generation model, allowing for accurate knowledge selection and generation of faithful responses.

paper

Cite Scalable Language Models with Posterior Inference of Latent Thought Vectors


@article{kong2025scalable,
    title={Scalable Language Models with Posterior Inference of Latent Thought Vectors},
    author={Kong, Deqian and Zhao, Minglu and Xu, Dehong and Pang, Bo and Wang, Shu and Honig, Edouardo and Si, Zhangzhang and Li, Chuan and Xie, Jianwen and Xie, Sirui and others},
    journal={arXiv preprint arXiv:2502.01567},
    year={2025}
}

Cite On Conformal Isometry of Grid Cells: Learning Distance-Preserving Position Embedding


@inproceedings{xuconformal,
    title={On Conformal Isometry of Grid Cells: Learning Distance-Preserving Position Embedding},
    author={Xu, Dehong and Gao, Ruiqi and Zhang, Wenhao and Wei, Xue-Xin and Wu, Ying Nian},
    booktitle={The Thirteenth International Conference on Learning Representations}
}

Cite Latent Plan Transformer: Planning as Latent Variable Inference


@article{kong2024latent,
    title={Latent Plan Transformer for Trajectory Abstraction: Planning as Latent Space Inference},
    author={Kong, Deqian and Xu, Dehong and Zhao, Minglu and Pang, Bo and Xie, Jianwen and Lizarraga, Andrew and Huang, Yuhao and Xie, Sirui and Wu, Ying Nian},
    journal={Advances in Neural Information Processing Systems},
    year={2024}
}

Cite Aligning Large Language Models via Fine-grained Supervision


@article{xu2024aligning,
    title={Aligning Large Language Models via Fine-grained Supervision},
    author={Xu, Dehong and Qiu, Liang and Kim, Minseok and Ladhak, Faisal and Do, Jaeyoung},
    journal={arXiv preprint arXiv:2406.02756},
    year={2024}
}

Cite Diverse and Faithful Knowledge-Grounded Dialogue Generation via Sequential Posterior Inference


@inproceedings{xu2023diverse,
    title={Diverse and faithful knowledge-grounded dialogue generation via sequential posterior inference},
    author={Xu, Yan and Kong, Deqian and Xu, Dehong and Ji, Ziwei and Pang, Bo and Fung, Pascale and Wu, Ying Nian},
    booktitle={International Conference on Machine Learning},
    pages={38518--38534},
    year={2023},
    organization={PMLR}
    }

Experience

Applied Scientist Intern
Amazon Inc. - 2024.06 - 2024.09

Improving Instruction-following Capability of Multi-modal Embedding Models
(Paper published in ICCV 2025)

Developed a multi-modal, decoder-only framework for learning representations with instruction-following capabilities.
Designed and implemented a two-stage training approach: a pre-training phase for modality alignment, followed by instruction fine-tuning.
Our method achieved SoTA performance on multi-modal information retrieval benchmarks.

Applied Scientist Intern
Amazon Inc. - 2023.06 - 2023.10

Aligning Large Language Models via Fine-grained Supervision and Token-level RLHF
(Paper published in ACL 2024)

Developed a fine-grained data collection method for reward training via minimal editing, which pinpoints the exact output segments that affect user choices.
Proposed token-level RLHF by training a token-level reward model with fine-grained supervision and incorporated it into PPO training.
Our method outperformed LLaMA2-chat-7B and achieved the SoTA performance on AlpacaFarm.

Professional Service

Conference Reviewer: NeurIPS, ICLR, ICML, ICCV, ACL, IJCAI, AISTATS, ACM MM
Journal Reviewer: TMLR, IEEE TNNLS, IEEE TIP, Stat

Teaching

STATS 100A Introduction to Probability
STATS 102C Introduction to Monte Carlo Methods
STATS 202C Monte Carlo Methods for Optimization
STATS 231A Pattern Recognition and Machine Learning
STATS 231B Methods of Machine Learning
STATS 413 Machine Learning