Vocabulary-Defined Semantics: Latent Space Clustering for Improving In-Context Learning

Gu, Jian; Aleti, Aldeida; Chen, Chunyang; Zhang, Hongyu

Computer Science > Computation and Language

arXiv:2401.16184 (cs)

[Submitted on 29 Jan 2024 (v1), last revised 14 Oct 2024 (this version, v6)]

Title:Vocabulary-Defined Semantics: Latent Space Clustering for Improving In-Context Learning

Authors:Jian Gu, Aldeida Aleti, Chunyang Chen, Hongyu Zhang

View PDF

Abstract:In-context learning enables language models (LM) to adapt to downstream data or tasks by incorporating few samples as demonstrations within the prompts. It offers strong performance without the expense of fine-tuning. However, the performance of in-context learning can be unstable depending on the quality, format, or order of demonstrations, which in turn exacerbates the difficulty of optimization. Prior work, such as Knn Prompting, index samples based on the similarities of logits at the output-side, in addition to the regular retrieval operation at the input-side. They improve in-context learning by leveraging the core ability of next-token prediction, rather than relying solely on the emergent capacity to make analogies. Despite this, the hard-to-optimize issue of in-context learning still exists. In our view, it stems from the process of selecting demonstrations. To address this, we propose complementing in-context learning with an additional clustering operation. We propose a novel approach "vocabulary-defined semantics". Grounded in LM vocabulary, which is the label space of model outputs, the proposed approach computes semantically equivalent latent representations for output labels. Then, taking the representations as centroids, a clustering operation is performed to align the semantic properties between the language model and the downstream data/tasks. Based on extensive experiments across diverse textual understanding datasets and multiple models, our approach outperforms the state-of-the-art in terms of effectiveness and efficiency. On average, it achieves $3\%-49\%$ improvements while requiring only half of the computation time.

Comments:	under peer-review
Subjects:	Computation and Language (cs.CL); Machine Learning (cs.LG)
Cite as:	arXiv:2401.16184 [cs.CL]
	(or arXiv:2401.16184v6 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2401.16184

Submission history

From: Jian Gu [view email]
[v1] Mon, 29 Jan 2024 14:29:48 UTC (4,343 KB)
[v2] Fri, 2 Feb 2024 04:14:26 UTC (4,344 KB)
[v3] Mon, 12 Feb 2024 11:30:05 UTC (4,344 KB)
[v4] Mon, 8 Apr 2024 07:08:48 UTC (4,344 KB)
[v5] Sun, 26 May 2024 13:12:35 UTC (18,862 KB)
[v6] Mon, 14 Oct 2024 04:19:06 UTC (6,060 KB)

Computer Science > Computation and Language

Title:Vocabulary-Defined Semantics: Latent Space Clustering for Improving In-Context Learning

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Vocabulary-Defined Semantics: Latent Space Clustering for Improving In-Context Learning

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators