Pointer-Guided Pre-training: Infusing Large Language Models with Paragraph-Level Contextual Awareness
We introduce “pointer-guided segment ordering” (SO), a novel pre-training technique aimed at enhancing the contextual understanding of paragraph-level text representations in large language models. Our methodology leverages a self-attention-driven pointer network to restore the original sequence of shuffled text segments, addressing the challenge of capturing the structural coherence and contextual dependencies within documents. This pre-training approach is complemented by a fine-tuning methodology that incorporates dynamic sampling, augmenting the diversity of training instances and improving sample efficiency for various downstream applications. We evaluate our method on a diverse set of datasets, demonstrating its efficacy in tasks requiring sequential text classification across scientific literature and financial reporting domains. Our experiments show that pointer-guided pre-training significantly enhances the model’s ability to understand complex document structures, leading to state-of-the-art performance in downstream classification tasks.
- Published in:
Machine Learning and Knowledge Discovery in Databases. Research Track. ECML PKDD - Type:
Inproceedings - Authors:
Hillebrand, Lars; Pradhan, Prabhupad; Bauckhage, Christian; Sifa, Rafet - Year:
2024 - Source:
https://link.springer.com/chapter/10.1007/978-3-031-70359-1_23
Citation information
Hillebrand, Lars; Pradhan, Prabhupad; Bauckhage, Christian; Sifa, Rafet: Pointer-Guided Pre-training: Infusing Large Language Models with Paragraph-Level Contextual Awareness, Machine Learning and Knowledge Discovery in Databases. Research Track. ECML PKDD, 2024, https://link.springer.com/chapter/10.1007/978-3-031-70359-1_23, Hillebrand.etal.2024a,
@Inproceedings{Hillebrand.etal.2024a,
author={Hillebrand, Lars; Pradhan, Prabhupad; Bauckhage, Christian; Sifa, Rafet},
title={Pointer-Guided Pre-training: Infusing Large Language Models with Paragraph-Level Contextual Awareness},
booktitle={Machine Learning and Knowledge Discovery in Databases. Research Track. ECML PKDD},
url={https://link.springer.com/chapter/10.1007/978-3-031-70359-1_23},
year={2024},
abstract={We introduce “pointer-guided segment ordering” (SO), a novel pre-training technique aimed at enhancing the contextual understanding of paragraph-level text representations in large language models. Our methodology leverages a self-attention-driven pointer network to restore the original sequence of shuffled text segments, addressing the challenge of capturing the structural coherence and...}}