ORIGINAL RESEARCH article

Front. Artif. Intell.

Sec. Natural Language Processing

Volume 8 - 2025 | doi: 10.3389/frai.2025.1604034

This article is part of the Research TopicThe Fusion of Fuzzy Logic and Natural Language Processing (NLP) in Next-generation Artificial Intelligence (AI) SystemsView all articles

Divide and Summarize: improve SLM text summarization

Provisionally accepted
  • Seenovate, Paris, France

The final, formatted version of the article will be published soon.

Text summarization is a longstanding challenge in natural language processing, with recent advancements driven by the adoption of Large Language Models (LLMs) and Small Language Models (SLMs). Despite these developments, issues such as the "Lost in the Middle" problemwhere LLMs tend to overlook information in the middle of lengthy prompts -persist. Traditional summarization, often termed the "Stuff" method, processes an entire text in a single pass. In contrast, the "Map" method divides the text into segments, summarizes each independently, and then synthesizes these partial summaries into a final output, potentially mitigating the "Lost in the Middle" issue. This study investigates whether the Map method outperforms the Stuff method for texts that fit within the context window of SLMs and assesses its effectiveness in addressing the "Lost in the Middle" problem. We conducted a two-part investigation: first, a simulation study using generated texts, paired with an automated fact-retrieval evaluation to eliminate the need for human assessment; second, a practical study summarizing scientific papers. Results from both studies demonstrate that the Map method produces summaries that are at least as accurate as those from the Stuff method. Notably, the Map method excels at retaining key facts from the beginning and middle of texts, unlike the Stuff method, suggesting its superiority for SLM-based summarization of smaller texts. Additionally, SLMs using the Map method achieved performance comparable to LLMs using the Stuff method, highlighting its practical utility.

Keywords: Small Language Models, Text summarization, Lost in the Middle, Text generation, automatic evaluation

Received: 01 Apr 2025; Accepted: 14 Jul 2025.

Copyright: © 2025 Bailly, Saubin, Kocevar and Bodin. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

* Correspondence: Alexandre Bailly, Seenovate, Paris, France

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.