Large Language Models for Closed-Library Multi-Document Query, Test Generation, and Evaluation

Randolph, Claire; Michaleas, Adam  M.; Ricke, Darrell  O.

doi:10.3389/frai.2025.1592013

ORIGINAL RESEARCH article

Front. Artif. Intell.

Sec. Language and Computation

Volume 8 - 2025 | doi: 10.3389/frai.2025.1592013

Large Language Models for Closed-Library Multi-Document Query, Test Generation, and Evaluation

Provisionally accepted

Claire Randolph^1,2

Adam M. Michaleas^3*

Darrell O. Ricke³

¹Massachusetts Institute of Technology, Cambridge, Massachusetts, United States
²United States Air Force, Washington, District of Columbia, United States
³Lincoln Laboratory, Massachusetts Institute of Technology, Lexington, United States

The final, formatted version of the article will be published soon.

Learning complex, detailed, and evolving knowledge is a challenge in multiple technical professions. Relevant source knowledge is contained within many large documents and information sources with frequent updates to these documents. Knowledge tests need to be generated on new material and existing tests revised, tracking knowledge base updates. Large Language Models (LLMs) provide a framework for artificial intelligence-assisted knowledge acquisition and continued learning. Retrieval-Augmented Generation (RAG) provides a framework to leverage available, trained LLMs combined with technical area-specific knowledge bases. Herein, two methods are introduced, which together enable effective implementation of LLM-RAG question-answering on large documents. Additionally, the AI for knowledge intensive tasks (AIKIT) solution is presented for working with numerous documents for training and continuing education. AIKIT is provided as a containerized open source solution that deploys on standalone, high performance, and cloud systems. AIKIT includes LLM, RAG, vector stores, relational database, and a Ruby on Rails web interface. AIKIT provides an easy-to-use set of tools to enable users to work with complex information using LLM-RAG capabilities.

Keywords: Large language models, LLM, Retrieval-Augmented Generation, RAG, Langchain

Received: 11 Mar 2025; Accepted: 21 Jul 2025.

Copyright: © 2025 Randolph, Michaleas and Ricke. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

* Correspondence: Adam M. Michaleas, Lincoln Laboratory, Massachusetts Institute of Technology, Lexington, United States

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.