A Review on AI Based Business Lead Generation and a Case Study : Scrapus

ŞEKER, Evren  Şadi

doi:10.3389/frai.2025.1606431

TECHNOLOGY AND CODE article

Front. Artif. Intell.

Sec. AI in Business

Volume 8 - 2025 | doi: 10.3389/frai.2025.1606431

This article is part of the Research TopicAdvancing Knowledge-Based Economies and Societies through AI and Optimization: Innovations, Challenges, and ImplicationsView all 4 articles

A Review on AI Based Business Lead Generation and a Case Study : Scrapus

Provisionally accepted

Evren Şadi ŞEKER^*

Istanbul University, Istanbul, Türkiye

The final, formatted version of the article will be published soon.

The exponential growth of open web data provides unprecedented opportunities for business-to-business (B2B) lead generation. However, automating the discovery and qualification of new leads from unstructured web content is a complex challenge requiring the integration of web crawling, information extraction, and data-driven analytics. This article presents a comprehensive review of artificial intelligence (AI) methods for automated lead generation and introduces Scrapus, an AI-driven web prospecting platform that unifies these methods into an end-to-end system. Scrapus autonomously crawls the open web for company information, extracts and enriches relevant data (using natural language processing and knowledge graphs), matches findings to user-defined ideal customer profiles, and generates concise natural-language lead summaries using large language models. We survey relevant literature in web mining, focused crawling, entity resolution, and text summarization – highlighting how Scrapus builds upon and extends prior work. The system’s modular architecture and AI components are described in detail, reflecting accurate implementation details. We also report an experimental evaluation on real-world data: Scrapus significantly outperforms baseline approaches in lead discovery rate, extraction accuracy, lead qualification (achieving ~90% precision and recall), and summary usefulness. The results show a ~3× higher relevant lead yield from web crawling due to reinforcement learning, a substantial increase in extraction F1 (from ~0.77 to ~0.92) through transformer-based NLP, and greatly improved lead scoring over traditional methods. This review and case study demonstrate that combining reinforcement learning, transformer-based NLP, and knowledge-enhanced analysis can effectively automate B2B lead generation. The advances surveyed here point toward a new generation of intelligent sales prospecting tools, in which AI techniques augment human expertise to identify and engage leads at scale.

Keywords: Lead generation, Web Scrapping, LLM, text processing, B2B

Received: 05 Apr 2025; Accepted: 22 May 2025.

Copyright: © 2025 ŞEKER. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

* Correspondence: Evren Şadi ŞEKER, Istanbul University, Istanbul, Türkiye

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.