Harnessing High-Performance Computing for Next-Generation Data Mining ​

  • 968

    Total views and downloads

About this Research Topic

This Research Topic is still accepting articles.

Background

In various fields ranging from healthcare to finance, the demand for sophisticated data mining techniques that can efficiently process vast datasets to extract actionable intelligence has surged. Traditional sequential computer systems, despite enhancements in their performance, struggle to meet the burgeoning requirements of data mining applications. The growth in data volume often outpaces the main memory capacity of these systems, spotlighting the limitations in scalability and processing power. This underscores a growing shift toward the design and implementation of parallel and distributed data mining algorithms, which are preferred for their potential to leverage multiple processing units simultaneously, thus enhancing computational efficiency and memory utilization.

This Research Topic aims to address the complex challenge of developing and optimizing parallel data mining algorithms that can scale effectively with large datasets. The primary focus is on devising methods that improve runtime efficiency and data management in distributed environments. Recognizing the typical obstacles — such as suboptimal data decomposition, excessive synchronization, and high communication overhead — this topic seeks contributions that propose novel data organization strategies, advanced parallel computing techniques, and innovative algorithms that minimize I/O costs and optimize workload distribution across multiple computing nodes.

To gather further insights into the cutting-edge advancements in this domain, we welcome articles addressing, but not limited to, the following themes:
- Parallel data mining and machine learning algorithms using MPI and/or OpenMP
- GPU-accelerated data mining and machine learning tools
- FPGA-based applications in parallel data mining
- Distributed algorithms for scalable machine learning
- Performance benchmarks and evaluations of high-speed data mining applications
- Emerging programming paradigms for distributed data mining
- Theoretical performance models for middleware in distributed systems
- Advanced programming tools and environments tailored for high-performance data mining
- Optimization techniques like caching, streaming, and pipelining for data management in machine learning platforms

This call for papers seeks to unite thinkers and innovators across various domains to contribute their research, findings, and theoretical advancements to foster the development of robust, scalable, and efficient data mining and machine learning technologies.

Topic Editor Italo Epicoco is a Principal Scientist with the Euro-Mediterranean Center on Climate Change. All other Topic Editors declare no conflict of interest.

Research Topic Research topic image

Article types and fees

This Research Topic accepts the following article types, unless otherwise specified in the Research Topic description:

  • Brief Research Report
  • Community Case Study
  • Conceptual Analysis
  • Data Report
  • Editorial
  • FAIR² Data
  • FAIR² DATA Direct Submission
  • Hypothesis and Theory
  • Methods

Articles that are accepted for publication by our external editors following rigorous peer review incur a publishing fee charged to Authors, institutions, or funders.

Keywords: parallel computing, distributed computing, deep learning, data mining, machine learning

Important note: All contributions to this Research Topic must be within the scope of the section and journal to which they are submitted, as defined in their mission statements. Frontiers reserves the right to guide an out-of-scope manuscript to a more suitable section or journal at any stage of peer review.

Topic editors

Manuscripts can be submitted to this Research Topic via the main journal or any other participating journal.

Impact

  • 968Topic views
View impact