Research Topic

Automatic Performance Management and Optimization on Large-scale Heterogeneous Clusters

About this Research Topic

Modern industrial, government, and academic organizations are collecting massive amounts of data at an unprecedented scale and pace, which are then analyzed on large compute clusters in order to extract value and deep insights. These insights can drive automated processes for advertisement placement, improve customer relationship management, and lead to major scientific breakthroughs. Ensuring good and robust system performance at such a scale is the foundation for successfully performing timely and cost-effective analytics. However, as the new systems have grown in scale and complexity, the administration and management of system resources have become very expensive with the human factor dominating the total cost of ownership. To make matters worse, computing clusters are increasingly becoming heterogeneous in nature, both in the compute and the storage tier. Heterogeneity, if not addressed appropriately, is shown to have detrimental effects on the overall system performance.

As organizations often own multiple generations of hardware and data centers are starting to use virtualization to consolidate servers, heterogeneous environments are becoming common in practice. Computing-wise, nodes can have CPUs with different capacities and number of cores, making performance-based resource allocation and workload scheduling extremely important and challenging. In addition, the presence of GPUs and FPGAs on modern clusters has inspired their use by various big data frameworks. On the storage front, cluster nodes can have multiple hard drives, SSDs, and large memory, all of different sizes, while emerging storage technologies (e.g., NVMe, SCM) are becoming more popular. At the same time, applications exhibit a variety of I/O patterns: batch-processing applications care about raw sequential throughput, interactive query processing benefits from lower latency storage media, whereas other applications display random I/O patterns. Hence, it is desirable to have a variety of storage types and let each application choose the one that best fits its performance or cost requirements. Administrators and systems will need mechanisms to manage the fair distribution of scarce storage resources across all users, ideally in an automated manner. The goal of this Research Topic is to report recent advances in automating (fully or partially) any aspects of resource management and performance optimization in the presence of heterogeneous cluster environments.

We are inviting submissions of original research articles or comprehensive reviews related to one or more of the following topics:
• Automated resource allocation in heterogeneous clusters
• Workload and task scheduling in heterogeneous environments
• Performance optimization and tuning of data-parallel applications
• Automated data management in heterogeneous and emerging storage systems
• Automatic parameter tuning in big data processing systems
• Automatic big data systems tuning that is robust to workload and resource uncertainty
• Query processing, indexing, and optimization in heterogeneous clusters
• Data stream processing in heterogeneous environments
• Automated provisioning of heterogeneous cluster resources
• System administration and manageability

Manuscripts can be submitted at any time until the manuscript deadline. All papers will move to peer-review upon their submission and accepted papers will be published as soon as they are accepted in the Research Topic


Keywords: cluster compuring, intelligent data management, performance optimization and tuning, automatic resource management, automatic resource provisioning


Important Note: All contributions to this Research Topic must be within the scope of the section and journal to which they are submitted, as defined in their mission statements. Frontiers reserves the right to guide an out-of-scope manuscript to a more suitable section or journal at any stage of peer review.

Modern industrial, government, and academic organizations are collecting massive amounts of data at an unprecedented scale and pace, which are then analyzed on large compute clusters in order to extract value and deep insights. These insights can drive automated processes for advertisement placement, improve customer relationship management, and lead to major scientific breakthroughs. Ensuring good and robust system performance at such a scale is the foundation for successfully performing timely and cost-effective analytics. However, as the new systems have grown in scale and complexity, the administration and management of system resources have become very expensive with the human factor dominating the total cost of ownership. To make matters worse, computing clusters are increasingly becoming heterogeneous in nature, both in the compute and the storage tier. Heterogeneity, if not addressed appropriately, is shown to have detrimental effects on the overall system performance.

As organizations often own multiple generations of hardware and data centers are starting to use virtualization to consolidate servers, heterogeneous environments are becoming common in practice. Computing-wise, nodes can have CPUs with different capacities and number of cores, making performance-based resource allocation and workload scheduling extremely important and challenging. In addition, the presence of GPUs and FPGAs on modern clusters has inspired their use by various big data frameworks. On the storage front, cluster nodes can have multiple hard drives, SSDs, and large memory, all of different sizes, while emerging storage technologies (e.g., NVMe, SCM) are becoming more popular. At the same time, applications exhibit a variety of I/O patterns: batch-processing applications care about raw sequential throughput, interactive query processing benefits from lower latency storage media, whereas other applications display random I/O patterns. Hence, it is desirable to have a variety of storage types and let each application choose the one that best fits its performance or cost requirements. Administrators and systems will need mechanisms to manage the fair distribution of scarce storage resources across all users, ideally in an automated manner. The goal of this Research Topic is to report recent advances in automating (fully or partially) any aspects of resource management and performance optimization in the presence of heterogeneous cluster environments.

We are inviting submissions of original research articles or comprehensive reviews related to one or more of the following topics:
• Automated resource allocation in heterogeneous clusters
• Workload and task scheduling in heterogeneous environments
• Performance optimization and tuning of data-parallel applications
• Automated data management in heterogeneous and emerging storage systems
• Automatic parameter tuning in big data processing systems
• Automatic big data systems tuning that is robust to workload and resource uncertainty
• Query processing, indexing, and optimization in heterogeneous clusters
• Data stream processing in heterogeneous environments
• Automated provisioning of heterogeneous cluster resources
• System administration and manageability

Manuscripts can be submitted at any time until the manuscript deadline. All papers will move to peer-review upon their submission and accepted papers will be published as soon as they are accepted in the Research Topic


Keywords: cluster compuring, intelligent data management, performance optimization and tuning, automatic resource management, automatic resource provisioning


Important Note: All contributions to this Research Topic must be within the scope of the section and journal to which they are submitted, as defined in their mission statements. Frontiers reserves the right to guide an out-of-scope manuscript to a more suitable section or journal at any stage of peer review.

About Frontiers Research Topics

With their unique mixes of varied contributions from Original Research to Review Articles, Research Topics unify the most influential researchers, the latest key findings and historical advances in a hot research area! Find out more on how to host your own Frontiers Research Topic or contribute to one as an author.

Topic Editors

Loading..

Submission Deadlines

15 May 2021 Abstract
31 July 2021 Manuscript

Participating Journals

Manuscripts can be submitted to this Research Topic via the following journals:

Loading..

Topic Editors

Loading..

Submission Deadlines

15 May 2021 Abstract
31 July 2021 Manuscript

Participating Journals

Manuscripts can be submitted to this Research Topic via the following journals:

Loading..
Loading..

total views article views article downloads topic views

}
 
Top countries
Top referring sites
Loading..