About this Research Topic
Global financial institutions hold a vast array of data the majority of which (~80%) is unstructured. It is at the nexus of traditional and alternative unstructured data sources that there is most potential for next-generation financial models. For example, younger retail investors increasingly rely on social media to find investment strategies. By so doing, they may impact the financial markets, such as the recent GameStop mania due to users of Reddit’s WallStreetBets. Despite their potential, extracting value from unstructured data has several unique challenges when compared to structured data. On top of these challenges are the significant pre-processing efforts needed to convert unstructured data into a structured representation for downstream analysis. Given the challenges they present, unstructured data have historically been underutilized to support critical decisions in the financial industry.
Traditional data analytics in finance mostly relies on manual knowledge discovery and understanding processes, which is inefficient, error-prone, inconsistent, and challenging to scale. Also, in this traditional approach, valuable knowledge may be adulterated by human bias, such as fake news, which can lead to distortions in strategies and decisions to both institutional and retail investors. Recent advances in Artificial Intelligence (AI) methods and, more specifically, in natural language processing and knowledge understanding, have improved the ability to extract knowledge from unstructured data sources over the past decades. However, many contemporary knowledge extraction and representation efforts focus on singular modalities of textual data: news, web, social media, etc. Significantly less work has been performed on knowledge extraction and representation across multiple unstructured data modalities. The latter presents huge challenges for the financial services industry, which consistently generates immense amounts of unstructured data in a variety of formats and from a variety of financial (SEC filings, loan documents, industry reports, etc.) and alternative sources. Furthermore, unstructured data frequently come from alternative data sources (e.g. social media feeds on Twitter, Reddit, and Facebook).
The goal of this Research Topic is to bring together AI research with applications to any aspect of financial analysis including knowledge discovery, extraction, understanding, representation, and, finally, utilizing knowledge from previous processes to drive decision marking. We understand that the design and implementation of these techniques to resolve real problems in financial services require a joint effort between academic researchers and industry practitioners across different disciplines. We particularly welcome academic and industry researchers and practitioners to submit articles that focus on knowledge discovery and understanding from unstructured data in the financial services domain. The scope of this Research Topic includes, but is not limited to, the following areas:
● Representation learning, distributed representations learning, and encoding in natural language processing for financial documents
● Synthetic or genuine financial datasets and benchmarking baseline models
● Transfer learning application on financial data, knowledge distillation as a method for compression of pre-trained models or adaptation to financial datasets
● Search and question answering systems designed for financial corpora
● Named-entity disambiguation, recognition, relationship discovery, ontology learning, and extraction in financial documents
● Knowledge alignment and integration from heterogeneous data
● Using multi-modal data in knowledge discovery for financial applications
● AI-assisted data tagging and labeling
● Data acquisition, augmentation, feature engineering, and analysis for investment and risk management
● Automatic data extraction from financial filings and quality verification
● Event discovery from alternative data and impact on organization equity price
● AI systems for relationship extraction and risk assessment from legal documents
● Accounting for Black-Swan events in knowledge discovery methods
● Big data analytics and knowledge discovery from financial transactions, sensors, mobile devices, satellites, social media, etc.
The Topic Editor Sameena Shu is employed by JP Morgan while the Topic Editor Xiaomo Liu is employed by S&P Global. All other Topic Editors declare no competing interests with regards to the Research Topic subject.
Keywords: big data analytics, data science, data mining, machine learning, Financial Documents, financial data analytics, computational finance, financial datasets, financial corpora, financial applications
Important Note: All contributions to this Research Topic must be within the scope of the section and journal to which they are submitted, as defined in their mission statements. Frontiers reserves the right to guide an out-of-scope manuscript to a more suitable section or journal at any stage of peer review.