2:40 PM - 3:00 PM
[1H3-OS-8a-04] Development of Company Similarities from Both the Textual and Numerical Types of Financial Data
Keywords:Text Mining, Financial Documents, Embedding
Similarity among companies is important information that forms the basis of analysis in various financial practices, such as corporate valuation, investment and loan decisions, portfolio risk management, partner selection for business promotion, and in-house investor relations activities.
A useful tool for calculating the degree of similarity between companies is the embedded representation of companies, which can be obtained by using BERT and other methods on textual information.
While this embedding representation based on text data is effective, in the economic and financial fields, there are many numerical data that are expected to be useful for measuring the degree of similarity between companies. It is expected that combining these numerical financial data with textual data will enable us to search for more useful “similar companies”.
Therefore, this study proposes a method for searching similar companies utilizing both “textual information” and “numerical information.” Our methso utilizes not only textual information on stocks, but also numerical information such as sales by segment, stock price time-series data, and shareholder composition.
A useful tool for calculating the degree of similarity between companies is the embedded representation of companies, which can be obtained by using BERT and other methods on textual information.
While this embedding representation based on text data is effective, in the economic and financial fields, there are many numerical data that are expected to be useful for measuring the degree of similarity between companies. It is expected that combining these numerical financial data with textual data will enable us to search for more useful “similar companies”.
Therefore, this study proposes a method for searching similar companies utilizing both “textual information” and “numerical information.” Our methso utilizes not only textual information on stocks, but also numerical information such as sales by segment, stock price time-series data, and shareholder composition.
Authentication for paper PDF access
A password is required to view paper PDFs. If you are a registered participant, please log on the site from Participant Log In.
You could view the PDF with entering the PDF viewing password bellow.