City Research Online

When methods matter: how implementation choices shape topic discovery in financial text

Gad, M., Park, G. ORCID: 0000-0002-1009-7462, Rawsthorne, S. & Young, S. (2026). When methods matter: how implementation choices shape topic discovery in financial text. Accounting and Business Research, doi: 10.1080/00014788.2026.2625716

Abstract

This paper examines the application of LDA topic modelling to risk disclosures in FTSE350 firms’ annual reports. We show that LDA implementation choices significantly impact topic representations and subsequent inferences. Using a corpus of FTSE350 annual reports, we show that preprocessing decisions, multiword expressions and labelling strategies materially affect topic interpretability and granularity. Our analysis reveals that while risk reporting addresses key business risks at an aggregate level, the degree of firm-specific commentary is sensitive to topic granularity. Hierarchical linear modelling suggests that 27% of topic variation is within firms for broad topics, increasing to 75% for granular topics. We leverage GPT to enhance topic labelling, showcasing the potential of LLMs in financial text analysis. We also compare LDA to modern embedding-based topic models, finding that while they often generate more coherent topics, they introduce a new set of critical implementation choices and do not eliminate the need for researcher discretion. These findings challenge the claims of LDA objectivity and highlight the importance of domain expertise. We propose a practical checklist for LDA implementation in accounting and finance research emphasising transparency and robustness checks.

Publication Type: Article
Additional Information: © 2026 The Author(s). Published by Informa UK Limited, trading as Taylor & Francis Group This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. The terms on which this article has been published allow the posting of the Accepted Manuscript in a repository by the author(s) or with their consent.
Publisher Keywords: textual analysis, topic modelling, risk disclosure, annual reports, Latent Dirichlet Allocation, GPT
Subjects: H Social Sciences > HG Finance
Departments: Bayes Business School
Bayes Business School > Faculty of Finance
SWORD Depositor:
[thumbnail of When methods matter how implementation choices shape topic discovery in financial text.pdf]
Preview
Text - Published Version
Available under License Creative Commons Attribution.

Download (13MB) | Preview

Export

Add to AnyAdd to TwitterAdd to FacebookAdd to LinkedinAdd to PinterestAdd to Email

Downloads

Downloads per month over past year

View more statistics

Actions (login required)

Admin Login Admin Login