Saved in:
Bibliographic Details
Main Authors: Li, Mingshu, Desai, Dhruv, Jeyapaulraj, Jerinsh, Sommer, Philip, Jain, Riya, Chu, Peter, Mehta, Dhagash
Format: Preprint
Published: 2025
Subjects:
Online Access:https://arxiv.org/abs/2509.24151
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866916975073034240
author Li, Mingshu
Desai, Dhruv
Jeyapaulraj, Jerinsh
Sommer, Philip
Jain, Riya
Chu, Peter
Mehta, Dhagash
author_facet Li, Mingshu
Desai, Dhruv
Jeyapaulraj, Jerinsh
Sommer, Philip
Jain, Riya
Chu, Peter
Mehta, Dhagash
contents Accurately measuring portfolio similarity is critical for a wide range of financial applications, including Exchange-traded Fund (ETF) recommendation, portfolio trading, and risk alignment. Existing similarity measures often rely on exact asset overlap or static distance metrics, which fail to capture similarities among the constituents (e.g., securities within the portfolio) as well as nuanced relationships between partially overlapping portfolios with heterogeneous weights. We introduce STRAPSim (Semantic, Two-level, Residual-Aware Portfolio Similarity), a novel method that computes portfolio similarity by matching constituents based on semantic similarity, weighting them according to their portfolio share, and aggregating results via residual-aware greedy alignment. We benchmark our approach against Jaccard, weighted Jaccard, as well as BERTScore-inspired variants across public classification, regression, and recommendation tasks, as well as on corporate bond ETF datasets. Empirical results show that our method consistently outperforms baselines in predictive accuracy and ranking alignment, achieving the highest Spearman correlation with return-based similarity. By leveraging constituent-aware matching and dynamic reweighting, portfolio similarity offers a scalable, interpretable framework for comparing structured asset baskets, demonstrating its utility in ETF benchmarking, portfolio construction, and systematic execution.
format Preprint
id arxiv_https___arxiv_org_abs_2509_24151
institution arXiv
publishDate 2025
record_format arxiv
spellingShingle STRAPSim: A Portfolio Similarity Metric for ETF Alignment and Portfolio Trades
Li, Mingshu
Desai, Dhruv
Jeyapaulraj, Jerinsh
Sommer, Philip
Jain, Riya
Chu, Peter
Mehta, Dhagash
Statistical Finance
Machine Learning
Accurately measuring portfolio similarity is critical for a wide range of financial applications, including Exchange-traded Fund (ETF) recommendation, portfolio trading, and risk alignment. Existing similarity measures often rely on exact asset overlap or static distance metrics, which fail to capture similarities among the constituents (e.g., securities within the portfolio) as well as nuanced relationships between partially overlapping portfolios with heterogeneous weights. We introduce STRAPSim (Semantic, Two-level, Residual-Aware Portfolio Similarity), a novel method that computes portfolio similarity by matching constituents based on semantic similarity, weighting them according to their portfolio share, and aggregating results via residual-aware greedy alignment. We benchmark our approach against Jaccard, weighted Jaccard, as well as BERTScore-inspired variants across public classification, regression, and recommendation tasks, as well as on corporate bond ETF datasets. Empirical results show that our method consistently outperforms baselines in predictive accuracy and ranking alignment, achieving the highest Spearman correlation with return-based similarity. By leveraging constituent-aware matching and dynamic reweighting, portfolio similarity offers a scalable, interpretable framework for comparing structured asset baskets, demonstrating its utility in ETF benchmarking, portfolio construction, and systematic execution.
title STRAPSim: A Portfolio Similarity Metric for ETF Alignment and Portfolio Trades
topic Statistical Finance
Machine Learning
url https://arxiv.org/abs/2509.24151