Towards Personalized Similarity Search for Vector Databases

Mahrík,  Marek; Šikyňa,  Matúš; Míč,  Vladimír; Zezula,  Pavel

Towards Personalized Similarity Search for Vector Databases

Warning

This publication doesn't include Faculty of Sports Studies. It includes Faculty of Informatics. Official publication website can be found on muni.cz.

Authors	MAHRÍK Marek ŠIKYŇA Matúš MÍČ Vladimír ZEZULA Pavel
Year of publication	2025
Type	Article in Proceedings
Conference	17th International Conference on Similarity Search and Applications (SISAP 2024)
MU Faculty or unit	Faculty of Informatics
Citation
Doi	http://dx.doi.org/10.1007/978-3-031-75823-2_11
Keywords	Similarity search;Personalized similarity;Vector databases
Description	The importance of similarity search has become prominent in the fast-evolving vector databases, which apply content embedding techniques on complex data to produce and manage large collections of high-dimensional vectors. Processing of such data is only possible by using a similarity function for storage, structure, and retrieval. However, if multiple users access the collection, their views on similarity can differ as similarity, in general, is subjective and context-dependent. In this article, we elaborate on the problem of a similarity search engine implementation, where users use a common index but search with personalised views of similarity, implemented by a possibly different similarity model. Specifically, we define a foundational theoretical framework and conduct experiments on real-life data to confirm the viability of such an approach. The experiments also indicate future research directions needed to propose and implement an effective and efficient personalised similarity search engine.
Related projects:	Automated digital data forensics lab for complex crime detection Using artificial intelligence techniques for data processing, complex analysis and visualization of large-scale data