Since the release of BERT in 2018, encoder-only transformer models have been widely used in natural language processing (NLP) applications due to their efficiency in retrieval and classification tasks ...