Embedding models have become an important part of LLM applications, enabling tasks such as measuring text similarity, information retrieval, and clustering. However, embedding models are mostly based on a transformer architecture that is different from the one used in generative tasks. This makes it difficult to transfer the massive work being done on generative models to improve embedding models and requires extra parallel efforts.
pull down to refresh