Embedding
Embeddings compress a string of tokens into a high-dimensional representation. They are preferably contextually aware, meaning different strings of tokens will have a different embedding.
Embeddings are can be used used to generate the next-expected token, evaluating text similarities, and with the similarity identification a way to do search is necessary in RAG
Embeddings are generally depend on the tokenization methods.
graph LR
Text --> Token
Token --> C[Token Embedding]
C --> D[Sequence Embedding]
D --> E[Changeable LLM]
subgraph Embedding["Embedding Model"]
C
D
end
In order to separate the representation, allowing greater freedom in evaluating downstream architectures and permitting enduring lookup ability with RAG, these models can be part of a larger and more complex models for sequence generation.
Text and Code Embeddings by Contrastrive Pre-Training
The authors demonstrate using contrastive pre-training can yield high-quality vector representations of text and code.
Matryoshka Representation Learning
The authors demonstrate MLR, which can encode information at different granularities allowing a single embedding to be be used for different downstream tasks.
ELE Embeddings
ELE Provides spherical embeddings based on descriptional logic. This allows for representation which works nicely with knoelged-graphs and ontologies.
Fastembed with qdrant
Light & Fast embedding model