all-minilm-l12-v2
The all-minilm-l12-v2 embedding model is built by the open source natural language processing (NLP) and computer vision (CV) community and provided by HuggingFace.
Supported Languages: English
multilingual-e5-large
Usage: For use cases where you want to generate text embeddings for text in a language other than English.
Supported natural languages: Up to 100 languages. See the model card for details.
For more information regarding supported embedding models, see the watsonx documentation.
Quickstart with watsonx embeddings Python SDK
Install ibm-watsonx-ai Python library
pip install -U ibm-watsonx-ai
Use the watsonx embeddings API and the available embedding models to generate text embeddings.
from ibm_watsonx_ai.metanames import EmbedTextParamsMetaNames as EmbedParams
from ibm_watsonx_ai.foundation_models.utils.enums import EmbeddingTypes
from ibm_watsonx_ai.foundation_models import Embeddings
# Set the truncate_input_tokens to a value that is equal to or less than the maximum allowed tokens for the embedding model that you are using. If you don't specify this value and the input has more tokens than the model can process, an error is generated.
embed_params = {
EmbedParams.TRUNCATE_INPUT_TOKENS: 128,
EmbedParams.RETURN_OPTIONS: {
'input_text': True
}
}
embedding = Embeddings(
model_id=EmbeddingTypes.IBM_SLATE_30M_ENG,
credentials=credentials,
params=embed_params,
project_id=project_id,
space_id=None,
verify=False
)
q = [
"A foundation model is a large scale generative AI model that can be adapted to a wide range of downstream tasks.",
"Generative AI a class of AI algorithms that can produce various types of content including text, source code, imagery, audio, and synthetic data."
]
embedding_vectors = embedding.embed_documents(texts=q)
print(embedding_vectors)
Sample Notebook
Use watsonx Granite Model Series and embeddings, Chroma, and LangChain to answer questions (RAG) and LangChain
Integrations