Ollama Embeddings#
Ollama can also serve embedding models locally. Before executing the following code, you need to run ollama pull embedinggemma
once to download the embedding model. Also, depending on how you installed ollama, you may have to execute it in a terminal window using this command, before executing this notebook:
ollama serve
As you will see, we access the local embedding models offered via ollama using the OpenAI API as shown in previous examples. We just exchange the base_url
and we do not need to provide an API-Key.
import openai
openai.__version__
'1.41.0'
Creating Embeddings with Ollama#
We define a helper function to generate text embeddings using the local Ollama endpoint. The function connects to the local Ollama server and uses the “embedinggemma” model to create vector representations of text.
def embed_ollama(text, model="embeddinggemma"):
"""A helper function that generates embeddings using ollama and returns the embedding vector."""
# setup connection to the local Ollama server
client = openai.OpenAI()
client.base_url = "http://localhost:11434/v1"
client.api_key = "none" # No API key needed for local ollama
# create embedding
response = client.embeddings.create(
input=text,
model=model
)
# extract embedding vector
return response.data[0].embedding
Let’s test the embedding function with a simple example:
# Test with a simple text
test_text = "Hello, this is a test sentence for embeddings."
embedding = embed_ollama(test_text)
print(f"Text: {test_text}")
print(f"Embedding dimension: {len(embedding)}")
print(f"First 5 values: {embedding[:5]}")
Text: Hello, this is a test sentence for embeddings.
Embedding dimension: 768
First 5 values: [-0.16048418, -0.002961286, 0.014041578, -0.029707532, -0.009763586]
Working with Multiple Texts#
Let’s generate embeddings for multiple texts and compare them:
# Define some sample texts
texts = [
"The cat sat on the mat.",
"A feline rested on the carpet.",
"The dog ran in the park.",
"Machine learning is fascinating.",
"Artificial intelligence transforms technology."
]
# Generate embeddings for all texts
embeddings = {}
for i, text in enumerate(texts):
embeddings[f"text_{i+1}"] = embed_ollama(text)
print(f"Generated embedding for text {i+1}: {text[:30]}...")
print(f"\nGenerated {len(embeddings)} embeddings successfully!")
Generated embedding for text 1: The cat sat on the mat....
Generated embedding for text 2: A feline rested on the carpet....
Generated embedding for text 2: A feline rested on the carpet....
Generated embedding for text 3: The dog ran in the park....
Generated embedding for text 3: The dog ran in the park....
Generated embedding for text 4: Machine learning is fascinatin...
Generated embedding for text 4: Machine learning is fascinatin...
Generated embedding for text 5: Artificial intelligence transf...
Generated 5 embeddings successfully!
Generated embedding for text 5: Artificial intelligence transf...
Generated 5 embeddings successfully!
Visualizing Embeddings#
Let’s use PCA to reduce the dimensionality and visualize the embeddings in 2D space:
import matplotlib.pyplot as plt
import numpy as np
from sklearn.decomposition import PCA
# Apply PCA to reduce to 2 dimensions
pca = PCA(n_components=2)
embeddings_2d = pca.fit_transform(list(embeddings.values()))
# Create scatter plot
plt.figure(figsize=(10, 8))
plt.scatter(embeddings_2d[:, 0], embeddings_2d[:, 1], s=100, alpha=0.7)
# Add text labels for each point
for i, text in enumerate(texts):
plt.annotate(f"{i+1}: {text[:25]}...",
(embeddings_2d[i, 0], embeddings_2d[i, 1]),
xytext=(5, 5), textcoords='offset points',
fontsize=9, alpha=0.8)
plt.title('Text Embeddings Visualization (PCA Projection)', fontsize=14)
plt.xlabel(f'Principal Component 1 (explained variance: {pca.explained_variance_ratio_[0]:.3f})')
plt.ylabel(f'Principal Component 2 (explained variance: {pca.explained_variance_ratio_[1]:.3f})')
plt.grid(True, alpha=0.3)
plt.tight_layout()
plt.show()
print(f"Total explained variance: {sum(pca.explained_variance_ratio_):.3f}")
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
Cell In[12], line 6
4 # Apply PCA to reduce to 2 dimensions
5 pca = PCA(n_components=2)
----> 6 embeddings_2d = pca.fit_transform(embeddings.values())
8 # Create scatter plot
9 plt.figure(figsize=(10, 8))
File c:\Users\rober\miniconda3\envs\bob-env\Lib\site-packages\sklearn\utils\_set_output.py:313, in _wrap_method_output.<locals>.wrapped(self, X, *args, **kwargs)
311 @wraps(f)
312 def wrapped(self, X, *args, **kwargs):
--> 313 data_to_wrap = f(self, X, *args, **kwargs)
314 if isinstance(data_to_wrap, tuple):
315 # only wrap the first output for cross decomposition
316 return_tuple = (
317 _wrap_data_with_container(method, data_to_wrap[0], X, self),
318 *data_to_wrap[1:],
319 )
File c:\Users\rober\miniconda3\envs\bob-env\Lib\site-packages\sklearn\base.py:1473, in _fit_context.<locals>.decorator.<locals>.wrapper(estimator, *args, **kwargs)
1466 estimator._validate_params()
1468 with config_context(
1469 skip_parameter_validation=(
1470 prefer_skip_nested_validation or global_skip_validation
1471 )
1472 ):
-> 1473 return fit_method(estimator, *args, **kwargs)
File c:\Users\rober\miniconda3\envs\bob-env\Lib\site-packages\sklearn\decomposition\_pca.py:474, in PCA.fit_transform(self, X, y)
451 @_fit_context(prefer_skip_nested_validation=True)
452 def fit_transform(self, X, y=None):
453 """Fit the model with X and apply the dimensionality reduction on X.
454
455 Parameters
(...)
472 C-ordered array, use 'np.ascontiguousarray'.
473 """
--> 474 U, S, _, X, x_is_centered, xp = self._fit(X)
475 if U is not None:
476 U = U[:, : self.n_components_]
File c:\Users\rober\miniconda3\envs\bob-env\Lib\site-packages\sklearn\decomposition\_pca.py:511, in PCA._fit(self, X)
501 raise ValueError(
502 "PCA with svd_solver='arpack' is not supported for Array API inputs."
503 )
505 # Validate the data, without ever forcing a copy as any solver that
506 # supports sparse input data and the `covariance_eigh` solver are
507 # written in a way to avoid the need for any inplace modification of
508 # the input data contrary to the other solvers.
509 # The copy will happen
510 # later, only if needed, once the solver negotiation below is done.
--> 511 X = self._validate_data(
512 X,
513 dtype=[xp.float64, xp.float32],
514 force_writeable=True,
515 accept_sparse=("csr", "csc"),
516 ensure_2d=True,
517 copy=False,
518 )
519 self._fit_svd_solver = self.svd_solver
520 if self._fit_svd_solver == "auto" and issparse(X):
File c:\Users\rober\miniconda3\envs\bob-env\Lib\site-packages\sklearn\base.py:633, in BaseEstimator._validate_data(self, X, y, reset, validate_separately, cast_to_ndarray, **check_params)
631 out = X, y
632 elif not no_val_X and no_val_y:
--> 633 out = check_array(X, input_name="X", **check_params)
634 elif no_val_X and not no_val_y:
635 out = _check_y(y, **check_params)
File c:\Users\rober\miniconda3\envs\bob-env\Lib\site-packages\sklearn\utils\validation.py:1012, in check_array(array, accept_sparse, accept_large_sparse, dtype, order, copy, force_writeable, force_all_finite, ensure_2d, allow_nd, ensure_min_samples, ensure_min_features, estimator, input_name)
1010 array = xp.astype(array, dtype, copy=False)
1011 else:
-> 1012 array = _asarray_with_order(array, order=order, dtype=dtype, xp=xp)
1013 except ComplexWarning as complex_warning:
1014 raise ValueError(
1015 "Complex data not supported\n{}\n".format(array)
1016 ) from complex_warning
File c:\Users\rober\miniconda3\envs\bob-env\Lib\site-packages\sklearn\utils\_array_api.py:751, in _asarray_with_order(array, dtype, order, copy, xp, device)
749 array = numpy.array(array, order=order, dtype=dtype)
750 else:
--> 751 array = numpy.asarray(array, order=order, dtype=dtype)
753 # At this point array is a NumPy ndarray. We convert it to an array
754 # container that is consistent with the input's namespace.
755 return xp.asarray(array)
TypeError: float() argument must be a string or a real number, not 'dict_values'
Semantic Search Example#
We can use embeddings for semantic search - finding the most similar text to a query:
def semantic_search(query, texts, top_k=3):
"""Find the most similar texts to a query using embeddings."""
# Get embedding for the query
query_embedding = embed_ollama(query)
# Get embeddings for all texts
text_embeddings = [embed_ollama(text) for text in texts]
# Calculate similarities
similarities = []
for text_emb in text_embeddings:
# Cosine similarity between query and text embeddings
similarity = np.dot(query_embedding, text_emb) / (
np.linalg.norm(query_embedding) * np.linalg.norm(text_emb)
)
similarities.append(similarity)
# Get top-k most similar texts
indexed_similarities = [(i, sim) for i, sim in enumerate(similarities)]
indexed_similarities.sort(key=lambda x: x[1], reverse=True)
return indexed_similarities[:top_k]
# Example search
query = "animal sitting down"
results = semantic_search(query, texts)
print(f"Query: '{query}'")
print("\nMost similar texts:")
for rank, (idx, similarity) in enumerate(results, 1):
print(f"{rank}. Text {idx+1} (similarity: {similarity:.3f}): {texts[idx]}")
Exercise#
Try different texts and see how the embeddings cluster in the visualization
Experiment with different queries in the semantic search function
Explore other embedding models available in Ollama by running
ollama list
in your terminalCompare the results with different embedding models (if you have others installed)
# Your experiments here