Simple retrieval augmented generation#
In this notebook we see how retrieval augmented generation (RAG) works using OpenAI and numpy. This implementation avoids using complex libraries intentionally. To keep the code simple, we are using Euclidean distances to determine related entries in the knowledge base. Maximum inner product search is more common in the field though.
import numpy as np
import openai
from IPython.display import Markdown, display
def show(text):
display(Markdown(text))
We aim to answer this question:
question = "How can I label objects in an image?"
… using these code snippets (and more):
with open('code_snippets.txt', 'r') as file:
all_code_snippets = file.read()
splits = all_code_snippets.split("\n\n")
[show(s) for s in splits[:3]];
Displays an image with a slider and label showing mouse position and intensity.
stackview.annotate(image, labels)
Allows cropping an image along all axes.
stackview.crop(image)
Showing an image stored in variable
image
and a segmented image stored in variablelabels
on top. Also works with two images or two label images.
stackview.curtain(image, labels, alpha: float = 1)
Vector embeddings#
To make our code snippets searchable, we need to created vector embedding form them, we need to turn them into vectors.
def embed(text):
from openai import OpenAI
client = OpenAI()
response = client.embeddings.create(
input=text,
model="text-embedding-3-small"
)
return response.data[0].embedding
vector = embed("Hello world")
len(vector)
1536
vector[:3]
[-0.002119065960869193, -0.04909009113907814, 0.02101006731390953]
Vector store#
We also need a vector store, which is basically just a dictionary that allows us to quickly find a text given a corresponding vector, or a vector that has a short distance to it.
class VectorStore:
def __init__(self, texts=None):
self._store = {}
if texts is not None:
for text in texts:
self._store[tuple(embed(text))] = text
def search(self, text, n_best_results=3):
single_vector = embed(text)
# Step 1: Compute Euclidean distances
distances = [(np.linalg.norm(np.asarray(single_vector) - np.asarray(vector)), vector) for vector in self._store.keys()]
# Step 2: Sort distances and get the three vectors with the shortest distances
distances.sort() # Sort based on the first element in the tuple (distance)
closest_vectors = [vec for _, vec in distances[:n_best_results]] # Extract only the vectors
self.distances = distances
return [self._store[tuple(v)] for v in closest_vectors]
def get_text(self, vector):
return self._store[vector]
vectore_store = VectorStore(splits)
Searching the vector store#
We can then search in the store for vectors and corresponding texts that are close by a given question.
question
'How can I label objects in an image?'
vector = embed(question)
vector[:3]
[-0.004170199856162071, 0.03236572816967964, -0.0011563869193196297]
related_code_snippets = vectore_store.search(question)
show("\n\n".join(related_code_snippets))
Labels objects in grey-value images using Gaussian blurs, spot detection, Otsu-thresholding, and Voronoi-labeling from isotropic input images.
cle.voronoi_otsu_labeling(source: ndarray, label_image_destination: ndarray = None, spot_sigma: float = 2, outline_sigma: float = 2) -> ndarray
Draw a mesh between close-by objects in a label image:
mesh = cle.draw_mesh_between_proximal_labels(labels, maximum_distance:int)
Apply morphological opening operation, fill label gaps with voronoi-labeling, and mask background pixels in label image.
cle.smooth_labels(labels_input: ndarray, labels_destination: ndarray = None, radius: int = 0) -> ndarray
Prompting OpenAI#
We will also need access to a large language model (LLM) to combine the code snippets and the question to retrieve an answer to our question that involves the code snippets.
def prompt_chatGPT(message:str, model="gpt-3.5-turbo"):
"""A prompt helper function that sends a message to openAI
and returns only the text response.
"""
import os
import openai
# convert message in the right format if necessary
if isinstance(message, str):
message = [{"role": "user", "content": message}]
# setup connection to the LLM
# todo: enter your API key here:
client = openai.OpenAI(api_key = os.environ.get('OPENAI_API_KEY'))
# submit prompt
response = client.chat.completions.create(
model=model,
messages=message
)
# extract answer
return response.choices[0].message.content
We can then assemble code snippets and question to a prompt.
context = "\n\n".join(related_code_snippets)
prompt = f"""
Answer the question by the very end and consider given code snippets.
Choose at least one of the code-snippets.
Only write Python code that answers the question.
## Code snippets
{context}
## Question
{question}
"""
print(prompt)
Answer the question by the very end and consider given code snippets.
Choose at least one of the code-snippets.
Only write Python code that answers the question.
## Code snippets
* Labels objects in grey-value images using Gaussian blurs, spot detection, Otsu-thresholding, and Voronoi-labeling from isotropic input images.
```python
cle.voronoi_otsu_labeling(source: ndarray, label_image_destination: ndarray = None, spot_sigma: float = 2, outline_sigma: float = 2) -> ndarray
```
* Draw a mesh between close-by objects in a label image:
```python
mesh = cle.draw_mesh_between_proximal_labels(labels, maximum_distance:int)
```
* Apply morphological opening operation, fill label gaps with voronoi-labeling, and mask background pixels in label image.
```python
cle.smooth_labels(labels_input: ndarray, labels_destination: ndarray = None, radius: int = 0) -> ndarray
```
## Question
How can I label objects in an image?
Answering our question#
Eventually we can answer our question
answer = prompt_chatGPT(prompt)
show(answer)
You can label objects in an image using the voronoi_otsu_labeling
function from the first code snippet. Here is an example code snippet:
import numpy as np
import pyclesperanto_prototype as cle
# Load your image data
image = np.array([[0, 0, 0, 0, 0],
[0, 1, 1, 0, 0],
[0, 1, 1, 1, 0],
[0, 0, 1, 0, 0],
[0, 0, 0, 0, 0]])
# Label objects in the image
labels = cle.voronoi_otsu_labeling(image)
# Display the labeled image
print(labels)
This code snippet uses the voronoi_otsu_labeling
function to label objects in the input image.
Prompting without RAG#
In comparison, we send the same question together with minimal instructions to ChatGPT without out additional code-snippets.
answer = prompt_chatGPT(f"""
Write Python code to answer this question:
{question}
""")
show(answer)
You can label objects in an image using image processing techniques such as contour detection and bounding box drawing. Here is an example code using OpenCV library in Python:
import cv2
# Load the image
image = cv2.imread('image.jpg')
# Convert the image to grayscale
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
# Apply thresholding to get binary image
ret, thresh = cv2.threshold(gray, 127, 255, 0)
# Find contours of objects in the image
contours, hierarchy = cv2.findContours(thresh, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)
# Draw bounding boxes around objects
for contour in contours:
x, y, w, h = cv2.boundingRect(contour)
cv2.rectangle(image, (x, y), (x + w, y + h), (0, 255, 0), 2)
# Display the image
cv2.imshow('Labeled Image', image)
cv2.waitKey(0)
cv2.destroyAllWindows()
This code will find the objects in the image, draw bounding boxes around them, and display the labeled image. Make sure to replace ‘image.jpg’ with the path to your image file.
Exercise#
Modify the question and ask for drawing a mesh between near neighbors. Prompt chatGPT with and without the RAG-approach.