Simple retrieval augmented generation#

In this notebook we see how retrieval augmented generation (RAG) works using OpenAI and numpy. This implementation avoids using complex libraries intentionally. To keep the code simple, we are using Euclidean distances to determine related entries in the knowledge base. Maximum inner product search is more common in the field though.

import numpy as np
import openai
from IPython.display import Markdown, display

def show(text):
    display(Markdown(text))

We aim to answer this question:

question = "How can I label objects in an image?"

… using these code snippets (and more):

with open('code_snippets.txt', 'r') as file:
    all_code_snippets = file.read()
splits = all_code_snippets.split("\n\n")
[show(s) for s in splits[:3]];
  • Displays an image with a slider and label showing mouse position and intensity.

stackview.annotate(image, labels)
  • Allows cropping an image along all axes.

stackview.crop(image)
  • Showing an image stored in variable image and a segmented image stored in variable labels on top. Also works with two images or two label images.

stackview.curtain(image, labels, alpha: float = 1)

Vector embeddings#

To make our code snippets searchable, we need to created vector embedding form them, we need to turn them into vectors.

def embed(text):
    from openai import OpenAI
    client = OpenAI()

    response = client.embeddings.create(
        input=text,
        model="text-embedding-3-small"
    )
    return response.data[0].embedding
vector = embed("Hello world")
len(vector)
1536
vector[:3]
[-0.002119065960869193, -0.04909009113907814, 0.02101006731390953]

Vector store#

We also need a vector store, which is basically just a dictionary that allows us to quickly find a text given a corresponding vector, or a vector that has a short distance to it.

class VectorStore:
    def __init__(self, texts=None):
        self._store = {}
        if texts is not None:
            for text in texts:
                self._store[tuple(embed(text))] = text
    
    def search(self, text, n_best_results=3):
        single_vector = embed(text)
        
        # Step 1: Compute Euclidean distances
        distances = [(np.linalg.norm(np.asarray(single_vector) - np.asarray(vector)), vector) for vector in self._store.keys()]

        # Step 2: Sort distances and get the three vectors with the shortest distances
        distances.sort()  # Sort based on the first element in the tuple (distance)
        closest_vectors = [vec for _, vec in distances[:n_best_results]]  # Extract only the vectors

        self.distances = distances
        
        return [self._store[tuple(v)] for v in closest_vectors]
    
    def get_text(self, vector):
        return self._store[vector]
vectore_store = VectorStore(splits)

Searching the vector store#

We can then search in the store for vectors and corresponding texts that are close by a given question.

question
'How can I label objects in an image?'
vector = embed(question)
vector[:3]
[-0.004170199856162071, 0.03236572816967964, -0.0011563869193196297]
related_code_snippets = vectore_store.search(question)
show("\n\n".join(related_code_snippets))
  • Labels objects in grey-value images using Gaussian blurs, spot detection, Otsu-thresholding, and Voronoi-labeling from isotropic input images.

cle.voronoi_otsu_labeling(source: ndarray, label_image_destination: ndarray = None, spot_sigma: float = 2, outline_sigma: float = 2) -> ndarray
  • Draw a mesh between close-by objects in a label image:

mesh = cle.draw_mesh_between_proximal_labels(labels, maximum_distance:int)
  • Apply morphological opening operation, fill label gaps with voronoi-labeling, and mask background pixels in label image.

cle.smooth_labels(labels_input: ndarray, labels_destination: ndarray = None, radius: int = 0) -> ndarray

Prompting OpenAI#

We will also need access to a large language model (LLM) to combine the code snippets and the question to retrieve an answer to our question that involves the code snippets.

def prompt_chatGPT(message:str, model="gpt-3.5-turbo"):
    """A prompt helper function that sends a message to openAI
    and returns only the text response.
    """
    import os
    import openai
    
    # convert message in the right format if necessary
    if isinstance(message, str):
        message = [{"role": "user", "content": message}]
        
    # setup connection to the LLM
    # todo: enter your API key here:
    client = openai.OpenAI(api_key = os.environ.get('OPENAI_API_KEY'))
    
    # submit prompt
    response = client.chat.completions.create(
        model=model,
        messages=message
    )
    
    # extract answer
    return response.choices[0].message.content

We can then assemble code snippets and question to a prompt.

context = "\n\n".join(related_code_snippets)

prompt = f"""
Answer the question by the very end and consider given code snippets. 
Choose at least one of the code-snippets.
Only write Python code that answers the question.

## Code snippets
{context}

## Question
{question}
"""

print(prompt)
Answer the question by the very end and consider given code snippets. 
Choose at least one of the code-snippets.
Only write Python code that answers the question.

## Code snippets
* Labels objects in grey-value images using Gaussian blurs, spot detection, Otsu-thresholding, and Voronoi-labeling from isotropic input images.
```python
cle.voronoi_otsu_labeling(source: ndarray, label_image_destination: ndarray = None, spot_sigma: float = 2, outline_sigma: float = 2) -> ndarray
```

* Draw a mesh between close-by objects in a label image:
```python
mesh = cle.draw_mesh_between_proximal_labels(labels, maximum_distance:int)
```


* Apply morphological opening operation, fill label gaps with voronoi-labeling, and mask background pixels in label image.
```python
cle.smooth_labels(labels_input: ndarray, labels_destination: ndarray = None, radius: int = 0) -> ndarray
```

## Question
How can I label objects in an image?

Answering our question#

Eventually we can answer our question

answer = prompt_chatGPT(prompt)

show(answer)

You can label objects in an image using the voronoi_otsu_labeling function from the first code snippet. Here is an example code snippet:

import numpy as np
import pyclesperanto_prototype as cle

# Load your image data
image = np.array([[0, 0, 0, 0, 0],
                   [0, 1, 1, 0, 0],
                   [0, 1, 1, 1, 0],
                   [0, 0, 1, 0, 0],
                   [0, 0, 0, 0, 0]])

# Label objects in the image
labels = cle.voronoi_otsu_labeling(image)

# Display the labeled image
print(labels)

This code snippet uses the voronoi_otsu_labeling function to label objects in the input image.

Prompting without RAG#

In comparison, we send the same question together with minimal instructions to ChatGPT without out additional code-snippets.

answer = prompt_chatGPT(f"""
Write Python code to answer this question:
{question}
""")

show(answer)

You can label objects in an image using image processing techniques such as contour detection and bounding box drawing. Here is an example code using OpenCV library in Python:

import cv2

# Load the image
image = cv2.imread('image.jpg')

# Convert the image to grayscale
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)

# Apply thresholding to get binary image
ret, thresh = cv2.threshold(gray, 127, 255, 0)

# Find contours of objects in the image
contours, hierarchy = cv2.findContours(thresh, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)

# Draw bounding boxes around objects
for contour in contours:
    x, y, w, h = cv2.boundingRect(contour)
    cv2.rectangle(image, (x, y), (x + w, y + h), (0, 255, 0), 2)

# Display the image
cv2.imshow('Labeled Image', image)
cv2.waitKey(0)
cv2.destroyAllWindows()

This code will find the objects in the image, draw bounding boxes around them, and display the labeled image. Make sure to replace ‘image.jpg’ with the path to your image file.

Exercise#

Modify the question and ask for drawing a mesh between near neighbors. Prompt chatGPT with and without the RAG-approach.