Vision models with structured output

Contents

Vision models with structured output#

In this notebook we will ask a vision model to produce structured output, so that we can put the image in categories or use the information to write Python code for analysing the image.

import openai
from skimage.io import imread
import stackview
from image_utilities import numpy_to_bytestream, prompt_scads_llm
from IPython.display import Markdown

hela = imread("data/hela-cells-8bit.tif")
stackview.insight(hela)

shape	(512, 672, 3)
dtype	uint8
size	1008.0 kB
min	0
max	255

result = prompt_scads_llm("""You are a highly experienced biologist with advanced microscopy skills.

# Task
Name the content of this image. Answer for each channel independently. 

# Options
The following structures could be in the image:
* Nulcei
* Membranes
* Cytoplasm
* Cytoskeleton
* Extra-cellular structure
* Other sub-cellular structures

# Output format
* Red channel: <structure>
* Green channel: <structure>
* Blue channel: <structure>

Keep your answer as short as possible. 
Only respond with the structres for the three channels in the format shown above.
""", hela)

Markdown(result)

Red channel: Other sub-cellular structures
Green channel: Cytoskeleton
Blue channel: Nuclei

Exercise#

Ask the vision model to generate Python code for segmenting the image.