Vision models with structured output#

In this notebook we will ask a vision model to produce structured output, so that we can put the image in categories or use the information to write Python code for analysing the image.

import openai
from skimage.io import imread
import stackview
from image_utilities import numpy_to_bytestream, prompt_scads_llm
from IPython.display import Markdown
hela = imread("data/hela-cells-8bit.tif")
stackview.insight(hela)
shape(512, 672, 3)
dtypeuint8
size1008.0 kB
min0
max255
result = prompt_scads_llm("""You are a highly experienced biologist with advanced microscopy skills.

# Task
Name the content of this image. Answer for each channel independently. 

# Options
The following structures could be in the image:
* Nulcei
* Membranes
* Cytoplasm
* Cytoskeleton
* Extra-cellular structure
* Other sub-cellular structures

# Output format
* Red channel: <structure>
* Green channel: <structure>
* Blue channel: <structure>

Keep your answer as short as possible. 
Only respond with the structres for the three channels in the format shown above.
""", hela)

Markdown(result)
  • Red channel: Other sub-cellular structures

  • Green channel: Cytoskeleton

  • Blue channel: Nuclei

Exercise#

Ask the vision model to generate Python code for segmenting the image.