Claude VLM for bounding-box segmentation#
In this notebook we will use the vision language model claude to determine bounding-boxes around objects.
import anthropic
from skimage.io import imread
import stackview
from image_utilities import numpy_to_bytestream, extract_json
from prompt_utilities import prompt_anthropic
import base64
import json
import os
import pandas as pd
from skimage.io import imsave
Bounding box segmentation#
Models such as claude have some capabilities and can detect objects and tell us about their positions and size.
import stackview
from skimage import data
import numpy as np
# Load the human mitosis dataset
image = data.human_mitosis()[:100, :100]
stackview.insight(image)
|
|
reply = prompt_anthropic("""
Give me a json object of bounding boxes around ALL bright blobs in this image. Assume the image width and height are 1.
The format should be like this:
```json
[
{'x':float,'y':float, 'width': float, 'height': float},
{'x':float,'y':float, 'width': float, 'height': float},
...
]
```
If you think you can't do this accuratly, please try anyway.
""", image, model="claude-opus-4-1-20250805")
print(reply)
bb = json.loads(extract_json(reply))
bb
new_image = stackview.add_bounding_boxes(image, bb)
Looking at this image, I can identify approximately 13-14 bright blobs/circles. I'll provide bounding boxes for each visible blob, with coordinates normalized to a 1x1 image space where (0,0) is the top-left corner.
```json
[
{"x": 0.05, "y": 0.02, "width": 0.08, "height": 0.08},
{"x": 0.20, "y": 0.05, "width": 0.08, "height": 0.08},
{"x": 0.38, "y": 0.03, "width": 0.08, "height": 0.08},
{"x": 0.08, "y": 0.18, "width": 0.08, "height": 0.08},
{"x": 0.28, "y": 0.20, "width": 0.10, "height": 0.10},
{"x": 0.48, "y": 0.22, "width": 0.08, "height": 0.08},
{"x": 0.15, "y": 0.38, "width": 0.09, "height": 0.09},
{"x": 0.35, "y": 0.40, "width": 0.08, "height": 0.08},
{"x": 0.55, "y": 0.42, "width": 0.08, "height": 0.08},
{"x": 0.25, "y": 0.58, "width": 0.10, "height": 0.10},
{"x": 0.45, "y": 0.60, "width": 0.08, "height": 0.08},
{"x": 0.12, "y": 0.75, "width": 0.08, "height": 0.08},
{"x": 0.35, "y": 0.78, "width": 0.09, "height": 0.09},
{"x": 0.50, "y": 0.85, "width": 0.08, "height": 0.08}
]
```
Note: These are approximate bounding boxes based on visual estimation of the blob positions and sizes. The actual pixel-perfect coordinates might vary slightly.
new_image
|
|