Bounding box segmentation#
Some models have some capabilities in bounding-box segmentation of objects. Goal of this task is to draw a minimum sized surrounding rectangle around objects in an image.
import anthropic
from skimage.io import imread
import stackview
from image_utilities import numpy_to_bytestream, extract_json, prompt_kisski
import base64
import json
cat_image = imread("data/real_cat.png")
reply = prompt_kisski("""
Give me a json object of a bounding boxes around the cat in this image.
The format should be exactly like this: {'x':int,'y':int,'width':int,'height':int}
""", cat_image)
print(reply)
bb = json.loads(extract_json(reply))
bb
stackview.add_bounding_boxes(cat_image, [bb])
```json
{
"x": 6,
"y": 10,
"width": 427,
"height": 408
}
```
|
|
|
Bounding box segmentation using vision language models are an active research field. To see how well this works, we can inspect a couple of images.
visualizations = []
for filename in ["data/pinata.jpg", "data/real_cat.png", "data/sheeps.jpg", "data/guinea_pig.jpg"]:
image = imread(filename)
reply = prompt_kisski("""Give me a json object of a list of bounding boxes around each animal in this 512x512 pixel large image.
The format should be like this:
```json
[
{
"x":int,
"y":int,
"width":int,
"height":int,
"description":str,
"font_size":25
}
]
```
""", image)
bb = json.loads(extract_json(reply))
vis = stackview.add_bounding_boxes(image, bb)
visualizations.append(vis)
stackview.animate(visualizations, frame_delay_ms=2000)
Exercise#
Load blobs.tif and ask claude to draw bounding boxes around the white blobs.