{ "cells": [ { "cell_type": "markdown", "id": "330b3c14-42ee-4ac3-b11d-4f758230b452", "metadata": {}, "source": [ "# Vision models for image interpretation and code generation\n", "Some models support image input and can interpret the images. This might be useful to guide the large language model when deciding what to do with the image." ] }, { "cell_type": "code", "execution_count": 1, "id": "afc93deb-87e8-424f-ae9c-f0100a0774c2", "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", "
\n", " This notebook may contain text, code and images generated by artificial intelligence.\n", " Used model: claude-3-opus-20240229, vision model: claude-3-opus-20240229, endpoint: None, bia-bob version: 0.20.0.\n", " Read more about code generation using bia-bob.\n", "
\n", " " ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "import stackview\n", "from skimage.io import imread\n", "from bia_bob import bob\n", "bob.initialize(model=\"claude-3-opus-20240229\", vision_model=\"claude-3-opus-20240229\")\n", "#bob.initialize(model=\"gpt-4o-2024-05-13\", vision_model=\"gpt-4o-2024-05-13\")\n", "#bob.initialize(model=\"gemini-1.5-pro-latest\", vision_model=\"gemini-1.5-pro-latest\")" ] }, { "cell_type": "markdown", "id": "faa75504-62a5-409d-a123-f01db9d68169", "metadata": {}, "source": [ "First, we load an example image." ] }, { "cell_type": "code", "execution_count": 2, "id": "1bea1d9f-b997-446a-9619-db40ffb56548", "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", "\n", "\n", "\n", "\n", "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "
shape(512, 672, 3)
dtypeuint8
size1008.0 kB
min0
max255
\n", "\n", "
" ], "text/plain": [ "StackViewNDArray([[[ 3, 6, 1],\n", " [ 3, 7, 0],\n", " [ 3, 6, 1],\n", " ...,\n", " [11, 8, 2],\n", " [11, 7, 2],\n", " [11, 11, 2]],\n", "\n", " [[ 3, 6, 1],\n", " [ 3, 8, 1],\n", " [ 3, 7, 1],\n", " ...,\n", " [11, 10, 2],\n", " [10, 10, 2],\n", " [11, 11, 2]],\n", "\n", " [[ 4, 6, 1],\n", " [ 3, 6, 1],\n", " [ 4, 6, 1],\n", " ...,\n", " [10, 10, 2],\n", " [11, 10, 2],\n", " [11, 10, 2]],\n", "\n", " ...,\n", "\n", " [[15, 14, 8],\n", " [14, 14, 8],\n", " [15, 14, 7],\n", " ...,\n", " [10, 11, 5],\n", " [10, 12, 4],\n", " [11, 14, 5]],\n", "\n", " [[14, 16, 7],\n", " [16, 15, 7],\n", " [15, 16, 8],\n", " ...,\n", " [10, 11, 4],\n", " [11, 13, 4],\n", " [11, 16, 5]],\n", "\n", " [[15, 18, 7],\n", " [14, 17, 8],\n", " [14, 17, 8],\n", " ...,\n", " [ 9, 12, 5],\n", " [10, 13, 5],\n", " [11, 15, 5]]], dtype=uint8)" ] }, "execution_count": 2, "metadata": {}, "output_type": "execute_result" } ], "source": [ "image = imread(\"hela-cells-8bit.tif\")\n", "\n", "stackview.insight(image)" ] }, { "cell_type": "markdown", "id": "565a4098-1d6a-4e22-b1ea-5b9a344fb580", "metadata": {}, "source": [ "We can use vision capabilities by passing the image like this:" ] }, { "cell_type": "code", "execution_count": 3, "id": "afa18d74-6b25-407a-bece-c7239047f27b", "metadata": {}, "outputs": [ { "data": { "text/markdown": [ "The blue channel shows cell nuclei in this microscopy image." ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "%%bob image\n", "what's in the blue channel of this microscopy image? Answer in one short sentence." ] }, { "cell_type": "code", "execution_count": 4, "id": "2bb5ff53-9573-4067-a8cc-6308600b2eca", "metadata": {}, "outputs": [], "source": [ "%%bob \n", "Please segment the nuclei and use stackview.animate_curtain \n", "to show the resulting label image on top of the original image." ] }, { "cell_type": "code", "execution_count": 5, "id": "2db59cdd-adba-4992-b95b-8c8894fc4a9d", "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "C:\\Users\\haase\\miniconda3\\envs\\genai2\\Lib\\site-packages\\stackview\\_animate.py:61: UserWarning: The image is quite large (> 10 MByte) and might not be properly shown in the notebook when rendered over the internet. Consider subsampling or cropping the image for visualization purposes.\n", " warnings.warn(\"The image is quite large (> 10 MByte) and might not be properly shown in the notebook when rendered over the internet. Consider subsampling or cropping the image for visualization purposes.\")\n" ] }, { "data": { "text/html": [ "" ], "text/plain": [ "" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "from skimage.filters import threshold_otsu\n", "from skimage.morphology import remove_small_objects\n", "from skimage.measure import label\n", "\n", "# Extract the blue channel\n", "blue = image[:,:,2]\n", "\n", "# Apply a threshold to create a binary mask\n", "thresh = threshold_otsu(blue)\n", "mask = blue > thresh\n", "\n", "# Remove small objects\n", "mask = remove_small_objects(mask, min_size=50)\n", "\n", "# Label the connected components\n", "labels = label(mask)\n", "\n", "# Display the label image on top of the original image\n", "stackview.animate_curtain(image, labels)" ] }, { "cell_type": "code", "execution_count": null, "id": "58bf1e84-f4c6-492c-ba44-44092a827603", "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.11.9" } }, "nbformat": 4, "nbformat_minor": 5 }