Statistics using Scikit-image#
We can use scikit-image for extracting features from label images. For convenience reasons we use the napari-skimage-regionprops library.
Before we can do measurements, we need an image
and a corresponding label_image
. Therefore, we recapitulate filtering, thresholding and labeling:
from skimage.io import imread
from skimage import filters
from skimage import measure
from napari_skimage_regionprops import regionprops_table
import pandas as pd
import numpy as np
import stackview
# load image
image = imread("data/blobs.tif")
stackview.insight(image)
|
# denoising
blurred_image = filters.gaussian(image, sigma=1)
# binarization
threshold = filters.threshold_otsu(blurred_image)
thresholded_image = blurred_image >= threshold
# labeling
label_image = measure.label(thresholded_image)
# visualization
stackview.insight(label_image)
|
Measurements / region properties#
We are now using the very handy function regionprops_table
. It provides features based on the scikit-image regionprops list of measurements library. Let us check first what we need to provide for this function:
regionprops_table?
Signature:
regionprops_table(
image: 'napari.types.ImageData',
labels: 'napari.types.LabelsData',
size: bool = True,
intensity: bool = True,
perimeter: bool = False,
shape: bool = False,
position: bool = False,
moments: bool = False,
napari_viewer: 'napari.Viewer' = None,
) -> 'pandas.DataFrame'
Docstring: Adds a table widget to a given napari viewer with quantitative analysis results derived from an image-label pair.
File: c:\users\haase\mambaforge\envs\tea2024\lib\site-packages\napari_skimage_regionprops\_regionprops.py
Type: function
df = pd.DataFrame(regionprops_table(image , label_image,
perimeter = True,
shape = True,
position=True))
df
label | area | bbox_area | equivalent_diameter | convex_area | max_intensity | mean_intensity | min_intensity | perimeter | perimeter_crofton | ... | bbox-0 | bbox-1 | bbox-2 | bbox-3 | weighted_centroid-0 | weighted_centroid-1 | standard_deviation_intensity | aspect_ratio | roundness | circularity | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 1 | 429.0 | 750.0 | 23.371345 | 479.0 | 232.0 | 191.440559 | 128.0 | 89.012193 | 87.070368 | ... | 0 | 10 | 30 | 35 | 13.130723 | 19.987532 | 29.793138 | 2.088249 | 0.451572 | 0.680406 |
1 | 2 | 183.0 | 231.0 | 15.264430 | 190.0 | 224.0 | 179.846995 | 128.0 | 53.556349 | 53.456120 | ... | 0 | 53 | 11 | 74 | 4.156053 | 63.178901 | 21.270534 | 1.782168 | 0.530849 | 0.801750 |
2 | 3 | 658.0 | 756.0 | 28.944630 | 673.0 | 248.0 | 205.604863 | 120.0 | 95.698485 | 93.409370 | ... | 0 | 95 | 28 | 122 | 12.485897 | 108.430312 | 29.392255 | 1.067734 | 0.918683 | 0.902871 |
3 | 4 | 433.0 | 529.0 | 23.480049 | 445.0 | 248.0 | 217.515012 | 120.0 | 77.455844 | 76.114262 | ... | 0 | 144 | 23 | 167 | 9.630850 | 154.408732 | 35.852345 | 1.061942 | 0.917813 | 0.906963 |
4 | 5 | 472.0 | 551.0 | 24.514670 | 486.0 | 248.0 | 213.033898 | 128.0 | 83.798990 | 82.127941 | ... | 0 | 237 | 29 | 256 | 13.051158 | 247.170738 | 28.741080 | 1.579415 | 0.621952 | 0.844645 |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
57 | 58 | 213.0 | 285.0 | 16.468152 | 221.0 | 224.0 | 184.525822 | 120.0 | 52.284271 | 52.250114 | ... | 232 | 39 | 251 | 54 | 240.563200 | 46.034602 | 28.255467 | 1.296143 | 0.771094 | 0.979146 |
58 | 59 | 79.0 | 108.0 | 10.029253 | 84.0 | 248.0 | 184.810127 | 128.0 | 39.313708 | 39.953250 | ... | 248 | 170 | 254 | 188 | 251.276164 | 178.373151 | 33.739912 | 3.173540 | 0.300766 | 0.642316 |
59 | 60 | 88.0 | 110.0 | 10.585135 | 92.0 | 216.0 | 182.727273 | 128.0 | 45.692388 | 46.196967 | ... | 249 | 117 | 254 | 139 | 251.403483 | 127.717413 | 24.417173 | 4.021193 | 0.238521 | 0.529669 |
60 | 61 | 52.0 | 75.0 | 8.136858 | 56.0 | 248.0 | 189.538462 | 128.0 | 30.692388 | 32.924135 | ... | 249 | 227 | 254 | 242 | 251.671266 | 234.202922 | 37.867411 | 2.839825 | 0.322190 | 0.693668 |
61 | 62 | 48.0 | 68.0 | 7.817640 | 53.0 | 224.0 | 173.833333 | 128.0 | 33.071068 | 35.375614 | ... | 250 | 66 | 254 | 83 | 252.038351 | 73.570470 | 27.987596 | 4.417297 | 0.213334 | 0.551512 |
62 rows × 31 columns
As you can see, we have now plenty of features to investigate. We can print out all feature names with the keys
function:
print(df.keys())
Index(['label', 'area', 'bbox_area', 'equivalent_diameter', 'convex_area',
'max_intensity', 'mean_intensity', 'min_intensity', 'perimeter',
'perimeter_crofton', 'extent', 'local_centroid-0', 'local_centroid-1',
'solidity', 'feret_diameter_max', 'major_axis_length',
'minor_axis_length', 'orientation', 'eccentricity', 'centroid-0',
'centroid-1', 'bbox-0', 'bbox-1', 'bbox-2', 'bbox-3',
'weighted_centroid-0', 'weighted_centroid-1',
'standard_deviation_intensity', 'aspect_ratio', 'roundness',
'circularity'],
dtype='object')
We can select some columns that we want to focus on like this:
df_selection = df[['label', 'area', 'extent', 'aspect_ratio', 'roundness', 'circularity']]
df_selection
label | area | extent | aspect_ratio | roundness | circularity | |
---|---|---|---|---|---|---|
0 | 1 | 429.0 | 0.572000 | 2.088249 | 0.451572 | 0.680406 |
1 | 2 | 183.0 | 0.792208 | 1.782168 | 0.530849 | 0.801750 |
2 | 3 | 658.0 | 0.870370 | 1.067734 | 0.918683 | 0.902871 |
3 | 4 | 433.0 | 0.818526 | 1.061942 | 0.917813 | 0.906963 |
4 | 5 | 472.0 | 0.856624 | 1.579415 | 0.621952 | 0.844645 |
... | ... | ... | ... | ... | ... | ... |
57 | 58 | 213.0 | 0.747368 | 1.296143 | 0.771094 | 0.979146 |
58 | 59 | 79.0 | 0.731481 | 3.173540 | 0.300766 | 0.642316 |
59 | 60 | 88.0 | 0.800000 | 4.021193 | 0.238521 | 0.529669 |
60 | 61 | 52.0 | 0.693333 | 2.839825 | 0.322190 | 0.693668 |
61 | 62 | 48.0 | 0.705882 | 4.417297 | 0.213334 | 0.551512 |
62 rows × 6 columns
And describe
gives us basic statistics like max
, mean
, min
and std
of each feature:
df_selection.describe()
label | area | extent | aspect_ratio | roundness | circularity | |
---|---|---|---|---|---|---|
count | 62.000000 | 62.000000 | 62.000000 | 62.000000 | 62.000000 | 62.000000 |
mean | 31.500000 | 355.370968 | 0.761363 | 1.637991 | 0.692418 | 0.894101 |
std | 18.041619 | 211.367385 | 0.065208 | 0.794366 | 0.210973 | 0.183024 |
min | 1.000000 | 7.000000 | 0.541102 | 1.048053 | 0.213334 | 0.529669 |
25% | 16.250000 | 194.750000 | 0.744329 | 1.168451 | 0.538616 | 0.805774 |
50% | 31.500000 | 366.000000 | 0.781076 | 1.316003 | 0.757485 | 0.925560 |
75% | 46.750000 | 500.750000 | 0.799519 | 1.769976 | 0.851463 | 0.966037 |
max | 62.000000 | 896.000000 | 0.870370 | 4.417297 | 0.974824 | 1.886542 |
If we’re interested in specific descriptive statistics, we can derive them directly from the columns.
df_selection['area'].mean()
355.3709677419355
Exercises#
Make a table with only area
, mean_intensity
, standard_deviation_intensity
and label
.
How many object are in the dataframe?
How large is the largest object?
What is the mean intensity of the brightest object?
What are mean and standard deviation intensity of the image?