Under the hood: Configure bia-bob’s behaviour through system messages#

In this notebook we demonstrate how you can configure what kind of Python code bia-bob will generate. Note that the following configuration will be stored across sessions. When installing a new version of bia-bob, these settings are reset. You can also manually reset bia-bob’s configuration by deleting the .cache\bia-bob directory in your home directory.

# load secret API key. You must unpack the contents of api_key.zip 
# into the same folder before going ahead.
from dotenv import load_dotenv
load_dotenv()

import os
from bia_bob import bob, DEFAULT_SYSTEM_PROMPT
bob.initialize(endpoint='https://llm.scads.ai/v1', model='openai/gpt-oss-120b', api_key=os.environ.get('SCADSAI_API_KEY'))
This notebook may contain text, code and images generated by artificial intelligence. Used model: openai/gpt-oss-120b, vision model: None, endpoint: https://llm.scads.ai/v1, bia-bob version: 0.33.0.. Do not enter sensitive or private information and verify generated contents according to good scientific practice. Read more: https://github.com/haesleinhuepf/bia-bob#disclaimer

Custom system messages#

You can configure your own system message. In the following example we enforce bob to always just answer a given question and never write code.

bob.initialize(
    endpoint='https://llm.scads.ai/v1', model='openai/gpt-oss-120b', api_key=os.environ.get('SCADSAI_API_KEY'),
    system_prompt="""
    You are an excellent Python programmer who always 
    uses Finnish variable names in their code.
    """)
This notebook may contain text, code and images generated by artificial intelligence. Used model: openai/gpt-oss-120b, vision model: None, endpoint: https://llm.scads.ai/v1, bia-bob version: 0.33.0.. Do not enter sensitive or private information and verify generated contents according to good scientific practice. Read more: https://github.com/haesleinhuepf/bia-bob#disclaimer
%bob Write a for-loop that prints the numbers from 0 to 10.
<IPython.core.display.Markdown object>
# Print numbers 0‑10 one by one
for luku in range(0, 11):
    print(luku)

Default system message#

You can reset the config to the default system message like this:

bob.initialize(
    endpoint='https://llm.scads.ai/v1', model='openai/gpt-oss-120b', api_key=os.environ.get('SCADSAI_API_KEY'),
    system_prompt=DEFAULT_SYSTEM_PROMPT)
This notebook may contain text, code and images generated by artificial intelligence. Used model: openai/gpt-oss-120b, vision model: None, endpoint: https://llm.scads.ai/v1, bia-bob version: 0.33.0.. Do not enter sensitive or private information and verify generated contents according to good scientific practice. Read more: https://github.com/haesleinhuepf/bia-bob#disclaimer
%bob Write a for-loop that prints the numbers from 0 to 10.

I will provide a straightforward Python for loop that prints the integers from 0 through 10 inclusive.

for i in range(0, 11):
    print(i)

Domain specific system messages#

You can instruct bia-bob to use common Python libraries from specific domains.

from bia_bob import bob
bob.initialize(
    endpoint='https://llm.scads.ai/v1', model='openai/gpt-oss-120b', api_key=os.environ.get('SCADSAI_API_KEY'),
    system_prompt="""
    You are an excellent astronomer and Python programmer.
    You typically use Python libraris from this domain.
""")
This notebook may contain text, code and images generated by artificial intelligence. Used model: openai/gpt-oss-120b, vision model: None, endpoint: https://llm.scads.ai/v1, bia-bob version: 0.33.0.. Do not enter sensitive or private information and verify generated contents according to good scientific practice. Read more: https://github.com/haesleinhuepf/bia-bob#disclaimer
%bob Write a function that can detect stars in an image.

Below is a self‑contained Python routine that will detect point‑like sources (i.e. stars) in any 2‑D astronomical image.
It uses the very‑well‑tested Astropy ecosystem:

| Library | Why it’s used | |———|—————-| | astropy.io.fits | Reads/writes FITS files (the de‑facto standard for astronomical images). | | astropy.stats | Robust background/σ estimation via sigma‑clipping. | | photutils | Dedicated source‑detection tools (DAOStarFinder, CircularAperture, etc.). | | numpy & matplotlib | Core array handling and optional quick visualisation. |

The function is deliberately modular – you can feed it a raw numpy.ndarray or a path to a FITS file, tweak the detection parameters, and obtain a convenient astropy.table.Table with the measured centroids, fluxes and a few quality flags.


1️⃣ High‑level overview of the algorithm

  1. Read the image (or accept an already‑loaded array).

  2. Background estimation – compute a median background and its RMS using sigma‑clipping.

  3. Thresholding – define a detection threshold (e.g. nsigma × RMS).

  4. Source detection – run DAOStarFinder (a classic DAOPHOT‑style algorithm).

  5. Post‑filtering – optional cuts on sharpness, roundness, signal‑to‑noise, etc.

  6. Return a catalog (and optionally a plot showing the detections).


2️⃣ The code

# --------------------------------------------------------------
# star_detector.py
# --------------------------------------------------------------
# Minimal, well‑documented star‑finding routine.
# --------------------------------------------------------------

from __future__ import annotations

import pathlib
from typing import Union, Tuple, Optional

import numpy as np
import matplotlib.pyplot as plt

from astropy.io import fits
from astropy.table import Table
from astropy.stats import sigma_clipped_stats
from photutils.detection import DAOStarFinder
from photutils.aperture import CircularAperture

# --------------------------------------------------------------
def _load_image(image: Union[str, pathlib.Path, np.ndarray]) -> np.ndarray:
    """
    Helper that hides the FITS‑reading boiler‑plate.
    Accepts:
        * a filename (str or Path) → reads the primary HDU
        * a 2‑D numpy array → returns it unchanged
    Raises:
        ValueError if the object is not 2‑D.
    """
    if isinstance(image, (str, pathlib.Path)):
        with fits.open(image) as hdul:
            data = hdul[0].data.astype(float)   # force float for later math
    else:
        data = np.asarray(image, dtype=float)

    if data.ndim != 2:
        raise ValueError("Input image must be a 2‑D array (grayscale).")
    return data


# --------------------------------------------------------------
def detect_stars(
    image: Union[str, pathlib.Path, np.ndarray],
    *,
    fwhm: float = 3.0,
    sigma: float = 5.0,
    exclude_border: int = 5,
    sharpness_range: Tuple[float, float] = (0.2, 1.0),
    roundness_range: Tuple[float, float] = (-0.5, 0.5),
    background_boxsize: int = 50,
    background_filter_size: int = 3,
    plot: bool = False,
    ax: Optional[plt.Axes] = None,
) -> Table:
    """
    Detect point‑like sources (stars) in an astronomical image.

    Parameters
    ----------
    image : str | pathlib.Path | np.ndarray
        Either the path to a FITS file (primary HDU) or a 2‑D numpy array.
    fwhm : float, optional
        Approximate full‑width at half‑maximum of the stellar PSF in pixels.
        The DAOStarFinder kernel width is `fwhm / (2*sqrt(2*ln(2)))`.
    sigma : float, optional
        Detection threshold expressed as a multiple of the background RMS.
        Typical values: 3‑8. Larger → fewer spurious detections.
    exclude_border : int, optional
        Number of pixels to trim from each edge (helps avoid edge artefacts).
    sharpness_range : tuple(float, float), optional
        Acceptable range of the DAOFIND `sharpness` metric.  Stars usually
        lie between ~0.2 and 1.0; adjust if you have undersampled data.
    roundness_range : tuple(float, float), optional
        Acceptable range of the DAOFIND `roundness` metric.  Near‑zero values
        correspond to circular sources.
    background_boxsize : int, optional
        Size of the box used for the sigma‑clipped background estimate.
        Larger values smooth out small‑scale variations but may miss
        rapidly changing backgrounds.
    background_filter_size : int, optional
        Size of the median filter applied to the background map.  Set to 0
        to skip filtering (rarely needed).
    plot : bool, optional
        If True, a quick Matplotlib display of the image with over‑laid
        detection circles is produced.
    ax : matplotlib.axes.Axes, optional
        Provide an existing Axes to plot into; otherwise a new figure is created.

    Returns
    -------
    astropy.table.Table
        Table with at least the following columns:
        * ``id``          – running integer index.
        * ``xcentroid``   – x‑coordinate (0‑based, left‑to‑right).
        * ``ycentroid``   – y‑coordinate (0‑based, bottom‑to‑top).
        * ``sharpness``   – DAOFIND sharpness metric.
        * ``roundness1``  – First roundness metric.
        * ``roundness2``  – Second roundness metric.
        * ``flux``        – Estimated source flux (sum within the kernel).
        * ``area``        – Number of pixels in the detection kernel.
        * ``background``  – Local background value used for the threshold.
        * ``threshold``   – Detection threshold that was applied (in ADU).

    Notes
    -----
    * The routine is deliberately simple.  For crowded fields or
      extreme PSF variation you may want to switch to
      ``photutils.psf`` tools, but DAOStarFinder works remarkably well
      for most ground‑based or space‑based imaging with a roughly
      constant PSF.
    * All coordinates are **0‑based** (as used by `numpy`).  If you need
      1‑based FITS coordinates, simply add 1.
    """
    # ------------------------------------------------------------------
    # 1. Load the image & trim the border
    # ------------------------------------------------------------------
    data = _load_image(image)
    if exclude_border > 0:
        data = data[
            exclude_border : -exclude_border,
            exclude_border : -exclude_border,
        ]

    # ------------------------------------------------------------------
    # 2. Estimate global background and RMS (sigma‑clipped)
    # ------------------------------------------------------------------
    mean, median, std = sigma_clipped_stats(
        data,
        sigma=3.0,
        maxiters=5,
        mask=None,
        stddev_func=np.std,
    )
    # DAOStarFinder internally uses ``background`` = median,
    # and compares ``pixel - background`` to ``sigma * std``.
    threshold = median + sigma * std

    # ------------------------------------------------------------------
    # 3. Run DAOStarFinder
    # ------------------------------------------------------------------
    daofind = DAOStarFinder(
        fwhm=fwhm,
        threshold=threshold,
        sharplo=sharpness_range[0],
        sharphi=sharpness_range[1],
        roundlo=roundness_range[0],
        roundhi=roundness_range[1],
    )
    sources = daofind(data - median)  # subtract median -> background‑subtracted

    # ------------------------------------------------------------------
    # 4. No detections? Return an empty table but keep a helpful column set.
    # ------------------------------------------------------------------
    if sources is None:
        empty_tbl = Table(
            names=(
                "id",
                "xcentroid",
                "ycentroid",
                "sharpness",
                "roundness1",
                "roundness2",
                "flux",
                "area",
                "background",
                "threshold",
            ),
            dtype=("i4", "f8", "f8", "f8", "f8", "f8", "f8", "i4", "f8", "f8"),
        )
        return empty_tbl

    # ------------------------------------------------------------------
    # 5. Add a few convenience columns
    # ------------------------------------------------------------------
    sources["id"] = np.arange(1, len(sources) + 1, dtype=int)

    # Local background for each source is just the global median (good enough for most cases)
    sources["background"] = median
    sources["threshold"] = threshold

    # ------------------------------------------------------------------
    # 6. Plot (optional)
    # ------------------------------------------------------------------
    if plot:
        if ax is None:
            fig, ax = plt.subplots(figsize=(8, 8))
        im = ax.imshow(
            data,
            cmap="gray",
            origin="lower",
            vmin=median - std,
            vmax=median + 5 * std,
        )
        ax.set_title("Star detection (DAOStarFinder)")
        # Overplot circles at the detected centroids
        positions = np.transpose((sources["xcentroid"], sources["ycentroid"]))
        apertures = CircularAperture(positions, r=fwhm)  # use FWHM as a visual radius
        apertures.plot(color="lime", lw=1.5, axes=ax)
        plt.colorbar(im, ax=ax, label="ADU")
        plt.tight_layout()
        plt.show()

    # ------------------------------------------------------------------
    # 7. Return the astropy Table (already a Table object)
    # ------------------------------------------------------------------
    return sources

# --------------------------------------------------------------
# Example usage (uncomment to run):
# --------------------------------------------------------------
if __name__ == "__main__":
    # >>>>>>>  SIMPLE DEMO  <<<<<<<<
    # You need an actual FITS file or a numpy array.
    # Here we create a synthetic image to illustrate the API.

    from photutils.datasets import make_gaussian_sources_image
    from photutils.datasets import make_noise_image

    # Synthetic parameters: 10 stars with random positions & fluxes
    np.random.seed(42)
    n_stars = 10
    shape = (200, 200)
    amplitudes = np.random.uniform(500, 2000, n_stars)
    x_mean = np.random.uniform(20, 180, n_stars)
    y_mean = np.random.uniform(20, 180, n_stars)
    x_stddev = np.full(n_stars, 1.5)    # ~FWHM ≈ 3.5 px
    y_stddev = np.full(n_stars, 1.5)

    sources_tbl = Table(
        {
            "amplitude": amplitudes,
            "x_mean": x_mean,
            "y_mean": y_mean,
            "x_stddev": x_stddev,
            "y_stddev": y_stddev,
            "theta": np.zeros(n_stars),
        }
    )
    synthetic = make_gaussian_sources_image(shape, sources_tbl)
    noisy = synthetic + make_noise_image(shape, kind="gaussian", sigma=30)

    # Run the detector on the synthetic image
    catalog = detect_stars(
        noisy,
        fwhm=3.5,
        sigma=5,
        plot=True,
    )
    print("\nDetected sources:")
    print(catalog["id", "xcentroid", "ycentroid", "flux"])

    from star_detector import detect_stars

    # 1️⃣  Locate your calibrated science frame (must be a 2‑D image)
    fits_path = "calibrated_image.fits"

    # 2️⃣  Call the detector – tweak parameters for your data
    stars = detect_stars(
        fits_path,
        fwhm=2.8,               # typical PSF width in pixels (e.g. Hubble ACS ~2.0‑2.5)
        sigma=4.0,              # 4‑σ detection threshold
        exclude_border=10,      # avoid edge artefacts
        sharpness_range=(0.1, 1.2),
        roundness_range=(-0.5, 0.5),
        plot=True,              # Quick look‑check
    )

    # 3️⃣  Inspect the resulting catalog
    print(stars[:5])   # first five detections
    # → columns include xcentroid, ycentroid, flux, etc.

    # 4️⃣  Save catalog for later astrometry/photometry steps
    stars.write("star_list.ecsv", format="ascii.ecsv", overwrite=True)

Terrible system messages#

You can also instruct bia-bob to do stupid things. Be careful with this functionality.

bob.initialize(
    endpoint='https://llm.scads.ai/v1', model='openai/gpt-oss-120b', api_key=os.environ.get('SCADSAI_API_KEY'),
    system_prompt="""
    You always write super complicated, hard-to-read Python code 
    surrounded with markdown fences. 
    with a lot of unnecessary loops and weird variable names. 
    Make sure it consists of at least 5 lines of code.
""")
This notebook may contain text, code and images generated by artificial intelligence. Used model: openai/gpt-oss-120b, vision model: None, endpoint: https://llm.scads.ai/v1, bia-bob version: 0.33.0.. Do not enter sensitive or private information and verify generated contents according to good scientific practice. Read more: https://github.com/haesleinhuepf/bia-bob#disclaimer
%bob print hello world

Here’s a deliberately over‑engineered way to print “hello world” in Python. The code uses needless loops, obscure variable names, and extra functions, but it still produces the correct output when run.

#!/usr/bin/env python3
# -*- coding: utf-8 -*-

# A streamlined routine that prints "hello world"
def _orchestrate_the_messy_output():
    # List of characters paired with their intended positions
    _scrambled = [
        ('h', 0), ('e', 1), ('l', 2), ('l', 3), ('o', 4),
        (' ', 5), ('w', 6), ('o', 7), ('r', 8), ('l', 9), ('d', 10)
    ]

    # Re‑order based on the stored positions and join into the final string
    _ordered = [char for char, _ in sorted(_scrambled, key=lambda p: p[1])]
    print(''.join(_ordered))

# Invoke the streamlined function
if __name__ == "__main__":
    _orchestrate_the_messy_output()
# Reset default behaviour, as it would be stored permanently otherwise.
bob.initialize(
    endpoint='https://llm.scads.ai/v1', model='openai/gpt-oss-120b', api_key=os.environ.get('SCADSAI_API_KEY'),
    system_prompt=DEFAULT_SYSTEM_PROMPT)
This notebook may contain text, code and images generated by artificial intelligence. Used model: openai/gpt-oss-120b, vision model: None, endpoint: https://llm.scads.ai/v1, bia-bob version: 0.33.0.. Do not enter sensitive or private information and verify generated contents according to good scientific practice. Read more: https://github.com/haesleinhuepf/bia-bob#disclaimer
%bob print hello world

I will output the classic “Hello World” message using Python’s print function.

print("hello world")