histox.model

This module provides the ModelParams class to organize model and training parameters/hyperparameters and assist with model building, as well as the Trainer class that executes model training and evaluation. RegressionTrainer and SurvivalTrainer are extensions of this class, supporting regression and Cox Proportional Hazards outcomes, respectively. The function build_trainer() can choose and return the correct model instance based on the provided hyperparameters.

Note

In order to support both Tensorflow and PyTorch backends, the histox.model module will import either histox.model.tensorflow or histox.model.torch according to the currently active backend, indicated by the environmental variable SF_BACKEND.

See Training for a detailed look at how to train models.

Trainer

class histox.model.Trainer(hp: ModelParams, outdir: str, labels: Dict[str, Any], *, slide_input: Dict[str, Any] | None = None, name: str = 'Trainer', feature_sizes: List[int] | None = None, feature_names: List[str] | None = None, outcome_names: List[str] | None = None, mixed_precision: bool = True, allow_tf32: bool = False, config: Dict[str, Any] | None = None, use_neptune: bool = False, neptune_api: str | None = None, neptune_workspace: str | None = None, load_method: str = 'weights', custom_objects: Dict[str, Any] | None = None, device: str | None = None, transform: Callable | Dict[str, Callable] | None = None, pin_memory: bool = True, num_workers: int = 4, chunk_size: int = 8)[source]

Base trainer class containing functionality for model building, input processing, training, and evaluation.

This base class requires categorical outcome(s). Additional outcome types are supported by histox.model.RegressionTrainer and histox.model.SurvivalTrainer.

Slide-level (e.g. clinical) features can be used as additional model input by providing slide labels in the slide annotations dictionary, under the key ‘input’.

model.Trainer.load(*args, **kwargs)

MagicMock is a subclass of Mock with default implementations of most of the magic methods. You can use MagicMock without having to configure the magic methods yourself.

If you use the spec or spec_set arguments then only magic methods that exist in the spec will be created.

Attributes and the return value of a MagicMock will also be MagicMocks.

model.Trainer.evaluate(*args, **kwargs)

MagicMock is a subclass of Mock with default implementations of most of the magic methods. You can use MagicMock without having to configure the magic methods yourself.

If you use the spec or spec_set arguments then only magic methods that exist in the spec will be created.

Attributes and the return value of a MagicMock will also be MagicMocks.

model.Trainer.predict(*args, **kwargs)

MagicMock is a subclass of Mock with default implementations of most of the magic methods. You can use MagicMock without having to configure the magic methods yourself.

If you use the spec or spec_set arguments then only magic methods that exist in the spec will be created.

Attributes and the return value of a MagicMock will also be MagicMocks.

model.Trainer.train(*args, **kwargs)

MagicMock is a subclass of Mock with default implementations of most of the magic methods. You can use MagicMock without having to configure the magic methods yourself.

If you use the spec or spec_set arguments then only magic methods that exist in the spec will be created.

Attributes and the return value of a MagicMock will also be MagicMocks.

RegressionTrainer

class histox.model.RegressionTrainer(*args, **kwargs)[source]

Extends the base histox.model.Trainer class to add support for continuous outcomes. Requires that all outcomes be continuous, with appropriate regression loss function. Uses R-squared as the evaluation metric, rather than AUROC.

In this case, for the PyTorch backend, the continuous outcomes support is already baked into the base Trainer class, so no additional modifications are required. This class is written to inherit the Trainer class without modification to maintain consistency with the Tensorflow backend.

SurvivalTrainer

class histox.model.SurvivalTrainer(*args, **kwargs)[source]: Cox proportional hazards (CPH) models are not yet implemented, but are planned for a future update.

Features

class histox.model.Features(path: str | None, layers: str | List[str] | None = 'postconv', *, include_preds: bool = False, mixed_precision: bool = True, channels_last: bool = True, device: torch.device | None = None, apply_softmax: bool | None = None, pooling: Any | None = None, load_method: str = 'weights')[source]

Interface for obtaining predictions and features from intermediate layer activations from Slideflow models.

Use by calling on either a batch of images (returning outputs for a single batch), or by calling on a histox.WSI object, which will generate an array of spatially-mapped activations matching the slide.

Examples

Calling on batch of images:

interface = Features('/model/path', layers='postconv')
for image_batch in train_data:
    # Return shape: (batch_size, num_features)
    batch_features = interface(image_batch)

Calling on a slide:

slide = hx.slide.WSI(...)
interface = Features('/model/path', layers='postconv')
# Return shape:
# (slide.grid.shape[0], slide.grid.shape[1], num_features)
activations_grid = interface(slide)

Note

When this interface is called on a batch of images, no image processing or stain normalization will be performed, as it is assumed that normalization will occur during data loader image processing. When the interface is called on a histox.WSI, the normalization strategy will be read from the model configuration file, and normalization will be performed on image tiles extracted from the WSI. If this interface was created from an existing model and there is no model configuration file to read, a histox.norm.StainNormalizer object may be passed during initialization via the argument wsi_normalizer.

model.Features.from_model(*args, **kwargs)

MagicMock is a subclass of Mock with default implementations of most of the magic methods. You can use MagicMock without having to configure the magic methods yourself.

If you use the spec or spec_set arguments then only magic methods that exist in the spec will be created.

Attributes and the return value of a MagicMock will also be MagicMocks.

model.Features.__call__(*args, **kwargs): Call self as a function.

Other functions

histox.model.build_trainer(hp: ModelParams, outdir: str, labels: Dict[str, Any], **kwargs) → Trainer[source]

From the given histox.ModelParams object, returns the appropriate instance of histox.model.Trainer.

Parameters:

hp (histox.ModelParams) – ModelParams object.
outdir (str) – Path for event logs and checkpoints.
labels (dict) – Dict mapping slide names to outcome labels (int or float format).

Keyword Arguments:

slide_input (dict) – Dict mapping slide names to additional slide-level input, concatenated after post-conv.
name (str, optional) – Optional name describing the model, used for model saving. Defaults to ‘Trainer’.
feature_sizes (list, optional) – List of sizes of input features. Required if providing additional input features as input to the model.
feature_names (list, optional) – List of names for input features. Used when permuting feature importance.
outcome_names (list, optional) – Name of each outcome. Defaults to “Outcome {X}” for each outcome.
mixed_precision (bool, optional) – Use FP16 mixed precision (rather than FP32). Defaults to True.
allow_tf32 (bool) – Allow internal use of Tensorfloat-32 format. Defaults to False.
config (dict, optional) – Training configuration dictionary, used for logging. Defaults to None.
use_neptune (bool, optional) – Use Neptune API logging. Defaults to False
neptune_api (str, optional) – Neptune API token, used for logging. Defaults to None.
neptune_workspace (str, optional) – Neptune workspace. Defaults to None.
load_method (str) – Either ‘full’ or ‘weights’. Method to use when loading a Tensorflow model. If ‘full’, loads the model with tf.keras.models.load_model(). If ‘weights’, will read the params.json configuration file, build the model architecture, and then load weights from the given model with Model.load_weights(). Loading with ‘full’ may improve compatibility across Slideflow versions. Loading with ‘weights’ may improve compatibility across hardware & environments.
custom_objects (dict, Optional) – Dictionary mapping names (strings) to custom classes or functions. Defaults to None.
num_workers (int) – Number of dataloader workers. Only used for PyTorch. Defaults to 4.

histox.model.build_feature_extractor(name: str, backend: str | None = None, **kwargs) → BaseFeatureExtractor[source]

Build a feature extractor.

The returned feature extractor is a callable object, which returns features (often layer activations) for either a batch of images or a histox.WSI object.

If generating features for a batch of images, images are expected to be in (B, W, H, C) format and non-standardized (scaled 0-255) with dtype uint8. The feature extractors perform all needed preprocessing on the fly.

If generating features for a slide, the slide is expected to be a histox.WSI object. The feature extractor will generate features for each tile in the slide, returning a numpy array of shape (W, H, F), where F is the number of features.

Parameters:

name (str) – Name of the feature extractor to build. Available feature extractors are listed with histox.model.list_extractors().

Keyword Arguments:

tile_px (int) – Tile size (input image size), in pixels.
**kwargs (Any) – All remaining keyword arguments are passed to the feature extractor factory function, and may be different for each extractor.

Returns:

A callable object which accepts a batch of images (B, W, H, C) of dtype uint8 and returns a batch of features (dtype float32).

Examples

Create an extractor that calculates post-convolutional layer activations from an imagenet-pretrained Resnet50 model.

import histox as hx

extractor = hx.build_feature_extractor(
    'resnet50_imagenet'
)

Create an extractor that calculates ‘conv4_block4_2_relu’ activations from an imagenet-pretrained Resnet50 model.

extractor = hx.build_feature_extractor(
    'resnet50_imagenet',
    layers='conv4_block4_2_relu
)

Create a pretrained “CTransPath” extractor.

extractor = hx.build_feature_extractor('ctranspath')

Use an extractor to calculate layer activations for an entire dataset.

import histox as hx

# Load a project and dataset
P = hx.load_project(...)
dataset = P.dataset(...)

# Create a feature extractor
resnet = hx.build_feature_extractor(
    'resnet50_imagenet'
)

# Calculate features for the entire dataset
features = hx.DatasetFeatures(
    resnet,
    dataset=dataset
)

Generate a map of features across a slide.

import histox as hx

# Load a slide
wsi = hx.WSI(...)

# Create a feature extractor
retccl = hx.build_feature_extractor(
    'retccl',
    resize=True
)

# Create a feature map, a 2D array of shape
# (W, H, F), where F is the number of features.
features = retccl(wsi)

histox.model.list_extractors()[source]: Return a list of all available feature extractors.

histox.model.load(path: str) → torch.nn.Module[source]

Load a model trained with Slideflow.

Parameters:: path (str) – Path to saved model. Must be a model trained in Slideflow.
Returns:: Loaded model.
Return type:: torch.nn.Module

histox.model.is_tensorflow_model(arg: Any) → bool[source]: Checks if the object is a Tensorflow Model or path to Tensorflow model.

histox.model.is_tensorflow_tensor(arg: Any) → bool[source]: Checks if the given object is a Tensorflow Tensor.

histox.model.is_torch_model(arg: Any) → bool[source]: Checks if the object is a PyTorch Module or path to PyTorch model.

histox.model.is_torch_tensor(arg: Any) → bool[source]: Checks if the given object is a Tensorflow Tensor.

histox.model.read_hp_sweep(filename: str, models: List[str] | None = None) → Dict[str, ModelParams][source]

Organizes a list of hyperparameters ojects and associated models names.

Parameters:

filename (str) – Path to hyperparameter sweep JSON file.
models (list(str)) – List of model names. Defaults to None. If not supplied, returns all valid models from batch file.

Returns:

List of (Hyperparameter, model_name) for each HP combination

histox.model.rebuild_extractor(bags_or_model: str, allow_errors: bool = False, native_normalizer: bool = True) → Tuple[BaseFeatureExtractor | None, StainNormalizer | None][source]

Recreate the extractor used to generate features stored in bags.

Parameters:

bags_or_model (str) – Either a path to directory containing feature bags, or a path to a trained MIL model. If a path to a trained MIL model, the extractor used to generate features will be recreated.
allow_errors (bool) – If True, return None if the extractor cannot be rebuilt. If False, raise an error. Defaults to False.
native_normalizer (bool, optional) – Whether to use PyTorch/Tensorflow-native stain normalization, if applicable. If False, will use the OpenCV/Numpy implementations. Defaults to True.

Returns:

Extractor function, or None if allow_errors is: True and the extractor cannot be rebuilt.
Optional[StainNormalizer]: Stain normalizer used when generating: feature bags, or None if no stain normalization was used.

Return type:

Optional[BaseFeatureExtractor]