histox.util
This module contains a variety of utility functions used throughout the package.
- class histox.util.EasyDict[source]
Convenience class that behaves like a dict but allows access with the attribute syntax.
- class histox.util.FeatureExtractionProgress(*columns: str | ProgressColumn, console: Console | None = None, auto_refresh: bool = True, refresh_per_second: float = 10, speed_estimate_period: float = 30.0, transient: bool = False, redirect_stdout: bool = True, redirect_stderr: bool = True, get_time: Callable[[], float] | None = None, disable: bool = False, expand: bool = False)[source]
- class histox.util.ImgBatchSpeedColumn(batch_size=1, *args, **kwargs)[source]
Renders human readable transfer speed.
- class histox.util.LabeledMofNCompleteColumn(unit: str, *args, **kwargs)[source]
Renders a completion column with labels.
- class histox.util.MultiprocessProgress(pb)[source]
Wrapper for a rich.progress bar that can be shared across processes.
- class histox.util.MultiprocessProgressTracker(tasks)[source]
Wrapper for a rich.progress tracker that can be shared across processes.
- class histox.util.TileExtractionProgress(*columns: str | ProgressColumn, console: Console | None = None, auto_refresh: bool = True, refresh_per_second: float = 10, speed_estimate_period: float = 30.0, transient: bool = False, redirect_stdout: bool = True, redirect_stderr: bool = True, get_time: Callable[[], float] | None = None, disable: bool = False, expand: bool = False)[source]
- class histox.util.TileExtractionSpeedColumn(table_column: Column | None = None)[source]
Renders human readable transfer speed.
- class histox.util.ValidJSONEncoder(*, skipkeys=False, ensure_ascii=True, check_circular=True, allow_nan=True, sort_keys=False, indent=None, separators=None, default=None)[source]
- default(obj)[source]
Implement this method in a subclass such that it returns a serializable object for
o, or calls the base implementation (to raise aTypeError).For example, to support arbitrary iterators, you could implement default like this:
def default(self, o): try: iterable = iter(o) except TypeError: pass else: return list(iterable) # Let the base class default method raise the TypeError return JSONEncoder.default(self, o)
- histox.util.about(console=None) None[source]
Print a summary of the histox version and active backends.
``` Example
>>> hx.about() ╭=======================╮ │ Slideflow │ │ Version: 3.0.1 │ │ Backend: torch │ │ Slide Backend: cucim │ │ https://histox.dev │ ╰=======================╯
- Parameters:
console (rich.console.Console, optional) – Active console, if one exists. Defaults to None.
- histox.util.batch(iterable: List, n: int = 1) Iterable[source]
Separates an interable into batches of maximum size n.
- histox.util.batch_generator(iterable: Iterable, n: int = 1) Iterable[source]
Separates an interable into batches of maximum size n.
- histox.util.bin_values_to_slide_grid(locations: ndarray, values: ndarray, wsi: WSI, background: str = 'min') ndarray[source]
Bin heatmap values to a slide grid, using tile location information.
- Parameters:
locations (np.ndarray) – Array of shape
(n_tiles, 2)containing x, y coordinates for all image tiles. Coordinates represent the center for an associated tile, and must be in a grid.values (np.ndarray) – Array of shape
(n_tiles,)containing heatmap values for each tile.wsi (histox.wsi.WSI) – WSI object.
- Keyword Arguments:
background (str, optional) – Background strategy for heatmap. Can be ‘min’, ‘mean’, ‘median’, ‘max’, or ‘mask’. Defaults to ‘min’.
- histox.util.choice_input(prompt, valid_choices, default=None, multi_choice=False, input_type=<class 'str'>)[source]
Prompts user for multi-choice input.
- histox.util.create_triangles(vertices, hole_vertices=None, hole_points=None)[source]
Tessellate a complex polygon, possibly with holes.
- Parameters:
vertices – A list of vertices [(x1, y1), (x2, y2), …] defining the polygon boundary.
holes – An optional list of points [(hx1, hy1), (hx2, hy2), …] inside each hole in the polygon.
- Returns:
A numpy array of vertices for the tessellated triangles.
- histox.util.download_from_tcga(uuid: str, dest: str, message: str = 'Downloading...') None[source]
Download a file from TCGA (GDC) by UUID.
- histox.util.get_ensemble_model_config(model_path: str) Dict[source]
Loads ensemble model configuration JSON file.
- histox.util.get_gan_config(model_path: str) Dict[source]
Loads a GAN training_options.json for an associated network PKL.
- histox.util.get_model_normalizer(model_path: str) StainNormalizer | None[source]
Loads and fits normalizer using configuration at a model path.
- histox.util.get_preprocess_fn(model_path: str)[source]
Returns a function which preprocesses a uint8 image for a model.
- Parameters:
model_path (str) – Path to a saved Slideflow model.
- Returns:
A function which accepts a single image or batch of uint8 images, and returns preprocessed (and stain normalized) float32 images.
- histox.util.get_relative_tfrecord_paths(root: str, directory: str = '') List[str][source]
Returns relative tfrecord paths with respect to the given directory.
- histox.util.get_slide_paths(slides_dir: str) List[str][source]
Get all slide paths from a given directory containing slides.
- histox.util.get_slides_from_model_manifest(model_path: str, dataset: str | None = None) List[str][source]
Get list of slides from a model manifest.
- Parameters:
model_path (str) – Path to model from which to load the model manifest.
dataset (str) – ‘training’ or ‘validation’. Will return only slides from this dataset. Defaults to None (all).
- Returns:
List of slide names.
- Return type:
list(str)
- histox.util.get_valid_model_dir(root: str) List[source]
This function returns the path of the first indented directory from root. This only works when the indented folder name starts with a 5 digit number, like “00000%”.
- Examples
If the root has 3 files: root/00000-foldername/ root/00001-foldername/ root/00002-foldername/
The function returns “root/00000-foldername/”
- histox.util.global_path(root: str, path_string: str)[source]
Returns global path from a local path.
- histox.util.infer_stride(locations, wsi)[source]
Infer the stride of a grid of locations from a set of locations.
- Parameters:
locations (np.ndarray) – Nx2 array of locations
wsi (histox.wsi.WSI) – WSI object
- Returns:
inferred stride divisor in pixels
- Return type:
float
- histox.util.is_project(path: str) bool[source]
Checks if the given path is a valid Slideflow project.
- histox.util.is_simclr_model_path(path: Any) bool[source]
Checks if the given path is a valid SimCLR model or checkpoint.
- histox.util.is_tensorflow_model_path(path: str) bool[source]
Checks if the given path is a valid Slideflow/Tensorflow model.
- histox.util.is_tile_size_compatible(tile_px1: int, tile_um1: str | int, tile_px2: int, tile_um2: str | int) bool[source]
Check whether tile sizes are compatible.
- Compatibility is defined as:
Equal size in pixels
If tile width (tile_um) is defined in microns (int) for both, these must be equal
If tile width (tile_um) is defined as a magnification (str) for both, these must be equal
If one is defined in microns and the other as a magnification, the calculated magnification must be +/- 2.
Example 1: - tile_px1=299, tile_um1=302 - tile_px2=299, tile_um2=304 - Incompatible (unequal micron width)
Example 2: - tile_px1=299, tile_um1=10x - tile_px2=299, tile_um2=9x - Incompatible (unequal magnification)
Example 3: - tile_px1=299, tile_um1=302 - tile_px2=299, tile_um2=10x - Compatible (first has an equivalent magnification of 9.9x, which is +/- 2 compared to 10x)
- Parameters:
tile_px1 (int) – Tile size (in pixels) of first slide.
tile_um1 (int or str) – Tile size (in microns) of first slide. Can also be expressed as a magnification level, e.g.
'10x'tile_px2 (int) – Tile size (in pixels) of second slide.
tile_um2 (int or str) – Tile size (in microns) of second slide. Can also be expressed as a magnification level, e.g.
'10x'
- Returns:
Whether the tile sizes are compatible.
- Return type:
bool
- histox.util.is_torch_model_path(path: str) bool[source]
Checks if the given path is a valid Slideflow/PyTorch model.
- histox.util.is_uq_model(model_path: str) bool[source]
Checks if the given model path points to a UQ-enabled model.
- histox.util.isnumeric(val: Any) bool[source]
Check if the given value is numeric (numpy or python).
Tensors will return False.
Specifically checks if the value is a python int or float, or if the value is a numpy array with a numeric dtype (int or float).
- histox.util.load_predictions(path: str, **kwargs) DataFrame[source]
Loads a ‘csv’, ‘parquet’ or ‘feather’ file to a pandas dataframe.
- Parameters:
path (str) – Path to the file to be read.
- Returns:
The dataframe read from the path.
- Return type:
df (pd.DataFrame)
- histox.util.location_heatmap(locations: ndarray, values: ndarray, slide: str, tile_px: int, tile_um: int | str, filename: str, *, interpolation: str | None = 'bicubic', cmap: str = 'inferno', norm: str | None = None, background: str = 'min') None[source]
Generate a heatmap for a slide.
- Parameters:
locations (np.ndarray) – Array of shape
(n_tiles, 2)containing x, y coordinates for all image tiles. Coordinates represent the center for an associated tile, and must be in a grid.values (np.ndarray) – Array of shape
(n_tiles,)containing heatmap values for each tile.slide (str) – Path to corresponding slide.
tile_px (int) – Tile pixel size.
tile_um (int, str) – Tile micron or magnification size.
filename (str) – Destination filename for heatmap.
- Keyword Arguments:
interpolation (str, optional) – Interpolation strategy for smoothing heatmap. Defaults to ‘bicubic’.
cmap (str, optional) – Matplotlib colormap for heatmap. Can be any valid matplotlib colormap. Defaults to ‘inferno’.
norm (str, optional) – Normalization strategy for assigning heatmap values to colors. Either ‘two_slope’, or any other valid value for the
normargument ofmatplotlib.pyplot.imshow. If ‘two_slope’, normalizes values less than 0 and greater than 0 separately. Defaults to None.
- histox.util.log_manifest(train_tfrecords: List[str] | None = None, val_tfrecords: List[str] | None = None, *, labels: Dict[str, Any] | None = None, filename: str | None = None, remove_extension: bool = True) str[source]
Saves the training manifest in CSV format and returns as a string.
- Parameters:
train_tfrecords (list(str)], optional) – List of training TFRecords. Defaults to None.
val_tfrecords (list(str)], optional) – List of validation TFRecords. Defaults to None.
- Keyword Arguments:
labels (dict, optional) – TFRecord outcome labels. Defaults to None.
filename (str, optional) – Path to CSV file to save. Defaults to None.
remove_extension (bool, optional) – Remove file extension from slide names. Defaults to True.
- Returns:
Saved manifest in str format.
- Return type:
str
- histox.util.make_dir(_dir: str) None[source]
Makes a directory if one does not already exist, in a manner compatible with multithreading.
- histox.util.map_values_to_slide_grid(locations: ndarray, values: ndarray, wsi: WSI, background: str = 'min', *, interpolation: str | None = 'bicubic') ndarray[source]
Map heatmap values to a slide grid, using tile location information.
- Parameters:
locations (np.ndarray) – Array of shape
(n_tiles, 2)containing x, y coordinates for all image tiles. Coordinates represent the center for an associated tile, and must be in a grid.values (np.ndarray) – Array of shape
(n_tiles,)containing heatmap values for each tile.wsi (histox.wsi.WSI) – WSI object.
- Keyword Arguments:
background (str, optional) – Background strategy for heatmap. Can be ‘min’, ‘mean’, ‘median’, ‘max’, or ‘mask’. Defaults to ‘min’.
interpolation (str, optional) – Interpolation strategy for smoothing heatmap. Defaults to ‘bicubic’.
- histox.util.multi_warn(arr: List, compare: Callable, msg: Callable | str) int[source]
Logs multiple warning
- Parameters:
arr (List) – Array to compare.
compare (Callable) – Comparison to perform on array. If True, will warn.
msg (str) – Warning message.
- Returns:
Number of warnings.
- Return type:
int
- histox.util.path_input(prompt: str, root: str, default: str | None = None, create_on_invalid: bool = False, filetype: str | None = None, verify: bool = True) str[source]
Prompts user for directory input.
- histox.util.path_to_name(path: str) str[source]
Returns name of a file, without extension, from a given full path string.
- histox.util.read_annotations(path: str) Tuple[List[str], List[Dict]][source]
Read an annotations file.
- histox.util.relative_path(path: str, root: str)[source]
Returns a relative path, from a given root directory.
- histox.util.setLoggingLevel(level)[source]
Set the logging level.
Uses standard python logging levels:
50: CRITICAL
40: ERROR
30: WARNING
20: INFO
10: DEBUG
0: NOTSET
- Parameters:
level (int) – Logging level numeric value.
- histox.util.split_list(a: List, n: int) List[List][source]
Function to split a list into n components
- histox.util.tfrecord_heatmap(tfrecord: str, slide: str, tile_px: int, tile_um: int | str, tile_dict: Dict[int, float], filename: str, **kwargs) None[source]
Creates a tfrecord-based WSI heatmap using a dictionary of tile values for heatmap display.
- Parameters:
tfrecord (str) – Path to tfrecord.
slide (str) – Path to whole-slide image.
tile_dict (dict) – Dictionary mapping tfrecord indices to a tile-level value for display in heatmap format.
tile_px (int) – Tile width in pixels.
tile_um (int or str) – Tile width in microns (int) or magnification (str, e.g. “20x”).
filename (str) – Destination filename for heatmap.
- histox.util.tile_size_label(tile_px: int, tile_um: str | int) str[source]
Return the string label of the given tile size.
- histox.util.to_onehot(val: int, max: int) ndarray[source]
Converts value to one-hot encoding
- Parameters:
val (int) – Value to encode
max (int) – Maximum value (length of onehot encoding)
- histox.util.update_results_log(results_log_path: str, model_name: str, results_dict: Dict) None[source]
Dynamically update results_log when recording training metrics.