Data#
Generic Interfaces#
Dataset#
- class monai.data.Dataset(data, transform=None)[source]#
A generic dataset with a length property and an optional callable data transform when fetching a data sample. If passing slicing indices, will return a PyTorch Subset, for example: data: Subset = dataset[1:4], for more details, please check: https://pytorch.org/docs/stable/data.html#torch.utils.data.Subset
For example, typical input data can be a list of dictionaries:
[{ { { 'img': 'image1.nii.gz', 'img': 'image2.nii.gz', 'img': 'image3.nii.gz', 'seg': 'label1.nii.gz', 'seg': 'label2.nii.gz', 'seg': 'label3.nii.gz', 'extra': 123 'extra': 456 'extra': 789 }, }, }]
- __getitem__(index)[source]#
Returns a Subset if index is a slice or Sequence, a data item otherwise.
- __init__(data, transform=None)[source]#
- Parameters:
data (
Sequence) – input data to load and transform to generate dataset for model.transform (
UnionType[Sequence[Callable],Callable,None]) – a callable, sequence of callables or None. If transform is notinstance (a Compose)
Sequences (it will be wrapped in a Compose instance.)
passed (of callables are applied in order and if None is)
is. (the data is returned as)
IterableDataset#
- class monai.data.IterableDataset(data, transform=None)[source]#
A generic dataset for iterable data source and an optional callable data transform when fetching a data sample. Inherit from PyTorch IterableDataset: https://pytorch.org/docs/stable/data.html?highlight=iterabledataset#torch.utils.data.IterableDataset. For example, typical input data can be web data stream which can support multi-process access.
To accelerate the loading process, it can support multi-processing based on PyTorch DataLoader workers, every process executes transforms on part of every loaded data. Note that the order of output data may not match data source in multi-processing mode. And each worker process will have a different copy of the dataset object, need to guarantee process-safe from data source or DataLoader.
DatasetFunc#
- class monai.data.DatasetFunc(data, func, **kwargs)[source]#
Execute function on the input dataset and leverage the output to act as a new Dataset. It can be used to load / fetch the basic dataset items, like the list of image, label paths. Or chain together to execute more complicated logic, like partition_dataset, resample_datalist, etc. The data arg of Dataset will be applied to the first arg of callable func. Usage example:
data_list = DatasetFunc( data="path to file", func=monai.data.load_decathlon_datalist, data_list_key="validation", base_dir="path to base dir", ) # partition dataset for every rank data_partition = DatasetFunc( data=data_list, func=lambda **kwargs: monai.data.partition_dataset(**kwargs)[torch.distributed.get_rank()], num_partitions=torch.distributed.get_world_size(), ) dataset = Dataset(data=data_partition, transform=transforms)
- Parameters:
data (
Any) – input data for the func to process, will apply to func as the first arg.func (
Callable) – callable function to generate dataset items.kwargs – other arguments for the func except for the first arg.
- reset(data=None, func=None, **kwargs)[source]#
Reset the dataset items with specified func.
- Parameters:
data (
UnionType[Any,None]) – if not None, execute func on it, default to self.src.func (
UnionType[Callable,None]) – if not None, execute the func with specified kwargs, default to self.func.kwargs – other arguments for the func except for the first arg.
ShuffleBuffer#
- class monai.data.ShuffleBuffer(data, transform=None, buffer_size=512, seed=0, epochs=1)[source]#
Extend the IterableDataset with a buffer and randomly pop items.
- Parameters:
data – input data source to load and transform to generate dataset for model.
transform – a callable data transform on input data.
buffer_size (
int) – size of the buffer to store items and randomly pop, default to 512.seed (
int) – random seed to initialize the random state of all workers, set seed += 1 in every iter() call, refer to the PyTorch idea: pytorch/pytorch.epochs (
int) – number of epochs to iterate over the dataset, default to 1, -1 means infinite epochs.
Note
Both
monai.data.DataLoaderandtorch.utils.data.DataLoaderdo not seed this class (as a subclass ofIterableDataset) at run time.persistent_workers=Trueflag (and pytorch>1.8) is therefore required for multiple epochs of loading whennum_workers>0. For example:import monai def run(): dss = monai.data.ShuffleBuffer([1, 2, 3, 4], buffer_size=30, seed=42) dataloader = monai.data.DataLoader( dss, batch_size=1, num_workers=2, persistent_workers=True) for epoch in range(3): for item in dataloader: print(f"epoch: {epoch} item: {item}.") if __name__ == '__main__': run()
- randomize(size)[source]#
Within this method,
self.Rshould be used, instead of np.random, to introduce random factors.all
self.Rcalls happen here so that we have a better chance to identify errors of sync the random state.This method can generate the random factors based on properties of the input data.
- Raises:
NotImplementedError – When the subclass does not override this method.
- Return type:
None
CSVIterableDataset#
- class monai.data.CSVIterableDataset(src, chunksize=1000, buffer_size=None, col_names=None, col_types=None, col_groups=None, transform=None, shuffle=False, seed=0, kwargs_read_csv=None, **kwargs)[source]#
Iterable dataset to load CSV files and generate dictionary data. It is particularly useful when data come from a stream, inherits from PyTorch IterableDataset: https://pytorch.org/docs/stable/data.html?highlight=iterabledataset#torch.utils.data.IterableDataset.
It also can be helpful when loading extremely big CSV files that can’t read into memory directly, just treat the big CSV file as stream input, call reset() of CSVIterableDataset for every epoch. Note that as a stream input, it can’t get the length of dataset.
To effectively shuffle the data in the big dataset, users can set a big buffer to continuously store the loaded data, then randomly pick data from the buffer for following tasks.
To accelerate the loading process, it can support multi-processing based on PyTorch DataLoader workers, every process executes transforms on part of every loaded data. Note: the order of output data may not match data source in multi-processing mode.
It can load data from multiple CSV files and join the tables with additional kwargs arg. Support to only load specific columns. And it can also group several loaded columns to generate a new column, for example, set col_groups={“meta”: [“meta_0”, “meta_1”, “meta_2”]}, output can be:
[ {"image": "./image0.nii", "meta_0": 11, "meta_1": 12, "meta_2": 13, "meta": [11, 12, 13]}, {"image": "./image1.nii", "meta_0": 21, "meta_1": 22, "meta_2": 23, "meta": [21, 22, 23]}, ]
- Parameters:
src (
UnionType[str,Sequence[str],Iterable,Sequence[Iterable]]) – if provided the filename of CSV file, it can be a str, URL, path object or file-like object to load. also support to provide iter for stream input directly, will skip loading from filename. if provided a list of filenames or iters, it will join the tables.chunksize (
int) – rows of a chunk when loading iterable data from CSV files, default to 1000. more details: https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.read_csv.html.buffer_size (
UnionType[int,None]) – size of the buffer to store the loaded chunks, if None, set to 2 x chunksize.col_names (
UnionType[Sequence[str],None]) – names of the expected columns to load. if None, load all the columns.col_types (
UnionType[dict[str,UnionType[dict[str,Any],None]],None]) –type and default value to convert the loaded columns, if None, use original data. it should be a dictionary, every item maps to an expected column, the key is the column name and the value is None or a dictionary to define the default value and data type. the supported keys in dictionary are: [“type”, “default”]. for example:
col_types = { "subject_id": {"type": str}, "label": {"type": int, "default": 0}, "ehr_0": {"type": float, "default": 0.0}, "ehr_1": {"type": float, "default": 0.0}, "image": {"type": str, "default": None}, }
col_groups (
UnionType[dict[str,Sequence[str]],None]) – args to group the loaded columns to generate a new column, it should be a dictionary, every item maps to a group, the key will be the new column name, the value is the names of columns to combine. for example: col_groups={“ehr”: [f”ehr_{i}” for i in range(10)], “meta”: [“meta_1”, “meta_2”]}transform (
UnionType[Callable,None]) – transform to apply on the loaded items of a dictionary data.shuffle (
bool) – whether to shuffle all the data in the buffer every time a new chunk loaded.seed (
int) – random seed to initialize the random state for all the workers if shuffle is True, set seed += 1 in every iter() call, refer to the PyTorch idea: pytorch/pytorch.kwargs_read_csv (
UnionType[dict,None]) – dictionary args to pass to pandas read_csv function. Default to{"chunksize": chunksize}.kwargs – additional arguments for pandas.merge() API to join tables.
- close()[source]#
Close the pandas TextFileReader iterable objects. If the input src is file path, TextFileReader was created internally, need to close it. If the input src is iterable object, depends on users requirements whether to close it in this function. For more details, please check: https://pandas.pydata.org/pandas-docs/stable/user_guide/io.html?#iteration.
- reset(src=None)[source]#
Reset the pandas TextFileReader iterable object to read data. For more details, please check: https://pandas.pydata.org/pandas-docs/stable/user_guide/io.html?#iteration.
- Parameters:
src (
UnionType[str,Sequence[str],Iterable,Sequence[Iterable],None]) – if not None and provided the filename of CSV file, it can be a str, URL, path object or file-like object to load. also support to provide iter for stream input directly, will skip loading from filename. if provided a list of filenames or iters, it will join the tables. default to self.src.
PersistentDataset#
- class monai.data.PersistentDataset(data, transform, cache_dir, hash_func=<function pickle_hashing>, pickle_module='pickle', pickle_protocol=2, hash_transform=None, reset_ops_id=True)[source]#
Persistent storage of pre-computed values to efficiently manage larger than memory dictionary format data, it can operate transforms for specific fields. Results from the non-random transform components are computed when first used, and stored in the cache_dir for rapid retrieval on subsequent uses. If passing slicing indices, will return a PyTorch Subset, for example: data: Subset = dataset[1:4], for more details, please check: https://pytorch.org/docs/stable/data.html#torch.utils.data.Subset
The transforms which are supposed to be cached must implement the monai.transforms.Transform interface and should not be Randomizable. This dataset will cache the outcomes before the first Randomizable Transform within a Compose instance.
For example, typical input data can be a list of dictionaries:
[{ { { 'image': 'image1.nii.gz', 'image': 'image2.nii.gz', 'image': 'image3.nii.gz', 'label': 'label1.nii.gz', 'label': 'label2.nii.gz', 'label': 'label3.nii.gz', 'extra': 123 'extra': 456 'extra': 789 }, }, }]
For a composite transform like
[ LoadImaged(keys=['image', 'label']), Orientationd(keys=['image', 'label'], axcodes='RAS'), ScaleIntensityRanged(keys=['image'], a_min=-57, a_max=164, b_min=0.0, b_max=1.0, clip=True), RandCropByPosNegLabeld(keys=['image', 'label'], label_key='label', spatial_size=(96, 96, 96), pos=1, neg=1, num_samples=4, image_key='image', image_threshold=0), ToTensord(keys=['image', 'label'])]
Upon first use a filename based dataset will be processed by the transform for the [LoadImaged, Orientationd, ScaleIntensityRanged] and the resulting tensor written to the cache_dir before applying the remaining random dependant transforms [RandCropByPosNegLabeld, ToTensord] elements for use in the analysis.
Subsequent uses of a dataset directly read pre-processed results from cache_dir followed by applying the random dependant parts of transform processing.
During training call set_data() to update input data and recompute cache content.
Note
The input data must be a list of file paths and will hash them as cache keys.
The filenames of the cached files also try to contain the hash of the transforms. In this fashion, PersistentDataset should be robust to changes in transforms. This, however, is not guaranteed, so caution should be used when modifying transforms to avoid unexpected errors. If in doubt, it is advisable to clear the cache directory.
Cached data is expected to be tensors, primitives, or dictionaries keying to these values. Numpy arrays will be converted to tensors, however any other object type returned by transforms will not be loadable since torch.load will be used with weights_only=True to prevent loading of potentially malicious objects. Legacy cache files may not be loadable and may need to be recomputed.
- Lazy Resampling:
If you make use of the lazy resampling feature of monai.transforms.Compose, please refer to its documentation to familiarize yourself with the interaction between PersistentDataset and lazy resampling.
- __init__(data, transform, cache_dir, hash_func=<function pickle_hashing>, pickle_module='pickle', pickle_protocol=2, hash_transform=None, reset_ops_id=True)[source]#
- Parameters:
data (
Sequence) – input data file paths to load and transform to generate dataset for model. PersistentDataset expects input data to be a list of serializable and hashes them as cache keys using hash_func.transform (
UnionType[Sequence[Callable],Callable]) – transforms to execute operations on input data.cache_dir (
UnionType[Path,str,None]) – If specified, this is the location for persistent storage of pre-computed transformed data tensors. The cache_dir is computed once, and persists on disk until explicitly removed. Different runs, programs, experiments may share a common cache dir provided that the transforms pre-processing is consistent. If cache_dir doesn’t exist, will automatically create it. If cache_dir is None, there is effectively no caching.hash_func (
Callable[…,bytes]) – a callable to compute hash from data items to be cached. defaults to monai.data.utils.pickle_hashing.pickle_module (
str) – string representing the module used for pickling metadata and objects, default to “pickle”. due to the pickle limitation in multi-processing of Dataloader, we can’t use pickle as arg directly, so here we use a string name instead. if want to use other pickle module at runtime, just register like: >>> from monai.data import utils >>> utils.SUPPORTED_PICKLE_MOD[“test”] = other_pickle this arg is used by torch.save, for more details, please check: https://pytorch.org/docs/stable/generated/torch.save.html#torch.save, andmonai.data.utils.SUPPORTED_PICKLE_MOD.pickle_protocol (
int) – specifies pickle protocol when saving, with torch.save. Defaults to torch.serialization.DEFAULT_PROTOCOL. For more details, please check: https://pytorch.org/docs/stable/generated/torch.save.html#torch.save.hash_transform (
UnionType[Callable[…,bytes],None]) – a callable to compute hash from the transform information when caching. This may reduce errors due to transforms changing during experiments. Default to None (no hash). Other options are pickle_hashing and json_hashing functions from monai.data.utils.reset_ops_id (
bool) – whether to set TraceKeys.ID toTracekys.NONE, defaults toTrue. When this is enabled, the traced transform instance IDs will be removed from the cached MetaTensors. This is useful for skipping the transform instance checks when inverting applied operations using the cached content and with re-created transform instances.
GDSDataset#
- class monai.data.GDSDataset(data, transform, cache_dir, device, hash_func=<function pickle_hashing>, hash_transform=None, reset_ops_id=True, **kwargs)[source]#
An extension of the PersistentDataset using direct memory access(DMA) data path between GPU memory and storage, thus avoiding a bounce buffer through the CPU. This direct path can increase system bandwidth while decreasing latency and utilization load on the CPU and GPU.
A tutorial is available: Project-MONAI/tutorials.
See also: rapidsai/kvikio
- __init__(data, transform, cache_dir, device, hash_func=<function pickle_hashing>, hash_transform=None, reset_ops_id=True, **kwargs)[source]#
- Parameters:
data (
Sequence) – input data file paths to load and transform to generate dataset for model. GDSDataset expects input data to be a list of serializable and hashes them as cache keys using hash_func.transform (
UnionType[Sequence[Callable],Callable]) – transforms to execute operations on input data.cache_dir (
UnionType[Path,str,None]) – If specified, this is the location for gpu direct storage of pre-computed transformed data tensors. The cache_dir is computed once, and persists on disk until explicitly removed. Different runs, programs, experiments may share a common cache dir provided that the transforms pre-processing is consistent. If cache_dir doesn’t exist, will automatically create it. If cache_dir is None, there is effectively no caching.device (
int) – target device to put the output Tensor data. Note that only int can be used to specify the gpu to be used.hash_func (
Callable[…,bytes]) – a callable to compute hash from data items to be cached. defaults to monai.data.utils.pickle_hashing.hash_transform (
UnionType[Callable[…,bytes],None]) – a callable to compute hash from the transform information when caching. This may reduce errors due to transforms changing during experiments. Default to None (no hash). Other options are pickle_hashing and json_hashing functions from monai.data.utils.reset_ops_id (
bool) – whether to set TraceKeys.ID toTracekys.NONE, defaults toTrue. When this is enabled, the traced transform instance IDs will be removed from the cached MetaTensors. This is useful for skipping the transform instance checks when inverting applied operations using the cached content and with re-created transform instances.
CacheNTransDataset#
- class monai.data.CacheNTransDataset(data, transform, cache_n_trans, cache_dir, hash_func=<function pickle_hashing>, pickle_module='pickle', pickle_protocol=2, hash_transform=None, reset_ops_id=True)[source]#
Extension of PersistentDataset, it can also cache the result of first N transforms, no matter it’s random or not.
- __init__(data, transform, cache_n_trans, cache_dir, hash_func=<function pickle_hashing>, pickle_module='pickle', pickle_protocol=2, hash_transform=None, reset_ops_id=True)[source]#
- Parameters:
data (
Sequence) – input data file paths to load and transform to generate dataset for model. PersistentDataset expects input data to be a list of serializable and hashes them as cache keys using hash_func.transform (
UnionType[Sequence[Callable],Callable]) – transforms to execute operations on input data.cache_n_trans (
int) – cache the result of first N transforms.cache_dir (
UnionType[Path,str,None]) – If specified, this is the location for persistent storage of pre-computed transformed data tensors. The cache_dir is computed once, and persists on disk until explicitly removed. Different runs, programs, experiments may share a common cache dir provided that the transforms pre-processing is consistent. If cache_dir doesn’t exist, will automatically create it. If cache_dir is None, there is effectively no caching.hash_func (
Callable[…,bytes]) – a callable to compute hash from data items to be cached. defaults to monai.data.utils.pickle_hashing.pickle_module (
str) – string representing the module used for pickling metadata and objects, default to “pickle”. due to the pickle limitation in multi-processing of Dataloader, we can’t use pickle as arg directly, so here we use a string name instead. if want to use other pickle module at runtime, just register like: >>> from monai.data import utils >>> utils.SUPPORTED_PICKLE_MOD[“test”] = other_pickle this arg is used by torch.save, for more details, please check: https://pytorch.org/docs/stable/generated/torch.save.html#torch.save, andmonai.data.utils.SUPPORTED_PICKLE_MOD.pickle_protocol (
int) – specifies pickle protocol when saving, with torch.save. Defaults to torch.serialization.DEFAULT_PROTOCOL. For more details, please check: https://pytorch.org/docs/stable/generated/torch.save.html#torch.save.hash_transform (
UnionType[Callable[…,bytes],None]) – a callable to compute hash from the transform information when caching. This may reduce errors due to transforms changing during experiments. Default to None (no hash). Other options are pickle_hashing and json_hashing functions from monai.data.utils.reset_ops_id (
bool) – whether to set TraceKeys.ID toTracekys.NONE, defaults toTrue. When this is enabled, the traced transform instance IDs will be removed from the cached MetaTensors. This is useful for skipping the transform instance checks when inverting applied operations using the cached content and with re-created transform instances.
LMDBDataset#
- class monai.data.LMDBDataset(data, transform, cache_dir='cache', hash_func=<function pickle_hashing>, db_name='monai_cache', progress=True, pickle_protocol=2, hash_transform=None, reset_ops_id=True, lmdb_kwargs=None)[source]#
Extension of PersistentDataset using LMDB as the backend.
See also
Examples
>>> items = [{"data": i} for i in range(5)] # [{'data': 0}, {'data': 1}, {'data': 2}, {'data': 3}, {'data': 4}] >>> lmdb_ds = monai.data.LMDBDataset(items, transform=monai.transforms.SimulateDelayd("data", delay_time=1)) >>> print(list(lmdb_ds)) # using the cached results
- __init__(data, transform, cache_dir='cache', hash_func=<function pickle_hashing>, db_name='monai_cache', progress=True, pickle_protocol=2, hash_transform=None, reset_ops_id=True, lmdb_kwargs=None)[source]#
- Parameters:
data (
Sequence) – input data file paths to load and transform to generate dataset for model. LMDBDataset expects input data to be a list of serializable and hashes them as cache keys using hash_func.transform (
UnionType[Sequence[Callable],Callable]) – transforms to execute operations on input data.cache_dir (
UnionType[Path,str]) – if specified, this is the location for persistent storage of pre-computed transformed data tensors. The cache_dir is computed once, and persists on disk until explicitly removed. Different runs, programs, experiments may share a common cache dir provided that the transforms pre-processing is consistent. If the cache_dir doesn’t exist, will automatically create it. Defaults to “./cache”.hash_func (
Callable[…,bytes]) – a callable to compute hash from data items to be cached. defaults to monai.data.utils.pickle_hashing.db_name (
str) – lmdb database file name. Defaults to “monai_cache”.progress (
bool) – whether to display a progress bar.pickle_protocol – specifies pickle protocol when saving, with torch.save. Defaults to torch.serialization.DEFAULT_PROTOCOL. For more details, please check: https://pytorch.org/docs/stable/generated/torch.save.html#torch.save.
hash_transform (
UnionType[Callable[…,bytes],None]) – a callable to compute hash from the transform information when caching. This may reduce errors due to transforms changing during experiments. Default to None (no hash). Other options are pickle_hashing and json_hashing functions from monai.data.utils.reset_ops_id (
bool) – whether to set TraceKeys.ID toTracekeys.NONE, defaults toTrue. When this is enabled, the traced transform instance IDs will be removed from the cached MetaTensors. This is useful for skipping the transform instance checks when inverting applied operations using the cached content and with re-created transform instances.lmdb_kwargs (
UnionType[dict,None]) – additional keyword arguments to the lmdb environment. for more details please visit: https://lmdb.readthedocs.io/en/release/#environment-class
CacheDataset#
- class monai.data.CacheDataset(data, transform=None, cache_num=9223372036854775807, cache_rate=1.0, num_workers=1, progress=True, copy_cache=True, as_contiguous=True, hash_as_key=False, hash_func=<function pickle_hashing>, runtime_cache=False)[source]#
Dataset with cache mechanism that can load data and cache deterministic transforms’ result during training.
By caching the results of non-random preprocessing transforms, it accelerates the training data pipeline. If the requested data is not in the cache, all transforms will run normally (see also
monai.data.dataset.Dataset).Users can set the cache rate or number of items to cache. It is recommended to experiment with different cache_num or cache_rate to identify the best training speed.
The transforms which are supposed to be cached must implement the monai.transforms.Transform interface and should not be Randomizable. This dataset will cache the outcomes before the first Randomizable Transform within a Compose instance. So to improve the caching efficiency, please always put as many as possible non-random transforms before the randomized ones when composing the chain of transforms. If passing slicing indices, will return a PyTorch Subset, for example: data: Subset = dataset[1:4], for more details, please check: https://pytorch.org/docs/stable/data.html#torch.utils.data.Subset
For example, if the transform is a Compose of:
transforms = Compose([ LoadImaged(), EnsureChannelFirstd(), Spacingd(), Orientationd(), ScaleIntensityRanged(), RandCropByPosNegLabeld(), ToTensord() ])
when transforms is used in a multi-epoch training pipeline, before the first training epoch, this dataset will cache the results up to
ScaleIntensityRanged, as all non-random transforms LoadImaged, EnsureChannelFirstd, Spacingd, Orientationd, ScaleIntensityRanged can be cached. During training, the dataset will load the cached results and runRandCropByPosNegLabeldandToTensord, asRandCropByPosNegLabeldis a randomized transform and the outcome not cached.During training call set_data() to update input data and recompute cache content, note that it requires persistent_workers=False in the PyTorch DataLoader.
Note
CacheDataset executes non-random transforms and prepares cache content in the main process before the first epoch, then all the subprocesses of DataLoader will read the same cache content in the main process during training. it may take a long time to prepare cache content according to the size of expected cache data. So to debug or verify the program before real training, users can set cache_rate=0.0 or cache_num=0 to temporarily skip caching.
- Lazy Resampling:
If you make use of the lazy resampling feature of monai.transforms.Compose, please refer to its documentation to familiarize yourself with the interaction between CacheDataset and lazy resampling.
- __init__(data, transform=None, cache_num=9223372036854775807, cache_rate=1.0, num_workers=1, progress=True, copy_cache=True, as_contiguous=True, hash_as_key=False, hash_func=<function pickle_hashing>, runtime_cache=False)[source]#
- Parameters:
data (
Sequence) – input data to load and transform to generate dataset for model.transform (
UnionType[Sequence[Callable],Callable,None]) – transforms to execute operations on input data.cache_num (
int) – number of items to be cached. Default is sys.maxsize. will take the minimum of (cache_num, data_length x cache_rate, data_length).cache_rate (
float) – percentage of cached data in total, default is 1.0 (cache all). will take the minimum of (cache_num, data_length x cache_rate, data_length).num_workers (
UnionType[int,None]) – the number of worker threads if computing cache in the initialization. If num_workers is None then the number returned by os.cpu_count() is used. If a value less than 1 is specified, 1 will be used instead.progress (
bool) – whether to display a progress bar.copy_cache (
bool) – whether to deepcopy the cache content before applying the random transforms, default to True. if the random transforms don’t modify the cached content (for example, randomly crop from the cached image and deepcopy the crop region) or if every cache item is only used once in a multi-processing environment, may set copy=False for better performance.as_contiguous (
bool) – whether to convert the cached NumPy array or PyTorch tensor to be contiguous. it may help improve the performance of following logic.hash_as_key (
bool) – whether to compute hash value of input data as the key to save cache, if key exists, avoid saving duplicated content. it can help save memory when the dataset has duplicated items or augmented dataset.hash_func (
Callable[…,bytes]) – if hash_as_key, a callable to compute hash from data items to be cached. defaults to monai.data.utils.pickle_hashing.runtime_cache (
UnionType[bool,str,list,ListProxy]) –mode of cache at the runtime. Default to False to prepare the cache content for the entire
dataduring initialization, this potentially largely increase the time required between the constructor called and first mini-batch generated. Three options are provided to compute the cache on the fly after the dataset initialization:"threads"orTrue: use a regularlistto store the cache items."processes": use a ListProxy to store the cache items, it can be shared among processes.A list-like object: a users-provided container to be used to store the cache items.
For thread-based caching (typically for caching cuda tensors), option 1 is recommended. For single process workflows with multiprocessing data loading, option 2 is recommended. For multiprocessing workflows (typically for distributed training), where this class is initialized in subprocesses, option 3 is recommended, and the list-like object should be prepared in the main process and passed to all subprocesses. Not following these recommendations may lead to runtime errors or duplicated cache across processes.
- set_data(data)[source]#
Set the input data and run deterministic transforms to generate cache content.
Note: should call this func after an entire epoch and must set persistent_workers=False in PyTorch DataLoader, because it needs to create new worker processes based on new generated cache content.
- Return type:
None
SmartCacheDataset#
- class monai.data.SmartCacheDataset(data, transform=None, replace_rate=0.1, cache_num=9223372036854775807, cache_rate=1.0, num_init_workers=1, num_replace_workers=1, progress=True, shuffle=True, seed=0, copy_cache=True, as_contiguous=True, runtime_cache=False)[source]#
Re-implementation of the SmartCache mechanism in NVIDIA Clara-train SDK. At any time, the cache pool only keeps a subset of the whole dataset. In each epoch, only the items in the cache are used for training. This ensures that data needed for training is readily available, keeping GPU resources busy. Note that cached items may still have to go through a non-deterministic transform sequence before being fed to GPU. At the same time, another thread is preparing replacement items by applying the transform sequence to items not in cache. Once one epoch is completed, Smart Cache replaces the same number of items with replacement items. Smart Cache uses a simple running window algorithm to determine the cache content and replacement items. Let N be the configured number of objects in cache; and R be the number of replacement objects (R = ceil(N * r), where r is the configured replace rate). For more details, please refer to: https://docs.nvidia.com/clara/clara-train-archive/3.1/nvmidl/additional_features/smart_cache.html If passing slicing indices, will return a PyTorch Subset, for example: data: Subset = dataset[1:4], for more details, please check: https://pytorch.org/docs/stable/data.html#torch.utils.data.Subset
For example, if we have 5 images: [image1, image2, image3, image4, image5], and cache_num=4, replace_rate=0.25. so the actual training images cached and replaced for every epoch are as below:
epoch 1: [image1, image2, image3, image4] epoch 2: [image2, image3, image4, image5] epoch 3: [image3, image4, image5, image1] epoch 3: [image4, image5, image1, image2] epoch N: [image[N % 5] ...]
The usage of SmartCacheDataset contains 4 steps:
Initialize SmartCacheDataset object and cache for the first epoch.
Call start() to run replacement thread in background.
Call update_cache() before every epoch to replace training items.
Call shutdown() when training ends.
During training call set_data() to update input data and recompute cache content, note to call shutdown() to stop first, then update data and call start() to restart.
Note
This replacement will not work for below cases: 1. Set the multiprocessing_context of DataLoader to spawn. 2. Launch distributed data parallel with torch.multiprocessing.spawn. 3. Run on windows(the default multiprocessing method is spawn) with num_workers greater than 0. 4. Set the persistent_workers of DataLoader to True with num_workers greater than 0.
If using MONAI workflows, please add SmartCacheHandler to the handler list of trainer, otherwise, please make sure to call start(), update_cache(), shutdown() during training.
- Parameters:
data (
Sequence) – input data to load and transform to generate dataset for model.transform (
UnionType[Sequence[Callable],Callable,None]) – transforms to execute operations on input data.replace_rate (
float) – percentage of the cached items to be replaced in every epoch (default to 0.1).cache_num (
int) – number of items to be cached. Default is sys.maxsize. will take the minimum of (cache_num, data_length x cache_rate, data_length).cache_rate (
float) – percentage of cached data in total, default is 1.0 (cache all). will take the minimum of (cache_num, data_length x cache_rate, data_length).num_init_workers (
UnionType[int,None]) – the number of worker threads to initialize the cache for first epoch. If num_init_workers is None then the number returned by os.cpu_count() is used. If a value less than 1 is specified, 1 will be used instead.num_replace_workers (
UnionType[int,None]) – the number of worker threads to prepare the replacement cache for every epoch. If num_replace_workers is None then the number returned by os.cpu_count() is used. If a value less than 1 is specified, 1 will be used instead.progress (
bool) – whether to display a progress bar when caching for the first epoch.shuffle (
bool) – whether to shuffle the whole data list before preparing the cache content for first epoch. it will not modify the original input data sequence in-place.seed (
int) – random seed if shuffle is True, default to 0.copy_cache (
bool) – whether to deepcopy the cache content before applying the random transforms, default to True. if the random transforms don’t modify the cache content or every cache item is only used once in a multi-processing environment, may set copy=False for better performance.as_contiguous (
bool) – whether to convert the cached NumPy array or PyTorch tensor to be contiguous. it may help improve the performance of following logic.runtime_cache – Default to False, other options are not implemented yet.
- randomize(data)[source]#
Within this method,
self.Rshould be used, instead of np.random, to introduce random factors.all
self.Rcalls happen here so that we have a better chance to identify errors of sync the random state.This method can generate the random factors based on properties of the input data.
- Raises:
NotImplementedError – When the subclass does not override this method.
- Return type:
None
ZipDataset#
- class monai.data.ZipDataset(datasets, transform=None)[source]#
Zip several PyTorch datasets and output data(with the same index) together in a tuple. If the output of single dataset is already a tuple, flatten it and extend to the result. For example: if datasetA returns (img, imgmeta), datasetB returns (seg, segmeta), finally return (img, imgmeta, seg, segmeta). And if the datasets don’t have same length, use the minimum length of them as the length of ZipDataset. If passing slicing indices, will return a PyTorch Subset, for example: data: Subset = dataset[1:4], for more details, please check: https://pytorch.org/docs/stable/data.html#torch.utils.data.Subset
Examples:
>>> zip_data = ZipDataset([[1, 2, 3], [4, 5]]) >>> print(len(zip_data)) 2 >>> for item in zip_data: >>> print(item) [1, 4] [2, 5]
ArrayDataset#
- class monai.data.ArrayDataset(img, img_transform=None, seg=None, seg_transform=None, labels=None, label_transform=None)[source]#
Dataset for segmentation and classification tasks based on array format input data and transforms. It ensures the same random seeds in the randomized transforms defined for image, segmentation and label. The transform can be
monai.transforms.Composeor any other callable object. For example: If train based on Nifti format images without metadata, all transforms can be composed:img_transform = Compose( [ LoadImage(image_only=True), EnsureChannelFirst(), RandAdjustContrast() ] ) ArrayDataset(img_file_list, img_transform=img_transform)
If training based on images and the metadata, the array transforms can not be composed because several transforms receives multiple parameters or return multiple values. Then Users need to define their own callable method to parse metadata from LoadImage or set affine matrix to Spacing transform:
class TestCompose(Compose): def __call__(self, input_): img, metadata = self.transforms[0](input_) img = self.transforms[1](img) img, _, _ = self.transforms[2](img, metadata["affine"]) return self.transforms[3](img), metadata img_transform = TestCompose( [ LoadImage(image_only=False), EnsureChannelFirst(), Spacing(pixdim=(1.5, 1.5, 3.0)), RandAdjustContrast() ] ) ArrayDataset(img_file_list, img_transform=img_transform)
Examples:
>>> ds = ArrayDataset([1, 2, 3, 4], lambda x: x + 0.1) >>> print(ds[0]) 1.1 >>> ds = ArrayDataset(img=[1, 2, 3, 4], seg=[5, 6, 7, 8]) >>> print(ds[0]) [1, 5]
- __init__(img, img_transform=None, seg=None, seg_transform=None, labels=None, label_transform=None)[source]#
Initializes the dataset with the filename lists. The transform img_transform is applied to the images and seg_transform to the segmentations.
- Parameters:
img (
Sequence) – sequence of images.img_transform (
UnionType[Callable,None]) – transform to apply to each element in img.seg (
UnionType[Sequence,None]) – sequence of segmentations.seg_transform (
UnionType[Callable,None]) – transform to apply to each element in seg.labels (
UnionType[Sequence,None]) – sequence of labels.label_transform (
UnionType[Callable,None]) – transform to apply to each element in labels.
- randomize(data=None)[source]#
Within this method,
self.Rshould be used, instead of np.random, to introduce random factors.all
self.Rcalls happen here so that we have a better chance to identify errors of sync the random state.This method can generate the random factors based on properties of the input data.
- Raises:
NotImplementedError – When the subclass does not override this method.
- Return type:
None
ImageDataset#
- class monai.data.ImageDataset(image_files, seg_files=None, labels=None, transform=None, seg_transform=None, label_transform=None, image_only=True, transform_with_metadata=False, dtype=<class 'numpy.float32'>, reader=None, *args, **kwargs)[source]#
Loads image/segmentation pairs of files from the given filename lists. Transformations can be specified for the image and segmentation arrays separately. The difference between this dataset and ArrayDataset is that this dataset can apply transform chain to images and segs and return both the images and metadata, and no need to specify transform to load images from files. For more information, please see the image_dataset demo in the MONAI tutorial repo, Project-MONAI/tutorials
- __init__(image_files, seg_files=None, labels=None, transform=None, seg_transform=None, label_transform=None, image_only=True, transform_with_metadata=False, dtype=<class 'numpy.float32'>, reader=None, *args, **kwargs)[source]#
Initializes the dataset with the image and segmentation filename lists. The transform transform is applied to the images and seg_transform to the segmentations.
- Parameters:
image_files (
Sequence[str]) – list of image filenames.seg_files (
UnionType[Sequence[str],None]) – if in segmentation task, list of segmentation filenames.labels (
UnionType[Sequence[float],None]) – if in classification task, list of classification labels.transform (
UnionType[Callable,None]) – transform to apply to image arrays.seg_transform (
UnionType[Callable,None]) – transform to apply to segmentation arrays.label_transform (
UnionType[Callable,None]) – transform to apply to the label data.image_only (
bool) – if True return only the image volume, otherwise, return image volume and the metadata.transform_with_metadata (
bool) – if True, the metadata will be passed to the transforms whenever possible.dtype (
Union[dtype,type,str,None]) – if not None convert the loaded image to this data type.reader (
UnionType[ImageReader,str,None]) – register reader to load image file and metadata, if None, will use the default readers. If a string of reader name provided, will construct a reader object with the *args and **kwargs parameters, supported reader name: “NibabelReader”, “PILReader”, “ITKReader”, “NumpyReader”args – additional parameters for reader if providing a reader name.
kwargs – additional parameters for reader if providing a reader name.
- Raises:
ValueError – When
seg_fileslength differs fromimage_files
- randomize(data=None)[source]#
Within this method,
self.Rshould be used, instead of np.random, to introduce random factors.all
self.Rcalls happen here so that we have a better chance to identify errors of sync the random state.This method can generate the random factors based on properties of the input data.
- Raises:
NotImplementedError – When the subclass does not override this method.
- Return type:
None
NPZDictItemDataset#
- class monai.data.NPZDictItemDataset(npzfile, keys, transform=None, other_keys=())[source]#
Represents a dataset from a loaded NPZ file. The members of the file to load are named in the keys of keys and stored under the keyed name. All loaded arrays must have the same 0-dimension (batch) size. Items are always dicts mapping names to an item extracted from the loaded arrays. If passing slicing indices, will return a PyTorch Subset, for example: data: Subset = dataset[1:4], for more details, please check: https://pytorch.org/docs/stable/data.html#torch.utils.data.Subset
- Parameters:
npzfile (
UnionType[str,IO]) – Path to .npz file or stream containing .npz file datakeys (
dict[str,str]) – Maps keys to load from file to name to store in datasettransform (
UnionType[Callable[…,dict[str,Any]],None]) – Transform to apply to batch dictother_keys (
UnionType[Sequence[str],None]) – secondary data to load from file and store in dict other_keys, not returned by __getitem__
CSVDataset#
- class monai.data.CSVDataset(src=None, row_indices=None, col_names=None, col_types=None, col_groups=None, transform=None, kwargs_read_csv=None, **kwargs)[source]#
Dataset to load data from CSV files and generate a list of dictionaries, every dictionary maps to a row of the CSV file, and the keys of dictionary map to the column names of the CSV file.
It can load multiple CSV files and join the tables with additional kwargs arg. Support to only load specific rows and columns. And it can also group several loaded columns to generate a new column, for example, set col_groups={“meta”: [“meta_0”, “meta_1”, “meta_2”]}, output can be:
[ {"image": "./image0.nii", "meta_0": 11, "meta_1": 12, "meta_2": 13, "meta": [11, 12, 13]}, {"image": "./image1.nii", "meta_0": 21, "meta_1": 22, "meta_2": 23, "meta": [21, 22, 23]}, ]
- Parameters:
src (
UnionType[str,Sequence[str],None]) – if provided the filename of CSV file, it can be a str, URL, path object or file-like object to load. also support to provide pandas DataFrame directly, will skip loading from filename. if provided a list of filenames or pandas DataFrame, it will join the tables.row_indices (
UnionType[Sequence[UnionType[int,str]],None]) – indices of the expected rows to load. it should be a list, every item can be a int number or a range [start, end) for the indices. for example: row_indices=[[0, 100], 200, 201, 202, 300]. if None, load all the rows in the file.col_names (
UnionType[Sequence[str],None]) – names of the expected columns to load. if None, load all the columns.col_types (
UnionType[dict[str,UnionType[dict[str,Any],None]],None]) –type and default value to convert the loaded columns, if None, use original data. it should be a dictionary, every item maps to an expected column, the key is the column name and the value is None or a dictionary to define the default value and data type. the supported keys in dictionary are: [“type”, “default”]. for example:
col_types = { "subject_id": {"type": str}, "label": {"type": int, "default": 0}, "ehr_0": {"type": float, "default": 0.0}, "ehr_1": {"type": float, "default": 0.0}, "image": {"type": str, "default": None}, }
col_groups (
UnionType[dict[str,Sequence[str]],None]) – args to group the loaded columns to generate a new column, it should be a dictionary, every item maps to a group, the key will be the new column name, the value is the names of columns to combine. for example: col_groups={“ehr”: [f”ehr_{i}” for i in range(10)], “meta”: [“meta_1”, “meta_2”]}transform (
UnionType[Callable,None]) – transform to apply on the loaded items of a dictionary data.kwargs_read_csv (
UnionType[dict,None]) – dictionary args to pass to pandas read_csv function.kwargs – additional arguments for pandas.merge() API to join tables.
Patch-based dataset#
GridPatchDataset#
- class monai.data.GridPatchDataset(data, patch_iter, transform=None, with_coordinates=True, cache=False, cache_num=9223372036854775807, cache_rate=1.0, num_workers=1, progress=True, copy_cache=True, as_contiguous=True, hash_func=<function pickle_hashing>)[source]#
Yields patches from data read from an image dataset. Typically used with PatchIter or PatchIterd so that the patches are chosen in a contiguous grid sampling scheme.
import numpy as np from monai.data import GridPatchDataset, DataLoader, PatchIter, RandShiftIntensity # image-level dataset images = [np.arange(16, dtype=float).reshape(1, 4, 4), np.arange(16, dtype=float).reshape(1, 4, 4)] # image-level patch generator, "grid sampling" patch_iter = PatchIter(patch_size=(2, 2), start_pos=(0, 0)) # patch-level intensity shifts patch_intensity = RandShiftIntensity(offsets=1.0, prob=1.0) # construct the dataset ds = GridPatchDataset(data=images, patch_iter=patch_iter, transform=patch_intensity) # use the grid patch dataset for item in DataLoader(ds, batch_size=2, num_workers=2): print("patch size:", item[0].shape) print("coordinates:", item[1]) # >>> patch size: torch.Size([2, 1, 2, 2]) # coordinates: tensor([[[0, 1], [0, 2], [0, 2]], # [[0, 1], [2, 4], [0, 2]]])
- Parameters:
data (
UnionType[Iterable,Sequence]) – the data source to read image data from.patch_iter (
Callable) – converts an input image (item from dataset) into a iterable of image patches. patch_iter(dataset[idx]) must yield a tuple: (patches, coordinates). see also:monai.data.PatchIterormonai.data.PatchIterd.transform (
UnionType[Callable,None]) – a callable data transform operates on the patches.with_coordinates (
bool) – whether to yield the coordinates of each patch, default to True.cache (
bool) – whether to use cache mache mechanism, default to False. see also:monai.data.CacheDataset.cache_num (
int) – number of items to be cached. Default is sys.maxsize. will take the minimum of (cache_num, data_length x cache_rate, data_length).cache_rate (
float) – percentage of cached data in total, default is 1.0 (cache all). will take the minimum of (cache_num, data_length x cache_rate, data_length).num_workers (
UnionType[int,None]) – the number of worker threads if computing cache in the initialization. If num_workers is None then the number returned by os.cpu_count() is used. If a value less than 1 is specified, 1 will be used instead.progress (
bool) – whether to display a progress bar.copy_cache (
bool) – whether to deepcopy the cache content before applying the random transforms, default to True. if the random transforms don’t modify the cached content (for example, randomly crop from the cached image and deepcopy the crop region) or if every cache item is only used once in a multi-processing environment, may set copy=False for better performance.as_contiguous (
bool) – whether to convert the cached NumPy array or PyTorch tensor to be contiguous. it may help improve the performance of following logic.hash_func (
Callable[…,bytes]) – a callable to compute hash from data items to be cached. defaults to monai.data.utils.pickle_hashing.
- set_data(data)[source]#
Set the input data and run deterministic transforms to generate cache content.
Note: should call this func after an entire epoch and must set persistent_workers=False in PyTorch DataLoader, because it needs to create new worker processes based on new generated cache content.
- Return type:
None
PatchDataset#
- class monai.data.PatchDataset(data, patch_func, samples_per_image=1, transform=None)[source]#
Yields patches from data read from an image dataset. The patches are generated by a user-specified callable patch_func, and are optionally post-processed by transform. For example, to generate random patch samples from an image dataset:
import numpy as np from monai.data import PatchDataset, DataLoader from monai.transforms import RandSpatialCropSamples, RandShiftIntensity # image dataset images = [np.arange(16, dtype=float).reshape(1, 4, 4), np.arange(16, dtype=float).reshape(1, 4, 4)] # image patch sampler n_samples = 5 sampler = RandSpatialCropSamples(roi_size=(3, 3), num_samples=n_samples, random_center=True, random_size=False) # patch-level intensity shifts patch_intensity = RandShiftIntensity(offsets=1.0, prob=1.0) # construct the patch dataset ds = PatchDataset(dataset=images, patch_func=sampler, samples_per_image=n_samples, transform=patch_intensity) # use the patch dataset, length: len(images) x samplers_per_image print(len(ds)) >>> 10 for item in DataLoader(ds, batch_size=2, shuffle=True, num_workers=2): print(item.shape) >>> torch.Size([2, 1, 3, 3])
- __init__(data, patch_func, samples_per_image=1, transform=None)[source]#
- Parameters:
data (
Sequence) – an image dataset to extract patches from.patch_func (
Callable) – converts an input image (item from dataset) into a sequence of image patches. patch_func(dataset[idx]) must return a sequence of patches (length samples_per_image).samples_per_image (
int) – patch_func should return a sequence of samples_per_image elements.transform (
UnionType[Callable,None]) – transform applied to each patch.
PatchIter#
- class monai.data.PatchIter(patch_size, start_pos=(), mode=wrap, **pad_opts)[source]#
Return a patch generator with predefined properties such as patch_size. Typically used with
monai.data.GridPatchDataset.- __call__(array)[source]#
- Parameters:
array (~NdarrayTensor) – the image to generate patches from.
- Return type:
Generator[tuple[~NdarrayTensor,ndarray],None,None]
- __init__(patch_size, start_pos=(), mode=wrap, **pad_opts)[source]#
- Parameters:
patch_size (
Sequence[int]) – size of patches to generate slices for, 0/None selects whole dimensionstart_pos (
Sequence[int]) – starting position in the array, default is 0 for each dimensionmode (
UnionType[str,None]) – available modes: (Numpy) {"constant","edge","linear_ramp","maximum","mean","median","minimum","reflect","symmetric","wrap","empty"} (PyTorch) {"constant","reflect","replicate","circular"}. One of the listed string values or a user supplied function. If None, no wrapping is performed. Defaults to"wrap". See also: https://numpy.org/doc/stable/reference/generated/numpy.pad.html https://pytorch.org/docs/stable/generated/torch.nn.functional.pad.html requires pytorch >= 1.10 for best compatibility.pad_opts (
dict) – other arguments for the np.pad function. note that np.pad treats channel dimension as the first dimension.
Note
The patch_size is the size of the patch to sample from the input arrays. It is assumed the arrays first dimension is the channel dimension which will be yielded in its entirety so this should not be specified in patch_size. For example, for an input 3D array with 1 channel of size (1, 20, 20, 20) a regular grid sampling of eight patches (1, 10, 10, 10) would be specified by a patch_size of (10, 10, 10).
PatchIterd#
- class monai.data.PatchIterd(keys, patch_size, start_pos=(), mode=wrap, **pad_opts)[source]#
Dictionary-based wrapper of
monai.data.PatchIter. Return a patch generator for dictionary data and the coordinate, Typically used withmonai.data.GridPatchDataset. Suppose all the expected fields specified by keys have same shape.- Parameters:
keys (
Union[Collection[Hashable],Hashable]) – keys of the corresponding items to iterate patches.patch_size (
Sequence[int]) – size of patches to generate slices for, 0/None selects whole dimensionstart_pos (
Sequence[int]) – starting position in the array, default is 0 for each dimensionmode (
UnionType[str,None]) – available modes: (Numpy) {"constant","edge","linear_ramp","maximum","mean","median","minimum","reflect","symmetric","wrap","empty"} (PyTorch) {"constant","reflect","replicate","circular"}. One of the listed string values or a user supplied function. If None, no wrapping is performed. Defaults to"wrap". See also: https://numpy.org/doc/stable/reference/generated/numpy.pad.html https://pytorch.org/docs/stable/generated/torch.nn.functional.pad.html requires pytorch >= 1.10 for best compatibility.pad_opts – other arguments for the np.pad function. note that np.pad treats channel dimension as the first dimension.
Image reader#
ImageReader#
- class monai.data.ImageReader[source]#
An abstract class defines APIs to load image files.
Typical usage of an implementation of this class is:
image_reader = MyImageReader() img_obj = image_reader.read(path_to_image) img_data, meta_data = image_reader.get_data(img_obj)
The read call converts image filenames into image objects,
The get_data call fetches the image data, as well as metadata.
A reader should implement verify_suffix with the logic of checking the input filename by the filename extensions.
- abstractmethod get_data(img)[source]#
Extract data array and metadata from loaded image and return them. This function must return two objects, the first is a numpy array of image data, the second is a dictionary of metadata.
- Parameters:
img – an image object loaded from an image file or a list of image objects.
- Return type:
tuple[ndarray,dict]
- abstractmethod read(data, **kwargs)[source]#
Read image data from specified file or files. Note that it returns a data object or a sequence of data objects.
- Parameters:
data (
Union[Sequence[Union[str,PathLike]],str,PathLike]) – file name or a list of file names to read.kwargs – additional args for actual read API of 3rd party libs.
- Return type:
UnionType[Sequence[Any],Any]
- abstractmethod verify_suffix(filename)[source]#
Verify whether the specified filename is supported by the current reader. This method should return True if the reader is able to read the format suggested by the filename.
- Parameters:
filename (
Union[Sequence[Union[str,PathLike]],str,PathLike]) – file name or a list of file names to read. if a list of files, verify all the suffixes.- Return type:
bool
ITKReader#
- class monai.data.ITKReader(channel_dim=None, series_name='', reverse_indexing=False, series_meta=False, affine_lps_to_ras=True, **kwargs)[source]#
Load medical images based on ITK library. All the supported image formats can be found at: InsightSoftwareConsortium/ITK The loaded data array will be in C order, for example, a 3D image NumPy array index order will be CDWH.
- Parameters:
channel_dim (
UnionType[str,int,None]) –the channel dimension of the input image, default is None. This is used to set original_channel_dim in the metadata, EnsureChannelFirstD reads this field. If None, original_channel_dim will be either no_channel or -1.
Nifti file is usually “channel last”, so there is no need to specify this argument.
PNG file usually has GetNumberOfComponentsPerPixel()==3, so there is no need to specify this argument.
series_name (
str) – the name of the DICOM series if there are multiple ones. used when loading DICOM series.reverse_indexing (
bool) – whether to use a reversed spatial indexing convention for the returned data array. IfFalse, the spatial indexing convention is reversed to be compatible with ITK; otherwise, the spatial indexing follows the numpy convention. Default isFalse. This option does not affect the metadata.series_meta (
bool) – whether to load the metadata of the DICOM series (using the metadata from the first slice). This flag is checked only when loading DICOM series. Default isFalse.affine_lps_to_ras (
bool) – whether to convert the affine matrix from “LPS” to “RAS”. Defaults toTrue. Set toTrueto be consistent withNibabelReader, otherwise the affine matrix remains in the ITK convention.kwargs – additional args for itk.imread API. more details about available args: InsightSoftwareConsortium/ITK
- get_data(img)[source]#
Extract data array and metadata from loaded image and return them. This function returns two objects, first is numpy array of image data, second is dict of metadata. It constructs affine, original_affine, and spatial_shape and stores them in meta dict. When loading a list of files, they are stacked together at a new dimension as the first dimension, and the metadata of the first image is used to represent the output metadata.
- Parameters:
img – an ITK image object loaded from an image file or a list of ITK image objects.
- Return type:
tuple[ndarray,dict]
- read(data, **kwargs)[source]#
Read image data from specified file or files, it can read a list of images and stack them together as multi-channel data in get_data(). If passing directory path instead of file path, will treat it as DICOM images series and read. Note that the returned object is ITK image object or list of ITK image objects.
- Parameters:
data (
Union[Sequence[Union[str,PathLike]],str,PathLike]) – file name or a list of file names to read,kwargs – additional args for itk.imread API, will override self.kwargs for existing keys. More details about available args: InsightSoftwareConsortium/ITK
NibabelReader#
- class monai.data.NibabelReader(channel_dim=None, as_closest_canonical=False, squeeze_non_spatial_dims=False, to_gpu=False, **kwargs)[source]#
Load NIfTI format images based on Nibabel library.
- Parameters:
channel_dim (
UnionType[str,int,None]) – the channel dimension of the input image, default is None. this is used to set original_channel_dim in the metadata, EnsureChannelFirstD reads this field. if None, original_channel_dim will be either no_channel or -1. most Nifti files are usually “channel last”, no need to specify this argument for them.as_closest_canonical (
bool) – if True, load the image as closest to canonical axis format.squeeze_non_spatial_dims (
bool) – if True, non-spatial singletons will be squeezed, e.g. (256,256,1,3) -> (256,256,3)to_gpu (
bool) – If True, load the image into GPU memory using CuPy and Kvikio. This can accelerate data loading. Default is False. CuPy and Kvikio are required for this option. Note: For compressed NIfTI files, some operations may still be performed on CPU memory, and the acceleration may not be significant. In some cases, it may be slower than loading on CPU.kwargs – additional args for nibabel.load API. more details about available args: nipy/nibabel
- get_data(img)[source]#
Extract data array and metadata from loaded image and return them. This function returns two objects, first is numpy array of image data, second is dict of metadata. It constructs affine, original_affine, and spatial_shape and stores them in meta dict. When loading a list of files, they are stacked together at a new dimension as the first dimension, and the metadata of the first image is used to present the output metadata.
- Parameters:
img – a Nibabel image object loaded from an image file or a list of Nibabel image objects.
- Return type:
tuple[ndarray,dict]
- read(data, **kwargs)[source]#
Read image data from specified file or files, it can read a list of images and stack them together as multi-channel data in get_data(). Note that the returned object is Nibabel image object or list of Nibabel image objects.
- Parameters:
data (
Union[Sequence[Union[str,PathLike]],str,PathLike]) – file name or a list of file names to read.kwargs – additional args for nibabel.load API, will override self.kwargs for existing keys. More details about available args: nipy/nibabel
NumpyReader#
- class monai.data.NumpyReader(npz_keys=None, channel_dim=None, **kwargs)[source]#
Load NPY or NPZ format data based on Numpy library, they can be arrays or pickled objects. A typical usage is to load the mask data for classification task. It can load part of the npz file with specified npz_keys.
- Parameters:
npz_keys (
Union[Collection[Hashable],Hashable,None]) – if loading npz file, only load the specified keys, if None, load all the items. stack the loaded items together to construct a new first dimension.channel_dim (
UnionType[str,int,None]) – if not None, explicitly specify the channel dim, otherwise, treat the array as no channel.kwargs – additional args for numpy.load API except allow_pickle. more details about available args: https://numpy.org/doc/stable/reference/generated/numpy.load.html
- get_data(img)[source]#
Extract data array and metadata from loaded image and return them. This function returns two objects, first is numpy array of image data, second is dict of metadata. It constructs affine, original_affine, and spatial_shape and stores them in meta dict. When loading a list of files, they are stacked together at a new dimension as the first dimension, and the metadata of the first image is used to represent the output metadata.
- Parameters:
img – a Numpy array loaded from a file or a list of Numpy arrays.
- Return type:
tuple[ndarray,dict]
- read(data, **kwargs)[source]#
Read image data from specified file or files, it can read a list of data files and stack them together as multi-channel data in get_data(). Note that the returned object is Numpy array or list of Numpy arrays.
- Parameters:
data (
Union[Sequence[Union[str,PathLike]],str,PathLike]) – file name or a list of file names to read.kwargs – additional args for numpy.load API except allow_pickle, will override self.kwargs for existing keys. More details about available args: https://numpy.org/doc/stable/reference/generated/numpy.load.html
PILReader#
- class monai.data.PILReader(converter=None, reverse_indexing=True, **kwargs)[source]#
Load common 2D image format (supports PNG, JPG, BMP) file or files from provided path.
- Parameters:
converter (
UnionType[Callable,None]) – additional function to convert the image data after read(). for example, use converter=lambda image: image.convert(“LA”) to convert image format.reverse_indexing (
bool) – whether to swap axis 0 and 1 after loading the array, this is enabled by default, so that output of the reader is consistent with the other readers. Set this option toFalseto use the PIL backend’s original spatial axes convention.kwargs – additional args for Image.open API in read(), mode details about available args: https://pillow.readthedocs.io/en/stable/reference/Image.html#PIL.Image.open
- get_data(img)[source]#
Extract data array and metadata from loaded image and return them. This function returns two objects, first is numpy array of image data, second is dict of metadata. It computes spatial_shape and stores it in meta dict. When loading a list of files, they are stacked together at a new dimension as the first dimension, and the metadata of the first image is used to represent the output metadata. Note that by default self.reverse_indexing is set to
True, which swaps axis 0 and 1 after loading the array because the spatial axes definition in PIL is different from other common medical packages.- Parameters:
img – a PIL Image object loaded from a file or a list of PIL Image objects.
- Return type:
tuple[ndarray,dict]
- read(data, **kwargs)[source]#
Read image data from specified file or files, it can read a list of images and stack them together as multi-channel data in get_data(). Note that the returned object is PIL image or list of PIL image.
- Parameters:
data (
Union[Sequence[Union[str,PathLike]],str,PathLike,ndarray]) – file name or a list of file names to read.kwargs – additional args for Image.open API in read(), will override self.kwargs for existing keys. Mode details about available args: https://pillow.readthedocs.io/en/stable/reference/Image.html#PIL.Image.open
NrrdReader#
- class monai.data.NrrdReader(channel_dim=None, dtype=<class 'numpy.float32'>, index_order='F', affine_lps_to_ras=True, **kwargs)[source]#
Load NRRD format images based on pynrrd library.
- Parameters:
channel_dim (
UnionType[str,int,None]) – the channel dimension of the input image, default is None. This is used to set original_channel_dim in the metadata, EnsureChannelFirstD reads this field. If None, original_channel_dim will be either no_channel or 0. NRRD files are usually “channel first”.dtype (
UnionType[dtype,type,str,None]) – dtype of the data array when loading image.index_order (
str) – Specify whether the returned data array should be in C-order (‘C’) or Fortran-order (‘F’). Numpy is usually in C-order, but default on the NRRD header is Faffine_lps_to_ras (
bool) – whether to convert the affine matrix from “LPS” to “RAS”. Defaults toTrue. Set toTrueto be consistent withNibabelReader, otherwise the affine matrix is unmodified.kwargs – additional args for nrrd.read API. more details about available args: mhe/pynrrd
- get_data(img)[source]#
Extract data array and metadata from loaded image and return them. This function must return two objects, the first is a numpy array of image data, the second is a dictionary of metadata.
- Parameters:
img (
UnionType[NrrdImage,list[NrrdImage]]) – a NrrdImage loaded from an image file or a list of image objects.- Return type:
tuple[ndarray,dict]
- read(data, **kwargs)[source]#
Read image data from specified file or files. Note that it returns a data object or a sequence of data objects.
- Parameters:
data (
Union[Sequence[Union[str,PathLike]],str,PathLike]) – file name or a list of file names to read.kwargs – additional args for actual read API of 3rd party libs.
- Return type:
UnionType[Sequence[Any],Any]
Image writer#
resolve_writer#
- monai.data.resolve_writer(ext_name, error_if_not_found=True)[source]#
Resolves to a tuple of available
ImageWriterinSUPPORTED_WRITERSaccording to the filename extension keyext_name.- Parameters:
ext_name – the filename extension of the image. As an indexing key it will be converted to a lower case string.
error_if_not_found – whether to raise an error if no suitable image writer is found. if True , raise an
OptionalImportError, otherwise return an empty tuple. Default isTrue.
- Return type:
Sequence
register_writer#
- monai.data.register_writer(ext_name, *im_writers)[source]#
Register
ImageWriter, so that writing a file with filename extensionext_namecould be resolved to a tuple of potentially appropriateImageWriter. The customised writers could be registered by:from monai.data import register_writer # `MyWriter` must implement `ImageWriter` interface register_writer("nii", MyWriter)
- Parameters:
ext_name – the filename extension of the image. As an indexing key, it will be converted to a lower case string.
im_writers – one or multiple ImageWriter classes with high priority ones first.
ImageWriter#
- class monai.data.ImageWriter(**kwargs)[source]#
The class is a collection of utilities to write images to disk.
Main aspects to be considered are:
- dimensionality of the data array, arrangements of spatial dimensions and channel/time dimensions
convert_to_channel_last()
- metadata of the current affine and output affine, the data array should be converted accordingly
get_meta_info()resample_if_needed()
data type handling of the output image (as part of
resample_if_needed())
Subclasses of this class should implement the backend-specific functions:
set_data_array()to set the data array (input must be numpy array or torch tensor)this method sets the backend object’s data part
set_metadata()to set the metadata and output affinethis method sets the metadata including affine handling and image resampling
backend-specific data object
create_backend_obj()backend-specific writing function
write()
The primary usage of subclasses of
ImageWriteris:writer = MyWriter() # subclass of ImageWriter writer.set_data_array(data_array) writer.set_metadata(meta_dict) writer.write(filename)
This creates an image writer object based on
data_arrayandmeta_dictand write tofilename.It supports up to three spatial dimensions (with the resampling step supports for both 2D and 3D). When saving multiple time steps or multiple channels data_array, time and/or modality axes should be the at the channel_dim. For example, the shape of a 2D eight-class and
channel_dim=0, the segmentation probabilities to be saved could be (8, 64, 64); in this casedata_arraywill be converted to (64, 64, 1, 8) (the third dimension is reserved as a spatial dimension).The
metadatacould optionally have the following keys:'original_affine': for data original affine, it will be theaffine of the output object, defaulting to an identity matrix.
'affine': it should specify the current data affine, defaulting to an identity matrix.'spatial_shape': for data output spatial shape.
When
metadatais specified, the saver will may resample data from the space defined by “affine” to the space defined by “original_affine”, for more details, please refer to theresample_if_neededmethod.- __init__(**kwargs)[source]#
The constructor supports adding new instance members. The current member in the base class is
self.data_obj, the subclasses can add more members, so that necessary meta information can be stored in the object and shared among the class methods.
- classmethod convert_to_channel_last(data, channel_dim=0, squeeze_end_dims=True, spatial_ndim=3, contiguous=False)[source]#
Rearrange the data array axes to make the channel_dim-th dim the last dimension and ensure there are
spatial_ndimnumber of spatial dimensions.When
squeeze_end_dimsisTrue, a postprocessing step will be applied to remove any trailing singleton dimensions.- Parameters:
data (
Union[ndarray,Tensor]) – input data to be converted to “channel-last” format.channel_dim (
UnionType[None,int,Sequence[int]]) – specifies the channel axes of the data array to move to the last.Noneindicates no channel dimension, a new axis will be appended as the channel dimension. a sequence of integers indicates multiple non-spatial dimensions.squeeze_end_dims (
bool) – ifTrue, any trailing singleton dimensions will be removed (after the channel has been moved to the end). So if input is (H,W,D,C) and C==1, then it will be saved as (H,W,D). If D is also 1, it will be saved as (H,W). IfFalse, image will always be saved as (H,W,D,C).spatial_ndim (
UnionType[int,None]) – modifying the spatial dims if needed, so that output to have at least this number of spatial dims. IfNone, the output will have the same number of spatial dimensions as the input.contiguous (
bool) – ifTrue, the output will be contiguous.
- classmethod create_backend_obj(data_array, **kwargs)[source]#
Subclass should implement this method to return a backend-specific data representation object. This method is used by
cls.writeand the inputdata_arrayis assumed ‘channel-last’.- Return type:
ndarray
- classmethod get_meta_info(metadata=None)[source]#
Extracts relevant meta information from the metadata object (using
.get). Optional keys are"spatial_shape",MetaKeys.AFFINE,"original_affine".
- classmethod resample_if_needed(data_array, affine=None, target_affine=None, output_spatial_shape=None, mode=bilinear, padding_mode=border, align_corners=False, dtype=<class 'numpy.float64'>)[source]#
Convert the
data_arrayinto the coordinate system specified bytarget_affine, from the current coordinate definition ofaffine.If the transform between
affineandtarget_affinecould be achieved by simply transposing and flippingdata_array, no resampling will happen. Otherwise, this function resamplesdata_arrayusing the transformation computed fromaffineandtarget_affine.This function assumes the NIfTI dimension notations. Spatially it supports up to three dimensions, that is, H, HW, HWD for 1D, 2D, 3D respectively. When saving multiple time steps or multiple channels, time and/or modality axes should be appended after the first three dimensions. For example, shape of 2D eight-class segmentation probabilities to be saved could be (64, 64, 1, 8). Also, data in shape (64, 64, 8) or (64, 64, 8, 1) will be considered as a single-channel 3D image. The
convert_to_channel_lastmethod can be used to convert the data to the format described here.Note that the shape of the resampled
data_arraymay subject to some rounding errors. For example, resampling a 20x20 pixel image from pixel size (1.5, 1.5)-mm to (3.0, 3.0)-mm space will return a 10x10-pixel image. However, resampling a 20x20-pixel image from pixel size (2.0, 2.0)-mm to (3.0, 3.0)-mm space will output a 14x14-pixel image, where the image shape is rounded from 13.333x13.333 pixels. In this caseoutput_spatial_shapecould be specified so that this function writes image data to a designated shape.- Parameters:
data_array (
Union[ndarray,Tensor]) – input data array to be converted.affine (
Union[ndarray,Tensor,None]) – the current affine ofdata_array. Defaults to identitytarget_affine (
Union[ndarray,Tensor,None]) – the designated affine ofdata_array. The actual output affine might be different from this value due to precision changes.output_spatial_shape (
UnionType[Sequence[int],int,None]) – spatial shape of the output image. This option is used when resampling is needed.mode (
str) – available options are {"bilinear","nearest","bicubic"}. This option is used when resampling is needed. Interpolation mode to calculate output values. Defaults to"bilinear". See also: https://pytorch.org/docs/stable/nn.functional.html#grid-samplepadding_mode (
str) – available options are {"zeros","border","reflection"}. This option is used when resampling is needed. Padding mode for outside grid values. Defaults to"border". See also: https://pytorch.org/docs/stable/nn.functional.html#grid-samplealign_corners (
bool) – boolean option ofgrid_sampleto handle the corner convention. See also: https://pytorch.org/docs/stable/nn.functional.html#grid-sampledtype (
Union[dtype,type,str,None]) – data type for resampling computation. Defaults tonp.float64for best precision. IfNone, use the data type of input data. The output data type of this method is alwaysnp.float32.
ITKWriter#
- class monai.data.ITKWriter(output_dtype=<class 'numpy.float32'>, affine_lps_to_ras=True, **kwargs)[source]#
Write data and metadata into files on disk using ITK-python.
import numpy as np from monai.data import ITKWriter np_data = np.arange(48).reshape(3, 4, 4) # write as 3d spatial image no channel writer = ITKWriter(output_dtype=np.float32) writer.set_data_array(np_data, channel_dim=None) # optionally set metadata affine writer.set_metadata({"affine": np.eye(4), "original_affine": -1 * np.eye(4)}) writer.write("test1.nii.gz") # write as 2d image, channel-first writer = ITKWriter(output_dtype=np.uint8) writer.set_data_array(np_data, channel_dim=0) writer.set_metadata({"spatial_shape": (5, 5)}) writer.write("test1.png")
- __init__(output_dtype=<class 'numpy.float32'>, affine_lps_to_ras=True, **kwargs)[source]#
- Parameters:
output_dtype (
Union[dtype,type,str,None]) – output data type.affine_lps_to_ras (
UnionType[bool,None]) – whether to convert the affine matrix from “LPS” to “RAS”. Defaults toTrue. Set toTrueto be consistent withNibabelWriter, otherwise the affine matrix is assumed already in the ITK convention. Set toNoneto usedata_array.meta[MetaKeys.SPACE]to determine the flag.kwargs – keyword arguments passed to
ImageWriter.
The constructor will create
self.output_dtypeinternally.affineandchannel_dimare initialized as instance members (defaultNone,0):user-specified
affineshould be set inset_metadata,user-specified
channel_dimshould be set inset_data_array.
- classmethod create_backend_obj(data_array, channel_dim=0, affine=None, dtype=<class 'numpy.float32'>, affine_lps_to_ras=True, **kwargs)[source]#
Create an ITK object from
data_array. This method assumes a ‘channel-last’data_array.- Parameters:
data_array (
Union[ndarray,Tensor]) – input data array.channel_dim (
UnionType[int,None]) – channel dimension of the data array. This is used to create a Vector Image if it is notNone.affine (
Union[ndarray,Tensor,None]) – affine matrix of the data array. This is used to compute spacing, direction and origin.dtype (
Union[dtype,type,str,None]) – output data type.affine_lps_to_ras (
UnionType[bool,None]) – whether to convert the affine matrix from “LPS” to “RAS”. Defaults toTrue. Set toTrueto be consistent withNibabelWriter, otherwise the affine matrix is assumed already in the ITK convention. Set toNoneto usedata_array.meta[MetaKeys.SPACE]to determine the flag.kwargs – keyword arguments. Current itk.GetImageFromArray will read
ttypefrom this dictionary.
See also
- set_data_array(data_array, channel_dim=0, squeeze_end_dims=True, **kwargs)[source]#
Convert
data_arrayinto ‘channel-last’ numpy ndarray.- Parameters:
data_array (
Union[ndarray,Tensor]) – input data array with the channel dimension specified bychannel_dim.channel_dim (
UnionType[int,None]) – channel dimension of the data array. Defaults to 0.Noneindicates data without any channel dimension.squeeze_end_dims (
bool) – ifTrue, any trailing singleton dimensions will be removed.kwargs – keyword arguments passed to
self.convert_to_channel_last, currently supportspatial_ndimandcontiguous, defauting to3andFalserespectively.
- set_metadata(meta_dict=None, resample=True, **options)[source]#
Resample
self.dataobjif needed. This method assumesself.data_objis a ‘channel-last’ ndarray.- Parameters:
meta_dict (
UnionType[Mapping,None]) – a metadata dictionary for affine, original affine and spatial shape information. Optional keys are"spatial_shape","affine","original_affine".resample (
bool) – ifTrue, the data will be resampled to the original affine (specified inmeta_dict).options – keyword arguments passed to
self.resample_if_needed, currently supportmode,padding_mode,align_corners, anddtype, defaulting tobilinear,border,False, andnp.float64respectively.
- write(filename, verbose=False, **kwargs)[source]#
Create an ITK object from
self.create_backend_obj(self.obj, ...)and callitk.imwrite.- Parameters:
filename (
Union[str,PathLike]) – filename or PathLike object.verbose (
bool) – ifTrue, log the progress.kwargs – keyword arguments passed to
itk.imwrite, currently supportcompressionandimageio.
See also
NibabelWriter#
- class monai.data.NibabelWriter(output_dtype=<class 'numpy.float32'>, **kwargs)[source]#
Write data and metadata into files on disk using Nibabel.
import numpy as np from monai.data import NibabelWriter np_data = np.arange(48).reshape(3, 4, 4) writer = NibabelWriter() writer.set_data_array(np_data, channel_dim=None) writer.set_metadata({"affine": np.eye(4), "original_affine": np.eye(4)}) writer.write("test1.nii.gz", verbose=True)
- __init__(output_dtype=<class 'numpy.float32'>, **kwargs)[source]#
- Parameters:
output_dtype (
Union[dtype,type,str,None]) – output data type.kwargs – keyword arguments passed to
ImageWriter.
The constructor will create
self.output_dtypeinternally.affineis initialized as instance members (defaultNone), user-specifiedaffineshould be set inset_metadata.
- classmethod create_backend_obj(data_array, affine=None, dtype=None, **kwargs)[source]#
Create an Nifti1Image object from
data_array. This method assumes a ‘channel-last’data_array.- Parameters:
data_array (
Union[ndarray,Tensor]) – input data array.affine (
Union[ndarray,Tensor,None]) – affine matrix of the data array.dtype (
Union[dtype,type,str,None]) – output data type.kwargs – keyword arguments. Current
nib.nifti1.Nifti1Imagewill readheader,extra,file_mapfrom this dictionary.
- set_data_array(data_array, channel_dim=0, squeeze_end_dims=True, **kwargs)[source]#
Convert
data_arrayinto ‘channel-last’ numpy ndarray.- Parameters:
data_array (
Union[ndarray,Tensor]) – input data array with the channel dimension specified bychannel_dim.channel_dim (
UnionType[int,None]) – channel dimension of the data array. Defaults to 0.Noneindicates data without any channel dimension.squeeze_end_dims (
bool) – ifTrue, any trailing singleton dimensions will be removed.kwargs – keyword arguments passed to
self.convert_to_channel_last, currently supportspatial_ndim, defauting to3.
- set_metadata(meta_dict, resample=True, **options)[source]#
Resample
self.dataobjif needed. This method assumesself.data_objis a ‘channel-last’ ndarray.- Parameters:
meta_dict (
UnionType[Mapping,None]) – a metadata dictionary for affine, original affine and spatial shape information. Optional keys are"spatial_shape","affine","original_affine".resample (
bool) – ifTrue, the data will be resampled to the original affine (specified inmeta_dict).options – keyword arguments passed to
self.resample_if_needed, currently supportmode,padding_mode,align_corners, anddtype, defaulting tobilinear,border,False, andnp.float64respectively.
- write(filename, verbose=False, **obj_kwargs)[source]#
Create a Nibabel object from
self.create_backend_obj(self.obj, ...)and callnib.save.- Parameters:
filename (
Union[str,PathLike]) – filename or PathLike object.verbose (
bool) – ifTrue, log the progress.obj_kwargs – keyword arguments passed to
self.create_backend_obj,
PILWriter#
- class monai.data.PILWriter(output_dtype=<class 'numpy.float32'>, channel_dim=0, scale=255, **kwargs)[source]#
Write image data into files on disk using pillow.
It’s based on the Image module in PIL library: https://pillow.readthedocs.io/en/stable/reference/Image.html
import numpy as np from monai.data import PILWriter np_data = np.arange(48).reshape(3, 4, 4) writer = PILWriter(np.uint8) writer.set_data_array(np_data, channel_dim=0) writer.write("test1.png", verbose=True)
- __init__(output_dtype=<class 'numpy.float32'>, channel_dim=0, scale=255, **kwargs)[source]#
- Parameters:
output_dtype (
Union[dtype,type,str,None]) – output data type.channel_dim (
UnionType[int,None]) – channel dimension of the data array. Defaults to 0.Noneindicates data without any channel dimension.scale (
UnionType[int,None]) – {255,65535} postprocess data by clipping to [0, 1] and scaling [0, 255] (uint8) or [0, 65535] (uint16). Default is None to disable scaling.kwargs – keyword arguments passed to
ImageWriter.
- classmethod create_backend_obj(data_array, dtype=None, scale=255, reverse_indexing=True, **kwargs)[source]#
Create a PIL object from
data_array.- Parameters:
data_array (
Union[ndarray,Tensor]) – input data array.dtype (
Union[dtype,type,str,None]) – output data type.scale (
UnionType[int,None]) – {255,65535} postprocess data by clipping to [0, 1] and scaling [0, 255] (uint8) or [0, 65535] (uint16). Default is None to disable scaling.reverse_indexing (
bool) – ifTrue, the data array’s first two dimensions will be swapped.kwargs – keyword arguments. Currently
PILImage.fromarraywill readimage_modefrom this dictionary, defaults toNone.
- classmethod get_meta_info(metadata=None)[source]#
Extracts relevant meta information from the metadata object (using
.get). Optional keys are"spatial_shape",MetaKeys.AFFINE,"original_affine".
- classmethod resample_and_clip(data_array, output_spatial_shape=None, mode=bicubic)[source]#
Resample
data_arraytooutput_spatial_shapeif needed. :type data_array:Union[ndarray,Tensor] :param data_array: input data array. This method assumes the ‘channel-last’ format. :type output_spatial_shape:UnionType[Sequence[int],None] :param output_spatial_shape: output spatial shape. :type mode:str:param mode: interpolation mode, default isInterpolateMode.BICUBIC.- Return type:
ndarray
- set_data_array(data_array, channel_dim=0, squeeze_end_dims=True, contiguous=False, **kwargs)[source]#
Convert
data_arrayinto ‘channel-last’ numpy ndarray.- Parameters:
data_array (
Union[ndarray,Tensor]) – input data array with the channel dimension specified bychannel_dim.channel_dim (
UnionType[int,None]) – channel dimension of the data array. Defaults to 0.Noneindicates data without any channel dimension.squeeze_end_dims (
bool) – ifTrue, any trailing singleton dimensions will be removed.contiguous (
bool) – ifTrue, the data array will be converted to a contiguous array. Default isFalse.kwargs – keyword arguments passed to
self.convert_to_channel_last, currently supportspatial_ndim, defauting to2.
- set_metadata(meta_dict=None, resample=True, **options)[source]#
Resample
self.dataobjif needed. This method assumesself.data_objis a ‘channel-last’ ndarray.- Parameters:
meta_dict (
UnionType[Mapping,None]) – a metadata dictionary for affine, original affine and spatial shape information. Optional key is"spatial_shape".resample (
bool) – ifTrue, the data will be resampled to the spatial shape specified inmeta_dict.options – keyword arguments passed to
self.resample_if_needed, currently supportmode, defaulting tobicubic.
- write(filename, verbose=False, **kwargs)[source]#
Create a PIL image object from
self.create_backend_obj(self.obj, ...)and callsave.- Parameters:
filename (
Union[str,PathLike]) – filename or PathLike object.verbose (
bool) – ifTrue, log the progress.kwargs – optional keyword arguments passed to
self.create_backend_objcurrently supportreverse_indexing,image_mode, defaulting toTrue,Nonerespectively.
Synthetic#
- monai.data.synthetic.create_test_image_2d(height, width, num_objs=12, rad_max=30, rad_min=5, noise_max=0.0, num_seg_classes=5, channel_dim=None, random_state=None)[source]#
Return a noisy 2D image with num_objs circles and a 2D mask image. The maximum and minimum radii of the circles are given as rad_max and rad_min. The mask will have num_seg_classes number of classes for segmentations labeled sequentially from 1, plus a background class represented as 0. If noise_max is greater than 0 then noise will be added to the image taken from the uniform distribution on range [0,noise_max). If channel_dim is None, will create an image without channel dimension, otherwise create an image with channel dimension as first dim or last dim.
- Parameters:
height (
int) – height of the image. The value should be larger than 2 * rad_max.width (
int) – width of the image. The value should be larger than 2 * rad_max.num_objs (
int) – number of circles to generate. Defaults to 12.rad_max (
int) – maximum circle radius. Defaults to 30.rad_min (
int) – minimum circle radius. Defaults to 5.noise_max (
float) – if greater than 0 then noise will be added to the image taken from the uniform distribution on range [0,noise_max). Defaults to 0.num_seg_classes (
int) – number of classes for segmentations. Defaults to 5.channel_dim (
UnionType[int,None]) – if None, create an image without channel dimension, otherwise create an image with channel dimension as first dim or last dim. Defaults to None.random_state (
UnionType[RandomState,None]) – the random generator to use. Defaults to np.random.
- Return type:
tuple[ndarray,ndarray]- Returns:
Randomised Numpy array with shape (height, width)
- monai.data.synthetic.create_test_image_3d(height, width, depth, num_objs=12, rad_max=30, rad_min=5, noise_max=0.0, num_seg_classes=5, channel_dim=None, random_state=None)[source]#
Return a noisy 3D image and segmentation.
- Parameters:
height (
int) – height of the image. The value should be larger than 2 * rad_max.width (
int) – width of the image. The value should be larger than 2 * rad_max.depth (
int) – depth of the image. The value should be larger than 2 * rad_max.num_objs (
int) – number of circles to generate. Defaults to 12.rad_max (
int) – maximum circle radius. Defaults to 30.rad_min (
int) – minimum circle radius. Defaults to 5.noise_max (
float) – if greater than 0 then noise will be added to the image taken from the uniform distribution on range [0,noise_max). Defaults to 0.num_seg_classes (
int) – number of classes for segmentations. Defaults to 5.channel_dim (
UnionType[int,None]) – if None, create an image without channel dimension, otherwise create an image with channel dimension as first dim or last dim. Defaults to None.random_state (
UnionType[RandomState,None]) – the random generator to use. Defaults to np.random.
- Return type:
tuple[ndarray,ndarray]- Returns:
Randomised Numpy array with shape (height, width, depth)
See also
Ouput folder layout#
- class monai.data.folder_layout.FolderLayout(output_dir, postfix='', extension='', parent=False, makedirs=False, data_root_dir='')[source]#
A utility class to create organized filenames within
output_dir. Thefilenamemethod could be used to create a filename following the folder structure.Example:
from monai.data import FolderLayout layout = FolderLayout( output_dir="/test_run_1/", postfix="seg", extension="nii", makedirs=False) layout.filename(subject="Sub-A", idx="00", modality="T1") # return value: "/test_run_1/Sub-A_seg_00_modality-T1.nii"
The output filename is a string starting with a
subjectID, and includes additional information about a customized index and image modality. This utility class doesn’t alter the underlying image data, but provides a convenient way to create filenames.- __init__(output_dir, postfix='', extension='', parent=False, makedirs=False, data_root_dir='')[source]#
- Parameters:
output_dir (
Union[str,PathLike]) – output directory.postfix (
str) – a postfix string for output file name appended tosubject.extension (
str) – output file extension to be appended to the end of an output filename.parent (
bool) – whether to add a level of parent folder to contain each image to the output filename.makedirs (
bool) – whether to create the output parent directories if they do not exist.data_root_dir (
Union[str,PathLike]) – an optional PathLike object to preserve the folder structure of the input subject. Please seemonai.data.utils.create_file_basename()for more details.
- filename(subject='subject', idx=None, **kwargs)[source]#
Create a filename based on the input
subjectandidx.The output filename is formed as:
output_dir/[subject/]subject[_postfix][_idx][_key-value][ext]- Parameters:
subject (
Union[str,PathLike]) – subject name, used as the primary id of the output filename. When a PathLike object is provided, the base filename will be used as the subject name, the extension name of subject will be ignored, in favor ofextensionfrom this class’s constructor.idx – additional index name of the image.
kwargs – additional keyword arguments to be used to form the output filename. The key-value pairs will be appended to the output filename as
f"_{k}-{v}".
- Return type:
Union[str,PathLike]
- class monai.data.folder_layout.FolderLayoutBase[source]#
Abstract base class to define a common interface for FolderLayout and derived classes Mainly, defines the
filename(**kwargs) -> PathLikefunction, which must be defined by the deriving class.Example:
from monai.data import FolderLayoutBase class MyFolderLayout(FolderLayoutBase): def __init__( self, basepath: Path, extension: str = "", makedirs: bool = False ): self.basepath = basepath if not extension: self.extension = "" elif extension.startswith("."): self.extension = extension: else: self.extension = f".{extension}" self.makedirs = makedirs def filename(self, patient_no: int, image_name: str, **kwargs) -> Path: sub_path = self.basepath / patient_no if not sub_path.exists(): sub_path.mkdir(parents=True) file = image_name for k, v in kwargs.items(): file += f"_{k}-{v}" file += self.extension return sub_path / file
- monai.data.folder_layout.default_name_formatter(metadict, saver)[source]#
Returns a kwargs dict for
FolderLayout.filename(), according to the input metadata and SaveImage transform.- Return type:
dict
Utilities#
- monai.data.utils.affine_to_spacing(affine, r=3, dtype=<class 'float'>, suppress_zeros=True)[source]#
Computing the current spacing from the affine matrix.
- Parameters:
affine (~NdarrayTensor) – a d x d affine matrix.
r (
int) – indexing based on the spatial rank, spacing is computed from affine[:r, :r].dtype – data type of the output.
suppress_zeros (
bool) – whether to suppress the zeros with ones.
- Return type:
~NdarrayTensor
- Returns:
an r dimensional vector of spacing.
- monai.data.utils.compute_importance_map(patch_size, mode=constant, sigma_scale=0.125, device='cpu', dtype=torch.float32)[source]#
Get importance map for different weight modes.
- Parameters:
patch_size (
tuple[int, …]) – Size of the required importance map. This should be either H, W [,D].mode (
UnionType[BlendMode,str]) –{
"constant","gaussian"} How to blend output of overlapping windows. Defaults to"constant"."constant”: gives equal weight to all predictions."gaussian”: gives less weight to predictions on edges of windows.
sigma_scale (
UnionType[Sequence[float],float]) – Sigma_scale to calculate sigma for each dimension (sigma = sigma_scale * dim_size). Used for gaussian mode only.device (
UnionType[device,int,str]) – Device to put importance map on.dtype (
UnionType[dtype,str,None]) – Data type of the output importance map.
- Raises:
ValueError – When
modeis not one of [“constant”, “gaussian”].- Return type:
Tensor- Returns:
Tensor of size patch_size.
- monai.data.utils.compute_shape_offset(spatial_shape, in_affine, out_affine, scale_extent=False)[source]#
Given input and output affine, compute appropriate shapes in the output space based on the input array’s shape. This function also returns the offset to put the shape in a good position with respect to the world coordinate system.
- Parameters:
spatial_shape (
UnionType[ndarray,Sequence[int]]) – input array’s shapein_affine (matrix) – 2D affine matrix
out_affine (matrix) – 2D affine matrix
scale_extent (
bool) –whether the scale is computed based on the spacing or the full extent of voxels, for example, for a factor of 0.5 scaling:
option 1, “o” represents a voxel, scaling the distance between voxels:
o--o--o o-----o
option 2, each voxel has a physical extent, scaling the full voxel extent:
| voxel 1 | voxel 2 | voxel 3 | voxel 4 | | voxel 1 | voxel 2 |
Option 1 may reduce the number of locations that requiring interpolation. Option 2 is more resolution agnostic, that is, resampling coordinates depend on the scaling factor, not on the number of voxels. Default is False, using option 1 to compute the shape and offset.
- Return type:
tuple[ndarray,ndarray]
- monai.data.utils.convert_tables_to_dicts(dfs, row_indices=None, col_names=None, col_types=None, col_groups=None, **kwargs)[source]#
Utility to join pandas tables, select rows, columns and generate groups. Will return a list of dictionaries, every dictionary maps to a row of data in tables.
- Parameters:
dfs – data table in pandas Dataframe format. if providing a list of tables, will join them.
row_indices (
UnionType[Sequence[UnionType[int,str]],None]) – indices of the expected rows to load. it should be a list, every item can be a int number or a range [start, end) for the indices. for example: row_indices=[[0, 100], 200, 201, 202, 300]. if None, load all the rows in the file.col_names (
UnionType[Sequence[str],None]) – names of the expected columns to load. if None, load all the columns.col_types (
UnionType[dict[str,UnionType[dict[str,Any],None]],None]) –type and default value to convert the loaded columns, if None, use original data. it should be a dictionary, every item maps to an expected column, the key is the column name and the value is None or a dictionary to define the default value and data type. the supported keys in dictionary are: [“type”, “default”], and note that the value of default should not be None. for example:
col_types = { "subject_id": {"type": str}, "label": {"type": int, "default": 0}, "ehr_0": {"type": float, "default": 0.0}, "ehr_1": {"type": float, "default": 0.0}, }
col_groups (
UnionType[dict[str,Sequence[str]],None]) – args to group the loaded columns to generate a new column, it should be a dictionary, every item maps to a group, the key will be the new column name, the value is the names of columns to combine. for example: col_groups={“ehr”: [f”ehr_{i}” for i in range(10)], “meta”: [“meta_1”, “meta_2”]}kwargs – additional arguments for pandas.merge() API to join tables.
- Return type:
list[dict[str,Any]]
- monai.data.utils.correct_nifti_header_if_necessary(img_nii)[source]#
Check nifti object header’s format, update the header if needed. In the updated image pixdim matches the affine.
- Parameters:
img_nii – nifti image object
- monai.data.utils.create_file_basename(postfix, input_file_name, folder_path, data_root_dir='', separate_folder=True, patch_index=None, makedirs=True)[source]#
Utility function to create the path to the output file based on the input filename (file name extension is not added by this function). When
data_root_diris not specified, the output file name is:folder_path/input_file_name (no ext.) /input_file_name (no ext.)[_postfix][_patch_index]
otherwise the relative path with respect to
data_root_dirwill be inserted, for example:from monai.data import create_file_basename create_file_basename( postfix="seg", input_file_name="/foo/bar/test1/image.png", folder_path="/output", data_root_dir="/foo/bar", separate_folder=True, makedirs=False) # output: /output/test1/image/image_seg
- Parameters:
postfix (
str) – output name’s postfixinput_file_name (
Union[str,PathLike]) – path to the input image file.folder_path (
Union[str,PathLike]) – path for the output filedata_root_dir (
Union[str,PathLike]) – if not empty, it specifies the beginning parts of the input file’s absolute path. This is used to compute input_file_rel_path, the relative path to the file from data_root_dir to preserve folder structure when saving in case there are files in different folders with the same file names.separate_folder (
bool) – whether to save every file in a separate folder, for example: if input filename is image.nii, postfix is seg and folder_path is output, if True, save as: output/image/image_seg.nii, if False, save as output/image_seg.nii. default to True.patch_index – if not None, append the patch index to filename.
makedirs (
bool) – whether to create the folder if it does not exist.
- Return type:
str
- monai.data.utils.decollate_batch(batch, detach=True, pad=True, fill_value=None)[source]#
De-collate a batch of data (for example, as produced by a DataLoader).
Returns a list of structures with the original tensor’s 0-th dimension sliced into elements using torch.unbind.
Images originally stored as (B,C,H,W,[D]) will be returned as (C,H,W,[D]). Other information, such as metadata, may have been stored in a list (or a list inside nested dictionaries). In this case we return the element of the list corresponding to the batch idx.
Return types aren’t guaranteed to be the same as the original, since numpy arrays will have been converted to torch.Tensor, sequences may be converted to lists of tensors, mappings may be converted into dictionaries.
For example:
batch_data = { "image": torch.rand((2,1,10,10)), DictPostFix.meta("image"): {"scl_slope": torch.Tensor([0.0, 0.0])} } out = decollate_batch(batch_data) print(len(out)) >>> 2 print(out[0]) >>> {'image': tensor([[[4.3549e-01...43e-01]]]), DictPostFix.meta("image"): {'scl_slope': 0.0}} batch_data = [torch.rand((2,1,10,10)), torch.rand((2,3,5,5))] out = decollate_batch(batch_data) print(out[0]) >>> [tensor([[[4.3549e-01...43e-01]]], tensor([[[5.3435e-01...45e-01]]])] batch_data = torch.rand((2,1,10,10)) out = decollate_batch(batch_data) print(out[0]) >>> tensor([[[4.3549e-01...43e-01]]]) batch_data = { "image": [1, 2, 3], "meta": [4, 5], # undetermined batch size } out = decollate_batch(batch_data, pad=True, fill_value=0) print(out) >>> [{'image': 1, 'meta': 4}, {'image': 2, 'meta': 5}, {'image': 3, 'meta': 0}] out = decollate_batch(batch_data, pad=False) print(out) >>> [{'image': 1, 'meta': 4}, {'image': 2, 'meta': 5}]
- Parameters:
batch – data to be de-collated.
detach (
bool) – whether to detach the tensors. Scalars tensors will be detached into number types instead of torch tensors.pad – when the items in a batch indicate different batch size, whether to pad all the sequences to the longest. If False, the batch size will be the length of the shortest sequence.
fill_value – when pad is True, the fillvalue to use when padding, defaults to None.
- monai.data.utils.dense_patch_slices(image_size, patch_size, scan_interval, return_slice=True)[source]#
Enumerate all slices defining ND patches of size patch_size from an image_size input image.
- Parameters:
image_size (
Sequence[int]) – dimensions of image to iterate overpatch_size (
Sequence[int]) – size of patches to generate slicesscan_interval (
Sequence[int]) – dense patch sampling intervalreturn_slice (
bool) – whether to return a list of slices (or tuples of indices), defaults to True
- Return type:
list[tuple[slice, …]]- Returns:
a list of slice objects defining each patch
- monai.data.utils.get_extra_metadata_keys()[source]#
Get a list of unnecessary keys for metadata that can be removed.
- Return type:
list[str]- Returns:
List of keys to be removed.
- monai.data.utils.get_random_patch(dims, patch_size, rand_state=None)[source]#
Returns a tuple of slices to define a random patch in an array of shape dims with size patch_size or the as close to it as possible within the given dimension. It is expected that patch_size is a valid patch for a source of shape dims as returned by get_valid_patch_size.
- Parameters:
dims (
Sequence[int]) – shape of source arraypatch_size (
Sequence[int]) – shape of patch size to generaterand_state (
UnionType[RandomState,None]) – a random state object to generate random numbers from
- Returns:
a tuple of slice objects defining the patch
- Return type:
(tuple of slice)
- monai.data.utils.get_valid_patch_size(image_size, patch_size)[source]#
Given an image of dimensions image_size, return a patch size tuple taking the dimension from patch_size if this is not 0/None. Otherwise, or if patch_size is shorter than image_size, the dimension from image_size is taken. This ensures the returned patch size is within the bounds of image_size. If patch_size is a single number this is interpreted as a patch of the same dimensionality of image_size with that size in each dimension.
- Return type:
tuple[int, …]
- monai.data.utils.is_no_channel(val)[source]#
Returns whether val indicates “no_channel”, for MetaKeys.ORIGINAL_CHANNEL_DIM.
- Return type:
bool
- monai.data.utils.is_supported_format(filename, suffixes)[source]#
Verify whether the specified file or files format match supported suffixes. If supported suffixes is None, skip the verification and return True.
- Parameters:
filename (
Union[Sequence[Union[str,PathLike]],str,PathLike]) – file name or a list of file names to read. if a list of files, verify all the suffixes.suffixes (
Sequence[str]) – all the supported image suffixes of current reader, must be a list of lower case suffixes.
- Return type:
bool
- monai.data.utils.iter_patch(arr, patch_size=0, start_pos=(), overlap=0.0, copy_back=True, mode=wrap, **pad_opts)[source]#
Yield successive patches from arr of size patch_size. The iteration can start from position start_pos in arr but drawing from a padded array extended by the patch_size in each dimension (so these coordinates can be negative to start in the padded region). If copy_back is True the values from each patch are written back to arr.
- Parameters:
arr (
Union[ndarray,Tensor]) – array to iterate overpatch_size (
UnionType[Sequence[int],int]) – size of patches to generate slices for, 0 or None selects whole dimension. For 0 or None, padding and overlap ratio of the corresponding dimension will be 0.start_pos (
Sequence[int]) – starting position in the array, default is 0 for each dimensionoverlap (
UnionType[Sequence[float],float]) – the amount of overlap of neighboring patches in each dimension (a value between 0.0 and 1.0). If only one float number is given, it will be applied to all dimensions. Defaults to 0.0.copy_back (
bool) – if True data from the yielded patches is copied back to arr once the generator completesmode (
UnionType[str,None]) – available modes: (Numpy) {"constant","edge","linear_ramp","maximum","mean","median","minimum","reflect","symmetric","wrap","empty"} (PyTorch) {"constant","reflect","replicate","circular"}. One of the listed string values or a user supplied function. If None, no wrapping is performed. Defaults to"wrap". See also: https://numpy.org/doc/stable/reference/generated/numpy.pad.html https://pytorch.org/docs/stable/generated/torch.nn.functional.pad.html requires pytorch >= 1.10 for best compatibility.pad_opts (
dict) – other arguments for the np.pad or torch.pad function. note that np.pad treats channel dimension as the first dimension.
- Yields:
Patches of array data from arr which are views into a padded array which can be modified, if copy_back is True these changes will be reflected in arr once the iteration completes.
Note
coordinate format is:
- [1st_dim_start, 1st_dim_end,
2nd_dim_start, 2nd_dim_end, …, Nth_dim_start, Nth_dim_end]]
- Return type:
Generator[tuple[Union[ndarray,Tensor],ndarray],None,None]
- monai.data.utils.iter_patch_position(image_size, patch_size, start_pos=(), overlap=0.0, padded=False)[source]#
Yield successive tuples of upper left corner of patches of size patch_size from an array of dimensions image_size. The iteration starts from position start_pos in the array, or starting at the origin if this isn’t provided. Each patch is chosen in a contiguous grid using a rwo-major ordering.
- Parameters:
image_size (
Sequence[int]) – dimensions of array to iterate overpatch_size (
UnionType[Sequence[int],int,ndarray]) – size of patches to generate slices for, 0 or None selects whole dimensionstart_pos (
Sequence[int]) – starting position in the array, default is 0 for each dimensionoverlap (
UnionType[Sequence[float],float,Sequence[int],int]) – the amount of overlap of neighboring patches in each dimension. Either a float or list of floats between 0.0 and 1.0 to define relative overlap to patch size, or an int or list of ints to define number of pixels for overlap. If only one float/int number is given, it will be applied to all dimensions. Defaults to 0.0.padded (
bool) – if the image is padded so the patches can go beyond the borders. Defaults to False.
- Yields:
Tuples of positions defining the upper left corner of each patch
- monai.data.utils.iter_patch_slices(image_size, patch_size, start_pos=(), overlap=0.0, padded=True)[source]#
Yield successive tuples of slices defining patches of size patch_size from an array of dimensions image_size. The iteration starts from position start_pos in the array, or starting at the origin if this isn’t provided. Each patch is chosen in a contiguous grid using a rwo-major ordering.
- Parameters:
image_size (
Sequence[int]) – dimensions of array to iterate overpatch_size (
UnionType[Sequence[int],int]) – size of patches to generate slices for, 0 or None selects whole dimensionstart_pos (
Sequence[int]) – starting position in the array, default is 0 for each dimensionoverlap (
UnionType[Sequence[float],float]) – the amount of overlap of neighboring patches in each dimension (a value between 0.0 and 1.0). If only one float number is given, it will be applied to all dimensions. Defaults to 0.0.padded (
bool) – if the image is padded so the patches can go beyond the borders. Defaults to False.
- Yields:
Tuples of slice objects defining each patch
- Return type:
Generator[tuple[slice, …],None,None]
- monai.data.utils.json_hashing(item)[source]#
- Parameters:
item – data item to be hashed
Returns: the corresponding hash key
- Return type:
bytes
- monai.data.utils.list_data_collate(batch)[source]#
Enhancement for PyTorch DataLoader default collate. If dataset already returns a list of batch data that generated in transforms, need to merge all data to 1 list. Then it’s same as the default collate behavior.
Note
Need to use this collate if apply some transforms that can generate batch data.
- monai.data.utils.orientation_ras_lps(affine)[source]#
Convert the
affinebetween the RAS and LPS orientation by flipping the first two spatial dimensions.- Parameters:
affine (~NdarrayTensor) – a 2D affine matrix.
- Return type:
~NdarrayTensor
- monai.data.utils.pad_list_data_collate(batch, method=symmetric, mode=constant, **kwargs)[source]#
Function version of
monai.transforms.croppad.batch.PadListDataCollate.Same as MONAI’s
list_data_collate, except any tensors are centrally padded to match the shape of the biggest tensor in each dimension. This transform is useful if some of the applied transforms generate batch data of different sizes.This can be used on both list and dictionary data. Note that in the case of the dictionary data, this decollate function may add the transform information of PadListDataCollate to the list of invertible transforms if input batch have different spatial shape, so need to call static method: monai.transforms.croppad.batch.PadListDataCollate.inverse before inverting other transforms.
- Parameters:
batch (
Sequence) – batch of data to pad-collatemethod (
str) – padding method (seemonai.transforms.SpatialPad)mode (
str) – padding mode (seemonai.transforms.SpatialPad)kwargs – other arguments for the np.pad or torch.pad function. note that np.pad treats channel dimension as the first dimension.
- monai.data.utils.partition_dataset(data, ratios=None, num_partitions=None, shuffle=False, seed=0, drop_last=False, even_divisible=False)[source]#
Split the dataset into N partitions. It can support shuffle based on specified random seed. Will return a set of datasets, every dataset contains 1 partition of original dataset. And it can split the dataset based on specified ratios or evenly split into num_partitions. Refer to: https://pytorch.org/docs/stable/distributed.html#module-torch.distributed.launch.
Note
It also can be used to partition dataset for ranks in distributed training. For example, partition dataset before training and use CacheDataset, every rank trains with its own data. It can avoid duplicated caching content in each rank, but will not do global shuffle before every epoch:
data_partition = partition_dataset( data=train_files, num_partitions=dist.get_world_size(), shuffle=True, even_divisible=True, )[dist.get_rank()] train_ds = SmartCacheDataset( data=data_partition, transform=train_transforms, replace_rate=0.2, cache_num=15, )
- Parameters:
data (
Sequence) – input dataset to split, expect a list of data.ratios (
UnionType[Sequence[float],None]) – a list of ratio number to split the dataset, like [8, 1, 1].num_partitions (
UnionType[int,None]) – expected number of the partitions to evenly split, only works when ratios not specified.shuffle (
bool) – whether to shuffle the original dataset before splitting.seed (
int) – random seed to shuffle the dataset, only works when shuffle is True.drop_last (
bool) – only works when even_divisible is False and no ratios specified. if True, will drop the tail of the data to make it evenly divisible across partitions. if False, will add extra indices to make the data evenly divisible across partitions.even_divisible (
bool) – if True, guarantee every partition has same length.
Examples:
>>> data = [1, 2, 3, 4, 5] >>> partition_dataset(data, ratios=[0.6, 0.2, 0.2], shuffle=False) [[1, 2, 3], [4], [5]] >>> partition_dataset(data, num_partitions=2, shuffle=False) [[1, 3, 5], [2, 4]] >>> partition_dataset(data, num_partitions=2, shuffle=False, even_divisible=True, drop_last=True) [[1, 3], [2, 4]] >>> partition_dataset(data, num_partitions=2, shuffle=False, even_divisible=True, drop_last=False) [[1, 3, 5], [2, 4, 1]] >>> partition_dataset(data, num_partitions=2, shuffle=False, even_divisible=False, drop_last=False) [[1, 3, 5], [2, 4]]
- monai.data.utils.partition_dataset_classes(data, classes, ratios=None, num_partitions=None, shuffle=False, seed=0, drop_last=False, even_divisible=False)[source]#
Split the dataset into N partitions based on the given class labels. It can make sure the same ratio of classes in every partition. Others are same as
monai.data.partition_dataset.- Parameters:
data (
Sequence) – input dataset to split, expect a list of data.classes (
Sequence[int]) – a list of labels to help split the data, the length must match the length of data.ratios (
UnionType[Sequence[float],None]) – a list of ratio number to split the dataset, like [8, 1, 1].num_partitions (
UnionType[int,None]) – expected number of the partitions to evenly split, only works when no ratios.shuffle (
bool) – whether to shuffle the original dataset before splitting.seed (
int) – random seed to shuffle the dataset, only works when shuffle is True.drop_last (
bool) – only works when even_divisible is False and no ratios specified. if True, will drop the tail of the data to make it evenly divisible across partitions. if False, will add extra indices to make the data evenly divisible across partitions.even_divisible (
bool) – if True, guarantee every partition has same length.
Examples:
>>> data = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14] >>> classes = [2, 0, 2, 1, 3, 2, 2, 0, 2, 0, 3, 3, 1, 3] >>> partition_dataset_classes(data, classes, shuffle=False, ratios=[2, 1]) [[2, 8, 4, 1, 3, 6, 5, 11, 12], [10, 13, 7, 9, 14]]
- monai.data.utils.pickle_hashing(item, protocol=5)[source]#
- Parameters:
item – data item to be hashed
protocol – protocol version used for pickling, defaults to pickle.HIGHEST_PROTOCOL.
Returns: the corresponding hash key
- Return type:
bytes
- monai.data.utils.rectify_header_sform_qform(img_nii)[source]#
Look at the sform and qform of the nifti object and correct it if any incompatibilities with pixel dimensions
Adapted from NifTK/NiftyNet
- Parameters:
img_nii – nifti image object
- monai.data.utils.remove_extra_metadata(meta)[source]#
Remove extra metadata from the dictionary. Operates in-place so nothing is returned.
- Parameters:
meta (
dict) – dictionary containing metadata to be modified.- Return type:
None- Returns:
None
- monai.data.utils.remove_keys(data, keys)[source]#
Remove keys from a dictionary. Operates in-place so nothing is returned.
- Parameters:
data (
dict) – dictionary to be modified.keys (
list[str]) – keys to be deleted from dictionary.
- Return type:
None- Returns:
None
- monai.data.utils.reorient_spatial_axes(data_shape, init_affine, target_affine)[source]#
Given the input
init_affine, compute the orientation transform between it andtarget_affineby rearranging/flipping the axes.Returns the orientation transform and the updated affine (tensor or ndarray depends on the input
affinedata type). Note that this function requires external modulenibabel.orientations.- Return type:
tuple[ndarray,Union[ndarray,Tensor]]
- monai.data.utils.resample_datalist(data, factor, random_pick=False, seed=0)[source]#
Utility function to resample the loaded datalist for training, for example: If factor < 1.0, randomly pick part of the datalist and set to Dataset, useful to quickly test the program. If factor > 1.0, repeat the datalist to enhance the Dataset.
- Parameters:
data (
Sequence) – original datalist to scale.factor (
float) – scale factor for the datalist, for example, factor=4.5, repeat the datalist 4 times and plus 50% of the original datalist.random_pick (
bool) – whether to randomly pick data if scale factor has decimal part.seed (
int) – random seed to randomly pick data.
- monai.data.utils.select_cross_validation_folds(partitions, folds)[source]#
Select cross validation data based on data partitions and specified fold index. if a list of fold indices is provided, concatenate the partitions of these folds.
- Parameters:
partitions (
Sequence[Iterable]) – a sequence of datasets, each item is a iterablefolds (
UnionType[Sequence[int],int]) – the indices of the partitions to be combined.
- Return type:
list- Returns:
A list of combined datasets.
Example:
>>> partitions = [[1, 2], [3, 4], [5, 6], [7, 8], [9, 10]] >>> select_cross_validation_folds(partitions, 2) [5, 6] >>> select_cross_validation_folds(partitions, [1, 2]) [3, 4, 5, 6] >>> select_cross_validation_folds(partitions, [-1, 2]) [9, 10, 5, 6]
- monai.data.utils.set_rnd(obj, seed)[source]#
Set seed or random state for all randomizable properties of obj.
- Parameters:
obj – object to set seed or random state for.
seed (
int) – set the random state with an integer seed.
- Return type:
int
- monai.data.utils.sorted_dict(item, key=None, reverse=False)[source]#
Return a new sorted dictionary from the item.
- monai.data.utils.to_affine_nd(r, affine, dtype=<class 'numpy.float64'>)[source]#
Using elements from affine, to create a new affine matrix by assigning the rotation/zoom/scaling matrix and the translation vector.
When
ris an integer, output is an (r+1)x(r+1) matrix, where the top left kxk elements are copied fromaffine, the last column of the output affine is copied fromaffine’s last column. k is determined by min(r, len(affine) - 1).When
ris an affine matrix, the output has the same shape asr, and the top left kxk elements are copied fromaffine, the last column of the output affine is copied fromaffine’s last column. k is determined by min(len(r) - 1, len(affine) - 1).- Parameters:
r (int or matrix) – number of spatial dimensions or an output affine to be filled.
affine (matrix) – 2D affine matrix
dtype – data type of the output array.
- Raises:
ValueError – When
affinedimensions is not 2.ValueError – When
ris nonpositive.
- Return type:
~NdarrayTensor
- Returns:
an (r+1) x (r+1) matrix (tensor or ndarray depends on the input
affinedata type)
- monai.data.utils.worker_init_fn(worker_id)[source]#
Callback function for PyTorch DataLoader worker_init_fn. It can set different random seed for the transforms in different workers.
- Return type:
None
- monai.data.utils.zoom_affine(affine, scale, diagonal=True)[source]#
To make column norm of affine the same as scale. If diagonal is False, returns an affine that combines orthogonal rotation and the new scale. This is done by first decomposing affine, then setting the zoom factors to scale, and composing a new affine; the shearing factors are removed. If diagonal is True, returns a diagonal matrix, the scaling factors are set to the diagonal elements. This function always return an affine with zero translations.
- Parameters:
affine (nxn matrix) – a square matrix.
scale (
UnionType[ndarray,Sequence[float]]) – new scaling factor along each dimension. if the components of the scale are non-positive values, will use the corresponding components of the original pixdim, which is computed from the affine.diagonal (
bool) – whether to return a diagonal scaling matrix. Defaults to True.
- Raises:
ValueError – When
affineis not a square matrix.ValueError – When
scalecontains a nonpositive scalar.
- Returns:
the updated n x n affine.
Partition Dataset#
- monai.data.partition_dataset(data, ratios=None, num_partitions=None, shuffle=False, seed=0, drop_last=False, even_divisible=False)[source]#
Split the dataset into N partitions. It can support shuffle based on specified random seed. Will return a set of datasets, every dataset contains 1 partition of original dataset. And it can split the dataset based on specified ratios or evenly split into num_partitions. Refer to: https://pytorch.org/docs/stable/distributed.html#module-torch.distributed.launch.
Note
It also can be used to partition dataset for ranks in distributed training. For example, partition dataset before training and use CacheDataset, every rank trains with its own data. It can avoid duplicated caching content in each rank, but will not do global shuffle before every epoch:
data_partition = partition_dataset( data=train_files, num_partitions=dist.get_world_size(), shuffle=True, even_divisible=True, )[dist.get_rank()] train_ds = SmartCacheDataset( data=data_partition, transform=train_transforms, replace_rate=0.2, cache_num=15, )
- Parameters:
data (
Sequence) – input dataset to split, expect a list of data.ratios (
UnionType[Sequence[float],None]) – a list of ratio number to split the dataset, like [8, 1, 1].num_partitions (
UnionType[int,None]) – expected number of the partitions to evenly split, only works when ratios not specified.shuffle (
bool) – whether to shuffle the original dataset before splitting.seed (
int) – random seed to shuffle the dataset, only works when shuffle is True.drop_last (
bool) – only works when even_divisible is False and no ratios specified. if True, will drop the tail of the data to make it evenly divisible across partitions. if False, will add extra indices to make the data evenly divisible across partitions.even_divisible (
bool) – if True, guarantee every partition has same length.
Examples:
>>> data = [1, 2, 3, 4, 5] >>> partition_dataset(data, ratios=[0.6, 0.2, 0.2], shuffle=False) [[1, 2, 3], [4], [5]] >>> partition_dataset(data, num_partitions=2, shuffle=False) [[1, 3, 5], [2, 4]] >>> partition_dataset(data, num_partitions=2, shuffle=False, even_divisible=True, drop_last=True) [[1, 3], [2, 4]] >>> partition_dataset(data, num_partitions=2, shuffle=False, even_divisible=True, drop_last=False) [[1, 3, 5], [2, 4, 1]] >>> partition_dataset(data, num_partitions=2, shuffle=False, even_divisible=False, drop_last=False) [[1, 3, 5], [2, 4]]
Partition Dataset based on classes#
- monai.data.partition_dataset_classes(data, classes, ratios=None, num_partitions=None, shuffle=False, seed=0, drop_last=False, even_divisible=False)[source]#
Split the dataset into N partitions based on the given class labels. It can make sure the same ratio of classes in every partition. Others are same as
monai.data.partition_dataset.- Parameters:
data (
Sequence) – input dataset to split, expect a list of data.classes (
Sequence[int]) – a list of labels to help split the data, the length must match the length of data.ratios (
UnionType[Sequence[float],None]) – a list of ratio number to split the dataset, like [8, 1, 1].num_partitions (
UnionType[int,None]) – expected number of the partitions to evenly split, only works when no ratios.shuffle (
bool) – whether to shuffle the original dataset before splitting.seed (
int) – random seed to shuffle the dataset, only works when shuffle is True.drop_last (
bool) – only works when even_divisible is False and no ratios specified. if True, will drop the tail of the data to make it evenly divisible across partitions. if False, will add extra indices to make the data evenly divisible across partitions.even_divisible (
bool) – if True, guarantee every partition has same length.
Examples:
>>> data = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14] >>> classes = [2, 0, 2, 1, 3, 2, 2, 0, 2, 0, 3, 3, 1, 3] >>> partition_dataset_classes(data, classes, shuffle=False, ratios=[2, 1]) [[2, 8, 4, 1, 3, 6, 5, 11, 12], [10, 13, 7, 9, 14]]
DistributedSampler#
- class monai.data.DistributedSampler(dataset, even_divisible=True, num_replicas=None, rank=None, shuffle=True, **kwargs)[source]#
Enhance PyTorch DistributedSampler to support non-evenly divisible sampling.
- Parameters:
dataset (
Dataset) – Dataset used for sampling.even_divisible (
bool) – if False, different ranks can have different data length. for example, input data: [1, 2, 3, 4, 5], rank 0: [1, 3, 5], rank 1: [2, 4].num_replicas (
UnionType[int,None]) – number of processes participating in distributed training. by default, world_size is retrieved from the current distributed group.rank (
UnionType[int,None]) – rank of the current process within num_replicas. by default, rank is retrieved from the current distributed group.shuffle (
bool) – if True, sampler will shuffle the indices, default to True.kwargs – additional arguments for DistributedSampler super class, can be seed and drop_last.
More information about DistributedSampler, please check: https://pytorch.org/docs/stable/data.html#torch.utils.data.distributed.DistributedSampler.
DistributedWeightedRandomSampler#
- class monai.data.DistributedWeightedRandomSampler(dataset, weights, num_samples_per_rank=None, generator=None, even_divisible=True, num_replicas=None, rank=None, **kwargs)[source]#
Extend the DistributedSampler to support weighted sampling. Refer to torch.utils.data.WeightedRandomSampler, for more details please check: https://pytorch.org/docs/stable/data.html#torch.utils.data.WeightedRandomSampler.
- Parameters:
dataset (
Dataset) – Dataset used for sampling.weights (
Sequence[float]) – a sequence of weights, not necessary summing up to one, length should exactly match the full dataset.num_samples_per_rank (
UnionType[int,None]) – number of samples to draw for every rank, sample from the distributed subset of dataset. if None, default to the length of dataset split by DistributedSampler.generator (
UnionType[Generator,None]) – PyTorch Generator used in sampling.even_divisible (
bool) – if False, different ranks can have different data length. for example, input data: [1, 2, 3, 4, 5], rank 0: [1, 3, 5], rank 1: [2, 4].’num_replicas (
UnionType[int,None]) – number of processes participating in distributed training. by default, world_size is retrieved from the current distributed group.rank (
UnionType[int,None]) – rank of the current process within num_replicas. by default, rank is retrieved from the current distributed group.kwargs – additional arguments for DistributedSampler super class, can be seed and drop_last.
DatasetSummary#
- class monai.data.DatasetSummary(dataset, image_key='image', label_key='label', meta_key=None, meta_key_postfix='meta_dict', num_workers=0, **kwargs)[source]#
This class provides a way to calculate a reasonable output voxel spacing according to the input dataset. The achieved values can used to resample the input in 3d segmentation tasks (like using as the pixdim parameter in monai.transforms.Spacingd). In addition, it also supports to compute the mean, std, min and max intensities of the input, and these statistics are helpful for image normalization (as parameters of monai.transforms.ScaleIntensityRanged and monai.transforms.NormalizeIntensityd).
The algorithm for calculation refers to: Automated Design of Deep Learning Methods for Biomedical Image Segmentation.
Decathlon Datalist#
- monai.data.load_decathlon_datalist(data_list_file_path, is_segmentation=True, data_list_key='training', base_dir=None)[source]#
Load image/label paths of decathlon challenge from JSON file
JSON file should follow the format of the Medical Segmentation Decathlon datalist.json files, see http://medicaldecathlon.com. The files are structured as follows:
{ "metadata_key_0": "metadata_value_0", "metadata_key_1": "metadata_value_1", ..., "training": [ {"image": "path/to/image_1.nii.gz", "label": "path/to/label_1.nii.gz"}, {"image": "path/to/image_2.nii.gz", "label": "path/to/label_2.nii.gz"}, ... ], "test": [ "path/to/image_3.nii.gz", "path/to/image_4.nii.gz", ... ] }
- The metadata keys are optional for loading the datalist, but include:
some string items:
name,description,reference,licence,release,tensorImageSizetwo dict items:
modality(keyed by channel index), andlabels(keyed by label index)and two integer items:
numTrainingandnumTest, with the number of items.
The
trainingkey contains a list of dictionaries, each of which has at least theimageandlabelkeys. The image and label are loaded bymonai.transforms.LoadImaged(), so both can be either a single file path or a list of file paths, in which case they are loaded as multi-channel images. Each item can also include afoldkey for cross-validation purposes. The “test” key contains a list of image paths, without labels, MONAI also supports a “validation” list with the same format as the “training” list.- Parameters:
data_list_file_path (
Union[str,PathLike]) – the path to the json file of datalist.is_segmentation (
bool) – whether the datalist is for segmentation task, default is True.data_list_key (
str) – the key to get a list of dictionary to be used, default is “training”.base_dir (
Union[str,PathLike,None]) – the base directory of the dataset, if None, use the datalist directory.
- Raises:
ValueError – When
data_list_file_pathdoes not point to a file.ValueError – When
data_list_keyis not specified in the data list file.
Returns a list of data items, each of which is a dict keyed by element names, for example:
[ {'image': '/workspace/data/chest_19.nii.gz', 'label': '/workspace/labels/chest_19.nii.gz'}, {'image': '/workspace/data/chest_31.nii.gz', 'label': '/workspace/labels/chest_31.nii.gz'}, ]
- Return type:
list[dict]
- monai.data.load_decathlon_properties(data_property_file_path, property_keys)[source]#
Extract the properties with the specified keys from the Decathlon JSON file. See under load_decathlon_datalist for the expected keys in the Decathlon challenge.
- Parameters:
data_property_file_path (
Union[str,PathLike]) – the path to the JSON file of data properties.property_keys (
UnionType[Sequence[str],str]) – expected keys to load from the JSON file, for example, we have these keys in the decathlon challenge: name, description, reference, licence, tensorImageSize, modality, labels, numTraining, numTest, etc.
- Return type:
dict
- monai.data.check_missing_files(datalist, keys, root_dir=None, allow_missing_keys=False)[source]#
Checks whether some files in the Decathlon datalist are missing. It would be helpful to check missing files before a heavy training run.
- Parameters:
datalist (
list[dict]) – a list of data items, every item is a dictionary. usually generated by load_decathlon_datalist API.keys (
Union[Collection[Hashable],Hashable]) – expected keys to check in the datalist.root_dir (
Union[str,PathLike,None]) – if not None, provides the root dir for the relative file paths in datalist.allow_missing_keys (
bool) – whether allow missing keys in the datalist items. if False, raise exception if missing. default to False.
- Returns:
A list of missing filenames.
- monai.data.create_cross_validation_datalist(datalist, nfolds, train_folds, val_folds, train_key='training', val_key='validation', filename=None, shuffle=True, seed=0, check_missing=False, keys=None, root_dir=None, allow_missing_keys=False, raise_error=True)[source]#
Utility to create new Decathlon style datalist based on cross validation partition.
- Parameters:
datalist (
list[dict]) – loaded list of dictionaries for all the items to partition.nfolds (
int) – number of the kfold split.train_folds (
UnionType[Sequence[int],int]) – indices of folds for training part.val_folds (
UnionType[Sequence[int],int]) – indices of folds for validation part.train_key (
str) – the key of train part in the new datalist, defaults to “training”.val_key (
str) – the key of validation part in the new datalist, defaults to “validation”.filename (
UnionType[Path,str,None]) – if not None and ends with “.json”, save the new datalist into JSON file.shuffle (
bool) – whether to shuffle the datalist before partition, defaults to True.seed (
int) – if shuffle is True, set the random seed, defaults to 0.check_missing (
bool) – whether to check all the files specified by keys are existing.keys (
Union[Collection[Hashable],Hashable,None]) – if not None and check_missing_files is True, the expected keys to check in the datalist.root_dir (
UnionType[str,None]) – if not None, provides the root dir for the relative file paths in datalist.allow_missing_keys (
bool) – if check_missing_files is True, whether allow missing keys in the datalist items. if False, raise exception if missing. default to False.raise_error (
bool) – when found missing files, if True, raise exception and stop, if False, print warning.
DataLoader#
- class monai.data.DataLoader(dataset, num_workers=0, **kwargs)[source]#
Provides an iterable over the given dataset. It inherits the PyTorch DataLoader and adds enhanced collate_fn and worker_fn by default.
Although this class could be configured to be the same as torch.utils.data.DataLoader, its default configuration is recommended, mainly for the following extra features:
It handles MONAI randomizable objects with appropriate random state managements for deterministic behaviour.
It is aware of the patch-based transform (such as
monai.transforms.RandSpatialCropSamplesDict) samples for preprocessing with enhanced data collating behaviour. See:monai.transforms.Compose.
For more details about
torch.utils.data.DataLoader, please see: https://pytorch.org/docs/stable/data.html#torch.utils.data.DataLoader.For example, to construct a randomized dataset and iterate with the data loader:
import torch from monai.data import DataLoader from monai.transforms import Randomizable class RandomDataset(torch.utils.data.Dataset, Randomizable): def __getitem__(self, index): return self.R.randint(0, 1000, (1,)) def __len__(self): return 16 dataset = RandomDataset() dataloader = DataLoader(dataset, batch_size=2, num_workers=4) for epoch in range(2): for i, batch in enumerate(dataloader): print(epoch, i, batch.data.numpy().flatten().tolist())
- Parameters:
dataset (
Dataset) – dataset from which to load the data.num_workers (
int) – how many subprocesses to use for data loading.0means that the data will be loaded in the main process. (default:0)collate_fn – default to
monai.data.utils.list_data_collate().worker_init_fn – default to
monai.data.utils.worker_init_fn().kwargs – other parameters for PyTorch DataLoader.
ThreadBuffer#
- class monai.data.ThreadBuffer(src, buffer_size=1, timeout=0.01)[source]#
Iterates over values from self.src in a separate thread but yielding them in the current thread. This allows values to be queued up asynchronously. The internal thread will continue running so long as the source has values or until the stop() method is called.
One issue raised by using a thread in this way is that during the lifetime of the thread the source object is being iterated over, so if the thread hasn’t finished another attempt to iterate over it will raise an exception or yield unexpected results. To ensure the thread releases the iteration and proper cleanup is done the stop() method must be called which will join with the thread.
- Parameters:
src – Source data iterable
buffer_size (
int) – Number of items to buffer from the sourcetimeout (
float) – Time to wait for an item from the buffer, or to wait while the buffer is full when adding items
ThreadDataLoader#
- class monai.data.ThreadDataLoader(dataset, buffer_size=1, buffer_timeout=0.01, repeats=1, use_thread_workers=False, **kwargs)[source]#
Subclass of DataLoader using a ThreadBuffer object to implement __iter__ method asynchronously. This will iterate over data from the loader as expected however the data is generated on a separate thread. Use this class where a DataLoader instance is required and not just an iterable object.
The default behaviour with repeats set to 1 is to yield each batch as it is generated, however with a higher value the generated batch is yielded that many times while underlying dataset asynchronously generates the next. Typically not all relevant information is learned from a batch in a single iteration so training multiple times on the same batch will still produce good training with minimal short-term overfitting while allowing a slow batch generation process more time to produce a result. This duplication is done by simply yielding the same object many times and not by regenerating the data.
Another typical usage is to accelerate light-weight preprocessing (usually cached all the deterministic transforms and no IO operations), because it leverages the separate thread to execute preprocessing to avoid unnecessary IPC between multiple workers of DataLoader. And as CUDA may not work well with the multi-processing of DataLoader, ThreadDataLoader can be useful for GPU transforms. For more details: Project-MONAI/tutorials.
The use_thread_workers will cause workers to be created as threads rather than processes although everything else in terms of how the class works is unchanged. This allows multiple workers to be used in Windows for example, or in any other situation where thread semantics is desired. Please note that some MONAI components like several datasets and random transforms are not thread-safe and can’t work as expected with thread workers, need to check all the preprocessing components carefully before enabling use_thread_workers.
- See:
Fischetti et al. “Faster SGD training by minibatch persistency.” ArXiv (2018) https://arxiv.org/abs/1806.07353
Dami et al., “Faster Neural Network Training with Data Echoing” ArXiv (2020) https://arxiv.org/abs/1907.05550
Ramezani et al. “GCN meets GPU: Decoupling “When to Sample” from “How to Sample”.” NeurIPS (2020). https://proceedings.neurips.cc/paper/2020/file/d714d2c5a796d5814c565d78dd16188d-Paper.pdf
- Parameters:
dataset (
Dataset) – input dataset.buffer_size (
int) – number of items to buffer from the data source.buffer_timeout (
float) – time to wait for an item from the buffer, or to wait while the buffer is full when adding items.repeats (
int) – number of times to yield the same batch.use_thread_workers (
bool) – if True and num_workers > 0 the workers are created as threads instead of processeskwargs – other arguments for DataLoader except for dataset.
TestTimeAugmentation#
- class monai.data.TestTimeAugmentation(transform, batch_size, num_workers=0, inferrer_fn=<function _identity>, device='cpu', image_key=image, orig_key=label, nearest_interp=True, orig_meta_keys=None, meta_key_postfix='meta_dict', to_tensor=True, output_device='cpu', post_func=<function _identity>, return_full_data=False, progress=True)[source]#
Class for performing test time augmentations. This will pass the same image through the network multiple times.
The user passes transform(s) to be applied to each realization, and provided that at least one of those transforms is random, the network’s output will vary. Provided that inverse transformations exist for all supplied spatial transforms, the inverse can be applied to each realization of the network’s output. Once in the same spatial reference, the results can then be combined and metrics computed.
Test time augmentations are a useful feature for computing network uncertainty, as well as observing the network’s dependency on the applied random transforms.
- Reference:
Wang et al., Aleatoric uncertainty estimation with test-time augmentation for medical image segmentation with convolutional neural networks, https://doi.org/10.1016/j.neucom.2019.01.103
- Parameters:
transform (
InvertibleTransform) – transform (or composed) to be applied to each realization. At least one transform must be of typeRandomizableTrait (i.e. Randomizable, RandomizableTransform, or RandomizableTrait) – . All random transforms must be of type InvertibleTransform.
batch_size (
int) – number of realizations to infer at once.num_workers (
int) – how many subprocesses to use for data.inferrer_fn (
Callable) – function to use to perform inference.device (
UnionType[str,device]) – device on which to perform inference.image_key – key used to extract image from input dictionary.
orig_key – the key of the original input data in the dict. will get the applied transform information for this input data, then invert them for the expected data with image_key.
orig_meta_keys (
UnionType[str,None]) – the key of the metadata of original input data, will get the affine, data_shape, etc. the metadata is a dictionary object which contains: filename, original_shape, etc. if None, will try to construct meta_keys by {orig_key}_{meta_key_postfix}.meta_key_postfix – use key_{postfix} to fetch the metadata according to the key data, default is meta_dict, the metadata is a dictionary object. For example, to handle key image, read/write affine matrices from the metadata image_meta_dict dictionary’s affine field. this arg only works when meta_keys=None.
to_tensor (
bool) – whether to convert the inverted data into PyTorch Tensor first, default to True.output_device (
UnionType[str,device]) – if converted the inverted data to Tensor, move the inverted results to target device before post_func, default to “cpu”.post_func (
Callable) – post processing for the inverted data, should be a callable function.return_full_data (
bool) – normally, metrics are returned (mode, mean, std, vvc). Setting this flag to True will return the full data. Dimensions will be same size as when passing a single image through inferrer_fn, with a dimension appended equal in size to num_examples (N), i.e., [N,C,H,W,[D]].progress (
bool) – whether to display a progress bar.
Example
model = UNet(...).to(device) transform = Compose([RandAffined(keys, ...), ...]) transform.set_random_state(seed=123) # ensure deterministic evaluation tt_aug = TestTimeAugmentation( transform, batch_size=5, num_workers=0, inferrer_fn=model, device=device ) mode, mean, std, vvc = tt_aug(test_data)
N-Dim Fourier Transform#
- monai.data.fft_utils.fftn_centered(im, spatial_dims, is_complex=True)[source]#
Pytorch-based fft for spatial_dims-dim signals. “centered” means this function automatically takes care of the required ifft and fft shifts. This function calls monai.networks.blocks.fft_utils_t.fftn_centered_t. This is equivalent to do ifft in numpy based on numpy.fft.fftn, numpy.fft.fftshift, and numpy.fft.ifftshift
- Parameters:
im (
Union[ndarray,Tensor]) – image that can be 1) real-valued: the shape is (C,H,W) for 2D spatial inputs and (C,H,W,D) for 3D, or 2) complex-valued: the shape is (C,H,W,2) for 2D spatial data and (C,H,W,D,2) for 3D. C is the number of channels.spatial_dims (
int) – number of spatial dimensions (e.g., is 2 for an image, and is 3 for a volume)is_complex (
bool) – if True, then the last dimension of the input im is expected to be 2 (representing real and imaginary channels)
- Return type:
Union[ndarray,Tensor]- Returns:
“out” which is the output kspace (fourier of im)
Example
import torch im = torch.ones(1,3,3,2) # the last dim belongs to real/imaginary parts # output1 and output2 will be identical output1 = torch.fft.fftn(torch.view_as_complex(torch.fft.ifftshift(im,dim=(-3,-2))), dim=(-2,-1), norm="ortho") output1 = torch.fft.fftshift( torch.view_as_real(output1), dim=(-3,-2) ) output2 = fftn_centered(im, spatial_dims=2, is_complex=True)
- monai.data.fft_utils.ifftn_centered(ksp, spatial_dims, is_complex=True)[source]#
Pytorch-based ifft for spatial_dims-dim signals. “centered” means this function automatically takes care of the required ifft and fft shifts. This function calls monai.networks.blocks.fft_utils_t.ifftn_centered_t. This is equivalent to do fft in numpy based on numpy.fft.ifftn, numpy.fft.fftshift, and numpy.fft.ifftshift
- Parameters:
ksp (
Union[ndarray,Tensor]) – k-space data that can be 1) real-valued: the shape is (C,H,W) for 2D spatial inputs and (C,H,W,D) for 3D, or 2) complex-valued: the shape is (C,H,W,2) for 2D spatial data and (C,H,W,D,2) for 3D. C is the number of channels.spatial_dims (
int) – number of spatial dimensions (e.g., is 2 for an image, and is 3 for a volume)is_complex (
bool) – if True, then the last dimension of the input ksp is expected to be 2 (representing real and imaginary channels)
- Return type:
Union[ndarray,Tensor]- Returns:
“out” which is the output image (inverse fourier of ksp)
Example
import torch ksp = torch.ones(1,3,3,2) # the last dim belongs to real/imaginary parts # output1 and output2 will be identical output1 = torch.fft.ifftn(torch.view_as_complex(torch.fft.ifftshift(ksp,dim=(-3,-2))), dim=(-2,-1), norm="ortho") output1 = torch.fft.fftshift( torch.view_as_real(output1), dim=(-3,-2) ) output2 = ifftn_centered(ksp, spatial_dims=2, is_complex=True)
ITK Torch Bridge#
- monai.data.itk_torch_bridge.get_itk_image_center(image)[source]#
Calculates the center of the ITK image based on its origin, size, and spacing. This center is equivalent to the implicit image center that MONAI uses.
- Parameters:
image – The ITK image.
- Returns:
The center of the image as a list of coordinates.
- monai.data.itk_torch_bridge.itk_image_to_metatensor(image, channel_dim=None, dtype=<class 'float'>)[source]#
Converts an ITK image to a MetaTensor object.
- Parameters:
image – The ITK image to be converted.
channel_dim (
UnionType[str,int,None]) – the channel dimension of the input image, default is None. This is used to set original_channel_dim in the metadata, EnsureChannelFirst reads this field. If None, the channel_dim is inferred automatically. If the input array doesn’t have a channel dim, this value should be'no_channel'.dtype (
Union[dtype,type,str,None,dtype]) – output dtype, defaults to the Python built-in float.
- Return type:
- Returns:
A MetaTensor object containing the array data and metadata in ChannelFirst format.
- monai.data.itk_torch_bridge.itk_to_monai_affine(image, matrix, translation, center_of_rotation=None, reference_image=None)[source]#
Converts an ITK affine matrix (2x2 for 2D or 3x3 for 3D matrix and translation vector) to a MONAI affine matrix.
- Parameters:
image – The ITK image object. This is used to extract the spacing and direction information.
matrix – The 2x2 or 3x3 ITK affine matrix.
translation – The 2-element or 3-element ITK affine translation vector.
center_of_rotation – The center of rotation. If provided, the affine matrix will be adjusted to account for the difference between the center of the image and the center of rotation.
reference_image – The coordinate space that matrix and translation were defined in respect to. If not supplied, the coordinate space of image is used.
- Return type:
Tensor- Returns:
A 4x4 MONAI affine matrix.
- monai.data.itk_torch_bridge.metatensor_to_itk_image(meta_tensor, channel_dim=0, dtype=<class 'numpy.float32'>, **kwargs)[source]#
Converts a MetaTensor object to an ITK image. Expects the MetaTensor to be in ChannelFirst format.
- Parameters:
meta_tensor (
MetaTensor) – The MetaTensor to be converted.channel_dim (
UnionType[int,None]) – channel dimension of the data array, defaults to0(Channel-first).Noneindicates no channel dimension. This is used to create a Vector Image if it is notNone.dtype (
Union[dtype,type,str,None]) – output data type, defaults to np.float32.kwargs – additional keyword arguments. Currently itk.GetImageFromArray will get
ttypefrom this dictionary.
- Returns:
The ITK image.
See also:
ITKWriter.create_backend_obj()
- monai.data.itk_torch_bridge.monai_to_itk_affine(image, affine_matrix, center_of_rotation=None)[source]#
Converts a MONAI affine matrix to an ITK affine matrix (2x2 for 2D or 3x3 for 3D matrix and translation vector). See also ‘itk_to_monai_affine’.
- Parameters:
image – The ITK image object. This is used to extract the spacing and direction information.
affine_matrix – The 3x3 for 2D or 4x4 for 3D MONAI affine matrix.
center_of_rotation – The center of rotation. If provided, the affine matrix will be adjusted to account for the difference between the center of the image and the center of rotation.
- Returns:
The ITK matrix and the translation vector.
- monai.data.itk_torch_bridge.monai_to_itk_ddf(image, ddf)[source]#
converting the dense displacement field from the MONAI space to the ITK :param image: itk image of array shape 2D: (H, W) or 3D: (D, H, W) :param ddf: numpy array of shape 2D: (2, H, W) or 3D: (3, D, H, W)
- Returns:
itk image of the corresponding displacement field
- Return type:
displacement_field
Meta Object#
- class monai.data.meta_obj.MetaObj[source]#
Abstract base class that stores data as well as any extra metadata.
This allows for subclassing torch.Tensor and np.ndarray through multiple inheritance.
Metadata is stored in the form of a dictionary.
Behavior should be the same as extended class (e.g., torch.Tensor or np.ndarray) aside from the extended meta functionality.
Copying of information:
For c = a + b, then auxiliary data (e.g., metadata) will be copied from the first instance of MetaObj if a.is_batch is False (For batched data, the metadata will be shallow copied for efficiency purposes).
- property applied_operations: list[dict]#
Get the applied operations. Defaults to
[].- Return type:
list[dict]
- static copy_items(data)[source]#
returns a copy of the data. list and dict are shallow copied for efficiency purposes.
- copy_meta_from(input_objs, copy_attr=True, keys=None)[source]#
Copy metadata from a MetaObj or an iterable of MetaObj instances.
- Parameters:
input_objs – list of MetaObj to copy data from.
copy_attr – whether to copy each attribute with MetaObj.copy_item. note that if the attribute is a nested list or dict, only a shallow copy will be done.
keys – the keys of attributes to copy from the
input_objs. If None, all keys from the input_objs will be copied.
- static flatten_meta_objs(*args)[source]#
Recursively flatten input and yield all instances of MetaObj. This means that for both torch.add(a, b), torch.stack([a, b]) (and their numpy equivalents), we return [a, b] if both a and b are of type MetaObj.
- Parameters:
args (
Iterable) – Iterables of inputs to be flattened.- Returns:
list of nested MetaObj from input.
- static get_default_applied_operations()[source]#
Get the default applied operations.
- Return type:
list- Returns:
default applied operations.
- static get_default_meta()[source]#
Get the default meta.
- Return type:
dict- Returns:
default metadata.
- property has_pending_operations: bool#
Determine whether there are pending operations. :rtype:
bool:returns: True if there are pending operations; False if not
- property is_batch: bool#
Return whether object is part of batch or not.
- Return type:
bool
- property meta: dict#
Get the meta. Defaults to
{}.- Return type:
dict
- property pending_operations: list[dict]#
Get the pending operations. Defaults to
[].- Return type:
list[dict]
- monai.data.meta_obj.get_track_meta()[source]#
Return the boolean as to whether metadata is tracked. If True, metadata will be associated its data by using subclasses of MetaObj. If False, then data will be returned with empty metadata.
If set_track_meta is False, then standard data objects will be returned (e.g., torch.Tensor and np.ndarray) as opposed to MONAI’s enhanced objects.
By default, this is True, and most users will want to leave it this way. However, if you are experiencing any problems regarding metadata, and aren’t interested in preserving metadata, then you can disable it.
- Return type:
bool
- monai.data.meta_obj.set_track_meta(val)[source]#
Boolean to set whether metadata is tracked. If True, metadata will be associated its data by using subclasses of MetaObj. If False, then data will be returned with empty metadata.
If set_track_meta is False, then standard data objects will be returned (e.g., torch.Tensor and np.ndarray) as opposed to MONAI’s enhanced objects.
By default, this is True, and most users will want to leave it this way. However, if you are experiencing any problems regarding metadata, and aren’t interested in preserving metadata, then you can disable it.
- Return type:
None
MetaTensor#
- class monai.data.MetaTensor(x, affine=None, meta=None, applied_operations=None, *_args, **_kwargs)[source]#
Bases:
MetaObj,TensorClass that inherits from both torch.Tensor and MetaObj, adding support for metadata.
Metadata is stored in the form of a dictionary. Nested, an affine matrix will be stored. This should be in the form of torch.Tensor.
Behavior should be the same as torch.Tensor aside from the extended meta functionality.
Copying of information:
For c = a + b, then auxiliary data (e.g., metadata) will be copied from the first instance of MetaTensor if a.is_batch is False (For batched data, the metadata will be shallow copied for efficiency purposes).
Example
import torch from monai.data import MetaTensor t = torch.tensor([1,2,3]) affine = torch.as_tensor([[2,0,0,0], [0,2,0,0], [0,0,2,0], [0,0,0,1]], dtype=torch.float64) meta = {"some": "info"} m = MetaTensor(t, affine=affine, meta=meta) m2 = m + m assert isinstance(m2, MetaTensor) assert m2.meta["some"] == "info" assert torch.all(m2.affine == affine)
Notes
Requires pytorch 1.9 or newer for full compatibility.
Older versions of pytorch (<=1.8), torch.jit.trace(net, im) may not work if im is of type MetaTensor. This can be resolved with torch.jit.trace(net, im.as_tensor()).
For pytorch < 1.8, sharing MetaTensor instances across processes may not be supported.
For pytorch < 1.9, next(iter(meta_tensor)) returns a torch.Tensor. see: pytorch/pytorch#54457
A warning will be raised if in the constructor affine is not None and meta already contains the key affine.
You can query whether the MetaTensor is a batch with the is_batch attribute.
With a batch of data, batch[0] will return the 0th image with the 0th metadata. When the batch dimension is non-singleton, e.g., batch[:, 0], batch[…, -1] and batch[1:3], then all (or a subset in the last example) of the metadata will be returned, and is_batch will return True.
When creating a batch with this class, use monai.data.DataLoader as opposed to torch.utils.data.DataLoader, as this will take care of collating the metadata properly.
- H#
Returns a view of a matrix (2-D tensor) conjugated and transposed.
x.His equivalent tox.transpose(0, 1).conj()for complex matrices andx.transpose(0, 1)for real matrices.See also
mH: An attribute that also works on batches of matrices.
- T#
Returns a view of this tensor with its dimensions reversed.
If
nis the number of dimensions inx,x.Tis equivalent tox.permute(n-1, n-2, ..., 0).Warning
The use of
Tensor.T()on tensors of dimension other than 2 to reverse their shape is deprecated and it will throw an error in a future release. ConsidermTto transpose batches of matrices or x.permute(*torch.arange(x.ndim - 1, -1, -1)) to reverse the dimensions of a tensor.
- __init__(x, affine=None, meta=None, applied_operations=None, *_args, **_kwargs)[source]#
- Parameters:
x – initial array for the MetaTensor. Can be a list, tuple, NumPy ndarray, scalar, and other types.
affine (
UnionType[Tensor,None]) – optional 4x4 array.meta (
UnionType[dict,None]) – dictionary of metadata.applied_operations (
UnionType[list,None]) – list of previously applied operations on the MetaTensor, the list is typically maintained by monai.transforms.TraceableTransform. See also:monai.transforms.TraceableTransform_args – additional args (currently not in use in this constructor).
_kwargs – additional kwargs (currently not in use in this constructor).
Note
If a meta dictionary is given, use it. Else, if meta exists in the input tensor x, use it. Else, use the default value. Similar for the affine, except this could come from four places, priority: affine, meta[“affine”], x.affine, get_default_affine.
- abs() Tensor#
See
torch.abs()
- abs_() Tensor#
In-place version of
abs()
- acos() Tensor#
See
torch.acos()
- acos_() Tensor#
In-place version of
acos()
- acosh() Tensor#
See
torch.acosh()
- acosh_() Tensor#
In-place version of
acosh()
- add(other, *, alpha=1) Tensor#
Add a scalar or tensor to
selftensor. If bothalphaandotherare specified, each element ofotheris scaled byalphabefore being used.When
otheris a tensor, the shape ofothermust be broadcastable with the shape of the underlying tensorSee
torch.add()
- add_(other, *, alpha=1) Tensor#
In-place version of
add()
- addbmm(batch1, batch2, *, beta=1, alpha=1) Tensor#
See
torch.addbmm()
- addbmm_(batch1, batch2, *, beta=1, alpha=1) Tensor#
In-place version of
addbmm()
- addcdiv(tensor1, tensor2, *, value=1) Tensor#
See
torch.addcdiv()
- addcdiv_(tensor1, tensor2, *, value=1) Tensor#
In-place version of
addcdiv()
- addcmul(tensor1, tensor2, *, value=1) Tensor#
See
torch.addcmul()
- addcmul_(tensor1, tensor2, *, value=1) Tensor#
In-place version of
addcmul()
- addmm(mat1, mat2, *, beta=1, alpha=1) Tensor#
See
torch.addmm()
- addmm_(mat1, mat2, *, beta=1, alpha=1) Tensor#
In-place version of
addmm()
- addmv(mat, vec, *, beta=1, alpha=1) Tensor#
See
torch.addmv()
- addmv_(mat, vec, *, beta=1, alpha=1) Tensor#
In-place version of
addmv()
- addr(vec1, vec2, *, beta=1, alpha=1) Tensor#
See
torch.addr()
- addr_(vec1, vec2, *, beta=1, alpha=1) Tensor#
In-place version of
addr()
- property affine: Tensor#
Get the affine. Defaults to
torch.eye(4, dtype=torch.float64)- Return type:
Tensor
- align_as(other) Tensor#
Permutes the dimensions of the
selftensor to match the dimension order in theothertensor, adding size-one dims for any new names.This operation is useful for explicit broadcasting by names (see examples).
All of the dims of
selfmust be named in order to use this method. The resulting tensor is a view on the original tensor.All dimension names of
selfmust be present inother.names.othermay contain named dimensions that are not inself.names; the output tensor has a size-one dimension for each of those new names.To align a tensor to a specific order, use
align_to().Examples:
# Example 1: Applying a mask >>> mask = torch.randint(2, [127, 128], dtype=torch.bool).refine_names('W', 'H') >>> imgs = torch.randn(32, 128, 127, 3, names=('N', 'H', 'W', 'C')) >>> imgs.masked_fill_(mask.align_as(imgs), 0) # Example 2: Applying a per-channel-scale >>> def scale_channels(input, scale): >>> scale = scale.refine_names('C') >>> return input * scale.align_as(input) >>> num_channels = 3 >>> scale = torch.randn(num_channels, names=('C',)) >>> imgs = torch.rand(32, 128, 128, num_channels, names=('N', 'H', 'W', 'C')) >>> more_imgs = torch.rand(32, num_channels, 128, 128, names=('N', 'C', 'H', 'W')) >>> videos = torch.randn(3, num_channels, 128, 128, 128, names=('N', 'C', 'H', 'W', 'D')) # scale_channels is agnostic to the dimension order of the input >>> scale_channels(imgs, scale) >>> scale_channels(more_imgs, scale) >>> scale_channels(videos, scale)
Warning
The named tensor API is experimental and subject to change.
- align_to(*names)#
Permutes the dimensions of the
selftensor to match the order specified innames, adding size-one dims for any new names.All of the dims of
selfmust be named in order to use this method. The resulting tensor is a view on the original tensor.All dimension names of
selfmust be present innames.namesmay contain additional names that are not inself.names; the output tensor has a size-one dimension for each of those new names.namesmay contain up to one Ellipsis (...). The Ellipsis is expanded to be equal to all dimension names ofselfthat are not mentioned innames, in the order that they appear inself.Python 2 does not support Ellipsis but one may use a string literal instead (
'...').- Parameters:
names (iterable of str) – The desired dimension ordering of the output tensor. May contain up to one Ellipsis that is expanded to all unmentioned dim names of
self.
Examples:
>>> tensor = torch.randn(2, 2, 2, 2, 2, 2) >>> named_tensor = tensor.refine_names('A', 'B', 'C', 'D', 'E', 'F') # Move the F and E dims to the front while keeping the rest in order >>> named_tensor.align_to('F', 'E', ...)
Warning
The named tensor API is experimental and subject to change.
- all(dim=None, keepdim=False) Tensor#
See
torch.all()
- allclose(other, rtol=1e-05, atol=1e-08, equal_nan=False) Tensor#
See
torch.allclose()
- amax(dim=None, keepdim=False) Tensor#
See
torch.amax()
- amin(dim=None, keepdim=False) Tensor#
See
torch.amin()
- aminmax(*, dim=None, keepdim=False) -> (Tensor min, Tensor max)#
See
torch.aminmax()
- angle() Tensor#
See
torch.angle()
- any(dim=None, keepdim=False) Tensor#
See
torch.any()
- apply_(callable) Tensor#
Applies the function
callableto each element in the tensor, replacing each element with the value returned bycallable.Note
This function only works with CPU tensors and should not be used in code sections that require high performance.
- arccos() Tensor#
See
torch.arccos()
- arccos_() Tensor#
In-place version of
arccos()
- arccosh()#
acosh() -> Tensor
See
torch.arccosh()
- arccosh_()#
acosh_() -> Tensor
In-place version of
arccosh()
- arcsin() Tensor#
See
torch.arcsin()
- arcsin_() Tensor#
In-place version of
arcsin()
- arcsinh() Tensor#
See
torch.arcsinh()
- arcsinh_() Tensor#
In-place version of
arcsinh()
- arctan() Tensor#
See
torch.arctan()
- arctan2(other) Tensor#
See
torch.arctan2()
- arctan2_()#
atan2_(other) -> Tensor
In-place version of
arctan2()
- arctan_() Tensor#
In-place version of
arctan()
- arctanh() Tensor#
See
torch.arctanh()
- arctanh_(other) Tensor#
In-place version of
arctanh()
- argmax(dim=None, keepdim=False) LongTensor#
See
torch.argmax()
- argmin(dim=None, keepdim=False) LongTensor#
See
torch.argmin()
- argsort(dim=-1, descending=False) LongTensor#
See
torch.argsort()
- argwhere() Tensor#
See
torch.argwhere()
- property array#
Returns a numpy array of
self. The array andselfshares the same underlying storage if self is on cpu. Changes toself(it’s a subclass of torch.Tensor) will be reflected in the ndarray and vice versa. Ifselfis not on cpu, the call will move the array to cpu and then the storage is not shared.- Getter:
see also:
MetaTensor.get_array()- Setter:
see also:
MetaTensor.set_array()
- as_dict(key, output_type=<class 'torch.Tensor'>, dtype=None)[source]#
Get the object as a dictionary for backwards compatibility. This method does not make a deep copy of the objects.
- Parameters:
key (
str) – Base key to store main data. The key for the metadata will be determined using PostFix.output_type – torch.Tensor or np.ndarray for the main data.
dtype – dtype of output data. Converted to correct library type (e.g., np.float32 is converted to torch.float32 if output type is torch.Tensor). If left blank, it remains unchanged.
- Return type:
dict- Returns:
A dictionary consisting of three keys, the main data (stored under key) and the metadata.
- as_strided(size, stride, storage_offset=None) Tensor#
See
torch.as_strided()
- as_strided_(size, stride, storage_offset=None) Tensor#
In-place version of
as_strided()
- as_strided_scatter(src, size, stride, storage_offset=None) Tensor#
See
torch.as_strided_scatter()
- as_subclass(cls) Tensor#
Makes a
clsinstance with the same data pointer asself. Changes in the output mirror changes inself, and the output stays attached to the autograd graph.clsmust be a subclass ofTensor.
- as_tensor()[source]#
Return the MetaTensor as a torch.Tensor. It is OS dependent as to whether this will be a deep copy or not.
- Return type:
Tensor
- asin() Tensor#
See
torch.asin()
- asin_() Tensor#
In-place version of
asin()
- asinh() Tensor#
See
torch.asinh()
- asinh_() Tensor#
In-place version of
asinh()
- astype(dtype, device=None, *_args, **_kwargs)[source]#
Cast to
dtype, sharing data whenever possible.- Parameters:
dtype – dtypes such as np.float32, torch.float, “np.float32”, float.
device – the device if dtype is a torch data type.
_args – additional args (currently unused).
_kwargs – additional kwargs (currently unused).
- Returns:
data array instance
- atan() Tensor#
See
torch.atan()
- atan2(other) Tensor#
See
torch.atan2()
- atan2_(other) Tensor#
In-place version of
atan2()
- atan_() Tensor#
In-place version of
atan()
- atanh() Tensor#
See
torch.atanh()
- atanh_(other) Tensor#
In-place version of
atanh()
- backward(gradient=None, retain_graph=None, create_graph=False, inputs=None)#
Computes the gradient of current tensor wrt graph leaves.
The graph is differentiated using the chain rule. If the tensor is non-scalar (i.e. its data has more than one element) and requires gradient, the function additionally requires specifying a
gradient. It should be a tensor of matching type and shape, that represents the gradient of the differentiated function w.r.t.self.This function accumulates gradients in the leaves - you might need to zero
.gradattributes or set them toNonebefore calling it. See Default gradient layouts for details on the memory layout of accumulated gradients.Note
If you run any forward ops, create
gradient, and/or callbackwardin a user-specified CUDA stream context, see Stream semantics of backward passes.Note
When
inputsare provided and a given input is not a leaf, the current implementation will call its grad_fn (though it is not strictly needed to get this gradients). It is an implementation detail on which the user should not rely. See pytorch/pytorch#60521 for more details.- Parameters:
gradient (Tensor, optional) – The gradient of the function being differentiated w.r.t.
self. This argument can be omitted ifselfis a scalar. Defaults toNone.retain_graph (bool, optional) – If
False, the graph used to compute the grads will be freed; IfTrue, it will be retained. The default isNone, in which case the value is inferred fromcreate_graph(i.e., the graph is retained only when higher-order derivative tracking is requested). Note that in nearly all cases setting this option to True is not needed and often can be worked around in a much more efficient way.create_graph (bool, optional) – If
True, graph of the derivative will be constructed, allowing to compute higher order derivative products. Defaults toFalse.inputs (Sequence[Tensor], optional) – Inputs w.r.t. which the gradient will be accumulated into
.grad. All other tensors will be ignored. If not provided, the gradient is accumulated into all the leaf Tensors that were used to compute thetensors. Defaults toNone.
- baddbmm(batch1, batch2, *, beta=1, alpha=1) Tensor#
See
torch.baddbmm()
- baddbmm_(batch1, batch2, *, beta=1, alpha=1) Tensor#
In-place version of
baddbmm()
- bernoulli(*, generator=None) Tensor#
Returns a result tensor where each \(\texttt{result[i]}\) is independently sampled from \(\text{Bernoulli}(\texttt{self[i]})\).
selfmust have floating pointdtype, and the result will have the samedtype.See
torch.bernoulli()
- bernoulli_(p=0.5, *, generator=None) Tensor#
Fills each location of
selfwith an independent sample from \(\text{Bernoulli}(\texttt{p})\).selfcan have integraldtype.pshould either be a scalar or tensor containing probabilities to be used for drawing the binary random number.If it is a tensor, the \(\text{i}^{th}\) element of
selftensor will be set to a value sampled from \(\text{Bernoulli}(\texttt{p\_tensor[i]})\). In this case p must have floating pointdtype.See also
bernoulli()andtorch.bernoulli()
- bfloat16(memory_format=torch.preserve_format) Tensor#
self.bfloat16()is equivalent toself.to(torch.bfloat16). Seeto().- Parameters:
memory_format (
torch.memory_format, optional) – the desired memory format of returned Tensor. Default:torch.preserve_format.
- bincount(weights=None, minlength=0) Tensor#
See
torch.bincount()
- bitwise_and() Tensor#
See
torch.bitwise_and()
- bitwise_and_() Tensor#
In-place version of
bitwise_and()
- bitwise_left_shift(other) Tensor#
See
torch.bitwise_left_shift()
- bitwise_left_shift_(other) Tensor#
In-place version of
bitwise_left_shift()
- bitwise_not() Tensor#
See
torch.bitwise_not()
- bitwise_not_() Tensor#
In-place version of
bitwise_not()
- bitwise_or() Tensor#
See
torch.bitwise_or()
- bitwise_or_() Tensor#
In-place version of
bitwise_or()
- bitwise_right_shift(other) Tensor#
See
torch.bitwise_right_shift()
- bitwise_right_shift_(other) Tensor#
In-place version of
bitwise_right_shift()
- bitwise_xor() Tensor#
See
torch.bitwise_xor()
- bitwise_xor_() Tensor#
In-place version of
bitwise_xor()
- bmm(batch2) Tensor#
See
torch.bmm()
- bool(memory_format=torch.preserve_format) Tensor#
self.bool()is equivalent toself.to(torch.bool). Seeto().- Parameters:
memory_format (
torch.memory_format, optional) – the desired memory format of returned Tensor. Default:torch.preserve_format.
- broadcast_to(shape) Tensor#
See
torch.broadcast_to().
- byte(memory_format=torch.preserve_format) Tensor#
self.byte()is equivalent toself.to(torch.uint8). Seeto().- Parameters:
memory_format (
torch.memory_format, optional) – the desired memory format of returned Tensor. Default:torch.preserve_format.
- cauchy_(median=0, sigma=1, *, generator=None) Tensor#
Fills the tensor with numbers drawn from the Cauchy distribution:
\[f(x) = \dfrac{1}{\pi} \dfrac{\sigma}{(x - \text{median})^2 + \sigma^2}\]Note
Sigma (\(\sigma\)) is used to denote the scale parameter in Cauchy distribution.
- cdouble(memory_format=torch.preserve_format) Tensor#
self.cdouble()is equivalent toself.to(torch.complex128). Seeto().- Parameters:
memory_format (
torch.memory_format, optional) – the desired memory format of returned Tensor. Default:torch.preserve_format.
- ceil() Tensor#
See
torch.ceil()
- ceil_() Tensor#
In-place version of
ceil()
- cfloat(memory_format=torch.preserve_format) Tensor#
self.cfloat()is equivalent toself.to(torch.complex64). Seeto().- Parameters:
memory_format (
torch.memory_format, optional) – the desired memory format of returned Tensor. Default:torch.preserve_format.
- chalf(memory_format=torch.preserve_format) Tensor#
self.chalf()is equivalent toself.to(torch.complex32). Seeto().- Parameters:
memory_format (
torch.memory_format, optional) – the desired memory format of returned Tensor. Default:torch.preserve_format.
- char(memory_format=torch.preserve_format) Tensor#
self.char()is equivalent toself.to(torch.int8). Seeto().- Parameters:
memory_format (
torch.memory_format, optional) – the desired memory format of returned Tensor. Default:torch.preserve_format.
- cholesky(upper=False) Tensor#
See
torch.cholesky()
- cholesky_inverse(upper=False) Tensor#
See
torch.cholesky_inverse()
- cholesky_solve(input2, upper=False) Tensor#
See
torch.cholesky_solve()
- chunk(chunks, dim=0) List of Tensors#
See
torch.chunk()
- clamp(min=None, max=None) Tensor#
See
torch.clamp()
- clamp_(min=None, max=None) Tensor#
In-place version of
clamp()
- clip(min=None, max=None) Tensor#
Alias for
clamp().
- clip_(min=None, max=None) Tensor#
Alias for
clamp_().
- clone(**kwargs)[source]#
Returns a copy of the MetaTensor instance.
- Parameters:
kwargs – additional keyword arguments to torch.clone.
See also: https://pytorch.org/docs/stable/generated/torch.clone.html
- coalesce() Tensor#
Returns a coalesced copy of
selfifselfis an uncoalesced tensor.Returns
selfifselfis a coalesced tensor.Warning
Throws an error if
selfis not a sparse COO tensor.
- col_indices() IntTensor#
Returns the tensor containing the column indices of the
selftensor whenselfis a sparse CSR tensor of layoutsparse_csr. Thecol_indicestensor is strictly of shape (self.nnz()) and of typeint32orint64. When using MKL routines such as sparse matrix multiplication, it is necessary to useint32indexing in order to avoid downcasting and potentially losing information.Example:
>>> csr = torch.eye(5,5).to_sparse_csr() >>> csr.col_indices() tensor([0, 1, 2, 3, 4], dtype=torch.int32)
- conj() Tensor#
See
torch.conj()
- conj_physical() Tensor#
See
torch.conj_physical()
- conj_physical_() Tensor#
In-place version of
conj_physical()
- contiguous(memory_format=torch.contiguous_format) Tensor#
Returns a contiguous in memory tensor containing the same data as
selftensor. Ifselftensor is already in the specified memory format, this function returns theselftensor.- Parameters:
memory_format (
torch.memory_format, optional) – the desired memory format of returned Tensor. Default:torch.contiguous_format.
- copy_(src, non_blocking=False) Tensor#
Copies the elements from
srcintoselftensor and returnsself.The
srctensor must be broadcastable with theselftensor. It may be of a different data type or reside on a different device.- Parameters:
src (Tensor) – the source tensor to copy from
non_blocking (bool) – if
Trueand this copy is between CPU and GPU, the copy may occur asynchronously with respect to the host. For other cases, this argument has no effect.
- copysign(other) Tensor#
See
torch.copysign()
- copysign_(other) Tensor#
In-place version of
copysign()
- corrcoef() Tensor#
See
torch.corrcoef()
- cos() Tensor#
See
torch.cos()
- cos_() Tensor#
In-place version of
cos()
- cosh() Tensor#
See
torch.cosh()
- cosh_() Tensor#
In-place version of
cosh()
- count_nonzero(dim=None) Tensor#
See
torch.count_nonzero()
- cov(*, correction=1, fweights=None, aweights=None) Tensor#
See
torch.cov()
- cpu(memory_format=torch.preserve_format) Tensor#
Returns a copy of this object in CPU memory.
If this object is already in CPU memory, then no copy is performed and the original object is returned.
- Parameters:
memory_format (
torch.memory_format, optional) – the desired memory format of returned Tensor. Default:torch.preserve_format.
- cross(other, dim=None) Tensor#
See
torch.cross()
- crow_indices() IntTensor#
Returns the tensor containing the compressed row indices of the
selftensor whenselfis a sparse CSR tensor of layoutsparse_csr. Thecrow_indicestensor is strictly of shape (self.size(0) + 1) and of typeint32orint64. When using MKL routines such as sparse matrix multiplication, it is necessary to useint32indexing in order to avoid downcasting and potentially losing information.Example:
>>> csr = torch.eye(5,5).to_sparse_csr() >>> csr.crow_indices() tensor([0, 1, 2, 3, 4, 5], dtype=torch.int32)
- cuda(device=None, non_blocking=False, memory_format=torch.preserve_format) Tensor#
Returns a copy of this object in CUDA memory.
If this object is already in CUDA memory and on the correct device, then no copy is performed and the original object is returned.
- Parameters:
device (
torch.device) – The destination GPU device. Defaults to the current CUDA device.non_blocking (bool) – If
Trueand the source is in pinned memory, the copy will be asynchronous with respect to the host. Otherwise, the argument has no effect. Default:False.memory_format (
torch.memory_format, optional) – the desired memory format of returned Tensor. Default:torch.preserve_format.
- cummax(dim)#
See
torch.cummax()
- cummin(dim)#
See
torch.cummin()
- cumprod(dim, dtype=None) Tensor#
See
torch.cumprod()
- cumprod_(dim, dtype=None) Tensor#
In-place version of
cumprod()
- cumsum(dim, dtype=None) Tensor#
See
torch.cumsum()
- cumsum_(dim, dtype=None) Tensor#
In-place version of
cumsum()
- data_ptr() int#
Returns the address of the first element of
selftensor.
- deg2rad() Tensor#
See
torch.deg2rad()
- deg2rad_() Tensor#
In-place version of
deg2rad()
- dense_dim() int#
Return the number of dense dimensions in a sparse tensor
self.Note
Returns
len(self.shape)ifselfis not a sparse tensor.See also
Tensor.sparse_dim()and hybrid tensors.
- dequantize() Tensor#
Given a quantized Tensor, dequantize it and return the dequantized float Tensor.
- det() Tensor#
See
torch.det()
- detach()#
Returns a new Tensor, detached from the current graph.
The result will never require gradient.
This method also affects forward mode AD gradients and the result will never have forward mode AD gradients.
Note
Returned Tensor shares the same storage with the original one. In-place modifications on either of them will be seen, and may trigger errors in correctness checks.
- detach_()#
Detaches the Tensor from the graph that created it, making it a leaf. Views cannot be detached in-place.
This method also affects forward mode AD gradients and the result will never have forward mode AD gradients.
- device#
Is the
torch.devicewhere this Tensor is.
- diag(diagonal=0) Tensor#
See
torch.diag()
- diag_embed(offset=0, dim1=-2, dim2=-1) Tensor#
See
torch.diag_embed()
- diagflat(offset=0) Tensor#
See
torch.diagflat()
- diagonal(offset=0, dim1=0, dim2=1) Tensor#
See
torch.diagonal()
- diagonal_scatter(src, offset=0, dim1=0, dim2=1) Tensor#
See
torch.diagonal_scatter()
- diff(n=1, dim=-1, prepend=None, append=None) Tensor#
See
torch.diff()
- digamma() Tensor#
See
torch.digamma()
- digamma_() Tensor#
In-place version of
digamma()
- dim() int#
Returns the number of dimensions of
selftensor.
- dim_order(*, ambiguity_check=False)#
Returns the uniquely determined tuple of int describing the dim order or physical layout of
self.The dim order represents how dimensions are laid out in memory of dense tensors, starting from the outermost to the innermost dimension.
Note that the dim order may not always be uniquely determined. If ambiguity_check is True, this function raises a RuntimeError when the dim order cannot be uniquely determined; If ambiguity_check is a list of memory formats, this function raises a RuntimeError when tensor can not be interpreted into exactly one of the given memory formats, or it cannot be uniquely determined. If ambiguity_check is False, it will return one of legal dim order(s) without checking its uniqueness. Otherwise, it will raise TypeError.
- Parameters:
ambiguity_check (bool or List[torch.memory_format]) – The check method for ambiguity of dim order.
Examples:
>>> torch.empty((2, 3, 5, 7)).dim_order() (0, 1, 2, 3) >>> torch.empty((2, 3, 5, 7)).transpose(1, 2).dim_order() (0, 2, 1, 3) >>> torch.empty((2, 3, 5, 7), memory_format=torch.channels_last).dim_order() (0, 2, 3, 1) >>> torch.empty((1, 2, 3, 4)).dim_order() (0, 1, 2, 3) >>> try: ... torch.empty((1, 2, 3, 4)).dim_order(ambiguity_check=True) ... except RuntimeError as e: ... print(e) The tensor does not have unique dim order, or cannot map to exact one of the given memory formats. >>> torch.empty((1, 2, 3, 4)).dim_order( ... ambiguity_check=[torch.contiguous_format, torch.channels_last] ... ) # It can be mapped to contiguous format (0, 1, 2, 3) >>> try: ... torch.empty((1, 2, 3, 4)).dim_order(ambiguity_check="ILLEGAL") ... except TypeError as e: ... print(e) The ambiguity_check argument must be a bool or a list of memory formats.
Warning
The dim_order tensor API is experimental and subject to change.
- dist(other, p=2) Tensor#
See
torch.dist()
- div(value, *, rounding_mode=None) Tensor#
See
torch.div()
- div_(value, *, rounding_mode=None) Tensor#
In-place version of
div()
- divide(value, *, rounding_mode=None) Tensor#
See
torch.divide()
- divide_(value, *, rounding_mode=None) Tensor#
In-place version of
divide()
- dot(other) Tensor#
See
torch.dot()
- double(memory_format=torch.preserve_format) Tensor#
self.double()is equivalent toself.to(torch.float64). Seeto().- Parameters:
memory_format (
torch.memory_format, optional) – the desired memory format of returned Tensor. Default:torch.preserve_format.
- dsplit(split_size_or_sections) List of Tensors#
See
torch.dsplit()
- element_size() int#
Returns the size in bytes of an individual element.
Example:
>>> torch.tensor([]).element_size() 4 >>> torch.tensor([], dtype=torch.uint8).element_size() 1
- static ensure_torch_and_prune_meta(im, meta, simple_keys=False, pattern=None, sep='.')[source]#
Convert the image to MetaTensor (when meta is not None). If affine is in the meta dictionary, convert that to torch.Tensor, too. Remove any superfluous metadata.
- Parameters:
im (~NdarrayTensor) – Input image (np.ndarray or torch.Tensor)
meta (
UnionType[dict,None]) – Metadata dictionary. When it’s None, the metadata is not tracked, this method returns a torch.Tensor.simple_keys (
bool) – whether to keep only a simple subset of metadata keys.pattern (
UnionType[str,None]) – combined with sep, a regular expression used to match and prune keys in the metadata (nested dictionary), default to None, no key deletion.sep (
str) – combined with pattern, used to match and delete keys in the metadata (nested dictionary). default is “.”, see alsomonai.transforms.DeleteItemsd. e.g.pattern=".*_code$", sep=" "removes any meta keys that ends with"_code".
- Returns:
By default, a MetaTensor is returned. However, if get_track_meta() is False or meta=None, a torch.Tensor is returned.
- eq(other) Tensor#
See
torch.eq()
- eq_(other) Tensor#
In-place version of
eq()
- equal(other) bool#
See
torch.equal()
- erf() Tensor#
See
torch.erf()
- erf_() Tensor#
In-place version of
erf()
- erfc() Tensor#
See
torch.erfc()
- erfc_() Tensor#
In-place version of
erfc()
- erfinv() Tensor#
See
torch.erfinv()
- erfinv_() Tensor#
In-place version of
erfinv()
- exp() Tensor#
See
torch.exp()
- exp2() Tensor#
See
torch.exp2()
- exp2_() Tensor#
In-place version of
exp2()
- exp_() Tensor#
In-place version of
exp()
- expand(*sizes) Tensor#
Returns a new view of the
selftensor with singleton dimensions expanded to a larger size.Passing -1 as the size for a dimension means not changing the size of that dimension.
Tensor can be also expanded to a larger number of dimensions, and the new ones will be appended at the front. For the new dimensions, the size cannot be set to -1.
Expanding a tensor does not allocate new memory, but only creates a new view on the existing tensor where a dimension of size one is expanded to a larger size by setting the
strideto 0. Any dimension of size 1 can be expanded to an arbitrary value without allocating new memory.- Parameters:
*sizes (torch.Size or int...) – the desired expanded size
Warning
More than one element of an expanded tensor may refer to a single memory location. As a result, in-place operations (especially ones that are vectorized) may result in incorrect behavior. If you need to write to the tensors, please clone them first.
Example:
>>> x = torch.tensor([[1], [2], [3]]) >>> x.size() torch.Size([3, 1]) >>> x.expand(3, 4) tensor([[ 1, 1, 1, 1], [ 2, 2, 2, 2], [ 3, 3, 3, 3]]) >>> x.expand(-1, 4) # -1 means not changing the size of that dimension tensor([[ 1, 1, 1, 1], [ 2, 2, 2, 2], [ 3, 3, 3, 3]])
- expand_as(other) Tensor#
Expand this tensor to the same size as
other.self.expand_as(other)is equivalent toself.expand(other.size()).Please see
expand()for more information aboutexpand.- Parameters:
other (
torch.Tensor) – The result tensor has the same size asother.
- expm1() Tensor#
See
torch.expm1()
- expm1_() Tensor#
In-place version of
expm1()
- exponential_(lambd=1, *, generator=None) Tensor#
Fills
selftensor with elements drawn from the PDF (probability density function):\[f(x) = \lambda e^{-\lambda x}, x > 0\]Note
In probability theory, exponential distribution is supported on interval [0, \(\inf\)) (i.e., \(x >= 0\)) implying that zero can be sampled from the exponential distribution. However,
torch.Tensor.exponential_()does not sample zero, which means that its actual support is the interval (0, \(\inf\)).Note that
torch.distributions.exponential.Exponential()is supported on the interval [0, \(\inf\)) and can sample zero.
- fill_(value) Tensor#
Fills
selftensor with the specified value.
- fill_diagonal_(fill_value, wrap=False) Tensor#
Fill the main diagonal of a tensor that has at least 2-dimensions. When dims>2, all dimensions of input must be of equal length. This function modifies the input tensor in-place, and returns the input tensor.
- Parameters:
fill_value (Scalar) – the fill value
wrap (bool) – the diagonal ‘wrapped’ after N columns for tall matrices.
Example:
>>> a = torch.zeros(3, 3) >>> a.fill_diagonal_(5) tensor([[5., 0., 0.], [0., 5., 0.], [0., 0., 5.]]) >>> b = torch.zeros(7, 3) >>> b.fill_diagonal_(5) tensor([[5., 0., 0.], [0., 5., 0.], [0., 0., 5.], [0., 0., 0.], [0., 0., 0.], [0., 0., 0.], [0., 0., 0.]]) >>> c = torch.zeros(7, 3) >>> c.fill_diagonal_(5, wrap=True) tensor([[5., 0., 0.], [0., 5., 0.], [0., 0., 5.], [0., 0., 0.], [5., 0., 0.], [0., 5., 0.], [0., 0., 5.]])
- fix() Tensor#
See
torch.fix().
- fix_() Tensor#
In-place version of
fix()
- flatten(start_dim=0, end_dim=-1) Tensor#
See
torch.flatten()
- flip(dims) Tensor#
See
torch.flip()
- fliplr() Tensor#
See
torch.fliplr()
- flipud() Tensor#
See
torch.flipud()
- float(memory_format=torch.preserve_format) Tensor#
self.float()is equivalent toself.to(torch.float32). Seeto().- Parameters:
memory_format (
torch.memory_format, optional) – the desired memory format of returned Tensor. Default:torch.preserve_format.
- float_power(exponent) Tensor#
See
torch.float_power()
- float_power_(exponent) Tensor#
In-place version of
float_power()
- floor() Tensor#
See
torch.floor()
- floor_() Tensor#
In-place version of
floor()
- floor_divide(value) Tensor#
See
torch.floor_divide()
- floor_divide_(value) Tensor#
In-place version of
floor_divide()
- fmax(other) Tensor#
See
torch.fmax()
- fmin(other) Tensor#
See
torch.fmin()
- fmod(divisor) Tensor#
See
torch.fmod()
- fmod_(divisor) Tensor#
In-place version of
fmod()
- frac() Tensor#
See
torch.frac()
- frac_() Tensor#
In-place version of
frac()
- frexp(input) -> (Tensor mantissa, Tensor exponent)#
See
torch.frexp()
- gather(dim, index) Tensor#
See
torch.gather()
- gcd(other) Tensor#
See
torch.gcd()
- gcd_(other) Tensor#
In-place version of
gcd()
- ge(other) Tensor#
See
torch.ge().
- ge_(other) Tensor#
In-place version of
ge().
- geometric_(p, *, generator=None) Tensor#
Fills
selftensor with elements drawn from the geometric distribution:\[P(X=k) = (1 - p)^{k - 1} p, k = 1, 2, ...\]Note
torch.Tensor.geometric_()k-th trial is the first success hence draws samples in \(\{1, 2, \ldots\}\), whereastorch.distributions.geometric.Geometric()\((k+1)\)-th trial is the first success hence draws samples in \(\{0, 1, \ldots\}\).
- geqrf()#
See
torch.geqrf()
- ger(vec2) Tensor#
See
torch.ger()
- get_array(output_type=<class 'numpy.ndarray'>, dtype=None, device=None, *_args, **_kwargs)[source]#
Returns a new array in output_type, the array shares the same underlying storage when the output is a numpy array. Changes to self tensor will be reflected in the ndarray and vice versa.
- Parameters:
output_type – output type, see also:
monai.utils.convert_data_type().dtype – dtype of output data. Converted to correct library type (e.g., np.float32 is converted to torch.float32 if output type is torch.Tensor). If left blank, it remains unchanged.
device – if the output is a torch.Tensor, select device (if None, unchanged).
_args – currently unused parameters.
_kwargs – currently unused parameters.
- get_device() -> Device ordinal (Integer)#
For CUDA tensors, this function returns the device ordinal of the GPU on which the tensor resides. For CPU tensors, this function returns -1.
Example:
>>> x = torch.randn(3, 4, 5, device='cuda:0') >>> x.get_device() 0 >>> x.cpu().get_device() -1
- grad#
This attribute is
Noneby default and becomes a Tensor the first time a call tobackward()computes gradients forself. The attribute will then contain the gradients computed and future calls tobackward()will accumulate (add) gradients into it.
- greater(other) Tensor#
See
torch.greater().
- greater_(other) Tensor#
In-place version of
greater().
- greater_equal(other) Tensor#
See
torch.greater_equal().
- greater_equal_(other) Tensor#
In-place version of
greater_equal().
- gt(other) Tensor#
See
torch.gt().
- gt_(other) Tensor#
In-place version of
gt().
- half(memory_format=torch.preserve_format) Tensor#
self.half()is equivalent toself.to(torch.float16). Seeto().- Parameters:
memory_format (
torch.memory_format, optional) – the desired memory format of returned Tensor. Default:torch.preserve_format.
- hardshrink(lambd=0.5) Tensor#
See
torch.nn.functional.hardshrink()
- has_names()#
Is
Trueif any of this tensor’s dimensions are named. Otherwise, isFalse.
- heaviside(values) Tensor#
See
torch.heaviside()
- heaviside_(values) Tensor#
In-place version of
heaviside()
- histc(bins=100, min=0, max=0) Tensor#
See
torch.histc()
- histogram(input, bins, *, range=None, weight=None, density=False)#
See
torch.histogram()
- hsplit(split_size_or_sections) List of Tensors#
See
torch.hsplit()
- hypot(other) Tensor#
See
torch.hypot()
- hypot_(other) Tensor#
In-place version of
hypot()
- i0() Tensor#
See
torch.i0()
- i0_() Tensor#
In-place version of
i0()
- igamma(other) Tensor#
See
torch.igamma()
- igamma_(other) Tensor#
In-place version of
igamma()
- igammac(other) Tensor#
See
torch.igammac()
- igammac_(other) Tensor#
In-place version of
igammac()
- imag#
Returns a new tensor containing imaginary values of the
selftensor. The returned tensor andselfshare the same underlying storage.Warning
imag()is only supported for tensors with complex dtypes.Example:
>>> x=torch.randn(4, dtype=torch.cfloat) >>> x tensor([(0.3100+0.3553j), (-0.5445-0.7896j), (-1.6492-0.0633j), (-0.0638-0.8119j)]) >>> x.imag tensor([ 0.3553, -0.7896, -0.0633, -0.8119])
- index_add(dim, index, source, *, alpha=1) Tensor#
Out-of-place version of
torch.Tensor.index_add_().
- index_add_(dim, index, source, *, alpha=1) Tensor#
Accumulate the elements of
alphatimessourceinto theselftensor by adding to the indices in the order given inindex. For example, ifdim == 0,index[i] == j, andalpha=-1, then theith row ofsourceis subtracted from thejth row ofself.The
dimth dimension ofsourcemust have the same size as the length ofindex(which must be a vector), and all other dimensions must matchself, or an error will be raised.For a 3-D tensor the output is given as:
self[index[i], :, :] += alpha * src[i, :, :] # if dim == 0 self[:, index[i], :] += alpha * src[:, i, :] # if dim == 1 self[:, :, index[i]] += alpha * src[:, :, i] # if dim == 2
Note
This operation may behave nondeterministically when given tensors on a CUDA device. See /notes/randomness for more information.
- Parameters:
dim (int) – dimension along which to index
index (Tensor) – indices of
sourceto select from, should have dtype either torch.int64 or torch.int32source (Tensor) – the tensor containing values to add
- Keyword Arguments:
alpha (Number) – the scalar multiplier for
source
Example:
>>> x = torch.ones(5, 3) >>> t = torch.tensor([[1, 2, 3], [4, 5, 6], [7, 8, 9]], dtype=torch.float) >>> index = torch.tensor([0, 4, 2]) >>> x.index_add_(0, index, t) tensor([[ 2., 3., 4.], [ 1., 1., 1.], [ 8., 9., 10.], [ 1., 1., 1.], [ 5., 6., 7.]]) >>> x.index_add_(0, index, t, alpha=-1) tensor([[ 1., 1., 1.], [ 1., 1., 1.], [ 1., 1., 1.], [ 1., 1., 1.], [ 1., 1., 1.]])
- index_copy(dim, index, tensor2) Tensor#
Out-of-place version of
torch.Tensor.index_copy_().
- index_copy_(dim, index, tensor) Tensor#
Copies the elements of
tensorinto theselftensor by selecting the indices in the order given inindex. For example, ifdim == 0andindex[i] == j, then theith row oftensoris copied to thejth row ofself.The
dimth dimension oftensormust have the same size as the length ofindex(which must be a vector), and all other dimensions must matchself, or an error will be raised.Note
If
indexcontains duplicate entries, multiple elements fromtensorwill be copied to the same index ofself. The result is nondeterministic since it depends on which copy occurs last.- Parameters:
dim (int) – dimension along which to index
index (LongTensor) – indices of
tensorto select fromtensor (Tensor) – the tensor containing values to copy
Example:
>>> x = torch.zeros(5, 3) >>> t = torch.tensor([[1, 2, 3], [4, 5, 6], [7, 8, 9]], dtype=torch.float) >>> index = torch.tensor([0, 4, 2]) >>> x.index_copy_(0, index, t) tensor([[ 1., 2., 3.], [ 0., 0., 0.], [ 7., 8., 9.], [ 0., 0., 0.], [ 4., 5., 6.]])
- index_fill(dim, index, value) Tensor#
Out-of-place version of
torch.Tensor.index_fill_().
- index_fill_(dim, index, value) Tensor#
Fills the elements of the
selftensor with valuevalueby selecting the indices in the order given inindex.- Parameters:
dim (int) – dimension along which to index
index (LongTensor) – indices of
selftensor to fill invalue (float) – the value to fill with
Example:
>>> x = torch.tensor([[1, 2, 3], [4, 5, 6], [7, 8, 9]], dtype=torch.float) >>> index = torch.tensor([0, 2]) >>> x.index_fill_(1, index, -1) tensor([[-1., 2., -1.], [-1., 5., -1.], [-1., 8., -1.]])
- index_put(indices, values, accumulate=False) Tensor#
Out-place version of
index_put_().
- index_put_(indices, values, accumulate=False) Tensor#
Puts values from the tensor
valuesinto the tensorselfusing the indices specified inindices(which is a tuple of Tensors). The expressiontensor.index_put_(indices, values)is equivalent totensor[indices] = values. Returnsself.If
accumulateisTrue, the elements invaluesare added toself. If accumulate isFalse, the behavior is undefined if indices contain duplicate elements.- Parameters:
indices (tuple of LongTensor) – tensors used to index into self.
values (Tensor) – tensor of same dtype as self.
accumulate (bool) – whether to accumulate into self
- index_reduce_(dim, index, source, reduce, *, include_self=True) Tensor#
Accumulate the elements of
sourceinto theselftensor by accumulating to the indices in the order given inindexusing the reduction given by thereduceargument. For example, ifdim == 0,index[i] == j,reduce == prodandinclude_self == Truethen theith row ofsourceis multiplied by thejth row ofself. Ifinclude_self="True", the values in theselftensor are included in the reduction, otherwise, rows in theselftensor that are accumulated to are treated as if they were filled with the reduction identites.The
dimth dimension ofsourcemust have the same size as the length ofindex(which must be a vector), and all other dimensions must matchself, or an error will be raised.For a 3-D tensor with
reduce="prod"andinclude_self=Truethe output is given as:self[index[i], :, :] *= src[i, :, :] # if dim == 0 self[:, index[i], :] *= src[:, i, :] # if dim == 1 self[:, :, index[i]] *= src[:, :, i] # if dim == 2
Note
This operation may behave nondeterministically when given tensors on a CUDA device. See /notes/randomness for more information.
Note
This function only supports floating point tensors.
Warning
This function is in beta and may change in the near future.
- Parameters:
dim (int) – dimension along which to index
index (Tensor) – indices of
sourceto select from, should have dtype either torch.int64 or torch.int32source (FloatTensor) – the tensor containing values to accumulate
reduce (str) – the reduction operation to apply (
"prod","mean","amax","amin")
- Keyword Arguments:
include_self (bool) – whether the elements from the
selftensor are included in the reduction
Example:
>>> x = torch.empty(5, 3).fill_(2) >>> t = torch.tensor([[1, 2, 3], [4, 5, 6], [7, 8, 9], [10, 11, 12]], dtype=torch.float) >>> index = torch.tensor([0, 4, 2, 0]) >>> x.index_reduce_(0, index, t, 'prod') tensor([[20., 44., 72.], [ 2., 2., 2.], [14., 16., 18.], [ 2., 2., 2.], [ 8., 10., 12.]]) >>> x = torch.empty(5, 3).fill_(2) >>> x.index_reduce_(0, index, t, 'prod', include_self=False) tensor([[10., 22., 36.], [ 2., 2., 2.], [ 7., 8., 9.], [ 2., 2., 2.], [ 4., 5., 6.]])
- index_select(dim, index) Tensor#
See
torch.index_select()
- indices() Tensor#
Return the indices tensor of a sparse COO tensor.
Warning
Throws an error if
selfis not a sparse COO tensor.See also
Tensor.values().Note
This method can only be called on a coalesced sparse tensor. See
Tensor.coalesce()for details.
- inner(other) Tensor#
See
torch.inner().
- int(memory_format=torch.preserve_format) Tensor#
self.int()is equivalent toself.to(torch.int32). Seeto().- Parameters:
memory_format (
torch.memory_format, optional) – the desired memory format of returned Tensor. Default:torch.preserve_format.
- int_repr() Tensor#
Given a quantized Tensor,
self.int_repr()returns a CPU Tensor with uint8_t as data type that stores the underlying uint8_t values of the given Tensor.
- inverse() Tensor#
See
torch.inverse()
- ipu(device=None, non_blocking=False, memory_format=torch.preserve_format) Tensor#
Returns a copy of this object in IPU memory.
If this object is already in IPU memory and on the correct device, then no copy is performed and the original object is returned.
- Parameters:
device (
torch.device) – The destination IPU device. Defaults to the current IPU device.non_blocking (bool) – If
Trueand the source is in pinned memory, the copy will be asynchronous with respect to the host. Otherwise, the argument has no effect. Default:False.memory_format (
torch.memory_format, optional) – the desired memory format of returned Tensor. Default:torch.preserve_format.
- is_coalesced() bool#
Returns
Trueifselfis a sparse COO tensor that is coalesced,Falseotherwise.Warning
Throws an error if
selfis not a sparse COO tensor.See
coalesce()and uncoalesced tensors.
- is_complex() bool#
Returns True if the data type of
selfis a complex data type.
- is_conj() bool#
Returns True if the conjugate bit of
selfis set to true.
- is_contiguous(memory_format=torch.contiguous_format) bool#
Returns True if
selftensor is contiguous in memory in the order specified by memory format.- Parameters:
memory_format (
torch.memory_format, optional) – Specifies memory allocation order. Default:torch.contiguous_format.
- is_cpu#
Is
Trueif the Tensor is stored on the CPU,Falseotherwise.
- is_cuda#
Is
Trueif the Tensor is stored on the GPU,Falseotherwise.
- is_floating_point() bool#
Returns True if the data type of
selfis a floating point data type.
- is_inference() bool#
See
torch.is_inference()
- is_ipu#
Is
Trueif the Tensor is stored on the IPU,Falseotherwise.
- is_leaf#
All Tensors that have
requires_gradwhich isFalsewill be leaf Tensors by convention.For Tensors that have
requires_gradwhich isTrue, they will be leaf Tensors if they were created by the user. This means that they are not the result of an operation and sograd_fnis None.Only leaf Tensors will have their
gradpopulated during a call tobackward(). To getgradpopulated for non-leaf Tensors, you can useretain_grad().Example:
>>> a = torch.rand(10, requires_grad=True) >>> a.is_leaf True >>> b = torch.rand(10, requires_grad=True).cuda() >>> b.is_leaf False # b was created by the operation that cast a cpu Tensor into a cuda Tensor >>> c = torch.rand(10, requires_grad=True) + 2 >>> c.is_leaf False # c was created by the addition operation >>> d = torch.rand(10).cuda() >>> d.is_leaf True # d does not require gradients and so has no operation creating it (that is tracked by the autograd engine) >>> e = torch.rand(10).cuda().requires_grad_() >>> e.is_leaf True # e requires gradients and has no operations creating it >>> f = torch.rand(10, requires_grad=True, device="cuda") >>> f.is_leaf True # f requires grad, has no operation creating it
- is_meta#
Is
Trueif the Tensor is a meta tensor,Falseotherwise. Meta tensors are like normal tensors, but they carry no data.
- is_mps#
Is
Trueif the Tensor is stored on the MPS device,Falseotherwise.
- is_neg() bool#
Returns True if the negative bit of
selfis set to true.
- is_pinned()#
Returns true if this tensor resides in pinned memory. By default, the device pinned memory on will be the current accelerator.
- is_quantized#
Is
Trueif the Tensor is quantized,Falseotherwise.
- is_set_to(tensor) bool#
Returns True if both tensors are pointing to the exact same memory (same storage, offset, size and stride).
Checks if tensor is in shared memory.
This is always
Truefor CUDA tensors.
- is_signed() bool#
Returns True if the data type of
selfis a signed data type.
- is_sparse#
Is
Trueif the Tensor uses sparse COO storage layout,Falseotherwise.
- is_sparse_csr#
Is
Trueif the Tensor uses sparse CSR storage layout,Falseotherwise.
- is_xla#
Is
Trueif the Tensor is stored on an XLA device,Falseotherwise.
- is_xpu#
Is
Trueif the Tensor is stored on the XPU,Falseotherwise.
- isclose(other, rtol=1e-05, atol=1e-08, equal_nan=False) Tensor#
See
torch.isclose()
- isfinite() Tensor#
See
torch.isfinite()
- isinf() Tensor#
See
torch.isinf()
- isnan() Tensor#
See
torch.isnan()
- isneginf() Tensor#
See
torch.isneginf()
- isposinf() Tensor#
See
torch.isposinf()
- isreal() Tensor#
See
torch.isreal()
- istft(n_fft, hop_length=None, win_length=None, window=None, center=True, normalized=False, onesided=None, length=None, return_complex=False)#
See
torch.istft()
- item() number#
Returns the value of this tensor as a standard Python number. This only works for tensors with one element. For other cases, see
tolist().This operation is not differentiable.
Example:
>>> x = torch.tensor([1.0]) >>> x.item() 1.0
- itemsize#
Alias for
element_size()
- kron(other) Tensor#
See
torch.kron()
- kthvalue(k, dim=None, keepdim=False)#
See
torch.kthvalue()
- lcm(other) Tensor#
See
torch.lcm()
- lcm_(other) Tensor#
In-place version of
lcm()
- ldexp(other) Tensor#
See
torch.ldexp()
- ldexp_(other) Tensor#
In-place version of
ldexp()
- le(other) Tensor#
See
torch.le().
- le_(other) Tensor#
In-place version of
le().
- lerp(end, weight) Tensor#
See
torch.lerp()
- lerp_(end, weight) Tensor#
In-place version of
lerp()
- less()#
lt(other) -> Tensor
See
torch.less().
- less_(other) Tensor#
In-place version of
less().
- less_equal(other) Tensor#
See
torch.less_equal().
- less_equal_(other) Tensor#
In-place version of
less_equal().
- lgamma() Tensor#
See
torch.lgamma()
- lgamma_() Tensor#
In-place version of
lgamma()
- log() Tensor#
See
torch.log()
- log10() Tensor#
See
torch.log10()
- log10_() Tensor#
In-place version of
log10()
- log1p() Tensor#
See
torch.log1p()
- log1p_() Tensor#
In-place version of
log1p()
- log2() Tensor#
See
torch.log2()
- log2_() Tensor#
In-place version of
log2()
- log_() Tensor#
In-place version of
log()
- log_normal_(mean=1, std=2, *, generator=None)#
Fills
selftensor with numbers samples from the log-normal distribution parameterized by the given mean \(\mu\) and standard deviation \(\sigma\). Note thatmeanandstdare the mean and standard deviation of the underlying normal distribution, and not of the returned distribution:\[f(x) = \dfrac{1}{x \sigma \sqrt{2\pi}}\ e^{-\frac{(\ln x - \mu)^2}{2\sigma^2}}\]
- logaddexp(other) Tensor#
See
torch.logaddexp()
- logaddexp2(other) Tensor#
See
torch.logaddexp2()
- logcumsumexp(dim) Tensor#
See
torch.logcumsumexp()
- logdet() Tensor#
See
torch.logdet()
- logical_and() Tensor#
See
torch.logical_and()
- logical_and_() Tensor#
In-place version of
logical_and()
- logical_not() Tensor#
See
torch.logical_not()
- logical_not_() Tensor#
In-place version of
logical_not()
- logical_or() Tensor#
See
torch.logical_or()
- logical_or_() Tensor#
In-place version of
logical_or()
- logical_xor() Tensor#
See
torch.logical_xor()
- logical_xor_() Tensor#
In-place version of
logical_xor()
- logit() Tensor#
See
torch.logit()
- logit_() Tensor#
In-place version of
logit()
- logsumexp(dim, keepdim=False) Tensor#
See
torch.logsumexp()
- long(memory_format=torch.preserve_format) Tensor#
self.long()is equivalent toself.to(torch.int64). Seeto().- Parameters:
memory_format (
torch.memory_format, optional) – the desired memory format of returned Tensor. Default:torch.preserve_format.
- lt(other) Tensor#
See
torch.lt().
- lt_(other) Tensor#
In-place version of
lt().
- lu(pivot=True, get_infos=False)#
See
torch.lu()
- lu_solve(LU_data, LU_pivots) Tensor#
See
torch.lu_solve()
- mT#
Returns a view of this tensor with the last two dimensions transposed.
x.mTis equivalent tox.transpose(-2, -1).
- map_(tensor, callable)#
Applies
callablefor each element inselftensor and the giventensorand stores the results inselftensor.selftensor and the giventensormust be broadcastable.The
callableshould have the signature:def callable(a, b) -> number
- masked_fill(mask, value) Tensor#
Out-of-place version of
torch.Tensor.masked_fill_()
- masked_fill_(mask, value)#
Fills elements of
selftensor withvaluewheremaskis True. The shape ofmaskmust be broadcastable with the shape of the underlying tensor.- Parameters:
mask (BoolTensor) – the boolean mask
value (float) – the value to fill in with
- masked_scatter(mask, tensor) Tensor#
Out-of-place version of
torch.Tensor.masked_scatter_()Note
The inputs
selfandmaskbroadcast.Example
>>> self = torch.tensor([0, 0, 0, 0, 0]) >>> mask = torch.tensor( ... [[0, 0, 0, 1, 1], [1, 1, 0, 1, 1]], ... dtype=torch.bool, ... ) >>> source = torch.tensor([[0, 1, 2, 3, 4], [5, 6, 7, 8, 9]]) >>> self.masked_scatter(mask, source) tensor([[0, 0, 0, 0, 1], [2, 3, 0, 4, 5]])
- masked_scatter_(mask, source)#
Copies elements from
sourceintoselftensor at positions where themaskis True. Elements fromsourceare copied intoselfstarting at position 0 ofsourceand continuing in order one-by-one for each occurrence ofmaskbeing True. The shape ofmaskmust be broadcastable with the shape of the underlying tensor. Thesourceshould have at least as many elements as the number of ones inmask.- Parameters:
mask (BoolTensor) – the boolean mask
source (Tensor) – the tensor to copy from
Note
The
maskoperates on theselftensor, not on the givensourcetensor.Example
>>> self = torch.tensor([[0, 0, 0, 0, 0], [0, 0, 0, 0, 0]]) >>> mask = torch.tensor( ... [[0, 0, 0, 1, 1], [1, 1, 0, 1, 1]], ... dtype=torch.bool, ... ) >>> source = torch.tensor([[0, 1, 2, 3, 4], [5, 6, 7, 8, 9]]) >>> self.masked_scatter_(mask, source) tensor([[0, 0, 0, 0, 1], [2, 3, 0, 4, 5]])
- masked_select(mask) Tensor#
See
torch.masked_select()
- matmul(tensor2) Tensor#
See
torch.matmul()
- matrix_exp() Tensor#
See
torch.matrix_exp()
- matrix_power(n) Tensor#
Note
matrix_power()is deprecated, usetorch.linalg.matrix_power()instead.Alias for
torch.linalg.matrix_power()
- max(dim=None, keepdim=False)#
See
torch.max()
- maximum(other) Tensor#
See
torch.maximum()
- mean(dim=None, keepdim=False, *, dtype=None) Tensor#
See
torch.mean()
- median(dim=None, keepdim=False)#
See
torch.median()
- min(dim=None, keepdim=False)#
See
torch.min()
- minimum(other) Tensor#
See
torch.minimum()
- mm(mat2) Tensor#
See
torch.mm()
- mode(dim=None, keepdim=False)#
See
torch.mode()
- module_load(other, assign=False)#
Defines how to transform
otherwhen loading it intoselfinload_state_dict().Used when
get_swap_module_params_on_conversion()isTrue.It is expected that
selfis a parameter or buffer in annn.Moduleandotheris the value in the state dictionary with the corresponding key, this method defines howotheris remapped before being swapped withselfviaswap_tensors()inload_state_dict().Note
This method should always return a new object that is not
selforother. For example, the default implementation returnsself.copy_(other).detach()ifassignisFalseorother.detach()ifassignisTrue.- Parameters:
other (Tensor) – value in state dict with key corresponding to
selfassign (bool) – the assign argument passed to
nn.Module.load_state_dict()
- moveaxis(source, destination) Tensor#
See
torch.moveaxis()
- movedim(source, destination) Tensor#
See
torch.movedim()
- msort() Tensor#
See
torch.msort()
- mtia(device=None, non_blocking=False, memory_format=torch.preserve_format) Tensor#
Returns a copy of this object in MTIA memory.
If this object is already in MTIA memory and on the correct device, then no copy is performed and the original object is returned.
- Parameters:
device (
torch.device) – The destination MTIA device. Defaults to the current MTIA device.non_blocking (bool) – If
Trueand the source is in pinned memory, the copy will be asynchronous with respect to the host. Otherwise, the argument has no effect. Default:False.memory_format (
torch.memory_format, optional) – the desired memory format of returned Tensor. Default:torch.preserve_format.
- mul(value) Tensor#
See
torch.mul().
- mul_(value) Tensor#
In-place version of
mul().
- multinomial(num_samples, replacement=False, *, generator=None) Tensor#
See
torch.multinomial()
- multiply(value) Tensor#
See
torch.multiply().
- multiply_(value) Tensor#
In-place version of
multiply().
- mv(vec) Tensor#
See
torch.mv()
- mvlgamma(p) Tensor#
See
torch.mvlgamma()
- mvlgamma_(p) Tensor#
In-place version of
mvlgamma()
- names#
Stores names for each of this tensor’s dimensions.
names[idx]corresponds to the name of tensor dimensionidx. Names are either a string if the dimension is named orNoneif the dimension is unnamed.Dimension names may contain characters or underscore. Furthermore, a dimension name must be a valid Python variable name (i.e., does not start with underscore).
Tensors may not have two named dimensions with the same name.
Warning
The named tensor API is experimental and subject to change.
- nan_to_num(nan=0.0, posinf=None, neginf=None) Tensor#
See
torch.nan_to_num().
- nan_to_num_(nan=0.0, posinf=None, neginf=None) Tensor#
In-place version of
nan_to_num().
- nanmean(dim=None, keepdim=False, *, dtype=None) Tensor#
See
torch.nanmean()
- nanmedian(dim=None, keepdim=False)#
See
torch.nanmedian()
- nanquantile(q, dim=None, keepdim=False, *, interpolation='linear') Tensor#
See
torch.nanquantile()
- nansum(dim=None, keepdim=False, dtype=None) Tensor#
See
torch.nansum()
- narrow(dimension, start, length) Tensor#
See
torch.narrow().
- narrow_copy(dimension, start, length) Tensor#
See
torch.narrow_copy().
- nbytes#
Returns the number of bytes consumed by the “view” of elements of the Tensor if the Tensor does not use sparse storage layout. Defined to be
numel()*element_size()
- ndim#
Alias for
dim()
- ndimension() int#
Alias for
dim()
- ne(other) Tensor#
See
torch.ne().
- ne_(other) Tensor#
In-place version of
ne().
- neg() Tensor#
See
torch.neg()
- neg_() Tensor#
In-place version of
neg()
- negative() Tensor#
See
torch.negative()
- negative_() Tensor#
In-place version of
negative()
- nelement() int#
Alias for
numel()
- new_empty(size, dtype=None, device=None, requires_grad=False)[source]#
must be defined for deepcopy to work
- new_empty_strided(size, stride, dtype=None, device=None, requires_grad=False, layout=torch.strided, pin_memory=False) Tensor#
Returns a Tensor of size
sizeand stridesstridefilled with uninitialized data. By default, the returned Tensor has the sametorch.dtypeandtorch.deviceas this tensor.- Parameters:
size (int...) – a list, tuple, or
torch.Sizeof integers defining the shape of the output tensor.- Keyword Arguments:
dtype (
torch.dtype, optional) – the desired type of returned tensor. Default: if None, sametorch.dtypeas this tensor.device (
torch.device, optional) – the desired device of returned tensor. Default: if None, sametorch.deviceas this tensor.requires_grad (bool, optional) – If autograd should record operations on the returned tensor. Default:
False.layout (
torch.layout, optional) – the desired layout of returned Tensor. Default:torch.strided.pin_memory (bool, optional) – If set, returned tensor would be allocated in the pinned memory. Works only for CPU tensors. Default:
False.
Example:
>>> tensor = torch.ones(()) >>> tensor.new_empty_strided((2, 3), (3, 1)) tensor([[ 5.8182e-18, 4.5765e-41, -1.0545e+30], [ 3.0949e-41, 4.4842e-44, 0.0000e+00]])
- new_full(size, fill_value, *, dtype=None, device=None, requires_grad=False, layout=torch.strided, pin_memory=False) Tensor#
Returns a Tensor of size
sizefilled withfill_value. By default, the returned Tensor has the sametorch.dtypeandtorch.deviceas this tensor.- Parameters:
fill_value (scalar) – the number to fill the output tensor with.
- Keyword Arguments:
dtype (
torch.dtype, optional) – the desired type of returned tensor. Default: if None, sametorch.dtypeas this tensor.device (
torch.device, optional) – the desired device of returned tensor. Default: if None, sametorch.deviceas this tensor.requires_grad (bool, optional) – If autograd should record operations on the returned tensor. Default:
False.layout (
torch.layout, optional) – the desired layout of returned Tensor. Default:torch.strided.pin_memory (bool, optional) – If set, returned tensor would be allocated in the pinned memory. Works only for CPU tensors. Default:
False.
Example:
>>> tensor = torch.ones((2,), dtype=torch.float64) >>> tensor.new_full((3, 4), 3.141592) tensor([[ 3.1416, 3.1416, 3.1416, 3.1416], [ 3.1416, 3.1416, 3.1416, 3.1416], [ 3.1416, 3.1416, 3.1416, 3.1416]], dtype=torch.float64)
- new_ones(size, *, dtype=None, device=None, requires_grad=False, layout=torch.strided, pin_memory=False) Tensor#
Returns a Tensor of size
sizefilled with1. By default, the returned Tensor has the sametorch.dtypeandtorch.deviceas this tensor.- Parameters:
size (int...) – a list, tuple, or
torch.Sizeof integers defining the shape of the output tensor.- Keyword Arguments:
dtype (
torch.dtype, optional) – the desired type of returned tensor. Default: if None, sametorch.dtypeas this tensor.device (
torch.device, optional) – the desired device of returned tensor. Default: if None, sametorch.deviceas this tensor.requires_grad (bool, optional) – If autograd should record operations on the returned tensor. Default:
False.layout (
torch.layout, optional) – the desired layout of returned Tensor. Default:torch.strided.pin_memory (bool, optional) – If set, returned tensor would be allocated in the pinned memory. Works only for CPU tensors. Default:
False.
Example:
>>> tensor = torch.tensor((), dtype=torch.int32) >>> tensor.new_ones((2, 3)) tensor([[ 1, 1, 1], [ 1, 1, 1]], dtype=torch.int32)
- new_tensor(data, *, dtype=None, device=None, requires_grad=False, layout=torch.strided, pin_memory=False) Tensor#
Returns a new Tensor with
dataas the tensor data. By default, the returned Tensor has the sametorch.dtypeandtorch.deviceas this tensor.Warning
new_tensor()always copiesdata. If you have a Tensordataand want to avoid a copy, usetorch.Tensor.requires_grad_()ortorch.Tensor.detach(). If you have a numpy array and want to avoid a copy, usetorch.from_numpy().Warning
When data is a tensor x,
new_tensor()reads out ‘the data’ from whatever it is passed, and constructs a leaf variable. Thereforetensor.new_tensor(x)is equivalent tox.detach().clone()andtensor.new_tensor(x, requires_grad=True)is equivalent tox.detach().clone().requires_grad_(True). The equivalents usingdetach()andclone()are recommended.- Parameters:
data (array_like) – The returned Tensor copies
data.- Keyword Arguments:
dtype (
torch.dtype, optional) – the desired type of returned tensor. Default: if None, sametorch.dtypeas this tensor.device (
torch.device, optional) – the desired device of returned tensor. Default: if None, sametorch.deviceas this tensor.requires_grad (bool, optional) – If autograd should record operations on the returned tensor. Default:
False.layout (
torch.layout, optional) – the desired layout of returned Tensor. Default:torch.strided.pin_memory (bool, optional) – If set, returned tensor would be allocated in the pinned memory. Works only for CPU tensors. Default:
False.
Example:
>>> tensor = torch.ones((2,), dtype=torch.int8) >>> data = [[0, 1], [2, 3]] >>> tensor.new_tensor(data) tensor([[ 0, 1], [ 2, 3]], dtype=torch.int8)
- new_zeros(size, *, dtype=None, device=None, requires_grad=False, layout=torch.strided, pin_memory=False) Tensor#
Returns a Tensor of size
sizefilled with0. By default, the returned Tensor has the sametorch.dtypeandtorch.deviceas this tensor.- Parameters:
size (int...) – a list, tuple, or
torch.Sizeof integers defining the shape of the output tensor.- Keyword Arguments:
dtype (
torch.dtype, optional) – the desired type of returned tensor. Default: if None, sametorch.dtypeas this tensor.device (
torch.device, optional) – the desired device of returned tensor. Default: if None, sametorch.deviceas this tensor.requires_grad (bool, optional) – If autograd should record operations on the returned tensor. Default:
False.layout (
torch.layout, optional) – the desired layout of returned Tensor. Default:torch.strided.pin_memory (bool, optional) – If set, returned tensor would be allocated in the pinned memory. Works only for CPU tensors. Default:
False.
Example:
>>> tensor = torch.tensor((), dtype=torch.float64) >>> tensor.new_zeros((2, 3)) tensor([[ 0., 0., 0.], [ 0., 0., 0.]], dtype=torch.float64)
- nextafter(other) Tensor#
See
torch.nextafter()
- nextafter_(other) Tensor#
In-place version of
nextafter()
- nonzero() LongTensor#
See
torch.nonzero()
- nonzero_static(input, *, size, fill_value=-1) Tensor#
Returns a 2-D tensor where each row is the index for a non-zero value. The returned Tensor has the same torch.dtype as torch.nonzero().
- Parameters:
input (Tensor) – the input tensor to count non-zero elements.
- Keyword Arguments:
size (int) – the size of non-zero elements expected to be included in the out tensor. Pad the out tensor with fill_value if the size is larger than total number of non-zero elements, truncate out tensor if size is smaller. The size must be a non-negative integer.
fill_value (int) – the value to fill the output tensor with when size is larger than the total number of non-zero elements. Default is -1 to represent invalid index.
Example
# Example 1: Padding >>> input_tensor = torch.tensor([[1, 0], [3, 2]]) >>> static_size = 4 >>> t = torch.nonzero_static(input_tensor, size=static_size) tensor([[ 0, 0],
[ 1, 0], [ 1, 1], [ -1, -1]], dtype=torch.int64)
# Example 2: Truncating >>> input_tensor = torch.tensor([[1, 0], [3, 2]]) >>> static_size = 2 >>> t = torch.nonzero_static(input_tensor, size=static_size) tensor([[ 0, 0],
[ 1, 0]], dtype=torch.int64)
# Example 3: 0 size >>> input_tensor = torch.tensor([10]) >>> static_size = 0 >>> t = torch.nonzero_static(input_tensor, size=static_size) tensor([], size=(0, 1), dtype=torch.int64)
# Example 4: 0 rank input >>> input_tensor = torch.tensor(10) >>> static_size = 2 >>> t = torch.nonzero_static(input_tensor, size=static_size) tensor([], size=(2, 0), dtype=torch.int64)
- norm(p='fro', dim=None, keepdim=False, dtype=None)#
See
torch.norm()
- normal_(mean=0, std=1, *, generator=None) Tensor#
Fills
selftensor with elements samples from the normal distribution parameterized bymeanandstd.
- not_equal(other) Tensor#
See
torch.not_equal().
- not_equal_(other) Tensor#
In-place version of
not_equal().
- numel() int#
See
torch.numel()
- numpy(*, force=False) numpy.ndarray#
Returns the tensor as a NumPy
ndarray.If
forceisFalse(the default), the conversion is performed only if the tensor is on the CPU, does not require grad, does not have its conjugate bit set, and is a dtype and layout that NumPy supports. The returned ndarray and the tensor will share their storage, so changes to the tensor will be reflected in the ndarray and vice versa.If
forceisTruethis is equivalent to callingt.detach().cpu().resolve_conj().resolve_neg().numpy(). If the tensor isn’t on the CPU or the conjugate or negative bit is set, the tensor won’t share its storage with the returned ndarray. SettingforcetoTruecan be a useful shorthand.- Parameters:
force (bool) – if
True, the ndarray may be a copy of the tensor instead of always sharing memory, defaults toFalse.
- orgqr(input2) Tensor#
See
torch.orgqr()
- ormqr(input2, input3, left=True, transpose=False) Tensor#
See
torch.ormqr()
- outer(vec2) Tensor#
See
torch.outer().
- peek_pending_shape()[source]#
Get the currently expected spatial shape as if all the pending operations are executed. For tensors that have more than 3 spatial dimensions, only the shapes of the top 3 dimensions will be returned.
- permute(*dims) Tensor#
See
torch.permute()
- pin_memory() Tensor#
Copies the tensor to pinned memory, if it’s not already pinned. By default, the device pinned memory on will be the current accelerator.
- pinverse() Tensor#
See
torch.pinverse()
- property pixdim#
Get the spacing
- polygamma(n) Tensor#
See
torch.polygamma()
- polygamma_(n) Tensor#
In-place version of
polygamma()
- positive() Tensor#
See
torch.positive()
- pow(exponent) Tensor#
See
torch.pow()
- pow_(exponent) Tensor#
In-place version of
pow()
- prod(dim=None, keepdim=False, dtype=None) Tensor#
See
torch.prod()
- put(input, index, source, accumulate=False) Tensor#
Out-of-place version of
torch.Tensor.put_(). input corresponds to self intorch.Tensor.put_().
- put_(index, source, accumulate=False) Tensor#
Copies the elements from
sourceinto the positions specified byindex. For the purpose of indexing, theselftensor is treated as if it were a 1-D tensor.indexandsourceneed to have the same number of elements, but not necessarily the same shape.If
accumulateisTrue, the elements insourceare added toself. If accumulate isFalse, the behavior is undefined ifindexcontain duplicate elements.- Parameters:
index (LongTensor) – the indices into self
source (Tensor) – the tensor containing values to copy from
accumulate (bool) – whether to accumulate into self
Example:
>>> src = torch.tensor([[4, 3, 5], ... [6, 7, 8]]) >>> src.put_(torch.tensor([1, 3]), torch.tensor([9, 10])) tensor([[ 4, 9, 5], [ 10, 7, 8]])
- q_per_channel_axis() int#
Given a Tensor quantized by linear (affine) per-channel quantization, returns the index of dimension on which per-channel quantization is applied.
- q_per_channel_scales() Tensor#
Given a Tensor quantized by linear (affine) per-channel quantization, returns a Tensor of scales of the underlying quantizer. It has the number of elements that matches the corresponding dimensions (from q_per_channel_axis) of the tensor.
- q_per_channel_zero_points() Tensor#
Given a Tensor quantized by linear (affine) per-channel quantization, returns a tensor of zero_points of the underlying quantizer. It has the number of elements that matches the corresponding dimensions (from q_per_channel_axis) of the tensor.
- q_scale() float#
Given a Tensor quantized by linear(affine) quantization, returns the scale of the underlying quantizer().
- q_zero_point() int#
Given a Tensor quantized by linear(affine) quantization, returns the zero_point of the underlying quantizer().
- qr(some=True)#
See
torch.qr()
- qscheme() torch.qscheme#
Returns the quantization scheme of a given QTensor.
- quantile(q, dim=None, keepdim=False, *, interpolation='linear') Tensor#
See
torch.quantile()
- rad2deg() Tensor#
See
torch.rad2deg()
- rad2deg_() Tensor#
In-place version of
rad2deg()
- random_(from=0, to=None, *, generator=None) Tensor#
Fills
selftensor with numbers sampled from the discrete uniform distribution over[from, to - 1]. If not specified, the values are usually only bounded byselftensor’s data type. However, for floating point types, if unspecified, range will be[0, 2^mantissa]to ensure that every value is representable. For example, torch.tensor(1, dtype=torch.double).random_() will be uniform in[0, 2^53].
- ravel() Tensor#
see
torch.ravel()
- real#
Returns a new tensor containing real values of the
selftensor for a complex-valued input tensor. The returned tensor andselfshare the same underlying storage.Returns
selfifselfis a real-valued tensor tensor.Example:
>>> x=torch.randn(4, dtype=torch.cfloat) >>> x tensor([(0.3100+0.3553j), (-0.5445-0.7896j), (-1.6492-0.0633j), (-0.0638-0.8119j)]) >>> x.real tensor([ 0.3100, -0.5445, -1.6492, -0.0638])
- reciprocal() Tensor#
See
torch.reciprocal()
- reciprocal_() Tensor#
In-place version of
reciprocal()
- record_stream(stream)#
Marks the tensor as having been used by this stream. When the tensor is deallocated, ensure the tensor memory is not reused for another tensor until all work queued on
streamat the time of deallocation is complete.Note
The caching allocator is aware of only the stream where a tensor was allocated. Due to the awareness, it already correctly manages the life cycle of tensors on only one stream. But if a tensor is used on a stream different from the stream of origin, the allocator might reuse the memory unexpectedly. Calling this method lets the allocator know which streams have used the tensor.
Warning
This method is most suitable for use cases where you are providing a function that created a tensor on a side stream, and want users to be able to make use of the tensor without having to think carefully about stream safety when making use of them. These safety guarantees come at some performance and predictability cost (analogous to the tradeoff between GC and manual memory management), so if you are in a situation where you manage the full lifetime of your tensors, you may consider instead manually managing CUDA events so that calling this method is not necessary. In particular, when you call this method, on later allocations the allocator will poll the recorded stream to see if all operations have completed yet; you can potentially race with side stream computation and non-deterministically reuse or fail to reuse memory for an allocation.
You can safely use tensors allocated on side streams without
record_stream(); you must manually ensure that any non-creation stream uses of a tensor are synced back to the creation stream before you deallocate the tensor. As the CUDA caching allocator guarantees that the memory will only be reused with the same creation stream, this is sufficient to ensure that writes to future reallocations of the memory will be delayed until non-creation stream uses are done. (Counterintuitively, you may observe that on the CPU side we have already reallocated the tensor, even though CUDA kernels on the old tensor are still in progress. This is fine, because CUDA operations on the new tensor will appropriately wait for the old operations to complete, as they are all on the same stream.)Concretely, this looks like this:
with torch.cuda.stream(s0): x = torch.zeros(N) s1.wait_stream(s0) with torch.cuda.stream(s1): y = some_comm_op(x) ... some compute on s0 ... # synchronize creation stream s0 to side stream s1 # before deallocating x s0.wait_stream(s1) del x
Note that some discretion is required when deciding when to perform
s0.wait_stream(s1). In particular, if we were to wait immediately aftersome_comm_op, there wouldn’t be any point in having the side stream; it would be equivalent to have runsome_comm_opons0. Instead, the synchronization must be placed at some appropriate, later point in time where you expect the side streams1to have finished work. This location is typically identified via profiling, e.g., using Chrome traces producedtorch.autograd.profiler.profile.export_chrome_trace(). If you place the wait too early, work on s0 will block untils1has finished, preventing further overlapping of communication and computation. If you place the wait too late, you will use more memory than is strictly necessary (as you are keepingxlive for longer.) For a concrete example of how this guidance can be applied in practice, see this post: FSDP and CUDACachingAllocator.
- refine_names(*names)#
Refines the dimension names of
selfaccording tonames.Refining is a special case of renaming that “lifts” unnamed dimensions. A
Nonedim can be refined to have any name; a named dim can only be refined to have the same name.Because named tensors can coexist with unnamed tensors, refining names gives a nice way to write named-tensor-aware code that works with both named and unnamed tensors.
namesmay contain up to one Ellipsis (...). The Ellipsis is expanded greedily; it is expanded in-place to fillnamesto the same length asself.dim()using names from the corresponding indices ofself.names.Python 2 does not support Ellipsis but one may use a string literal instead (
'...').- Parameters:
names (iterable of str) – The desired names of the output tensor. May contain up to one Ellipsis.
Examples:
>>> imgs = torch.randn(32, 3, 128, 128) >>> named_imgs = imgs.refine_names('N', 'C', 'H', 'W') >>> named_imgs.names ('N', 'C', 'H', 'W') >>> tensor = torch.randn(2, 3, 5, 7, 11) >>> tensor = tensor.refine_names('A', ..., 'B', 'C') >>> tensor.names ('A', None, None, 'B', 'C')
Warning
The named tensor API is experimental and subject to change.
- register_hook(hook)#
Registers a backward hook.
The hook will be called every time a gradient with respect to the Tensor is computed. The hook should have the following signature:
hook(grad) -> Tensor or None
The hook should not modify its argument, but it can optionally return a new gradient which will be used in place of
grad.This function returns a handle with a method
handle.remove()that removes the hook from the module.Note
See backward-hooks-execution for more information on how when this hook is executed, and how its execution is ordered relative to other hooks.
Example:
>>> v = torch.tensor([0., 0., 0.], requires_grad=True) >>> h = v.register_hook(lambda grad: grad * 2) # double the gradient >>> v.backward(torch.tensor([1., 2., 3.])) >>> v.grad 2 4 6 [torch.FloatTensor of size (3,)] >>> h.remove() # removes the hook
- register_post_accumulate_grad_hook(hook)#
Registers a backward hook that runs after grad accumulation.
The hook will be called after all gradients for a tensor have been accumulated, meaning that the .grad field has been updated on that tensor. The post accumulate grad hook is ONLY applicable for leaf tensors (tensors without a .grad_fn field). Registering this hook on a non-leaf tensor will error!
The hook should have the following signature:
hook(param: Tensor) -> None
Note that, unlike other autograd hooks, this hook operates on the tensor that requires grad and not the grad itself. The hook can in-place modify and access its Tensor argument, including its .grad field.
This function returns a handle with a method
handle.remove()that removes the hook from the module.Note
See backward-hooks-execution for more information on how when this hook is executed, and how its execution is ordered relative to other hooks. Since this hook runs during the backward pass, it will run in no_grad mode (unless create_graph is True). You can use torch.enable_grad() to re-enable autograd within the hook if you need it.
Example:
>>> v = torch.tensor([0., 0., 0.], requires_grad=True) >>> lr = 0.01 >>> # simulate a simple SGD update >>> h = v.register_post_accumulate_grad_hook(lambda p: p.add_(p.grad, alpha=-lr)) >>> v.backward(torch.tensor([1., 2., 3.])) >>> v tensor([-0.0100, -0.0200, -0.0300], requires_grad=True) >>> h.remove() # removes the hook
- remainder(divisor) Tensor#
See
torch.remainder()
- remainder_(divisor) Tensor#
In-place version of
remainder()
- rename(*names, **rename_map)#
Renames dimension names of
self.There are two main usages:
self.rename(**rename_map)returns a view on tensor that has dims renamed as specified in the mappingrename_map.self.rename(*names)returns a view on tensor, renaming all dimensions positionally usingnames. Useself.rename(None)to drop names on a tensor.One cannot specify both positional args
namesand keyword argsrename_map.Examples:
>>> imgs = torch.rand(2, 3, 5, 7, names=('N', 'C', 'H', 'W')) >>> renamed_imgs = imgs.rename(N='batch', C='channels') >>> renamed_imgs.names ('batch', 'channels', 'H', 'W') >>> renamed_imgs = imgs.rename(None) >>> renamed_imgs.names (None, None, None, None) >>> renamed_imgs = imgs.rename('batch', 'channel', 'height', 'width') >>> renamed_imgs.names ('batch', 'channel', 'height', 'width')
Warning
The named tensor API is experimental and subject to change.
- rename_(*names, **rename_map)#
In-place version of
rename().
- renorm(p, dim, maxnorm) Tensor#
See
torch.renorm()
- renorm_(p, dim, maxnorm) Tensor#
In-place version of
renorm()
- repeat(*repeats) Tensor#
Repeats this tensor along the specified dimensions.
Unlike
expand(), this function copies the tensor’s data.Warning
repeat()behaves differently from numpy.repeat, but is more similar to numpy.tile. For the operator similar to numpy.repeat, seetorch.repeat_interleave().- Parameters:
repeat (torch.Size, int..., tuple of int or list of int) – The number of times to repeat this tensor along each dimension
Example:
>>> x = torch.tensor([1, 2, 3]) >>> x.repeat(4, 2) tensor([[ 1, 2, 3, 1, 2, 3], [ 1, 2, 3, 1, 2, 3], [ 1, 2, 3, 1, 2, 3], [ 1, 2, 3, 1, 2, 3]]) >>> x.repeat(4, 2, 1).size() torch.Size([4, 2, 3])
- repeat_interleave(repeats, dim=None, *, output_size=None) Tensor#
See
torch.repeat_interleave().
- requires_grad#
Is
Trueif gradients need to be computed for this Tensor,Falseotherwise.
- requires_grad_(requires_grad=True) Tensor#
Change if autograd should record operations on this tensor: sets this tensor’s
requires_gradattribute in-place. Returns this tensor.requires_grad_()’s main use case is to tell autograd to begin recording operations on a Tensortensor. Iftensorhasrequires_grad=False(because it was obtained through a DataLoader, or required preprocessing or initialization),tensor.requires_grad_()makes it so that autograd will begin to record operations ontensor.- Parameters:
requires_grad (bool) – If autograd should record operations on this tensor. Default:
True.
Example:
>>> # Let's say we want to preprocess some saved weights and use >>> # the result as new weights. >>> saved_weights = [0.1, 0.2, 0.3, 0.25] >>> loaded_weights = torch.tensor(saved_weights) >>> weights = preprocess(loaded_weights) # some function >>> weights tensor([-0.5503, 0.4926, -2.1158, -0.8303]) >>> # Now, start to record operations done to weights >>> weights.requires_grad_() >>> out = weights.pow(2).sum() >>> out.backward() >>> weights.grad tensor([-1.1007, 0.9853, -4.2316, -1.6606])
- reshape(*shape) Tensor#
Returns a tensor with the same data and number of elements as
selfbut with the specified shape. This method returns a view ifshapeis compatible with the current shape. Seetorch.Tensor.view()on when it is possible to return a view.See
torch.reshape()- Parameters:
shape (tuple of ints or int...) – the desired shape
- reshape_as(other) Tensor#
Returns this tensor as the same shape as
other.self.reshape_as(other)is equivalent toself.reshape(other.sizes()). This method returns a view ifother.sizes()is compatible with the current shape. Seetorch.Tensor.view()on when it is possible to return a view.Please see
reshape()for more information aboutreshape.- Parameters:
other (
torch.Tensor) – The result tensor has the same shape asother.
- resize_(*sizes, memory_format=torch.contiguous_format) Tensor#
Resizes
selftensor to the specified size. If the number of elements is larger than the current storage size, then the underlying storage is resized to fit the new number of elements. If the number of elements is smaller, the underlying storage is not changed. Existing elements are preserved but any new memory is uninitialized.Warning
This is a low-level method. The storage is reinterpreted as C-contiguous, ignoring the current strides (unless the target size equals the current size, in which case the tensor is left unchanged). For most purposes, you will instead want to use
view(), which checks for contiguity, orreshape(), which copies data if needed. To change the size in-place with custom strides, seeset_().Note
If
torch.use_deterministic_algorithms()andtorch.utils.deterministic.fill_uninitialized_memoryare both set toTrue, new elements are initialized to prevent nondeterministic behavior from using the result as an input to an operation. Floating point and complex values are set to NaN, and integer values are set to the maximum value.- Parameters:
sizes (torch.Size or int...) – the desired size
memory_format (
torch.memory_format, optional) – the desired memory format of Tensor. Default:torch.contiguous_format. Note that memory format ofselfis going to be unaffected ifself.size()matchessizes.
Example:
>>> x = torch.tensor([[1, 2], [3, 4], [5, 6]]) >>> x.resize_(2, 2) tensor([[ 1, 2], [ 3, 4]])
- resize_as_(tensor, memory_format=torch.contiguous_format) Tensor#
Resizes the
selftensor to be the same size as the specifiedtensor. This is equivalent toself.resize_(tensor.size()).- Parameters:
memory_format (
torch.memory_format, optional) – the desired memory format of Tensor. Default:torch.contiguous_format. Note that memory format ofselfis going to be unaffected ifself.size()matchestensor.size().
- resolve_conj() Tensor#
See
torch.resolve_conj()
- resolve_neg() Tensor#
See
torch.resolve_neg()
- retain_grad() None#
Enables this Tensor to have their
gradpopulated duringbackward(). This is a no-op for leaf tensors.
- retains_grad#
Is
Trueif this Tensor is non-leaf and itsgradis enabled to be populated duringbackward(),Falseotherwise.
- roll(shifts, dims) Tensor#
See
torch.roll()
- rot90(k, dims) Tensor#
See
torch.rot90()
- round(decimals=0) Tensor#
See
torch.round()
- round_(decimals=0) Tensor#
In-place version of
round()
- rsqrt() Tensor#
See
torch.rsqrt()
- rsqrt_() Tensor#
In-place version of
rsqrt()
- scatter(dim, index, src) Tensor#
Out-of-place version of
torch.Tensor.scatter_()
- scatter_(dim, index, src, *, reduce=None) Tensor#
Writes all values from the tensor
srcintoselfat the indices specified in theindextensor. For each value insrc, its output index is specified by its index insrcfordimension != dimand by the corresponding value inindexfordimension = dim.For a 3-D tensor,
selfis updated as:self[index[i][j][k]][j][k] = src[i][j][k] # if dim == 0 self[i][index[i][j][k]][k] = src[i][j][k] # if dim == 1 self[i][j][index[i][j][k]] = src[i][j][k] # if dim == 2
This is the reverse operation of the manner described in
gather().self,indexandsrc(if it is a Tensor) should all have the same number of dimensions. It is also required thatindex.size(d) <= src.size(d)for all dimensionsd, and thatindex.size(d) <= self.size(d)for all dimensionsd != dim. Note thatindexandsrcdo not broadcast.Moreover, as for
gather(), the values ofindexmust be between0andself.size(dim) - 1inclusive.Warning
When indices are not unique, the behavior is non-deterministic (one of the values from
srcwill be picked arbitrarily) and the gradient will be incorrect (it will be propagated to all locations in the source that correspond to the same index)!Note
The backward pass is implemented only for
src.shape == index.shape.Additionally accepts an optional
reduceargument that allows specification of an optional reduction operation, which is applied to all values in the tensorsrcintoselfat the indices specified in theindex. For each value insrc, the reduction operation is applied to an index inselfwhich is specified by its index insrcfordimension != dimand by the corresponding value inindexfordimension = dim.Given a 3-D tensor and reduction using the multiplication operation,
selfis updated as:self[index[i][j][k]][j][k] *= src[i][j][k] # if dim == 0 self[i][index[i][j][k]][k] *= src[i][j][k] # if dim == 1 self[i][j][index[i][j][k]] *= src[i][j][k] # if dim == 2
Reducing with the addition operation is the same as using
scatter_add_().Warning
The reduce argument with Tensor
srcis deprecated and will be removed in a future PyTorch release. Please usescatter_reduce_()instead for more reduction options.- Parameters:
dim (int) – the axis along which to index
index (LongTensor) – the indices of elements to scatter, can be either empty or of the same dimensionality as
src. When empty, the operation returnsselfunchanged.src (Tensor) – the source element(s) to scatter.
- Keyword Arguments:
reduce (str, optional) – reduction operation to apply, can be either
'add'or'multiply'.
Example:
>>> src = torch.arange(1, 11).reshape((2, 5)) >>> src tensor([[ 1, 2, 3, 4, 5], [ 6, 7, 8, 9, 10]]) >>> index = torch.tensor([[0, 1, 2, 0]]) >>> torch.zeros(3, 5, dtype=src.dtype).scatter_(0, index, src) tensor([[1, 0, 0, 4, 0], [0, 2, 0, 0, 0], [0, 0, 3, 0, 0]]) >>> index = torch.tensor([[0, 1, 2], [0, 1, 4]]) >>> torch.zeros(3, 5, dtype=src.dtype).scatter_(1, index, src) tensor([[1, 2, 3, 0, 0], [6, 7, 0, 0, 8], [0, 0, 0, 0, 0]]) >>> torch.full((2, 4), 2.).scatter_(1, torch.tensor([[2], [3]]), ... 1.23, reduce='multiply') tensor([[2.0000, 2.0000, 2.4600, 2.0000], [2.0000, 2.0000, 2.0000, 2.4600]]) >>> torch.full((2, 4), 2.).scatter_(1, torch.tensor([[2], [3]]), ... 1.23, reduce='add') tensor([[2.0000, 2.0000, 3.2300, 2.0000], [2.0000, 2.0000, 2.0000, 3.2300]])
- scatter_(dim, index, value, *, reduce=None) Tensor:
Writes the value from
valueintoselfat the indices specified in theindextensor. This operation is equivalent to the previous version, with thesrctensor filled entirely withvalue.- Parameters:
dim (int) – the axis along which to index
index (LongTensor) – the indices of elements to scatter, can be either empty or of the same dimensionality as
src. When empty, the operation returnsselfunchanged.value (Scalar) – the value to scatter.
- Keyword Arguments:
reduce (str, optional) – reduction operation to apply, can be either
'add'or'multiply'.
Example:
>>> index = torch.tensor([[0, 1]]) >>> value = 2 >>> torch.zeros(3, 5).scatter_(0, index, value) tensor([[2., 0., 0., 0., 0.], [0., 2., 0., 0., 0.], [0., 0., 0., 0., 0.]])
- scatter_add(dim, index, src) Tensor#
Out-of-place version of
torch.Tensor.scatter_add_()
- scatter_add_(dim, index, src) Tensor#
Adds all values from the tensor
srcintoselfat the indices specified in theindextensor in a similar fashion asscatter_(). For each value insrc, it is added to an index inselfwhich is specified by its index insrcfordimension != dimand by the corresponding value inindexfordimension = dim.For a 3-D tensor,
selfis updated as:self[index[i][j][k]][j][k] += src[i][j][k] # if dim == 0 self[i][index[i][j][k]][k] += src[i][j][k] # if dim == 1 self[i][j][index[i][j][k]] += src[i][j][k] # if dim == 2
self,indexandsrcshould have same number of dimensions. It is also required thatindex.size(d) <= src.size(d)for all dimensionsd, and thatindex.size(d) <= self.size(d)for all dimensionsd != dim. Note thatindexandsrcdo not broadcast.Note
This operation may behave nondeterministically when given tensors on a CUDA device. See /notes/randomness for more information.
Note
The backward pass is implemented only for
src.shape == index.shape.- Parameters:
dim (int) – the axis along which to index
index (LongTensor) – the indices of elements to scatter and add, can be either empty or of the same dimensionality as
src. When empty, the operation returnsselfunchanged.src (Tensor) – the source elements to scatter and add
Example:
>>> src = torch.ones((2, 5)) >>> index = torch.tensor([[0, 1, 2, 0, 0]]) >>> torch.zeros(3, 5, dtype=src.dtype).scatter_add_(0, index, src) tensor([[1., 0., 0., 1., 1.], [0., 1., 0., 0., 0.], [0., 0., 1., 0., 0.]]) >>> index = torch.tensor([[0, 1, 2, 0, 0], [0, 1, 2, 2, 2]]) >>> torch.zeros(3, 5, dtype=src.dtype).scatter_add_(0, index, src) tensor([[2., 0., 0., 1., 1.], [0., 2., 0., 0., 0.], [0., 0., 2., 1., 1.]])
- scatter_reduce(dim, index, src, reduce, *, include_self=True) Tensor#
Out-of-place version of
torch.Tensor.scatter_reduce_()
- scatter_reduce_(dim, index, src, reduce, *, include_self=True) Tensor#
Reduces all values from the
srctensor to the indices specified in theindextensor in theselftensor using the applied reduction defined via thereduceargument ("sum","prod","mean","amax","amin"). For each value insrc, it is reduced to an index inselfwhich is specified by its index insrcfordimension != dimand by the corresponding value inindexfordimension = dim. Ifinclude_self="True", the values in theselftensor are included in the reduction.self,indexandsrcshould all have the same number of dimensions. It is also required thatindex.size(d) <= src.size(d)for all dimensionsd, and thatindex.size(d) <= self.size(d)for all dimensionsd != dim. Note thatindexandsrcdo not broadcast.For a 3-D tensor with
reduce="sum"andinclude_self=Truethe output is given as:self[index[i][j][k]][j][k] += src[i][j][k] # if dim == 0 self[i][index[i][j][k]][k] += src[i][j][k] # if dim == 1 self[i][j][index[i][j][k]] += src[i][j][k] # if dim == 2
Note
This operation may behave nondeterministically when given tensors on a CUDA device. See /notes/randomness for more information.
Note
The backward pass is implemented only for
src.shape == index.shape.Warning
This function is in beta and may change in the near future.
- Parameters:
dim (int) – the axis along which to index
index (LongTensor) – the indices of elements to scatter and reduce.
src (Tensor) – the source elements to scatter and reduce
reduce (str) – the reduction operation to apply for non-unique indices (
"sum","prod","mean","amax","amin")include_self (bool) – whether elements from the
selftensor are included in the reduction
Example:
>>> src = torch.tensor([1., 2., 3., 4., 5., 6.]) >>> index = torch.tensor([0, 1, 0, 1, 2, 1]) >>> input = torch.tensor([1., 2., 3., 4.]) >>> input.scatter_reduce(0, index, src, reduce="sum") tensor([5., 14., 8., 4.]) >>> input.scatter_reduce(0, index, src, reduce="sum", include_self=False) tensor([4., 12., 5., 4.]) >>> input2 = torch.tensor([5., 4., 3., 2.]) >>> input2.scatter_reduce(0, index, src, reduce="amax") tensor([5., 6., 5., 2.]) >>> input2.scatter_reduce(0, index, src, reduce="amax", include_self=False) tensor([3., 6., 5., 2.])
- select(dim, index) Tensor#
See
torch.select()
- select_scatter(src, dim, index) Tensor#
See
torch.select_scatter()
- set_(source=None, storage_offset=0, size=None, stride=None) Tensor#
Sets the underlying storage, size, and strides. If
sourceis a tensor,selftensor will share the same storage and have the same size and strides assource. Changes to elements in one tensor will be reflected in the other.If
sourceis aStorage, the method sets the underlying storage, offset, size, and stride.- Parameters:
source (Tensor or Storage) – the tensor or storage to use
storage_offset (int, optional) – the offset in the storage
size (torch.Size, optional) – the desired size. Defaults to the size of the source.
stride (tuple, optional) – the desired stride. Defaults to C-contiguous strides.
- set_array(src, non_blocking=False, *_args, **_kwargs)[source]#
Copies the elements from src into self tensor and returns self. The src tensor must be broadcastable with the self tensor. It may be of a different data type or reside on a different device.
See also: https://pytorch.org/docs/stable/generated/torch.Tensor.copy_.html
- Parameters:
src – the source tensor to copy from.
non_blocking (
bool) – if True and this copy is between CPU and GPU, the copy may occur asynchronously with respect to the host. For other cases, this argument has no effect._args – currently unused parameters.
_kwargs – currently unused parameters.
- sgn() Tensor#
See
torch.sgn()
- sgn_() Tensor#
In-place version of
sgn()
- shape#
Returns the size of the
selftensor. Alias forsize.See also
Tensor.size().Example:
>>> t = torch.empty(3, 4, 5) >>> t.size() torch.Size([3, 4, 5]) >>> t.shape torch.Size([3, 4, 5])
Moves the underlying storage to shared memory.
This is a no-op if the underlying storage is already in shared memory and for CUDA tensors. Tensors in shared memory cannot be resized.
See
torch.UntypedStorage.share_memory_()for more details.
- short(memory_format=torch.preserve_format) Tensor#
self.short()is equivalent toself.to(torch.int16). Seeto().- Parameters:
memory_format (
torch.memory_format, optional) – the desired memory format of returned Tensor. Default:torch.preserve_format.
- sigmoid() Tensor#
See
torch.sigmoid()
- sigmoid_() Tensor#
In-place version of
sigmoid()
- sign() Tensor#
See
torch.sign()
- sign_() Tensor#
In-place version of
sign()
- signbit() Tensor#
See
torch.signbit()
- sin() Tensor#
See
torch.sin()
- sin_() Tensor#
In-place version of
sin()
- sinc() Tensor#
See
torch.sinc()
- sinc_() Tensor#
In-place version of
sinc()
- sinh() Tensor#
See
torch.sinh()
- sinh_() Tensor#
In-place version of
sinh()
- size(dim=None) torch.Size or int#
Returns the size of the
selftensor. Ifdimis not specified, the returned value is atorch.Size, a subclass oftuple. Ifdimis specified, returns an int holding the size of that dimension.- Parameters:
dim (int, optional) – The dimension for which to retrieve the size.
Example:
>>> t = torch.empty(3, 4, 5) >>> t.size() torch.Size([3, 4, 5]) >>> t.size(dim=1) 4
- slice_scatter(src, dim=0, start=None, end=None, step=1) Tensor#
See
torch.slice_scatter()
- slogdet()#
See
torch.slogdet()
- smm(mat) Tensor#
See
torch.smm()
- softmax(dim) Tensor#
Alias for
torch.nn.functional.softmax().
- sort(dim=-1, descending=False)#
See
torch.sort()
- sparse_dim() int#
Return the number of sparse dimensions in a sparse tensor
self.Note
Returns
0ifselfis not a sparse tensor.See also
Tensor.dense_dim()and hybrid tensors.
- sparse_mask(mask) Tensor#
Returns a new sparse tensor with values from a strided tensor
selffiltered by the indices of the sparse tensormask. The values ofmasksparse tensor are ignored.selfandmasktensors must have the same shape.Note
The returned sparse tensor might contain duplicate values if
maskis not coalesced. It is therefore advisable to passmask.coalesce()if such behavior is not desired.Note
The returned sparse tensor has the same indices as the sparse tensor
mask, even when the corresponding values inselfare zeros.- Parameters:
mask (Tensor) – a sparse tensor whose indices are used as a filter
Example:
>>> nse = 5 >>> dims = (5, 5, 2, 2) >>> I = torch.cat([torch.randint(0, dims[0], size=(nse,)), ... torch.randint(0, dims[1], size=(nse,))], 0).reshape(2, nse) >>> V = torch.randn(nse, dims[2], dims[3]) >>> S = torch.sparse_coo_tensor(I, V, dims).coalesce() >>> D = torch.randn(dims) >>> D.sparse_mask(S) tensor(indices=tensor([[0, 0, 0, 2], [0, 1, 4, 3]]), values=tensor([[[ 1.6550, 0.2397], [-0.1611, -0.0779]], [[ 0.2326, -1.0558], [ 1.4711, 1.9678]], [[-0.5138, -0.0411], [ 1.9417, 0.5158]], [[ 0.0793, 0.0036], [-0.2569, -0.1055]]]), size=(5, 5, 2, 2), nnz=4, layout=torch.sparse_coo)
- sparse_resize_(size, sparse_dim, dense_dim) Tensor#
Resizes
selfsparse tensor to the desired size and the number of sparse and dense dimensions.Note
If the number of specified elements in
selfis zero, thensize,sparse_dim, anddense_dimcan be any size and positive integers such thatlen(size) == sparse_dim + dense_dim.If
selfspecifies one or more elements, however, then each dimension insizemust not be smaller than the corresponding dimension ofself,sparse_dimmust equal the number of sparse dimensions inself, anddense_dimmust equal the number of dense dimensions inself.Warning
Throws an error if
selfis not a sparse tensor.- Parameters:
size (torch.Size) – the desired size. If
selfis non-empty sparse tensor, the desired size cannot be smaller than the original size.sparse_dim (int) – the number of sparse dimensions
dense_dim (int) – the number of dense dimensions
- sparse_resize_and_clear_(size, sparse_dim, dense_dim) Tensor#
Removes all specified elements from a sparse tensor
selfand resizesselfto the desired size and the number of sparse and dense dimensions.- Parameters:
size (torch.Size) – the desired size.
sparse_dim (int) – the number of sparse dimensions
dense_dim (int) – the number of dense dimensions
- split(split_size, dim=0)#
See
torch.split()
- sqrt() Tensor#
See
torch.sqrt()
- sqrt_() Tensor#
In-place version of
sqrt()
- square() Tensor#
See
torch.square()
- square_() Tensor#
In-place version of
square()
- squeeze(dim=None) Tensor#
See
torch.squeeze()
- squeeze_(dim=None) Tensor#
In-place version of
squeeze()
- sspaddmm(mat1, mat2, *, beta=1, alpha=1) Tensor#
See
torch.sspaddmm()
- std(dim=None, *, correction=1, keepdim=False) Tensor#
See
torch.std()
- stft(n_fft, hop_length=None, win_length=None, window=None, center=True, pad_mode='reflect', normalized=False, onesided=None, return_complex=None, align_to_window=None)#
See
torch.stft()Warning
This function changed signature at version 0.4.1. Calling with the previous signature may cause error or return incorrect result.
- storage() torch.TypedStorage#
Returns the underlying
TypedStorage.Warning
TypedStorageis deprecated. It will be removed in the future, andUntypedStoragewill be the only storage class. To access theUntypedStoragedirectly, useTensor.untyped_storage().
- storage_offset() int#
Returns
selftensor’s offset in the underlying storage in terms of number of storage elements (not bytes).Example:
>>> x = torch.tensor([1, 2, 3, 4, 5]) >>> x.storage_offset() 0 >>> x[3:].storage_offset() 3
- storage_type() type#
Returns the type of the underlying storage.
- stride(dim) tuple or int#
Returns the stride of
selftensor.Stride is the jump necessary to go from one element to the next one in the specified dimension
dim. A tuple of all strides is returned when no argument is passed in. Otherwise, an integer value is returned as the stride in the particular dimensiondim.- Parameters:
dim (int, optional) – the desired dimension in which stride is required
Example:
>>> x = torch.tensor([[1, 2, 3, 4, 5], [6, 7, 8, 9, 10]]) >>> x.stride() (5, 1) >>> x.stride(0) 5 >>> x.stride(-1) 1
- sub(other, *, alpha=1) Tensor#
See
torch.sub().
- sub_(other, *, alpha=1) Tensor#
In-place version of
sub()
- subtract(other, *, alpha=1) Tensor#
See
torch.subtract().
- subtract_(other, *, alpha=1) Tensor#
In-place version of
subtract().
- sum(dim=None, keepdim=False, dtype=None) Tensor#
See
torch.sum()
- sum_to_size(*size) Tensor#
Sum
thistensor tosize.sizemust be broadcastable tothistensor size.- Parameters:
size (int...) – a sequence of integers defining the shape of the output tensor.
- svd(some=True, compute_uv=True)#
See
torch.svd()
- swapaxes(axis0, axis1) Tensor#
See
torch.swapaxes()
- swapaxes_(axis0, axis1) Tensor#
In-place version of
swapaxes()
- swapdims(dim0, dim1) Tensor#
See
torch.swapdims()
- swapdims_(dim0, dim1) Tensor#
In-place version of
swapdims()
- t() Tensor#
See
torch.t()
- t_() Tensor#
In-place version of
t()
- take(indices) Tensor#
See
torch.take()
- take_along_dim(indices, dim) Tensor#
See
torch.take_along_dim()
- tan() Tensor#
See
torch.tan()
- tan_() Tensor#
In-place version of
tan()
- tanh() Tensor#
See
torch.tanh()
- tanh_() Tensor#
In-place version of
tanh()
- tensor_split(indices_or_sections, dim=0) List of Tensors#
See
torch.tensor_split()
- tile(dims) Tensor#
See
torch.tile()
- to(*args, **kwargs) Tensor#
Performs Tensor dtype and/or device conversion. A
torch.dtypeandtorch.deviceare inferred from the arguments ofself.to(*args, **kwargs).Note
If the
selfTensor already has the correcttorch.dtypeandtorch.device, thenselfis returned. Otherwise, the returned tensor is a copy ofselfwith the desiredtorch.dtypeandtorch.device.Note
If
selfrequires gradients (requires_grad=True) but the targetdtypespecified is an integer type, the returned tensor will implicitly setrequires_grad=False. This is because only tensors with floating-point or complex dtypes can require gradients.Here are the ways to call
to:- to(dtype, non_blocking=False, copy=False, memory_format=torch.preserve_format) Tensor
Returns a Tensor with the specified
dtype- Args:
memory_format (
torch.memory_format, optional): the desired memory format of returned Tensor. Default:torch.preserve_format.
Note
According to C++ type conversion rules, converting floating point value to integer type will truncate the fractional part. If the truncated value cannot fit into the target type (e.g., casting
torch.inftotorch.long), the behavior is undefined and the result may vary across platforms.- to(device=None, dtype=None, non_blocking=False, copy=False, memory_format=torch.preserve_format) Tensor
Returns a Tensor with the specified
deviceand (optional)dtype. IfdtypeisNoneit is inferred to beself.dtype. Whennon_blockingis set toTrue, the function attempts to perform the conversion asynchronously with respect to the host, if possible. This asynchronous behavior applies to both pinned and pageable memory. However, caution is advised when using this feature. For more information, refer to the tutorial on good usage of non_blocking and pin_memory. Whencopyis set, a new Tensor is created even when the Tensor already matches the desired conversion.- Args:
memory_format (
torch.memory_format, optional): the desired memory format of returned Tensor. Default:torch.preserve_format.
- to(other, non_blocking=False, copy=False) Tensor
Returns a Tensor with same
torch.dtypeandtorch.deviceas the Tensorother. Whennon_blockingis set toTrue, the function attempts to perform the conversion asynchronously with respect to the host, if possible. This asynchronous behavior applies to both pinned and pageable memory. However, caution is advised when using this feature. For more information, refer to the tutorial on good usage of non_blocking and pin_memory. Whencopyis set, a new Tensor is created even when the Tensor already matches the desired conversion.
Example:
>>> tensor = torch.randn(2, 2) # Initially dtype=float32, device=cpu >>> tensor.to(torch.float64) tensor([[-0.5044, 0.0005], [ 0.3310, -0.0584]], dtype=torch.float64) >>> cuda0 = torch.device('cuda:0') >>> tensor.to(cuda0) tensor([[-0.5044, 0.0005], [ 0.3310, -0.0584]], device='cuda:0') >>> tensor.to(cuda0, dtype=torch.float64) tensor([[-0.5044, 0.0005], [ 0.3310, -0.0584]], dtype=torch.float64, device='cuda:0') >>> other = torch.randn((), dtype=torch.float64, device=cuda0) >>> tensor.to(other, non_blocking=True) tensor([[-0.5044, 0.0005], [ 0.3310, -0.0584]], dtype=torch.float64, device='cuda:0')
- to_dense(dtype=None, *, masked_grad=True) Tensor#
Creates a strided copy of
selfifselfis not a strided tensor, otherwise returnsself.- Keyword Arguments:
{dtype}
masked_grad (bool, optional) – If set to
True(default) andselfhas a sparse layout then the backward ofto_dense()returnsgrad.sparse_mask(self).
Example:
>>> s = torch.sparse_coo_tensor( ... torch.tensor([[1, 1], ... [0, 2]]), ... torch.tensor([9, 10]), ... size=(3, 3)) >>> s.to_dense() tensor([[ 0, 0, 0], [ 9, 0, 10], [ 0, 0, 0]])
- to_mkldnn() Tensor#
Returns a copy of the tensor in
torch.mkldnnlayout.
- to_padded_tensor(padding, output_size=None) Tensor#
- to_sparse(sparseDims) Tensor#
Returns a sparse copy of the tensor. PyTorch supports sparse tensors in coordinate format.
- Parameters:
sparseDims (int, optional) – the number of sparse dimensions to include in the new sparse tensor
Example:
>>> d = torch.tensor([[0, 0, 0], [9, 0, 10], [0, 0, 0]]) >>> d tensor([[ 0, 0, 0], [ 9, 0, 10], [ 0, 0, 0]]) >>> d.to_sparse() tensor(indices=tensor([[1, 1], [0, 2]]), values=tensor([ 9, 10]), size=(3, 3), nnz=2, layout=torch.sparse_coo) >>> d.to_sparse(1) tensor(indices=tensor([[1]]), values=tensor([[ 9, 0, 10]]), size=(3, 3), nnz=1, layout=torch.sparse_coo)
- to_sparse(*, layout=None, blocksize=None, dense_dim=None) Tensor
Returns a sparse tensor with the specified layout and blocksize. If the
selfis strided, the number of dense dimensions could be specified, and a hybrid sparse tensor will be created, with dense_dim dense dimensions and self.dim() - 2 - dense_dim batch dimension.Note
If the
selflayout and blocksize parameters match with the specified layout and blocksize, returnself. Otherwise, return a sparse tensor copy ofself.- Parameters:
layout (
torch.layout, optional) – The desired sparse layout. One oftorch.sparse_coo,torch.sparse_csr,torch.sparse_csc,torch.sparse_bsr, ortorch.sparse_bsc. Default: ifNone,torch.sparse_coo.blocksize (list, tuple,
torch.Size, optional) – Block size of the resulting BSR or BSC tensor. For other layouts, specifying the block size that is notNonewill result in a RuntimeError exception. A block size must be a tuple of length two such that its items evenly divide the two sparse dimensions.dense_dim (int, optional) – Number of dense dimensions of the resulting CSR, CSC, BSR or BSC tensor. This argument should be used only if
selfis a strided tensor, and must be a value between 0 and dimension ofselftensor minus two.
Example:
>>> x = torch.tensor([[1, 0], [0, 0], [2, 3]]) >>> x.to_sparse(layout=torch.sparse_coo) tensor(indices=tensor([[0, 2, 2], [0, 0, 1]]), values=tensor([1, 2, 3]), size=(3, 2), nnz=3, layout=torch.sparse_coo) >>> x.to_sparse(layout=torch.sparse_bsr, blocksize=(1, 2)) tensor(crow_indices=tensor([0, 1, 1, 2]), col_indices=tensor([0, 0]), values=tensor([[[1, 0]], [[2, 3]]]), size=(3, 2), nnz=2, layout=torch.sparse_bsr) >>> x.to_sparse(layout=torch.sparse_bsr, blocksize=(2, 1)) RuntimeError: Tensor size(-2) 3 needs to be divisible by blocksize[0] 2 >>> x.to_sparse(layout=torch.sparse_csr, blocksize=(3, 1)) RuntimeError: to_sparse for Strided to SparseCsr conversion does not use specified blocksize >>> x = torch.tensor([[[1], [0]], [[0], [0]], [[2], [3]]]) >>> x.to_sparse(layout=torch.sparse_csr, dense_dim=1) tensor(crow_indices=tensor([0, 1, 1, 3]), col_indices=tensor([0, 0, 1]), values=tensor([[1], [2], [3]]), size=(3, 2, 1), nnz=3, layout=torch.sparse_csr)
- to_sparse_bsc(blocksize, dense_dim) Tensor#
Convert a tensor to a block sparse column (BSC) storage format of given blocksize. If the
selfis strided, then the number of dense dimensions could be specified, and a hybrid BSC tensor will be created, with dense_dim dense dimensions and self.dim() - 2 - dense_dim batch dimension.- Parameters:
blocksize (list, tuple,
torch.Size, optional) – Block size of the resulting BSC tensor. A block size must be a tuple of length two such that its items evenly divide the two sparse dimensions.dense_dim (int, optional) – Number of dense dimensions of the resulting BSC tensor. This argument should be used only if
selfis a strided tensor, and must be a value between 0 and dimension ofselftensor minus two.
Example:
>>> dense = torch.randn(10, 10) >>> sparse = dense.to_sparse_csr() >>> sparse_bsc = sparse.to_sparse_bsc((5, 5)) >>> sparse_bsc.row_indices() tensor([0, 1, 0, 1]) >>> dense = torch.zeros(4, 3, 1) >>> dense[0:2, 0] = dense[0:2, 2] = dense[2:4, 1] = 1 >>> dense.to_sparse_bsc((2, 1), 1) tensor(ccol_indices=tensor([0, 1, 2, 3]), row_indices=tensor([0, 1, 0]), values=tensor([[[[1.]], [[1.]]], [[[1.]], [[1.]]], [[[1.]], [[1.]]]]), size=(4, 3, 1), nnz=3, layout=torch.sparse_bsc)
- to_sparse_bsr(blocksize, dense_dim) Tensor#
Convert a tensor to a block sparse row (BSR) storage format of given blocksize. If the
selfis strided, then the number of dense dimensions could be specified, and a hybrid BSR tensor will be created, with dense_dim dense dimensions and self.dim() - 2 - dense_dim batch dimension.- Parameters:
blocksize (list, tuple,
torch.Size, optional) – Block size of the resulting BSR tensor. A block size must be a tuple of length two such that its items evenly divide the two sparse dimensions.dense_dim (int, optional) – Number of dense dimensions of the resulting BSR tensor. This argument should be used only if
selfis a strided tensor, and must be a value between 0 and dimension ofselftensor minus two.
Example:
>>> dense = torch.randn(10, 10) >>> sparse = dense.to_sparse_csr() >>> sparse_bsr = sparse.to_sparse_bsr((5, 5)) >>> sparse_bsr.col_indices() tensor([0, 1, 0, 1]) >>> dense = torch.zeros(4, 3, 1) >>> dense[0:2, 0] = dense[0:2, 2] = dense[2:4, 1] = 1 >>> dense.to_sparse_bsr((2, 1), 1) tensor(crow_indices=tensor([0, 2, 3]), col_indices=tensor([0, 2, 1]), values=tensor([[[[1.]], [[1.]]], [[[1.]], [[1.]]], [[[1.]], [[1.]]]]), size=(4, 3, 1), nnz=3, layout=torch.sparse_bsr)
- to_sparse_coo()#
Convert a tensor to coordinate format.
Examples:
>>> dense = torch.randn(5, 5) >>> sparse = dense.to_sparse_coo() >>> sparse._nnz() 25
- to_sparse_csc() Tensor#
Convert a tensor to compressed column storage (CSC) format. Except for strided tensors, only works with 2D tensors. If the
selfis strided, then the number of dense dimensions could be specified, and a hybrid CSC tensor will be created, with dense_dim dense dimensions and self.dim() - 2 - dense_dim batch dimension.- Parameters:
dense_dim (int, optional) – Number of dense dimensions of the resulting CSC tensor. This argument should be used only if
selfis a strided tensor, and must be a value between 0 and dimension ofselftensor minus two.
Example:
>>> dense = torch.randn(5, 5) >>> sparse = dense.to_sparse_csc() >>> sparse._nnz() 25 >>> dense = torch.zeros(3, 3, 1, 1) >>> dense[0, 0] = dense[1, 2] = dense[2, 1] = 1 >>> dense.to_sparse_csc(dense_dim=2) tensor(ccol_indices=tensor([0, 1, 2, 3]), row_indices=tensor([0, 2, 1]), values=tensor([[[1.]], [[1.]], [[1.]]]), size=(3, 3, 1, 1), nnz=3, layout=torch.sparse_csc)
- to_sparse_csr(dense_dim=None) Tensor#
Convert a tensor to compressed row storage format (CSR). Except for strided tensors, only works with 2D tensors. If the
selfis strided, then the number of dense dimensions could be specified, and a hybrid CSR tensor will be created, with dense_dim dense dimensions and self.dim() - 2 - dense_dim batch dimension.- Parameters:
dense_dim (int, optional) – Number of dense dimensions of the resulting CSR tensor. This argument should be used only if
selfis a strided tensor, and must be a value between 0 and dimension ofselftensor minus two.
Example:
>>> dense = torch.randn(5, 5) >>> sparse = dense.to_sparse_csr() >>> sparse._nnz() 25 >>> dense = torch.zeros(3, 3, 1, 1) >>> dense[0, 0] = dense[1, 2] = dense[2, 1] = 1 >>> dense.to_sparse_csr(dense_dim=2) tensor(crow_indices=tensor([0, 1, 2, 3]), col_indices=tensor([0, 2, 1]), values=tensor([[[1.]], [[1.]], [[1.]]]), size=(3, 3, 1, 1), nnz=3, layout=torch.sparse_csr)
- tolist() list or number#
Returns the tensor as a (nested) list. For scalars, a standard Python number is returned, just like with
item(). Tensors are automatically moved to the CPU first if necessary.This operation is not differentiable.
Examples:
>>> a = torch.randn(2, 2) >>> a.tolist() [[0.012766935862600803, 0.5415473580360413], [-0.08909505605697632, 0.7729271650314331]] >>> a[0,0].tolist() 0.012766935862600803
- topk(k, dim=None, largest=True, sorted=True)#
See
torch.topk()
- trace() Tensor#
See
torch.trace()
- transpose(dim0, dim1) Tensor#
See
torch.transpose()
- transpose_(dim0, dim1) Tensor#
In-place version of
transpose()
- triangular_solve(A, upper=True, transpose=False, unitriangular=False)#
See
torch.triangular_solve()
- tril(diagonal=0) Tensor#
See
torch.tril()
- tril_(diagonal=0) Tensor#
In-place version of
tril()
- triu(diagonal=0) Tensor#
See
torch.triu()
- triu_(diagonal=0) Tensor#
In-place version of
triu()
- true_divide(value) Tensor#
See
torch.true_divide()
- true_divide_(value) Tensor#
In-place version of
true_divide_()
- trunc() Tensor#
See
torch.trunc()
- trunc_() Tensor#
In-place version of
trunc()
- type(dtype=None, non_blocking=False, **kwargs) str or Tensor#
Returns the type if dtype is not provided, else casts this object to the specified type.
If this is already of the correct type, no copy is performed and the original object is returned.
- Parameters:
dtype (dtype or string) – The desired type
non_blocking (bool) – If
True, and the source is in pinned memory and destination is on the GPU or vice versa, the copy is performed asynchronously with respect to the host. Otherwise, the argument has no effect.**kwargs – For compatibility, may contain the key
asyncin place of thenon_blockingargument. Theasyncarg is deprecated.
- type_as(tensor) Tensor#
Returns this tensor cast to the type of the given tensor.
This is a no-op if the tensor is already of the correct type. This is equivalent to
self.type(tensor.type())- Parameters:
tensor (Tensor) – the tensor which has the desired type
- unbind(dim=0) seq#
See
torch.unbind()
- unflatten(dim, sizes) Tensor#
See
torch.unflatten().
- unfold(dimension, size, step) Tensor#
Returns a view of the original tensor which contains all slices of size
sizefromselftensor in the dimensiondimension.Step between two slices is given by
step.If sizedim is the size of dimension
dimensionforself, the size of dimensiondimensionin the returned tensor will be (sizedim - size) / step + 1.An additional dimension of size
sizeis appended in the returned tensor.- Parameters:
dimension (int) – dimension in which unfolding happens
size (int) – the size of each slice that is unfolded
step (int) – the step between each slice
Example:
>>> x = torch.arange(1., 8) >>> x tensor([ 1., 2., 3., 4., 5., 6., 7.]) >>> x.unfold(0, 2, 1) tensor([[ 1., 2.], [ 2., 3.], [ 3., 4.], [ 4., 5.], [ 5., 6.], [ 6., 7.]]) >>> x.unfold(0, 2, 2) tensor([[ 1., 2.], [ 3., 4.], [ 5., 6.]])
- uniform_(from=0, to=1, *, generator=None) Tensor#
Fills
selftensor with numbers sampled from the continuous uniform distribution:\[f(x) = \dfrac{1}{\text{to} - \text{from}}\]
- unique(sorted=True, return_inverse=False, return_counts=False, dim=None)#
Returns the unique elements of the input tensor.
See
torch.unique()
- unique_consecutive(return_inverse=False, return_counts=False, dim=None)#
Eliminates all but the first element from every consecutive group of equivalent elements.
See
torch.unique_consecutive()
- unsafe_chunk(chunks, dim=0) List of Tensors#
See
torch.unsafe_chunk()
- unsafe_split(split_size, dim=0) List of Tensors#
See
torch.unsafe_split()
- unsqueeze(dim) Tensor#
See
torch.unsqueeze()
- unsqueeze_(dim) Tensor#
In-place version of
unsqueeze()
- untyped_storage() torch.UntypedStorage#
Returns the underlying
UntypedStorage.
- static update_meta(rets, func, args, kwargs)[source]#
Update the metadata from the output of MetaTensor.__torch_function__.
The output of torch.Tensor.__torch_function__ could be a single object or a sequence of them. Hence, in MetaTensor.__torch_function__ we convert them to a list of not already, and then we loop across each element, processing metadata as necessary. For each element, if not of type MetaTensor, then nothing to do.
- Parameters:
rets (
Sequence) – the output from torch.Tensor.__torch_function__, which has been converted to a list in MetaTensor.__torch_function__ if it wasn’t already a Sequence.func – the torch function that was applied. Examples might be torch.squeeze or torch.Tensor.__add__. We need this since the metadata need to be treated differently if a batch of data is considered. For example, slicing (torch.Tensor.__getitem__) the ith element of the 0th dimension of a batch of data should return a ith tensor with the ith metadata.
args – positional arguments that were passed to func.
kwargs – keyword arguments that were passed to func.
- Return type:
Sequence- Returns:
A sequence with the same number of elements as rets. For each element, if the input type was not MetaTensor, then no modifications will have been made. If global parameters have been set to false (e.g., not get_track_meta()), then any MetaTensor will be converted to torch.Tensor. Else, metadata will be propagated as necessary (see
MetaTensor._copy_meta()).
- values() Tensor#
Return the values tensor of a sparse COO tensor.
Warning
Throws an error if
selfis not a sparse COO tensor.See also
Tensor.indices().Note
This method can only be called on a coalesced sparse tensor. See
Tensor.coalesce()for details.
- var(dim=None, *, correction=1, keepdim=False) Tensor#
See
torch.var()
- vdot(other) Tensor#
See
torch.vdot()
- view(*shape) Tensor#
Returns a new tensor with the same data as the
selftensor but of a differentshape.The returned tensor shares the same data and must have the same number of elements, but may have a different size. For a tensor to be viewed, the new view size must be compatible with its original size and stride, i.e., each new view dimension must either be a subspace of an original dimension, or only span across original dimensions \(d, d+1, \dots, d+k\) that satisfy the following contiguity-like condition that \(\forall i = d, \dots, d+k-1\),
\[\text{stride}[i] = \text{stride}[i+1] \times \text{size}[i+1]\]Otherwise, it will not be possible to view
selftensor asshapewithout copying it (e.g., viacontiguous()). When it is unclear whether aview()can be performed, it is advisable to usereshape(), which returns a view if the shapes are compatible, and copies (equivalent to callingcontiguous()) otherwise.- Parameters:
shape (torch.Size or int...) – the desired size
Example:
>>> x = torch.randn(4, 4) >>> x.size() torch.Size([4, 4]) >>> y = x.view(16) >>> y.size() torch.Size([16]) >>> z = x.view(-1, 8) # the size -1 is inferred from other dimensions >>> z.size() torch.Size([2, 8]) >>> a = torch.randn(1, 2, 3, 4) >>> a.size() torch.Size([1, 2, 3, 4]) >>> b = a.transpose(1, 2) # Swaps 2nd and 3rd dimension >>> b.size() torch.Size([1, 3, 2, 4]) >>> c = a.view(1, 3, 2, 4) # Does not change tensor layout in memory >>> c.size() torch.Size([1, 3, 2, 4]) >>> torch.equal(b, c) False
- view(dtype) Tensor
Returns a new tensor with the same data as the
selftensor but of a differentdtype.If the element size of
dtypeis different than that ofself.dtype, then the size of the last dimension of the output will be scaled proportionally. For instance, ifdtypeelement size is twice that ofself.dtype, then each pair of elements in the last dimension ofselfwill be combined, and the size of the last dimension of the output will be half that ofself. Ifdtypeelement size is half that ofself.dtype, then each element in the last dimension ofselfwill be split in two, and the size of the last dimension of the output will be double that ofself. For this to be possible, the following conditions must be true:self.dim()must be greater than 0.self.stride(-1)must be 1.
Additionally, if the element size of
dtypeis greater than that ofself.dtype, the following conditions must be true as well:self.size(-1)must be divisible by the ratio between the element sizes of the dtypes.self.storage_offset()must be divisible by the ratio between the element sizes of the dtypes.The strides of all dimensions, except the last dimension, must be divisible by the ratio between the element sizes of the dtypes.
If any of the above conditions are not met, an error is thrown.
Warning
This overload is not supported by TorchScript, and using it in a Torchscript program will cause undefined behavior.
- Parameters:
dtype (
torch.dtype) – the desired dtype
Example:
>>> x = torch.randn(4, 4) >>> x tensor([[ 0.9482, -0.0310, 1.4999, -0.5316], [-0.1520, 0.7472, 0.5617, -0.8649], [-2.4724, -0.0334, -0.2976, -0.8499], [-0.2109, 1.9913, -0.9607, -0.6123]]) >>> x.dtype torch.float32 >>> y = x.view(torch.int32) >>> y tensor([[ 1064483442, -1124191867, 1069546515, -1089989247], [-1105482831, 1061112040, 1057999968, -1084397505], [-1071760287, -1123489973, -1097310419, -1084649136], [-1101533110, 1073668768, -1082790149, -1088634448]], dtype=torch.int32) >>> y[0, 0] = 1000000000 >>> x tensor([[ 0.0047, -0.0310, 1.4999, -0.5316], [-0.1520, 0.7472, 0.5617, -0.8649], [-2.4724, -0.0334, -0.2976, -0.8499], [-0.2109, 1.9913, -0.9607, -0.6123]]) >>> x.view(torch.cfloat) tensor([[ 0.0047-0.0310j, 1.4999-0.5316j], [-0.1520+0.7472j, 0.5617-0.8649j], [-2.4724-0.0334j, -0.2976-0.8499j], [-0.2109+1.9913j, -0.9607-0.6123j]]) >>> x.view(torch.cfloat).size() torch.Size([4, 2]) >>> x.view(torch.uint8) tensor([[ 0, 202, 154, 59, 182, 243, 253, 188, 185, 252, 191, 63, 240, 22, 8, 191], [227, 165, 27, 190, 128, 72, 63, 63, 146, 203, 15, 63, 22, 106, 93, 191], [205, 59, 30, 192, 112, 206, 8, 189, 7, 95, 152, 190, 12, 147, 89, 191], [ 43, 246, 87, 190, 235, 226, 254, 63, 111, 240, 117, 191, 177, 191, 28, 191]], dtype=torch.uint8) >>> x.view(torch.uint8).size() torch.Size([4, 16])
- view_as(other) Tensor#
View this tensor as the same size as
other.self.view_as(other)is equivalent toself.view(other.size()).Please see
view()for more information aboutview.- Parameters:
other (
torch.Tensor) – The result tensor has the same size asother.
- vsplit(split_size_or_sections) List of Tensors#
See
torch.vsplit()
- where(condition, y) Tensor#
self.where(condition, y)is equivalent totorch.where(condition, self, y). Seetorch.where()
- xlogy(other) Tensor#
See
torch.xlogy()
- xlogy_(other) Tensor#
In-place version of
xlogy()
- xpu(device=None, non_blocking=False, memory_format=torch.preserve_format) Tensor#
Returns a copy of this object in XPU memory.
If this object is already in XPU memory and on the correct device, then no copy is performed and the original object is returned.
- Parameters:
device (
torch.device) – The destination XPU device. Defaults to the current XPU device.non_blocking (bool) – If
Trueand the source is in pinned memory, the copy will be asynchronous with respect to the host. Otherwise, the argument has no effect. Default:False.memory_format (
torch.memory_format, optional) – the desired memory format of returned Tensor. Default:torch.preserve_format.
- zero_() Tensor#
Fills
selftensor with zeros.
Whole slide image reader#
BaseWSIReader#
- class monai.data.BaseWSIReader(level=None, mpp=None, mpp_rtol=0.05, mpp_atol=0.0, power=None, power_rtol=0.05, power_atol=0.0, channel_dim=0, dtype=<class 'numpy.uint8'>, device=None, mode='RGB', **kwargs)[source]#
An abstract class that defines APIs to load patches from whole slide image files.
- Parameters:
level (
UnionType[int,None]) – the whole slide image level at which the patches are extracted.mpp (
UnionType[float,tuple[float,float],None]) – the resolution in micron per pixel at which the patches are extracted.mpp_rtol (
float) – the acceptable relative tolerance for resolution in micro per pixel.mpp_atol (
float) – the acceptable absolute tolerance for resolution in micro per pixel.power (
UnionType[int,None]) – the objective power at which the patches are extracted.power_rtol (
float) – the acceptable relative tolerance for objective power.power_atol (
float) – the acceptable absolute tolerance for objective power.channel_dim (
int) – the desired dimension for color channel.dtype (
Union[dtype,type,str,None,dtype]) – the data type of output image.device (
UnionType[device,str,None]) – target device to put the extracted patch. Note that if device is “cuda””, the output will be converted to torch tenor and sent to the gpu even if the dtype is numpy.mode (
str) – the output image color mode, e.g., “RGB” or “RGBA”.kwargs – additional args for the reader
Notes – Only one of resolution parameters, level, mpp, or power, should be provided. If such parameters are provided in get_data method, those will override the values provided here. If none of them are provided here or in get_data, level=0 will be used.
Typical usage of a concrete implementation of this class is:
image_reader = MyWSIReader() wsi = image_reader.read(filepath, **kwargs) img_data, meta_data = image_reader.get_data(wsi)
The read call converts an image filename into whole slide image object,
The get_data call fetches the image data, as well as metadata.
The following methods needs to be implemented for any concrete implementation of this class:
read reads a whole slide image object from a given file
get_size returns the size of the whole slide image of a given wsi object at a given level.
get_level_count returns the number of levels in the whole slide image
_get_patch extracts and returns a patch image form the whole slide image
_get_metadata extracts and returns metadata for a whole slide image and a specific patch.
- get_data(wsi, location=(0, 0), size=None, level=None, mpp=None, power=None, mode=None)[source]#
Verifies inputs, extracts patches from WSI image and generates metadata.
- Parameters:
wsi – a whole slide image object loaded from a file or a list of such objects.
location (
tuple[int,int]) – (top, left) tuple giving the top left pixel in the level 0 reference frame. Defaults to (0, 0).size (
UnionType[tuple[int,int],None]) – (height, width) tuple giving the patch size at the given level (level). If not provided or None, it is set to the full image size at the given level.level (
UnionType[int,None]) – the whole slide image level at which the patches are extracted.mpp (
UnionType[float,tuple[float,float],None]) – the resolution in micron per pixel at which the patches are extracted.power (
UnionType[int,None]) – the objective power at which the patches are extracted.dtype – the data type of output image.
mode (
UnionType[str,None]) – the output image mode, ‘RGB’ or ‘RGBA’.
- Return type:
tuple[ndarray,dict]- Returns:
- a tuples, where the first element is an image patch [CxHxW] or stack of patches,
and second element is a dictionary of metadata.
Notes
Only one of resolution parameters, level, mpp, or power, should be provided. If none of them are provided, it uses the defaults that are set during class instantiation. If none of them are set here or during class instantiation, level=0 will be used.
- abstractmethod get_downsample_ratio(wsi, level)[source]#
Returns the down-sampling ratio of the whole slide image at a given level.
- Parameters:
wsi – a whole slide image object loaded from a file.
level (
int) – the level number where the downsample ratio is calculated.
- Return type:
float
- abstractmethod get_level_count(wsi)[source]#
Returns the number of levels in the whole slide image.
- Parameters:
wsi – a whole slide image object loaded from a file.
- Return type:
int
- abstractmethod get_mpp(wsi, level)[source]#
Returns the micro-per-pixel resolution of the whole slide image at a given level.
- Parameters:
wsi – a whole slide image object loaded from a file.
level (
int) – the level number where the mpp is calculated.
- Return type:
tuple[float,float]
- abstractmethod get_power(wsi, level)[source]#
Returns the objective power of the whole slide image at a given level.
- Parameters:
wsi – a whole slide image object loaded from a file.
level (
int) – the level number where the objective power is calculated.
- Return type:
float
- abstractmethod get_size(wsi, level)[source]#
Returns the size (height, width) of the whole slide image at a given level.
- Parameters:
wsi – a whole slide image object loaded from a file.
level (
int) – the level number where the size is calculated.
- Return type:
tuple[int,int]
- get_valid_level(wsi, level, mpp, power)[source]#
Returns the level associated to the resolution parameters in the whole slide image.
- Parameters:
wsi – a whole slide image object loaded from a file.
level (
UnionType[int,None]) – the level number.mpp (
UnionType[float,tuple[float,float],None]) – the micron-per-pixel resolution.power (
UnionType[int,None]) – the objective power.
- Return type:
int
- verify_suffix(filename)[source]#
Verify whether the specified file or files format is supported by WSI reader.
The list of supported suffixes are read from self.supported_suffixes.
- Parameters:
filename (
Union[Sequence[Union[str,PathLike]],str,PathLike]) – filename or a list of filenames to read.- Return type:
bool
WSIReader#
- class monai.data.WSIReader(backend='cucim', level=None, mpp=None, mpp_rtol=0.05, mpp_atol=0.0, power=None, power_rtol=0.05, power_atol=0.0, channel_dim=0, dtype=<class 'numpy.uint8'>, device=None, mode='RGB', **kwargs)[source]#
Read whole slide images and extract patches using different backend libraries
- Parameters:
backend – the name of backend whole slide image reader library, the default is cuCIM.
level (
UnionType[int,None]) – the whole slide image level at which the patches are extracted.mpp (
UnionType[float,tuple[float,float],None]) – the resolution in micron per pixel at which the patches are extracted.mpp_rtol (
float) – the acceptable relative tolerance for resolution in micro per pixel.mpp_atol (
float) – the acceptable absolute tolerance for resolution in micro per pixel.power (
UnionType[int,None]) – the objective power at which the patches are extracted.power_rtol (
float) – the acceptable relative tolerance for objective power.power_atol (
float) – the acceptable absolute tolerance for objective power.channel_dim (
int) – the desired dimension for color channel. Default to 0 (channel first).dtype (
Union[dtype,type,str,None,dtype]) – the data type of output image. Defaults to np.uint8.device (
UnionType[device,str,None]) – target device to put the extracted patch. Note that if device is “cuda””, the output will be converted to torch tenor and sent to the gpu even if the dtype is numpy.mode (
str) – the output image color mode, “RGB” or “RGBA”. Defaults to “RGB”.num_workers – number of workers for multi-thread image loading (cucim backend only).
kwargs – additional arguments to be passed to the backend library
Notes – Only one of resolution parameters, level, mpp, or power, should be provided. If such parameters are provided in get_data method, those will override the values provided here. If none of them are provided here or in get_data, level=0 will be used.
- get_downsample_ratio(wsi, level)[source]#
Returns the down-sampling ratio of the whole slide image at a given level.
- Parameters:
wsi – a whole slide image object loaded from a file.
level (
int) – the level number where the downsample ratio is calculated.
- Return type:
float
- get_level_count(wsi)[source]#
Returns the number of levels in the whole slide image.
- Parameters:
wsi – a whole slide image object loaded from a file.
- Return type:
int
- get_mpp(wsi, level)[source]#
Returns the micro-per-pixel resolution of the whole slide image at a given level.
- Parameters:
wsi – a whole slide image object loaded from a file.
level (
int) – the level number where the mpp is calculated.
- Return type:
tuple[float,float]
- get_power(wsi, level)[source]#
Returns the micro-per-pixel resolution of the whole slide image at a given level.
- Parameters:
wsi – a whole slide image object loaded from a file.
level (
int) – the level number where the objective power is calculated.
- Return type:
float
- get_size(wsi, level)[source]#
Returns the size (height, width) of the whole slide image at a given level.
- Parameters:
wsi – a whole slide image object loaded from a file.
level (
int) – the level number where the size is calculated.
- Return type:
tuple[int,int]
- read(data, **kwargs)[source]#
Read whole slide image objects from given file or list of files.
- Parameters:
data (
Union[Sequence[Union[str,PathLike]],str,PathLike,ndarray]) – file name or a list of file names to read.kwargs – additional args for the reader module (overrides self.kwargs for existing keys).
- Returns:
whole slide image object or list of such objects.
CuCIMWSIReader#
- class monai.data.CuCIMWSIReader(num_workers=0, **kwargs)[source]#
Read whole slide images and extract patches using cuCIM library.
- Parameters:
level – the whole slide image level at which the patches are extracted.
mpp – the resolution in micron per pixel at which the patches are extracted.
mpp_rtol – the acceptable relative tolerance for resolution in micro per pixel.
mpp_atol – the acceptable absolute tolerance for resolution in micro per pixel.
power – the objective power at which the patches are extracted.
power_rtol – the acceptable relative tolerance for objective power.
power_atol – the acceptable absolute tolerance for objective power.
channel_dim – the desired dimension for color channel. Default to 0 (channel first).
dtype – the data type of output image. Defaults to np.uint8.
device – target device to put the extracted patch. Note that if device is “cuda””, the output will be converted to torch tenor and sent to the gpu even if the dtype is numpy.
mode – the output image color mode, “RGB” or “RGBA”. Defaults to “RGB”.
num_workers (
int) – number of workers for multi-thread image loading.kwargs – additional args for cucim.CuImage module: rapidsai/cucim
Notes – Only one of resolution parameters, level, mpp, or power, should be provided. If such parameters are provided in get_data method, those will override the values provided here. If none of them are provided here or in get_data, level=0 will be used.
- get_downsample_ratio(wsi, level)[source]#
Returns the down-sampling ratio of the whole slide image at a given level.
- Parameters:
wsi – a whole slide image object loaded from a file.
level (
int) – the level number where the downsample ratio is calculated.
- Return type:
float
- static get_level_count(wsi)[source]#
Returns the number of levels in the whole slide image.
- Parameters:
wsi – a whole slide image object loaded from a file.
- Return type:
int
- get_mpp(wsi, level)[source]#
Returns the micro-per-pixel resolution of the whole slide image at a given level.
- Parameters:
wsi – a whole slide image object loaded from a file.
level (
int) – the level number where the mpp is calculated.
- Return type:
tuple[float,float]
- get_power(wsi, level)[source]#
Returns the objective power of the whole slide image at a given level.
- Parameters:
wsi – a whole slide image object loaded from a file.
level (
int) – the level number where the objective power is calculated.
- Return type:
float
- get_size(wsi, level)[source]#
Returns the size (height, width) of the whole slide image at a given level.
- Parameters:
wsi – a whole slide image object loaded from a file.
level (
int) – the level number where the size is calculated.
- Return type:
tuple[int,int]
- read(data, **kwargs)[source]#
Read whole slide image objects from given file or list of files.
- Parameters:
data (
Union[Sequence[Union[str,PathLike]],str,PathLike,ndarray]) – file name or a list of file names to read.kwargs – additional args that overrides self.kwargs for existing keys. For more details look at rapidsai/cucim
- Returns:
whole slide image object or list of such objects.
OpenSlideWSIReader#
- class monai.data.OpenSlideWSIReader(**kwargs)[source]#
Read whole slide images and extract patches using OpenSlide library.
- Parameters:
level – the whole slide image level at which the patches are extracted.
mpp – the resolution in micron per pixel at which the patches are extracted.
mpp_rtol – the acceptable relative tolerance for resolution in micro per pixel.
mpp_atol – the acceptable absolute tolerance for resolution in micro per pixel.
power – the objective power at which the patches are extracted.
power_rtol – the acceptable relative tolerance for objective power.
power_atol – the acceptable absolute tolerance for objective power.
channel_dim – the desired dimension for color channel. Default to 0 (channel first).
dtype – the data type of output image. Defaults to np.uint8.
device – target device to put the extracted patch. Note that if device is “cuda””, the output will be converted to torch tenor and sent to the gpu even if the dtype is numpy.
mode – the output image color mode, “RGB” or “RGBA”. Defaults to “RGB”.
kwargs – additional args for openslide.OpenSlide module.
Notes – Only one of resolution parameters, level, mpp, or power, should be provided. If such parameters are provided in get_data method, those will override the values provided here. If none of them are provided here or in get_data, level=0 will be used.
- get_downsample_ratio(wsi, level)[source]#
Returns the down-sampling ratio of the whole slide image at a given level.
- Parameters:
wsi – a whole slide image object loaded from a file.
level (
int) – the level number where the downsample ratio is calculated.
- Return type:
float
- static get_level_count(wsi)[source]#
Returns the number of levels in the whole slide image.
- Parameters:
wsi – a whole slide image object loaded from a file.
- Return type:
int
- get_mpp(wsi, level)[source]#
Returns the micro-per-pixel resolution of the whole slide image at a given level.
- Parameters:
wsi – a whole slide image object loaded from a file.
level (
int) – the level number where the mpp is calculated.
- Return type:
tuple[float,float]
- get_power(wsi, level)[source]#
Returns the objective power of the whole slide image at a given level.
- Parameters:
wsi – a whole slide image object loaded from a file.
level (
int) – the level number where the objective power is calculated.
- Return type:
float
- get_size(wsi, level)[source]#
Returns the size (height, width) of the whole slide image at a given level.
- Parameters:
wsi – a whole slide image object loaded from a file.
level (
int) – the level number where the size is calculated.
- Return type:
tuple[int,int]
- read(data, **kwargs)[source]#
Read whole slide image objects from given file or list of files.
- Parameters:
data (
Union[Sequence[Union[str,PathLike]],str,PathLike,ndarray]) – file name or a list of file names to read.kwargs – additional args that overrides self.kwargs for existing keys.
- Returns:
whole slide image object or list of such objects.
TiffFileWSIReader#
- class monai.data.TiffFileWSIReader(**kwargs)[source]#
Read whole slide images and extract patches using TiffFile library.
- Parameters:
level – the whole slide image level at which the patches are extracted.
mpp – the resolution in micron per pixel at which the patches are extracted.
mpp_rtol – the acceptable relative tolerance for resolution in micro per pixel.
mpp_atol – the acceptable absolute tolerance for resolution in micro per pixel.
channel_dim – the desired dimension for color channel. Default to 0 (channel first).
dtype – the data type of output image. Defaults to np.uint8.
device – target device to put the extracted patch. Note that if device is “cuda””, the output will be converted to torch tenor and sent to the gpu even if the dtype is numpy.
mode – the output image color mode, “RGB” or “RGBA”. Defaults to “RGB”.
kwargs – additional args for tifffile.TiffFile module.
Notes –
Objective power cannot be obtained via TiffFile backend.
- Only one of resolution parameters, level or mpp, should be provided.
If such parameters are provided in get_data method, those will override the values provided here. If none of them are provided here or in get_data, level=0 will be used.
- get_downsample_ratio(wsi, level)[source]#
Returns the down-sampling ratio of the whole slide image at a given level.
- Parameters:
wsi – a whole slide image object loaded from a file.
level (
int) – the level number where the downsample ratio is calculated.
- Return type:
float
- static get_level_count(wsi)[source]#
Returns the number of levels in the whole slide image.
- Parameters:
wsi – a whole slide image object loaded from a file.
- Return type:
int
- get_mpp(wsi, level)[source]#
Returns the micro-per-pixel resolution of the whole slide image at a given level.
- Parameters:
wsi – a whole slide image object loaded from a file.
level (
int) – the level number where the mpp is calculated.
- Return type:
tuple[float,float]
- get_power(wsi, level)[source]#
Returns the objective power of the whole slide image at a given level.
- Parameters:
wsi – a whole slide image object loaded from a file.
level (
int) – the level number where the objective power is calculated.
- Return type:
float
- get_size(wsi, level)[source]#
Returns the size (height, width) of the whole slide image at a given level.
- Parameters:
wsi – a whole slide image object loaded from a file.
level (
int) – the level number where the size is calculated.
- Return type:
tuple[int,int]
- read(data, **kwargs)[source]#
Read whole slide image objects from given file or list of files.
- Parameters:
data (
Union[Sequence[Union[str,PathLike]],str,PathLike,ndarray]) – file name or a list of file names to read.kwargs – additional args that overrides self.kwargs for existing keys.
- Returns:
whole slide image object or list of such objects.
Whole slide image datasets#
PatchWSIDataset#
- class monai.data.PatchWSIDataset(data, patch_size=None, patch_level=None, transform=None, include_label=True, center_location=True, additional_meta_keys=None, reader='cuCIM', **kwargs)[source]#
This dataset extracts patches from whole slide images (without loading the whole image) It also reads labels for each patch and provides each patch with its associated class labels.
- Parameters:
data (
Sequence) – the list of input samples including image, location, and label (see the note below for more details).patch_size (
UnionType[int,tuple[int,int],None]) – the size of patch to be extracted from the whole slide image.patch_level (
UnionType[int,None]) – the level at which the patches to be extracted (default to 0).transform (
UnionType[Callable,None]) – transforms to be executed on input data.include_label (
bool) – whether to load and include labels in the outputcenter_location (
bool) – whether the input location information is the position of the center of the patchadditional_meta_keys (
UnionType[Sequence[str],None]) – the list of keys for items to be copied to the output metadata from the input datareader –
the module to be used for loading whole slide imaging. If reader is
a string, it defines the backend of monai.data.WSIReader. Defaults to cuCIM.
a class (inherited from BaseWSIReader), it is initialized and set as wsi_reader.
an instance of a class inherited from BaseWSIReader, it is set as the wsi_reader.
kwargs – additional arguments to pass to WSIReader or provided whole slide reader class
- Returns:
a dictionary of loaded image (in MetaTensor format) along with the labels (if requested). {“image”: MetaTensor, “label”: torch.Tensor}
- Return type:
dict
Note
The input data has the following form as an example:
[ {"image": "path/to/image1.tiff", "location": [200, 500], "label": 0}, {"image": "path/to/image2.tiff", "location": [100, 700], "patch_size": [20, 20], "patch_level": 2, "label": 1} ]
MaskedPatchWSIDataset#
- class monai.data.MaskedPatchWSIDataset(data, patch_size=None, patch_level=None, mask_level=7, transform=None, include_label=False, center_location=False, additional_meta_keys=(mask_location, name), reader='cuCIM', **kwargs)[source]#
This dataset extracts patches from whole slide images at the locations where foreground mask at a given level is non-zero.
- Parameters:
data (
Sequence) – the list of input samples including image, location, and label (see the note below for more details).patch_size (
UnionType[int,tuple[int,int],None]) – the size of patch to be extracted from the whole slide image.patch_level (
UnionType[int,None]) – the level at which the patches to be extracted (default to 0).mask_level (
int) – the resolution level at which the mask is created.transform (
UnionType[Callable,None]) – transforms to be executed on input data.include_label (
bool) – whether to load and include labels in the outputcenter_location (
bool) – whether the input location information is the position of the center of the patchadditional_meta_keys (
Sequence[str]) – the list of keys for items to be copied to the output metadata from the input datareader –
the module to be used for loading whole slide imaging. Defaults to cuCIM. If reader is
a string, it defines the backend of monai.data.WSIReader.
a class (inherited from BaseWSIReader), it is initialized and set as wsi_reader,
an instance of a class inherited from BaseWSIReader, it is set as the wsi_reader.
kwargs – additional arguments to pass to WSIReader or provided whole slide reader class
Note
The input data has the following form as an example:
[ {"image": "path/to/image1.tiff"}, {"image": "path/to/image2.tiff", "size": [20, 20], "level": 2} ]
SlidingPatchWSIDataset#
- class monai.data.SlidingPatchWSIDataset(data, patch_size=None, patch_level=None, mask_level=0, overlap=0.0, offset=(0, 0), offset_limits=None, transform=None, include_label=False, center_location=False, additional_meta_keys=(mask_location, mask_size, num_patches), reader='cuCIM', seed=0, **kwargs)[source]#
This dataset extracts patches in sliding-window manner from whole slide images (without loading the whole image). It also reads labels for each patch and provides each patch with its associated class labels.
- Parameters:
data (
Sequence) – the list of input samples including image, location, and label (see the note below for more details).patch_size (
UnionType[int,tuple[int,int],None]) – the size of patch to be extracted from the whole slide image.patch_level (
UnionType[int,None]) – the level at which the patches to be extracted (default to 0).mask_level (
int) – the resolution level at which the mask/map is created (for ProbMapProducer for instance).overlap (
UnionType[tuple[float,float],float]) – the amount of overlap of neighboring patches in each dimension (a value between 0.0 and 1.0). If only one float number is given, it will be applied to all dimensions. Defaults to 0.0.offset (
UnionType[tuple[int,int],int,str]) – the offset of image to extract patches (the starting position of the upper left patch).offset_limits (
UnionType[tuple[tuple[int,int],tuple[int,int]],tuple[int,int],None]) – if offset is set to “random”, a tuple of integers defining the lower and upper limit of the random offset for all dimensions, or a tuple of tuples that defines the limits for each dimension.transform (
UnionType[Callable,None]) – transforms to be executed on input data.include_label (
bool) – whether to load and include labels in the outputcenter_location (
bool) – whether the input location information is the position of the center of the patchadditional_meta_keys (
Sequence[str]) – the list of keys for items to be copied to the output metadata from the input datareader –
the module to be used for loading whole slide imaging. Defaults to cuCIM. If reader is
a string, it defines the backend of monai.data.WSIReader.
a class (inherited from BaseWSIReader), it is initialized and set as wsi_reader,
an instance of a class inherited from BaseWSIReader, it is set as the wsi_reader.
seed (
int) – random seed to randomly generate offsets. Defaults to 0.kwargs – additional arguments to pass to WSIReader or provided whole slide reader class
Note
The input data has the following form as an example:
[ {"image": "path/to/image1.tiff"}, {"image": "path/to/image2.tiff", "patch_size": [20, 20], "patch_level": 2} ]
Unlike MaskedPatchWSIDataset, this dataset does not filter any patches.
Bounding box#
This utility module mainly supports rectangular bounding boxes with a few different parameterizations and methods for converting between them. It provides reliable access to the spatial coordinates of the box vertices in the “canonical ordering”: [xmin, ymin, xmax, ymax] for 2D and [xmin, ymin, zmin, xmax, ymax, zmax] for 3D. We currently define this ordering as monai.data.box_utils.StandardMode and the rest of the detection pipelines mainly assumes boxes in StandardMode.
- class monai.data.box_utils.BoxMode[source]#
An abstract class of a
BoxMode.A
BoxModeis callable that converts box mode ofboxes, which are Nx4 (2D) or Nx6 (3D) torch tensor or ndarray.BoxModehas several subclasses that represents different box modes, includingCornerCornerModeTypeA: represents [xmin, ymin, xmax, ymax] for 2D and [xmin, ymin, zmin, xmax, ymax, zmax] for 3DCornerCornerModeTypeB: represents [xmin, xmax, ymin, ymax] for 2D and [xmin, xmax, ymin, ymax, zmin, zmax] for 3DCornerCornerModeTypeC: represents [xmin, ymin, xmax, ymax] for 2D and [xmin, ymin, xmax, ymax, zmin, zmax] for 3DCornerSizeMode: represents [xmin, ymin, xsize, ysize] for 2D and [xmin, ymin, zmin, xsize, ysize, zsize] for 3DCenterSizeMode: represents [xcenter, ycenter, xsize, ysize] for 2D and [xcenter, ycenter, zcenter, xsize, ysize, zsize] for 3D
We currently define
StandardMode=CornerCornerModeTypeA, and monai detection pipelines mainly assumeboxesare inStandardMode.The implementation should be aware of:
remember to define class variable
name, a dictionary that mapsspatial_dimstoBoxModeName.boxes_to_corners()andcorners_to_boxes()should not modify inputs in place.
- abstractmethod boxes_to_corners(boxes)[source]#
Convert the bounding boxes of the current mode to corners.
- Parameters:
boxes (
Tensor) – bounding boxes, Nx4 or Nx6 torch tensor- Returns:
corners of boxes, 4-element or 6-element tuple, each element is a Nx1 torch tensor. It represents (xmin, ymin, xmax, ymax) or (xmin, ymin, zmin, xmax, ymax, zmax)
- Return type:
tuple
Example
boxes = torch.ones(10,6) boxmode = BoxMode() boxmode.boxes_to_corners(boxes) # will return a 6-element tuple, each element is a 10x1 tensor
- abstractmethod corners_to_boxes(corners)[source]#
Convert the given box corners to the bounding boxes of the current mode.
- Parameters:
corners (
Sequence) – corners of boxes, 4-element or 6-element tuple, each element is a Nx1 torch tensor. It represents (xmin, ymin, xmax, ymax) or (xmin, ymin, zmin, xmax, ymax, zmax)- Returns:
bounding boxes, Nx4 or Nx6 torch tensor
- Return type:
Tensor
Example
corners = (torch.ones(10,1), torch.ones(10,1), torch.ones(10,1), torch.ones(10,1)) boxmode = BoxMode() boxmode.corners_to_boxes(corners) # will return a 10x4 tensor
- class monai.data.box_utils.CenterSizeMode[source]#
A subclass of
BoxMode.Also represented as “ccwh” or “cccwhd”, with format of [xmin, ymin, xsize, ysize] or [xmin, ymin, zmin, xsize, ysize, zsize].
Example
CenterSizeMode.get_name(spatial_dims=2) # will return "ccwh" CenterSizeMode.get_name(spatial_dims=3) # will return "cccwhd"
- boxes_to_corners(boxes)[source]#
Convert the bounding boxes of the current mode to corners.
- Parameters:
boxes (
Tensor) – bounding boxes, Nx4 or Nx6 torch tensor- Returns:
corners of boxes, 4-element or 6-element tuple, each element is a Nx1 torch tensor. It represents (xmin, ymin, xmax, ymax) or (xmin, ymin, zmin, xmax, ymax, zmax)
- Return type:
tuple
Example
boxes = torch.ones(10,6) boxmode = BoxMode() boxmode.boxes_to_corners(boxes) # will return a 6-element tuple, each element is a 10x1 tensor
- corners_to_boxes(corners)[source]#
Convert the given box corners to the bounding boxes of the current mode.
- Parameters:
corners (
Sequence) – corners of boxes, 4-element or 6-element tuple, each element is a Nx1 torch tensor. It represents (xmin, ymin, xmax, ymax) or (xmin, ymin, zmin, xmax, ymax, zmax)- Returns:
bounding boxes, Nx4 or Nx6 torch tensor
- Return type:
Tensor
Example
corners = (torch.ones(10,1), torch.ones(10,1), torch.ones(10,1), torch.ones(10,1)) boxmode = BoxMode() boxmode.corners_to_boxes(corners) # will return a 10x4 tensor
- class monai.data.box_utils.CornerCornerModeTypeA[source]#
A subclass of
BoxMode.Also represented as “xyxy” or “xyzxyz”, with format of [xmin, ymin, xmax, ymax] or [xmin, ymin, zmin, xmax, ymax, zmax].
Example
CornerCornerModeTypeA.get_name(spatial_dims=2) # will return "xyxy" CornerCornerModeTypeA.get_name(spatial_dims=3) # will return "xyzxyz"
- boxes_to_corners(boxes)[source]#
Convert the bounding boxes of the current mode to corners.
- Parameters:
boxes (
Tensor) – bounding boxes, Nx4 or Nx6 torch tensor- Returns:
corners of boxes, 4-element or 6-element tuple, each element is a Nx1 torch tensor. It represents (xmin, ymin, xmax, ymax) or (xmin, ymin, zmin, xmax, ymax, zmax)
- Return type:
tuple
Example
boxes = torch.ones(10,6) boxmode = BoxMode() boxmode.boxes_to_corners(boxes) # will return a 6-element tuple, each element is a 10x1 tensor
- corners_to_boxes(corners)[source]#
Convert the given box corners to the bounding boxes of the current mode.
- Parameters:
corners (
Sequence) – corners of boxes, 4-element or 6-element tuple, each element is a Nx1 torch tensor. It represents (xmin, ymin, xmax, ymax) or (xmin, ymin, zmin, xmax, ymax, zmax)- Returns:
bounding boxes, Nx4 or Nx6 torch tensor
- Return type:
Tensor
Example
corners = (torch.ones(10,1), torch.ones(10,1), torch.ones(10,1), torch.ones(10,1)) boxmode = BoxMode() boxmode.corners_to_boxes(corners) # will return a 10x4 tensor
- class monai.data.box_utils.CornerCornerModeTypeB[source]#
A subclass of
BoxMode.Also represented as “xxyy” or “xxyyzz”, with format of [xmin, xmax, ymin, ymax] or [xmin, xmax, ymin, ymax, zmin, zmax].
Example
CornerCornerModeTypeB.get_name(spatial_dims=2) # will return "xxyy" CornerCornerModeTypeB.get_name(spatial_dims=3) # will return "xxyyzz"
- boxes_to_corners(boxes)[source]#
Convert the bounding boxes of the current mode to corners.
- Parameters:
boxes (
Tensor) – bounding boxes, Nx4 or Nx6 torch tensor- Returns:
corners of boxes, 4-element or 6-element tuple, each element is a Nx1 torch tensor. It represents (xmin, ymin, xmax, ymax) or (xmin, ymin, zmin, xmax, ymax, zmax)
- Return type:
tuple
Example
boxes = torch.ones(10,6) boxmode = BoxMode() boxmode.boxes_to_corners(boxes) # will return a 6-element tuple, each element is a 10x1 tensor
- corners_to_boxes(corners)[source]#
Convert the given box corners to the bounding boxes of the current mode.
- Parameters:
corners (
Sequence) – corners of boxes, 4-element or 6-element tuple, each element is a Nx1 torch tensor. It represents (xmin, ymin, xmax, ymax) or (xmin, ymin, zmin, xmax, ymax, zmax)- Returns:
bounding boxes, Nx4 or Nx6 torch tensor
- Return type:
Tensor
Example
corners = (torch.ones(10,1), torch.ones(10,1), torch.ones(10,1), torch.ones(10,1)) boxmode = BoxMode() boxmode.corners_to_boxes(corners) # will return a 10x4 tensor
- class monai.data.box_utils.CornerCornerModeTypeC[source]#
A subclass of
BoxMode.Also represented as “xyxy” or “xyxyzz”, with format of [xmin, ymin, xmax, ymax] or [xmin, ymin, xmax, ymax, zmin, zmax].
Example
CornerCornerModeTypeC.get_name(spatial_dims=2) # will return "xyxy" CornerCornerModeTypeC.get_name(spatial_dims=3) # will return "xyxyzz"
- boxes_to_corners(boxes)[source]#
Convert the bounding boxes of the current mode to corners.
- Parameters:
boxes (
Tensor) – bounding boxes, Nx4 or Nx6 torch tensor- Returns:
corners of boxes, 4-element or 6-element tuple, each element is a Nx1 torch tensor. It represents (xmin, ymin, xmax, ymax) or (xmin, ymin, zmin, xmax, ymax, zmax)
- Return type:
tuple
Example
boxes = torch.ones(10,6) boxmode = BoxMode() boxmode.boxes_to_corners(boxes) # will return a 6-element tuple, each element is a 10x1 tensor
- corners_to_boxes(corners)[source]#
Convert the given box corners to the bounding boxes of the current mode.
- Parameters:
corners (
Sequence) – corners of boxes, 4-element or 6-element tuple, each element is a Nx1 torch tensor. It represents (xmin, ymin, xmax, ymax) or (xmin, ymin, zmin, xmax, ymax, zmax)- Returns:
bounding boxes, Nx4 or Nx6 torch tensor
- Return type:
Tensor
Example
corners = (torch.ones(10,1), torch.ones(10,1), torch.ones(10,1), torch.ones(10,1)) boxmode = BoxMode() boxmode.corners_to_boxes(corners) # will return a 10x4 tensor
- class monai.data.box_utils.CornerSizeMode[source]#
A subclass of
BoxMode.Also represented as “xywh” or “xyzwhd”, with format of [xmin, ymin, xsize, ysize] or [xmin, ymin, zmin, xsize, ysize, zsize].
Example
CornerSizeMode.get_name(spatial_dims=2) # will return "xywh" CornerSizeMode.get_name(spatial_dims=3) # will return "xyzwhd"
- boxes_to_corners(boxes)[source]#
Convert the bounding boxes of the current mode to corners.
- Parameters:
boxes (
Tensor) – bounding boxes, Nx4 or Nx6 torch tensor- Returns:
corners of boxes, 4-element or 6-element tuple, each element is a Nx1 torch tensor. It represents (xmin, ymin, xmax, ymax) or (xmin, ymin, zmin, xmax, ymax, zmax)
- Return type:
tuple
Example
boxes = torch.ones(10,6) boxmode = BoxMode() boxmode.boxes_to_corners(boxes) # will return a 6-element tuple, each element is a 10x1 tensor
- corners_to_boxes(corners)[source]#
Convert the given box corners to the bounding boxes of the current mode.
- Parameters:
corners (
Sequence) – corners of boxes, 4-element or 6-element tuple, each element is a Nx1 torch tensor. It represents (xmin, ymin, xmax, ymax) or (xmin, ymin, zmin, xmax, ymax, zmax)- Returns:
bounding boxes, Nx4 or Nx6 torch tensor
- Return type:
Tensor
Example
corners = (torch.ones(10,1), torch.ones(10,1), torch.ones(10,1), torch.ones(10,1)) boxmode = BoxMode() boxmode.corners_to_boxes(corners) # will return a 10x4 tensor
- monai.data.box_utils.StandardMode[source]#
alias of
CornerCornerModeTypeA
- monai.data.box_utils.batched_nms(boxes, scores, labels, nms_thresh, max_proposals=-1, box_overlap_metric=<function box_iou>)[source]#
Performs non-maximum suppression in a batched fashion. Each labels value correspond to a category, and NMS will not be applied between elements of different categories.
Adapted from MIC-DKFZ/nnDetection
- Parameters:
boxes (
Union[ndarray,Tensor]) – bounding boxes, Nx4 or Nx6 torch tensor or ndarray. The box mode is assumed to beStandardModescores (
Union[ndarray,Tensor]) – prediction scores of the boxes, sized (N,). This function keeps boxes with higher scores.labels (
Union[ndarray,Tensor]) – indices of the categories for each one of the boxes. sized(N,), value range is (0, num_classes)nms_thresh (
float) – threshold of NMS. Discards all overlapping boxes with box_overlap > nms_thresh.max_proposals (
int) – maximum number of boxes it keeps. Ifmax_proposals= -1, there is no limit on the number of boxes that are kept.box_overlap_metric (
Callable) – the metric to compute overlap between boxes.
- Return type:
Union[ndarray,Tensor]- Returns:
Indexes of
boxesthat are kept after NMS.
- monai.data.box_utils.box_area(boxes)[source]#
This function computes the area (2D) or volume (3D) of each box. Half precision is not recommended for this function as it may cause overflow, especially for 3D images.
- Parameters:
boxes (
Union[ndarray,Tensor]) – bounding boxes, Nx4 or Nx6 torch tensor or ndarray. The box mode is assumed to beStandardMode- Return type:
Union[ndarray,Tensor]- Returns:
area (2D) or volume (3D) of boxes, with size of (N,).
Example
boxes = torch.ones(10,6) # we do computation with torch.float32 to avoid overflow compute_dtype = torch.float32 area = box_area(boxes=boxes.to(dtype=compute_dtype)) # torch.float32, size of (10,)
- monai.data.box_utils.box_centers(boxes)[source]#
Compute center points of boxes
- Parameters:
boxes (
Union[ndarray,Tensor]) – bounding boxes, Nx4 or Nx6 torch tensor or ndarray. The box mode is assumed to beStandardMode- Return type:
Union[ndarray,Tensor]- Returns:
center points with size of (N, spatial_dims)
- monai.data.box_utils.box_giou(boxes1, boxes2)[source]#
Compute the generalized intersection over union (GIoU) of two sets of boxes. The two inputs can have different shapes and the func return an NxM matrix, (in contrary to
box_pair_giou(), which requires the inputs to have the same shape and returnsNvalues).- Parameters:
boxes1 (
Union[ndarray,Tensor]) – bounding boxes, Nx4 or Nx6 torch tensor or ndarray. The box mode is assumed to beStandardModeboxes2 (
Union[ndarray,Tensor]) – bounding boxes, Mx4 or Mx6 torch tensor or ndarray. The box mode is assumed to beStandardMode
- Return type:
Union[ndarray,Tensor]- Returns:
GIoU, with size of (N,M) and same data type as
boxes1
- Reference:
- monai.data.box_utils.box_iou(boxes1, boxes2)[source]#
Compute the intersection over union (IoU) of two set of boxes.
- Parameters:
boxes1 (
Union[ndarray,Tensor]) – bounding boxes, Nx4 or Nx6 torch tensor or ndarray. The box mode is assumed to beStandardModeboxes2 (
Union[ndarray,Tensor]) – bounding boxes, Mx4 or Mx6 torch tensor or ndarray. The box mode is assumed to beStandardMode
- Return type:
Union[ndarray,Tensor]- Returns:
IoU, with size of (N,M) and same data type as
boxes1
- monai.data.box_utils.box_pair_giou(boxes1, boxes2)[source]#
Compute the generalized intersection over union (GIoU) of a pair of boxes. The two inputs should have the same shape and the func return an (N,) array, (in contrary to
box_giou(), which does not require the inputs to have the same shape and returnsNxMmatrix).- Parameters:
boxes1 (
Union[ndarray,Tensor]) – bounding boxes, Nx4 or Nx6 torch tensor or ndarray. The box mode is assumed to beStandardModeboxes2 (
Union[ndarray,Tensor]) – bounding boxes, same shape with boxes1. The box mode is assumed to beStandardMode
- Return type:
Union[ndarray,Tensor]- Returns:
paired GIoU, with size of (N,) and same data type as
boxes1
- Reference:
- monai.data.box_utils.boxes_center_distance(boxes1, boxes2, euclidean=True)[source]#
Distance of center points between two sets of boxes
- Parameters:
boxes1 (
Union[ndarray,Tensor]) – bounding boxes, Nx4 or Nx6 torch tensor or ndarray. The box mode is assumed to beStandardModeboxes2 (
Union[ndarray,Tensor]) – bounding boxes, Mx4 or Mx6 torch tensor or ndarray. The box mode is assumed to beStandardModeeuclidean (
bool) – computed the euclidean distance otherwise it uses the l1 distance
- Return type:
tuple[Union[ndarray,Tensor],Union[ndarray,Tensor],Union[ndarray,Tensor]]- Returns:
The pairwise distances for every element in boxes1 and boxes2, with size of (N,M) and same data type as
boxes1.Center points of boxes1, with size of (N,spatial_dims) and same data type as
boxes1.Center points of boxes2, with size of (M,spatial_dims) and same data type as
boxes1.
- Reference:
- monai.data.box_utils.centers_in_boxes(centers, boxes, eps=0.01)[source]#
Checks which center points are within boxes
- Parameters:
boxes (
Union[ndarray,Tensor]) – bounding boxes, Nx4 or Nx6 torch tensor or ndarray. The box mode is assumed to beStandardMode.centers (
Union[ndarray,Tensor]) – center points, Nx2 or Nx3 torch tensor or ndarray.eps (
float) – minimum distance to border of boxes.
- Return type:
Union[ndarray,Tensor]- Returns:
boolean array indicating which center points are within the boxes, sized (N,).
- Reference:
- monai.data.box_utils.clip_boxes_to_image(boxes, spatial_size, remove_empty=True)[source]#
This function clips the
boxesto makes sure the bounding boxes are within the image.- Parameters:
boxes (
Union[ndarray,Tensor]) – bounding boxes, Nx4 or Nx6 torch tensor or ndarray. The box mode is assumed to beStandardModespatial_size (
Union[Sequence[int],ndarray,Tensor]) – The spatial size of the image where the boxes are attached. len(spatial_size) should be in [2, 3].remove_empty (
bool) – whether to remove the boxes that are actually empty
- Return type:
tuple[Union[ndarray,Tensor],Union[ndarray,Tensor]]- Returns:
clipped boxes, boxes[keep], does not share memory with original boxes
keep, it indicates whether each box inboxesare kept whenremove_empty=True.
- monai.data.box_utils.convert_box_mode(boxes, src_mode=None, dst_mode=None)[source]#
This function converts the boxes in src_mode to the dst_mode.
- Parameters:
boxes (
Union[ndarray,Tensor]) – source bounding boxes, Nx4 or Nx6 torch tensor or ndarray.src_mode (
UnionType[str,BoxMode,type[BoxMode],None]) – source box mode. If it is not given, this func will assume it isStandardMode(). It follows the same format withmodeinget_boxmode().dst_mode (
UnionType[str,BoxMode,type[BoxMode],None]) – target box mode. If it is not given, this func will assume it isStandardMode(). It follows the same format withmodeinget_boxmode().
- Return type:
Union[ndarray,Tensor]- Returns:
bounding boxes with target mode, with same data type as
boxes, does not share memory withboxes
Example
boxes = torch.ones(10,4) # The following three lines are equivalent # They convert boxes with format [xmin, ymin, xmax, ymax] to [xcenter, ycenter, xsize, ysize]. convert_box_mode(boxes=boxes, src_mode="xyxy", dst_mode="ccwh") convert_box_mode(boxes=boxes, src_mode="xyxy", dst_mode=monai.data.box_utils.CenterSizeMode) convert_box_mode(boxes=boxes, src_mode="xyxy", dst_mode=monai.data.box_utils.CenterSizeMode())
- monai.data.box_utils.convert_box_to_standard_mode(boxes, mode=None)[source]#
Convert given boxes to standard mode. Standard mode is “xyxy” or “xyzxyz”, representing box format of [xmin, ymin, xmax, ymax] or [xmin, ymin, zmin, xmax, ymax, zmax].
- Parameters:
boxes (
Union[ndarray,Tensor]) – source bounding boxes, Nx4 or Nx6 torch tensor or ndarray.mode (
UnionType[str,BoxMode,type[BoxMode],None]) – source box mode. If it is not given, this func will assume it isStandardMode(). It follows the same format withmodeinget_boxmode().
- Return type:
Union[ndarray,Tensor]- Returns:
bounding boxes with standard mode, with same data type as
boxes, does not share memory withboxes
Example
boxes = torch.ones(10,6) # The following two lines are equivalent # They convert boxes with format [xmin, xmax, ymin, ymax, zmin, zmax] to [xmin, ymin, zmin, xmax, ymax, zmax] convert_box_to_standard_mode(boxes=boxes, mode="xxyyzz") convert_box_mode(boxes=boxes, src_mode="xxyyzz", dst_mode="xyzxyz")
- monai.data.box_utils.get_boxmode(mode=None, *args, **kwargs)[source]#
This function that return a
BoxModeobject giving a representation of box mode- Parameters:
mode (
UnionType[str,BoxMode,type[BoxMode],None]) – a representation of box mode. If it is not given, this func will assume it isStandardMode().
Note
StandardMode=CornerCornerModeTypeA, also represented as “xyxy” for 2D and “xyzxyz” for 3D.- mode can be:
- str: choose from
BoxModeName, for example, “xyxy”: boxes has format [xmin, ymin, xmax, ymax]
“xyzxyz”: boxes has format [xmin, ymin, zmin, xmax, ymax, zmax]
“xxyy”: boxes has format [xmin, xmax, ymin, ymax]
“xxyyzz”: boxes has format [xmin, xmax, ymin, ymax, zmin, zmax]
“xyxyzz”: boxes has format [xmin, ymin, xmax, ymax, zmin, zmax]
“xywh”: boxes has format [xmin, ymin, xsize, ysize]
“xyzwhd”: boxes has format [xmin, ymin, zmin, xsize, ysize, zsize]
“ccwh”: boxes has format [xcenter, ycenter, xsize, ysize]
“cccwhd”: boxes has format [xcenter, ycenter, zcenter, xsize, ysize, zsize]
- str: choose from
- BoxMode class: choose from the subclasses of
BoxMode, for example, CornerCornerModeTypeA: equivalent to “xyxy” or “xyzxyz”
CornerCornerModeTypeB: equivalent to “xxyy” or “xxyyzz”
CornerCornerModeTypeC: equivalent to “xyxy” or “xyxyzz”
CornerSizeMode: equivalent to “xywh” or “xyzwhd”
CenterSizeMode: equivalent to “ccwh” or “cccwhd”
- BoxMode class: choose from the subclasses of
- BoxMode object: choose from the subclasses of
BoxMode, for example, CornerCornerModeTypeA(): equivalent to “xyxy” or “xyzxyz”
CornerCornerModeTypeB(): equivalent to “xxyy” or “xxyyzz”
CornerCornerModeTypeC(): equivalent to “xyxy” or “xyxyzz”
CornerSizeMode(): equivalent to “xywh” or “xyzwhd”
CenterSizeMode(): equivalent to “ccwh” or “cccwhd”
- BoxMode object: choose from the subclasses of
None: will assume mode is
StandardMode()
- Return type:
- Returns:
BoxMode object
Example
mode = "xyzxyz" get_boxmode(mode) # will return CornerCornerModeTypeA()
- monai.data.box_utils.get_spatial_dims(boxes=None, points=None, corners=None, spatial_size=None)[source]#
Get spatial dimension for the giving setting and check the validity of them. Missing input is allowed. But at least one of the input value should be given. It raises ValueError if the dimensions of multiple inputs do not match with each other.
- Parameters:
boxes (
UnionType[Tensor,ndarray,None]) – bounding boxes, Nx4 or Nx6 torch tensor or ndarraypoints (
UnionType[Tensor,ndarray,None]) – point coordinates, [x, y] or [x, y, z], Nx2 or Nx3 torch tensor or ndarraycorners (
UnionType[Sequence,None]) – corners of boxes, 4-element or 6-element tuple, each element is a Nx1 torch tensor or ndarrayspatial_size (
UnionType[Sequence[int],Tensor,ndarray,None]) – The spatial size of the image where the boxes are attached. len(spatial_size) should be in [2, 3].
- Returns:
spatial_dims, number of spatial dimensions of the bounding boxes.
- Return type:
int
Example
boxes = torch.ones(10,6) get_spatial_dims(boxes, spatial_size=[100,200,200]) # will return 3 get_spatial_dims(boxes, spatial_size=[100,200]) # will raise ValueError get_spatial_dims(boxes) # will return 3
- monai.data.box_utils.is_valid_box_values(boxes)[source]#
This function checks whether the box size is non-negative.
- Parameters:
boxes (
Union[ndarray,Tensor]) – bounding boxes, Nx4 or Nx6 torch tensor or ndarray. The box mode is assumed to beStandardMode- Return type:
bool- Returns:
whether
boxesis valid
- monai.data.box_utils.non_max_suppression(boxes, scores, nms_thresh, max_proposals=-1, box_overlap_metric=<function box_iou>)[source]#
Non-maximum suppression (NMS).
- Parameters:
boxes (
Union[ndarray,Tensor]) – bounding boxes, Nx4 or Nx6 torch tensor or ndarray. The box mode is assumed to beStandardModescores (
Union[ndarray,Tensor]) – prediction scores of the boxes, sized (N,). This function keeps boxes with higher scores.nms_thresh (
float) – threshold of NMS. Discards all overlapping boxes with box_overlap > nms_thresh.max_proposals (
int) – maximum number of boxes it keeps. Ifmax_proposals= -1, there is no limit on the number of boxes that are kept.box_overlap_metric (
Callable) – the metric to compute overlap between boxes.
- Return type:
Union[ndarray,Tensor]- Returns:
Indexes of
boxesthat are kept after NMS.
Example
boxes = torch.ones(10,6) scores = torch.ones(10) keep = non_max_suppression(boxes, scores, num_thresh=0.1) boxes_after_nms = boxes[keep]
- monai.data.box_utils.spatial_crop_boxes(boxes, roi_start, roi_end, remove_empty=True)[source]#
This function generate the new boxes when the corresponding image is cropped to the given ROI. When
remove_empty=True, it makes sure the bounding boxes are within the new cropped image.- Parameters:
boxes (~NdarrayTensor) – bounding boxes, Nx4 or Nx6 torch tensor or ndarray. The box mode is assumed to be
StandardModeroi_start (
Union[Sequence[int],ndarray,Tensor]) – voxel coordinates for start of the crop ROI, negative values allowed.roi_end (
Union[Sequence[int],ndarray,Tensor]) – voxel coordinates for end of the crop ROI, negative values allowed.remove_empty (
bool) – whether to remove the boxes that are actually empty
- Return type:
tuple[~NdarrayTensor,Union[ndarray,Tensor]]- Returns:
cropped boxes, boxes[keep], does not share memory with original boxes
keep, it indicates whether each box inboxesare kept whenremove_empty=True.
- monai.data.box_utils.standardize_empty_box(boxes, spatial_dims)[source]#
When boxes are empty, this function standardize it to shape of (0,4) or (0,6).
- Parameters:
boxes (
Union[ndarray,Tensor]) – bounding boxes, Nx4 or Nx6 or empty torch tensor or ndarrayspatial_dims (
int) – number of spatial dimensions of the bounding boxes.
- Return type:
Union[ndarray,Tensor]- Returns:
bounding boxes with shape (N,4) or (N,6), N can be 0.
Example
boxes = torch.ones(0,) standardize_empty_box(boxes, 3)
Video datasets#
VideoDataset#
VideoFileDataset#
CameraDataset#
- class monai.data.video_dataset.CameraDataset(video_source, transform=None, max_num_frames=None, color_order=RGB, multiprocessing=False, channel_dim=0)[source]#
Video dataset from a capture device (e.g., webcam).
This class requires that OpenCV be installed.
- Parameters:
video_source (
UnionType[str,int]) – index of capture device. get_num_devices can be used to determine possible devices.transform (
UnionType[Callable,None]) – transform to be applied to each frame.max_num_frames (
UnionType[int,None]) – Max number of frames to iterate across. If None is passed, then the dataset will iterate infinitely.
- Raises:
RuntimeError – OpenCV not installed.