Delve Documentation

Delve is library for analyzing eigenspaces of neural networks.

Delve was developed to help researchers identify information flow through a layer. Specifically, it computes layer saturation, a measure of the covariance of features in a layer: Saturation.

It is useful for optimizing neural network topology, particularly identifying over or under-saturated layers.

If you use Delve in your publication, please cite:

@software{delve,
author       = {Justin Shenk and
                  Mats L. Richter and
                  Wolf Byttner and
                  Michał Marcinkiewicz},
title        = {delve-team/delve: v0.1.45},
month        = aug,
year         = 2021,
publisher    = {Zenodo},
version      = {v0.1.45},
doi          = {10.5281/zenodo.5233860},
url          = {https://doi.org/10.5281/zenodo.5233860}
}

Delve allows extracting features from neural network layers and computing the eigenspace of several layers.

Supported Layers

  • Dense/Linear

  • LSTM

  • Convolutional

Statistics

Layer eigenspace computation allows computing information flow between layers, including:

  • feature variance

  • feature covariance

  • layer feature instrinsic dimensionality

Several statistics are supported:

idim        : intrinsic dimensionality
lsat        : layer saturation (intrinsic dimensionality divided by feature space dimensionality)
cov         : the covariance-matrix (only saveable using the 'npy' save strategy)
det         : the determinant of the covariance matrix (also known as generalized variance)
trc         : the trace of the covariance matrix, generally a more useful metric than det for determining
            the total variance of the data than the determinant.
            However note that this does not take the correlation between
            features into account. On the other hand, in most cases the determinent will be zero, since
            there will be very strongly correlated features, so trace might be the better option.
dtrc        : the trace of the diagonalmatrix, another way of measuring the dispersion of the data.
lsat        : layer saturation (intrinsic dimensionality
            divided by feature space dimensionality)
cov         : the covariance-matrix (only saveable using
            the 'npy' save strategy)
embed       : samples embedded in the eigenspace of dimension 2

To support researchers, it allows saving plots at various intervals through the CheckLayerSat class.

Installation

Installing Delve

Delve require Python 3.6+ to be installed.

To install via pip:

pip install delve

To install the latest development version, clone the GitHub repository and use the setup script:

git clone https://github.com/delve-team/delve.git
cd delve
pip install .

Dependencies

Installation with pip should also include all dependencies, but a complete list is

Usage

Instantiate the CheckLayerSat class where you define your PyTorch training loop, as in the example:

from torch import nn
from delve import CheckLayerSat

...

model = nn.ModuleDict({
              'conv1': nn.Conv2d(1, 8, 3, padding=1),
              'linear1': nn.Linear(3, 1),
})


layers = [model.conv1, model.linear1]
stats = CheckLayerSat('regression/h{}'.format(h),
   save_to="plotcsv",
   modules=layers,
   stats=["lsat"]
)

...

for _ in range(10):
   y_pred = model(x)
   loss = loss_fn(y_pred, y)
   optimizer.zero_grad()
   loss.backward()
   optimizer.step()

   stats.add_saturations()

stats.close()

This will hook into the layers in layers and log the statistics, in this case lsat (layer saturation). It will save images to regression.

Saturation

Saturation is a metric used for identifying the intrinsic dimensionality of features in a layer.

A visualization of how saturation changes over training and can be used to optimize network topology is provided at https://github.com/justinshenk/playground:

_images/saturation_demo.gif

Covariance matrix of features is computed online:

\[Q(Z_l, Z_l) = \frac{\sum^{B}_{b=0}A_{l,b}^T A_{l,b}}{n} -(\bar{A}_l \bigotimes \bar{A}_l)\]

for \(B\) batches of layer output matrix \(A_l\) and \(n\) number of samples.

Note

For more information about how saturation is computed, read “Feature Space Saturation during Training”.

Indices and tables

Examples

Delve allows the user to create plots and log records in various formats.

Plotting

Delve allows plotting results every epoch using save_to="csvplot", which will create automated plots from the metrics recorded in the stats argument. The plots depict the layers generally in order of the forward pass.

_images/VGG16-Cifar10-r32-bs256-e90idim_epoch_88.png

Automatically generated plot of intrinsic dimensionality computed on the training set of Cifar10 on VGG16 at the 88th epoch of a 90 epoch of training.

_images/VGG16-Cifar10-r32-bs256-e90lsat_epoch_88.png

Automatically generated plot of saturation computed on the training set of Cifar10 on VGG16 at the 88th epoch of a 90 epoch training.

Logging

Delve logs results with the logging package and shows progress with tqdm.

_images/logging.JPG

A simple example generated from a two-layer network trained on randomly generated data is provided in Gallery.

Reference

PCA Layer Methods

The following methods are available via :module:`delve.pca_layers`:

delve.pca_layers.rvs(dim=3) numpy.ndarray[source]

Create random orthonormal matrix of size dim.

Note

Yanked from hpaulj’s implementation of SciPy’s scipy.stats.special_ortho_group() in Numpy at https://stackoverflow.com/questions/38426349/how-to-create-random-orthonormal-matrix-in-python-numpy which is from the paper:

Stewart, G.W., “The efficient generation of random orthogonal matrices with an application to condition estimators”, SIAM Journal on Numerical Analysis, 17(3), pp. 403-409, 1980.

delve.pca_layers.change_all_pca_layer_thresholds_and_inject_random_directions(threshold: float, network: torch.nn.modules.module.Module, verbose: bool = False, device='cpu', include_names: bool = False) Tuple[list, list, list][source]
delve.pca_layers.change_all_pca_layer_thresholds(threshold: float, network: torch.nn.modules.module.Module, verbose: bool = False)[source]
delve.pca_layers.change_all_pca_layer_centering(centering: bool, network: torch.nn.modules.module.Module, verbose: bool = False, downsampling=None)[source]
class delve.pca_layers.LinearPCALayer(in_features: int, threshold: float = 0.99, keepdim: bool = True, verbose: bool = False, gradient_epoch_start: int = 20, centering: bool = True)[source]

Eigenspace of the covariance matrix generated in TorchCovarianceMatrix with equation (1).

num = 0
keepdim: bool
verbose: bool
pca_computed: bool
is_floating_point()[source]
property threshold: float
property centering
forward(x)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

training: bool
class delve.pca_layers.Conv2DPCALayer(in_filters, threshold: float = 0.99, verbose: bool = True, gradient_epoch_start: int = 20, centering: bool = False, downsampling: Optional[int] = None)[source]

Compute PCA of Conv2D layer

training: bool
keepdim: bool
verbose: bool
pca_computed: bool
forward(x)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

CheckLayerSat

CheckLayerSat provides a hook for PyTorch and extracts metrics during model training.

class delve.CheckLayerSat(savefile: str, save_to: Union[str, delve.writers.AbstractWriter], modules: torch.nn.modules.module.Module, writer_args: Optional[Dict[str, Any]] = None, log_interval=1, max_samples=None, stats: list = ['lsat'], layerwise_sat: bool = True, reset_covariance: bool = True, average_sat: bool = False, ignore_layer_names: List[str] = [], include_conv: bool = True, conv_method: str = 'channelwise', timeseries_method: str = 'last_timestep', sat_threshold: str = 0.99, nosave=False, verbose: bool = False, device='cuda:0', initial_epoch: int = 0, interpolation_strategy: Optional[str] = None, interpolation_downsampling: int = 32)[source]
Takes PyTorch module and records layer saturation,

intrinsic dimensionality and other scalars.

Parameters
  • savefile (str) – destination for summaries

  • (str (save_to) –

    Specify one or multiple save strategies.

    You can use preimplemented save strategies or inherit from the AbstractWriter in order to implement your own preferred saving strategy.

    pre-existing saving strategies are:
    csvstores all stats in a csv-file with one

    row for each epoch.

    plotproduces plots from intrinsic dimensionality

    and / or layer saturation

    tensorboard : saves all stats to tensorboard print : print all metrics on console

    as soon as they are logged

    npycreates a folder-structure with npy-files

    containing the logged values. This is the only save strategy that can save the full covariance matrix. This strategy is useful if you want to reproduce intrinsic dimensionality and saturation values with other thresholds without re-evaluating model checkpoints.

  • List[Union[str

    Specify one or multiple save strategies.

    You can use preimplemented save strategies or inherit from the AbstractWriter in order to implement your own preferred saving strategy.

    pre-existing saving strategies are:
    csvstores all stats in a csv-file with one

    row for each epoch.

    plotproduces plots from intrinsic dimensionality

    and / or layer saturation

    tensorboard : saves all stats to tensorboard print : print all metrics on console

    as soon as they are logged

    npycreates a folder-structure with npy-files

    containing the logged values. This is the only save strategy that can save the full covariance matrix. This strategy is useful if you want to reproduce intrinsic dimensionality and saturation values with other thresholds without re-evaluating model checkpoints.

  • delve.writers.AbstractWriter]]

    Specify one or multiple save strategies.

    You can use preimplemented save strategies or inherit from the AbstractWriter in order to implement your own preferred saving strategy.

    pre-existing saving strategies are:
    csvstores all stats in a csv-file with one

    row for each epoch.

    plotproduces plots from intrinsic dimensionality

    and / or layer saturation

    tensorboard : saves all stats to tensorboard print : print all metrics on console

    as soon as they are logged

    npycreates a folder-structure with npy-files

    containing the logged values. This is the only save strategy that can save the full covariance matrix. This strategy is useful if you want to reproduce intrinsic dimensionality and saturation values with other thresholds without re-evaluating model checkpoints.

  • modules (torch modules or list of modules) – layer-containing object. Per default, only Conv2D, Linear and LSTM-Cells are recorded

  • writers_args (dict) – contains additional arguments passed over to the writers. This is only used, when a writer is initialized through a string-key.

  • log_interval (int) – distances between two batches used for updating the covariance matrix. Default value is 1, which means that all data is used for computing intrinsic dimensionality and saturation. Increasing the log interval is usefull on very large datasets to reduce numeric instability.

  • max_samples (int) – (optional) the covariance matrix in each layer will halt updating itself when max_samples are reached. Usecase is similar to log-interval, when datasets are very large.

  • stats (list of str) –

    list of stats to compute

    supported stats are:

    idim : intrinsic dimensionality lsat : layer saturation (intrinsic dimensionality divided by feature space dimensionality) cov : the covariance-matrix (only saveable using the ‘npy’ save strategy) det : the determinant of the covariance matrix (also known as generalized variance) trc : the trace of the covariance matrix, generally a more useful metric than det for determining

    the total variance of the data than the determinant. However note that this does not take the correlation between features into account. On the other hand, in most cases the determinent will be zero, since there will be very strongly correlated features, so trace might be the better option.

    dtrc : the trace of the diagonalmatrix, another way of measuring the dispersion of the data. lsat : layer saturation (intrinsic dimensionality

    divided by feature space dimensionality)

    embed : samples embedded in the eigenspace of dimension 2

  • layerwise_sat (bool) – whether or not to include layerwise saturation when saving

  • reset_covariance (bool) – True by default, resets the covariance every time the stats are computed. Disabling this option will strongly bias covariance since the gradient will influence the model. We recommend computing saturation at the end of training and testing.

  • include_conv – setting to False includes only linear layers

  • conv_method (str) –

    how to subsample convolutional layers. Default is

    channelwise, which means that the each position of the filter tensor is considered a datapoint, effectivly yielding a data matrix of shape (height*width*batch_size, num_filters)

    supported methods are:
    channelwisetreats every depth vector of the tensor as a

    datapoint, effectivly reshaping the data tensor from shape (batch_size, height, width, channel) into (batch_size*height*width, channel).

    meanapplies global average pooling on

    each feature map

    maxapplies global max pooling on

    each feature map

    medianapplies global median pooling on

    each feature map

    flattenflattenes the entire feature map to a vector,

    reshaping the data tensor into a data matrix of shape (batch_size, height*width*channel). This strategy for dealing with convolutions is extremly memory intensive and will likely cause memory and performance problems for any non toy-problem

  • timeseries_method (str) –

    how to subsample timeseries methods. Default

    is last_timestep.

    supported methods are:

    timestepwise : stacks each sample timestep-by-timestep last_timestep : selects the last timestep’s output

  • nosave (bool) – If True, disables saving artifacts (images), default is False

  • verbose (bool) – print saturation for every layer during training

  • sat_threshold (float) – threshold used to determine the number of eigendirections belonging to the latent space. In effect, this is the threshold determining the the intrinsic dimensionality. Default value is 0.99 (99% of the explained variance), which is a compromise between a good and interpretable approximation. From experience the threshold should be between 0.97 and 0.9995 for meaningfull results.

  • verbose – Change verbosity level (default is 0)

  • device (str) – Device to do the computations on. Default is cuda:0. Generally it is recommended to do the computations on the gpu in order to get maximum performance. Using the cpu is generally slower but it lets delve use regular RAM instead of the generally more limited VRAM of the GPU. Not having delve run on the same device as the network causes slight performance decrease due to copying memory between devices during each forward pass. Delve can handle models distributed on multiple GPUs, however delve itself will always run on a single device.

  • initial_epoch (int) – The initial epoch to start with. Default is 0, which corresponds to a new run. If initial_epoch != 0 the writers will look for save states that they can resume. If set to zero, all existing states will be overwritten. If set to a lower epoch than actually recorded the behavior of the writers is undefined and may result in crashes, loss of data or corrupted data.

  • interpolation_strategy (str) – Default is None (disabled). If set to a string key accepted by the model-argument of torch.nn.functional.interpolate, the feature map will be resized to match the interpolated size. This is useful if you work with large resolutions and want to save up on computation time. is done if the resolution is smaller.

  • interpolation_downsampling (int) – Default is 32. The target resolution if downsampling is enabled.

TorchCovarianceMatrix

class delve.torch_utils.TorchCovarianceMatrix(bias: bool = False, device: str = 'cuda:0', save_data: bool = False)[source]

Computes covariance matrix of features as described in https://arxiv.org/pdf/2006.08679.pdf:

\begin{eqnarray} Q(Z_l, Z_l) = \frac{\sum^{B}_{b=0}A_{l,b}^T A_{l,b}}{n} -(\bar{A}_l \bigotimes \bar{A}_l) \end{eqnarray}

for \(B\) batches of layer output matrix \(A_l\) and \(n\) number of samples.

Note

Method enforces float-64 precision, which may cause numerical instability in some cases.

API Pages

CheckLayerSat(savefile, save_to, modules[, ...])

Takes PyTorch module and records layer saturation,

TorchCovarianceMatrix([bias, device, save_data])

Computes covariance matrix of features as described in https://arxiv.org/pdf/2006.08679.pdf:

Support for Delve

Bugs

Bugs, issues and improvement requests can be logged in Github Issues.

Community

Community support is provided via Gitter. Just ask a question there.

Contributing to Delve

(Contribution guidelines largely copied from geopandas)

Overview

Contributions to Delve are very welcome. They are likely to be accepted more quickly if they follow these guidelines.

At this stage of Delve development, the priorities are to define a simple, usable, and stable API and to have clean, maintainable, readable code. Performance matters, but not at the expense of those goals.

In general, Delve follows the conventions of the pandas project where applicable.

In particular, when submitting a pull request:

  • All existing tests should pass. Please make sure that the test suite passes, both locally and on GitHub Actions. Status on GitHub Actions will be visible on a pull request.

  • New functionality should include tests. Please write reasonable tests for your code and make sure that they pass on your pull request.

  • Classes, methods, functions, etc. should have docstrings. The first line of a docstring should be a standalone summary. Parameters and return values should be ducumented explicitly.

  • Delve supports python 3 (3.6+). Use modern python idioms when possible.

  • Follow PEP 8 when possible.

  • Imports should be grouped with standard library imports first, 3rd-party libraries next, and Delve imports third. Within each grouping, imports should be alphabetized. Always use absolute imports when possible, and explicit relative imports for local imports when necessary in tests.

Seven Steps for Contributing

There are seven basic steps to contributing to Delve:

  1. Fork the Delve git repository

  2. Create a development environment

  3. Install Delve dependencies

  4. Make a development build of Delve

  5. Make changes to code and add tests

  6. Update the documentation

  7. Submit a Pull Request

Each of these 7 steps is detailed below.

1) Forking the Delve repository using Git

To the new user, working with Git is one of the more daunting aspects of contributing to Delve. It can very quickly become overwhelming, but sticking to the guidelines below will help keep the process straightforward and mostly trouble free. As always, if you are having difficulties please feel free to ask for help.

The code is hosted on GitHub. To contribute you will need to sign up for a free GitHub account. We use Git for version control to allow many people to work together on the project.

Some great resources for learning Git:

Getting started with Git

GitHub has instructions for installing git, setting up your SSH key, and configuring git. All these steps need to be completed before you can work seamlessly between your local repository and GitHub.

Forking

You will need your own fork to work on the code. Go to the Delve project page and hit the Fork button. You will want to clone your fork to your machine:

git clone git@github.com:your-user-name/delve.git delve-yourname
cd delve-yourname
git remote add upstream git://github.com/delve-team/delve.git

This creates the directory delve-yourname and connects your repository to the upstream (main project) Delve repository.

The testing suite will run automatically on Travis-CI once your pull request is submitted. However, if you wish to run the test suite on a branch prior to submitting the pull request, then Travis-CI needs to be hooked up to your GitHub repository. Instructions for doing so are here.

Creating a branch

You want your master branch to reflect only production-ready code, so create a feature branch for making your changes. For example:

git branch shiny-new-feature
git checkout shiny-new-feature

The above can be simplified to:

git checkout -b shiny-new-feature

This changes your working directory to the shiny-new-feature branch. Keep any changes in this branch specific to one bug or feature so it is clear what the branch brings to delve. You can have many shiny-new-features and switch in between them using the git checkout command.

To update this branch, you need to retrieve the changes from the master branch:

git fetch upstream
git rebase upstream/master

This will replay your commits on top of the latest Delve git master. If this leads to merge conflicts, you must resolve these before submitting your pull request. If you have uncommitted changes, you will need to stash them prior to updating. This will effectively store your changes and they can be reapplied after updating.

2) Creating a development environment

A development environment is a virtual space where you can keep an independent installation of Delve. This makes it easy to keep both a stable version of python in one place you use for work, and a development version (which you may break while playing with code) in another.

An easy way to create a Delve development environment is as follows:

Tell conda to create a new environment, named delve_dev, or any other name you would like for this environment, by running:

conda create -n delve_dev

For a python 3 environment:

conda create -n delve_dev python=3.8

This will create the new environment, and not touch any of your existing environments, nor any existing python installation.

To work in this environment, Windows users should activate it as follows:

activate delve_dev

Mac OSX and Linux users should use:

source activate delve_dev

You will then see a confirmation message to indicate you are in the new development environment.

To view your environments:

conda info -e

To return to you home root environment:

deactivate

See the full conda docs here.

At this point you can easily do a development install, as detailed in the next sections.

3) Installing Dependencies

To run Delve in an development environment, you must first install Delve’s dependencies. We suggest doing so using the following commands (executed after your development environment has been activated):

pip install -r requirements/reqirements.txt

This should install all necessary dependencies.

Next activate pre-commit hooks by running:

pre-commit install

4) Making a development build

Once dependencies are in place, make an in-place build by navigating to the git clone of the delve repository and running:

python setup.py develop

5) Making changes and writing tests

Delve is serious about testing and strongly encourages contributors to embrace test-driven development (TDD). This development process “relies on the repetition of a very short development cycle: first the developer writes an (initially failing) automated test case that defines a desired improvement or new function, then produces the minimum amount of code to pass that test.” So, before actually writing any code, you should write your tests. Often the test can be taken from the original GitHub issue. However, it is always worth considering additional use cases and writing corresponding tests.

Adding tests is one of the most common requests after code is pushed to delve. Therefore, it is worth getting in the habit of writing tests ahead of time so this is never an issue.

delve uses the pytest testing system and the convenient extensions in numpy.testing.

Writing tests

All tests should go into the tests directory. This folder contains many current examples of tests, and we suggest looking to these for inspiration.

Running the test suite

The tests can then be run directly inside your Git clone (without having to install Delve) by typing:

pytest

6) Updating the Documentation

Delve documentation resides in the doc folder. Changes to the docs are make by modifying the appropriate file in the source folder within doc. Delve docs us reStructuredText syntax, which is explained here and the docstrings follow the Numpy Docstring standard.

Once you have made your changes, you can build the docs by navigating to the doc folder and typing:

make html

The resulting html pages will be located in doc/build/html.

7) Submitting a Pull Request

Once you’ve made changes and pushed them to your forked repository, you then submit a pull request to have them integrated into the Delve code base.

You can find a pull request (or PR) tutorial in the GitHub’s Help Docs.

Indices and tables