igc.base.DataManager#
- class igc.base.DataManager(attr, y_required=True)[source]#
Bases:
objectHelp to setup the appropriate dataloaders for each context.
Initialized dataloaders iterate over inputs
x, true outputsy, baselinesx_0, output component indicesy_idx, or hidden layer component indicesz_idx.- Parameters:
attr (AbstractAttributionMethod) – Attribution method.
y_required (bool) – Define if true outputs
yare required byx_dtld.
Attributes
- n_xint
Number of
xsamples.- x_bszint
Batch size of
x_dtld.- x_nbint
Number of batches of
x_dtld.- x_dtldtorch.utils.data.DataLoader | tuple(ArrayLike)
Dataloader iterating over inputs
x(and true outputsyify_required).- n_x_0int
Number of
x_0baselines.- x_0_bszint
Batch size of
x_0_dtld.- x_0_nbint
Number of batches of
x_0_dtld.- x_0_dtldtorch.utils.data.DataLoader | tuple(ArrayLike)
Dataloader iterating over baselines
x_0.- n_z_idxint
Number of
z_idxcomponent indices.- z_idx_bszint
Batch size of
z_idx_dtld.- z_idx_nbint
Number of batches of
z_idx_dtld.- z_idx_dtldtorch.utils.data.DataLoader | tuple(ArrayLike)
Dataloader iterating over hidden layer component indices
z_idx.- n_y_idxint
Number of
y_idxcomponent indices.- y_idx_bszint
Batch size of
y_idx_dtld.- y_idx_nbint
Number of batches of
y_idx_dtld.- y_idx_dtldtorch.utils.data.DataLoader | tuple(ArrayLike)
Dataloader iterating over output component indices
y_idx.
- add_data(x, x_0, y_idx, n_steps, batch_size, x_seed, x_0_seed)[source]#
Setup a data manager iterating over inputs
x, baselinesx_0, and output component indicesy_idx(orz_idx).- Parameters:
x (None | int | ArrayLike | tuple(ArrayLike)) –
None :
x_dtlditerates over the whole dataset.int : Number of
xinputs sampled from the dataset.ArrayLike | tuple(ArrayLike) : Set new
xused byx_dtld.
x_0 (None | int | float | ArrayLike | tuple(ArrayLike)) –
None : Zero baseline
x_0.int : Number of
x_0baselines sampled from the dataset.float : Constant baseline
x_0.ArrayLike | tuple(ArrayLike) : Set
x_0baselines used byx_0_dtld.
y_idx (None | int | ArrayLike) –
None :
y_idx_dtlditerates over all output component indicesy_idx.int : Select a specific output component index
y_idx.ArrayLike : Select multiple output component indices
y_idx.
n_steps (int) – Number of steps of the Riemann approximation of supporting Integrated Gradients (IG) (see [STY17] for details).
batch_size (None | int | tuple(int)) –
None : Set
x_bsz= 1,x_0_bsz=n_x_0, andy_idx_bsz=n_y_idx(orz_idx_bsz=n_z_idx).int : Total batch size budget automatically distributed between
x_bsz,x_0_bsz, andy_idx_bsz(orz_idx_bsz).tuple(int) : Set
x_bsz,x_0_bsz, andy_idx_bsz(orz_idx_bsz) individually.
x_seed (None | int) – Seed associated with
x_dtld.x_0_seed (None | int) – Seed associated with
x_0_dtld.
- Returns:
Resolved
y_idxif it wasNone.- Return type:
torch.Tensor
- add_data_bsc(x, x_0, y_idx, n_iter, x_0_batch_size, x_seed, x_0_seed)[source]#
Setup a data manager dedicated to Baseline Shapley and Baseline Shapley Correlation attribution methods (
igc.bsc.BaselineShapleyandigc.bsc.BslShapCorr).- Parameters:
x (None | int | ArrayLike) –
None :
x_dtlditerates over the whole dataset.int : Number of
xinputs sampled from the dataset.ArrayLike : Set new
xused byx_dtld.
x_0 (None | int | float | ArrayLike) –
None : Zero baseline
x_0.int : Number of
x_0baselines sampled from the dataset.float : Constant baseline
x_0.ArrayLike : Set
x_0baselines used byx_0_dtld.
y_idx (None | int | ArrayLike) –
None :
y_idx_dtlditerates over all output component indicesy_idx.int : Select a specific output component index
y_idx.ArrayLike : Select multiple output component indices
y_idx.
n_iter (int) – Number of iterations, i.e. the number of random sequences of input component indices enabled one after the other.
x_0_batch_size (None | int) –
None : Set
x_0_bsz=n_x_0.int : Set
x_0_bsz.
x_seed (None | int) – Seed associated with
x_dtld.x_0_seed (None | int) – Seed associated with
x_0_dtld.
- Returns:
Resolved
y_idxif it wasNone.- Return type:
torch.Tensor
- add_data_iter_x(x, y_idx, batch_size, x_seed)[source]#
Setup a data manager iterating over inputs
x.- Parameters:
x (None | int | ArrayLike | tuple(ArrayLike)) –
None :
x_dtlditerates over the whole dataset.int : Number of
xinputs sampled from the dataset.ArrayLike | tuple(ArrayLike) : Set new
xused byx_dtld.
y_idx (None | int | ArrayLike) – Selected output component indices. If
None,y_idxis resolved to all output component indices.batch_size (None | int) –
None : Set
x_bsz= 1.int : Set
x_bsz.
x_seed (None | int) – Seed associated with
x_dtld.
- Returns:
Resolved
y_idxif it wasNone.- Return type:
torch.Tensor
- add_data_iter_x_y_idx(x, y_idx, batch_size, x_seed)[source]#
Setup a data manager iterating over inputs
xand output component indicesy_idx.- Parameters:
x (None | int | ArrayLike | tuple(ArrayLike)) –
None :
x_dtlditerates over the whole dataset.int : Number of
xinputs sampled from the dataset.ArrayLike | tuple(ArrayLike) : Set new
xused byx_dtld.
y_idx (None | int | ArrayLike) –
None :
y_idx_dtlditerates over all output component indicesy_idx.int : Select a specific output component index
y_idx.ArrayLike : Select multiple output component indices
y_idx.
batch_size (None | int | tuple(int)) –
None : Set
x_bsz= 1 andy_idx_bsz=n_y_idx.int : Total batch size budget automatically distributed between
x_bszandy_idx_bsz.tuple(int) : Set
x_bszandy_idx_bszindividually.
x_seed (None | int) – Seed associated with
x_dtld.
- Returns:
Resolved
y_idxif it wasNone.- Return type:
torch.Tensor
- add_data_naive(x, y_idx, batch_size, x_seed)[source]#
Setup a data manager dedicated to naive attribution methods (
igc.naive.NaiveCorrelationandigc.naive.NaiveTTest).- Parameters:
x (None | int | ArrayLike | tuple(ArrayLike)) –
None :
x_dtlditerates over the whole dataset.int : Number of
xinputs sampled from the dataset.ArrayLike | tuple(ArrayLike) : Set new
xused byx_dtld.
y_idx (None | int | ArrayLike) –
None :
y_idx_dtlditerates over all output component indicesy_idx.int : Select a specific output component index
y_idx.ArrayLike : Select multiple output component indices
y_idx.
batch_size (None | int | tuple(int)) –
None : Set
x_bsz= 1 andy_idx_bsz=n_y_idx.int : Total batch size budget automatically distributed between
x_bszandy_idx_bsz.tuple(int) : Set
x_bszandy_idx_bszindividually.
x_seed (None | int) – Seed associated with
x_dtld.
- Returns:
Resolved
y_idxif it wasNone.- Return type:
torch.Tensor