graph4nlp.loss¶

Losses¶

class graph4nlp.loss.CoverageLoss(cover_loss)¶

The loss function for coverage mechanism.

Parameters: cover_loss (float) – The weight for coverage loss.

Methods

`add_module`(name, module)	Adds a child module to the current module.
`apply`(fn)	Applies `fn` recursively to every submodule (as returned by `.children()`) as well as self.
`bfloat16`()	Casts all floating point parameters and buffers to `bfloat16` datatype.
`buffers`([recurse])	Returns an iterator over module buffers.
`children`()	Returns an iterator over immediate children modules.
`cpu`()	Moves all model parameters and buffers to the CPU.
`cuda`([device])	Moves all model parameters and buffers to the GPU.
`double`()	Casts all floating point parameters and buffers to `double` datatype.
`eval`()	Sets the module in evaluation mode.
`extra_repr`()	Set the extra representation of the module
`float`()	Casts all floating point parameters and buffers to `float` datatype.
`forward`(enc_attn_weights, coverage_vectors)	The calculation function.
`get_buffer`(target)	Returns the buffer given by `target` if it exists, otherwise throws an error.
`get_extra_state`()	Returns any extra state to include in the module’s state_dict.
`get_parameter`(target)	Returns the parameter given by `target` if it exists, otherwise throws an error.
`get_submodule`(target)	Returns the submodule given by `target` if it exists, otherwise throws an error.
`half`()	Casts all floating point parameters and buffers to `half` datatype.
`load_state_dict`(state_dict[, strict])	Copies parameters and buffers from `state_dict` into this module and its descendants.
`modules`()	Returns an iterator over all modules in the network.
`named_buffers`([prefix, recurse])	Returns an iterator over module buffers, yielding both the name of the buffer as well as the buffer itself.
`named_children`()	Returns an iterator over immediate children modules, yielding both the name of the module as well as the module itself.
`named_modules`([memo, prefix, remove_duplicate])	Returns an iterator over all modules in the network, yielding both the name of the module as well as the module itself.
`named_parameters`([prefix, recurse])	Returns an iterator over module parameters, yielding both the name of the parameter as well as the parameter itself.
`parameters`([recurse])	Returns an iterator over module parameters.
`register_backward_hook`(hook)	Registers a backward hook on the module.
`register_buffer`(name, tensor[, persistent])	Adds a buffer to the module.
`register_forward_hook`(hook)	Registers a forward hook on the module.
`register_forward_pre_hook`(hook)	Registers a forward pre-hook on the module.
`register_full_backward_hook`(hook)	Registers a backward hook on the module.
`register_parameter`(name, param)	Adds a parameter to the module.
`requires_grad_`([requires_grad])	Change if autograd should record operations on parameters in this module.
`set_extra_state`(state)	This function is called from `load_state_dict()` to handle any extra state found within the state_dict.
`share_memory`()	See `torch.Tensor.share_memory_()`
`state_dict`([destination, prefix, keep_vars])	Returns a dictionary containing a whole state of the module.
`to`(args, *kwargs)	Moves and/or casts the parameters and buffers.
`to_empty`(*, device)	Moves the parameters and buffers to the specified device without copying storage.
`train`([mode])	Sets the module in training mode.
`type`(dst_type)	Casts all parameters and buffers to `dst_type`.
`xpu`([device])	Moves all model parameters and buffers to the XPU.
`zero_grad`([set_to_none])	Sets gradients of all model parameters to zero.

__call__

forward(enc_attn_weights, coverage_vectors)¶

The calculation function.

Parameters

enc_attn_weights (list[torch.Tensor]) – The list containing all decoding steps’ attention weights. The length should be the decoding step. Each element should be the tensor.
coverage_vectors (list[torch.Tensor]) – The list containing all coverage vectors in decoding module.

Returns

coverage_loss – The loss.

Return type

torch.Tensor

class graph4nlp.loss.SeqGenerationLoss(ignore_index, use_coverage=False, coverage_weight=0.3)¶

The general loss for Graph2Seq model.

Parameters

ignore_index (ignore_index) – The token index which will be ignored during calculation. Usually it is the padding index.
use_coverage (bool, default=False) – Whether use coverage mechanism. If set True, the we will add the coverage loss.
coverage_weight (float, default=0.3) – The weight of coverage loss.

Methods

`add_module`(name, module)	Adds a child module to the current module.
`apply`(fn)	Applies `fn` recursively to every submodule (as returned by `.children()`) as well as self.
`bfloat16`()	Casts all floating point parameters and buffers to `bfloat16` datatype.
`buffers`([recurse])	Returns an iterator over module buffers.
`children`()	Returns an iterator over immediate children modules.
`cpu`()	Moves all model parameters and buffers to the CPU.
`cuda`([device])	Moves all model parameters and buffers to the GPU.
`double`()	Casts all floating point parameters and buffers to `double` datatype.
`eval`()	Sets the module in evaluation mode.
`extra_repr`()	Set the extra representation of the module
`float`()	Casts all floating point parameters and buffers to `float` datatype.
`forward`(logits, label[, enc_attn_weights, …])	The calculation method.
`get_buffer`(target)	Returns the buffer given by `target` if it exists, otherwise throws an error.
`get_extra_state`()	Returns any extra state to include in the module’s state_dict.
`get_parameter`(target)	Returns the parameter given by `target` if it exists, otherwise throws an error.
`get_submodule`(target)	Returns the submodule given by `target` if it exists, otherwise throws an error.
`half`()	Casts all floating point parameters and buffers to `half` datatype.
`load_state_dict`(state_dict[, strict])	Copies parameters and buffers from `state_dict` into this module and its descendants.
`modules`()	Returns an iterator over all modules in the network.
`named_buffers`([prefix, recurse])	Returns an iterator over module buffers, yielding both the name of the buffer as well as the buffer itself.
`named_children`()	Returns an iterator over immediate children modules, yielding both the name of the module as well as the module itself.
`named_modules`([memo, prefix, remove_duplicate])	Returns an iterator over all modules in the network, yielding both the name of the module as well as the module itself.
`named_parameters`([prefix, recurse])	Returns an iterator over module parameters, yielding both the name of the parameter as well as the parameter itself.
`parameters`([recurse])	Returns an iterator over module parameters.
`register_backward_hook`(hook)	Registers a backward hook on the module.
`register_buffer`(name, tensor[, persistent])	Adds a buffer to the module.
`register_forward_hook`(hook)	Registers a forward hook on the module.
`register_forward_pre_hook`(hook)	Registers a forward pre-hook on the module.
`register_full_backward_hook`(hook)	Registers a backward hook on the module.
`register_parameter`(name, param)	Adds a parameter to the module.
`requires_grad_`([requires_grad])	Change if autograd should record operations on parameters in this module.
`set_extra_state`(state)	This function is called from `load_state_dict()` to handle any extra state found within the state_dict.
`share_memory`()	See `torch.Tensor.share_memory_()`
`state_dict`([destination, prefix, keep_vars])	Returns a dictionary containing a whole state of the module.
`to`(args, *kwargs)	Moves and/or casts the parameters and buffers.
`to_empty`(*, device)	Moves the parameters and buffers to the specified device without copying storage.
`train`([mode])	Sets the module in training mode.
`type`(dst_type)	Casts all parameters and buffers to `dst_type`.
`xpu`([device])	Moves all model parameters and buffers to the XPU.
`zero_grad`([set_to_none])	Sets gradients of all model parameters to zero.

__call__

forward(logits, label, enc_attn_weights=None, coverage_vectors=None)¶

The calculation method.

Parameters

logits (torch.Tensor) – The probability with the shape of [batch_size, max_decoder_step, vocab_size]. Note that it is calculated by softmax.
label (torch.Tensor) – The ground-truth with the shape of [batch_size, max_decoder_step].
enc_attn_weights (list[torch.Tensor], default=None) – The list containing all decoding steps’ attention weights. The length should be the decoding step. Each element should be the tensor.
coverage_vectors (list[torch.Tensor], default=None) – The list containing all coverage vectors in decoding module.

Returns

graph2seq_loss: torch.Tensor

class graph4nlp.loss.GeneralLoss(loss_type, weight=None, size_average=None, ignore_index=-100, reduce=None, reduction='mean', pos_weight=None)¶

This general loss are backended on the pytorch loss function. The detailed decription for each loss function can be found at:

pytorch loss function <https://pytorch.org/docs/stable/nn.html#loss-functions>

Parameters

loss_type: str

the loss function to select (NLL,``BCEWithLogits``, MultiLabelMargin,``SoftMargin`` ,``CrossEntropy`` )

NLL loss<https://pytorch.org/docs/stable/_modules/torch/nn/modules/loss.html#NLLLoss> measures the negative log likelihood loss. It is useful to train a classification problem with C classes.

BCEWithLogits loss <https://pytorch.org/docs/stable/_modules/torch/nn/modules/loss.html#BCEWithLogitsLoss> combines a Sigmoid layer and the BCELoss in one single class. This version is more numerically stable than using a plain Sigmoid`followed by a `BCELoss as, by combining the operations into one layer, we take advantage of the log-sum-exp trick for numerical stability.

BCE Loss<https://pytorch.org/docs/stable/_modules/torch/nn/modules/loss.html#BCELoss> creates a criterion that measures the Binary Cross Entropy between the target and the output.

MultiLabelMargin loss <https://pytorch.org/docs/stable/_modules/torch/nn/modules/loss.html#MultiLabelMarginLoss> creates a criterion that optimizes a multi-class multi-classification hinge loss (margin-based loss) between input \(x\) (a 2D mini-batch Tensor) and output \(y\) (which is a 2D Tensor of target class indices).

SoftMargin loss <https://pytorch.org/docs/stable/_modules/torch/nn/modules/loss.html#SoftMarginLoss> creates a criterion that optimizes a two-class classification logistic loss between input tensor \(x\) and target tensor \(y\) (containing 1 or -1).

CrossEntropy loss <https://pytorch.org/docs/stable/_modules/torch/nn/modules/loss.html#CrossEntropyLoss> ` combines pytorch function `nn.LogSoftmax and nn.NLLLoss in one single class. It is useful when training a classification problem with C classes.

weight: Tensor, optional

a manual rescaling weight given to the loss of each batch element. If given, has to be a Tensor of size nbatch. This parameter is not suitable for SoftMargin loss functions.

size_average: bool, optional

By default,the losses are averaged over each loss element in the batch. Note that for some losses, there are multiple elements per sample. If the field size_average is set to False, the losses are instead summed for each minibatch. Ignored when reduce is False. Default: True.

reduce: bool, optional

By default, the losses are averaged or summed over observations for each minibatch depending on size_average. When reduce is False, returns a loss per batch element instead and ignores size_average. Default: True

reduction: string, optional

Specifies the reduction to apply to the output: 'none' | 'mean' | 'sum'. 'none': no reduction will be applied,

'mean': the sum of the output will be divided by the number of elements in the output, 'sum': the output will be summed.

Note: size_average and reduce are in the process of being deprecated, and in the meantime, specifying either of those two args will override reduction. Default: 'mean'

pos_weight:Tensor, optional

A weight of positive examples. Must be a vector with length equal to the number of classes. This paramter is only suitable for BCEWithLogits loss function.

ignore_index: int, optional

Specifies a target value that is ignored and does not contribute to the input gradient. When size_average is True, the loss is averaged over non-ignored targets. This paramter is only suitable for CrossEntropy loss function.

Methods

`add_module`(name, module)	Adds a child module to the current module.
`apply`(fn)	Applies `fn` recursively to every submodule (as returned by `.children()`) as well as self.
`bfloat16`()	Casts all floating point parameters and buffers to `bfloat16` datatype.
`buffers`([recurse])	Returns an iterator over module buffers.
`children`()	Returns an iterator over immediate children modules.
`cpu`()	Moves all model parameters and buffers to the CPU.
`cuda`([device])	Moves all model parameters and buffers to the GPU.
`double`()	Casts all floating point parameters and buffers to `double` datatype.
`eval`()	Sets the module in evaluation mode.
`extra_repr`()	Set the extra representation of the module
`float`()	Casts all floating point parameters and buffers to `float` datatype.
`forward`(input, target)	Compute the loss.
`get_buffer`(target)	Returns the buffer given by `target` if it exists, otherwise throws an error.
`get_extra_state`()	Returns any extra state to include in the module’s state_dict.
`get_parameter`(target)	Returns the parameter given by `target` if it exists, otherwise throws an error.
`get_submodule`(target)	Returns the submodule given by `target` if it exists, otherwise throws an error.
`half`()	Casts all floating point parameters and buffers to `half` datatype.
`load_state_dict`(state_dict[, strict])	Copies parameters and buffers from `state_dict` into this module and its descendants.
`modules`()	Returns an iterator over all modules in the network.
`named_buffers`([prefix, recurse])	Returns an iterator over module buffers, yielding both the name of the buffer as well as the buffer itself.
`named_children`()	Returns an iterator over immediate children modules, yielding both the name of the module as well as the module itself.
`named_modules`([memo, prefix, remove_duplicate])	Returns an iterator over all modules in the network, yielding both the name of the module as well as the module itself.
`named_parameters`([prefix, recurse])	Returns an iterator over module parameters, yielding both the name of the parameter as well as the parameter itself.
`parameters`([recurse])	Returns an iterator over module parameters.
`register_backward_hook`(hook)	Registers a backward hook on the module.
`register_buffer`(name, tensor[, persistent])	Adds a buffer to the module.
`register_forward_hook`(hook)	Registers a forward hook on the module.
`register_forward_pre_hook`(hook)	Registers a forward pre-hook on the module.
`register_full_backward_hook`(hook)	Registers a backward hook on the module.
`register_parameter`(name, param)	Adds a parameter to the module.
`requires_grad_`([requires_grad])	Change if autograd should record operations on parameters in this module.
`set_extra_state`(state)	This function is called from `load_state_dict()` to handle any extra state found within the state_dict.
`share_memory`()	See `torch.Tensor.share_memory_()`
`state_dict`([destination, prefix, keep_vars])	Returns a dictionary containing a whole state of the module.
`to`(args, *kwargs)	Moves and/or casts the parameters and buffers.
`to_empty`(*, device)	Moves the parameters and buffers to the specified device without copying storage.
`train`([mode])	Sets the module in training mode.
`type`(dst_type)	Casts all parameters and buffers to `dst_type`.
`xpu`([device])	Moves all model parameters and buffers to the XPU.
`zero_grad`([set_to_none])	Sets gradients of all model parameters to zero.

__call__

forward(input, target)¶

Compute the loss.

Parameters

NLL loss:

Input: tensor.: \((N, C)\) where C = number of classes, or \((N, C, d_1, d_2, ..., d_K)\) with \(K \geq 1\) in the case of K-dimensional loss.
Target: tensor.: \((N)\) where each value is \(0 \leq \text{targets}[i] \leq C-1\), or \((N, d_1, d_2, ..., d_K)\) with \(K \geq 1\) in the case of K-dimensional loss.
Output: scalar.: If reduction is 'none', then the same size as the target: \((N)\), or \((N, d_1, d_2, ..., d_K)\) with \(K \geq 1\) in the case of K-dimensional loss.

BCE/BCEWithLogits loss:

Input: Tensor.: \((N, *)\) where \(*\) means, any number of additional dimensions
Target: Tensor.: \((N, *)\), same shape as the input
Output: scalar.: If reduction is 'none', then \((N, *)\), same shape as input.

MultiLabelMargin loss:

Input: Tensor.: \((C)\) or \((N, C)\) where N is the batch size and C is the number of classes.
Target: Tensor.: \((C)\) or \((N, C)\), label targets padded by -1 ensuring same shape as the input.
Output: Scalar.: If reduction is 'none', then \((N)\).

SoftMargin loss:

Input: Tensor.: \((*)\) where \(*\) means, any number of additional dimensions
Target: Tensor.: \((*)\), same shape as the input
Output: scalar.: If reduction is 'none', then same shape as the input

CrossEntropy:

Input: Tensor.
\((N, C)\) where C = number of classes, or \((N, C, d_1, d_2, ..., d_K)\) with \(K \geq 1\) in the case of K-dimensional loss.

Target: Tensor.
\((N)\) where each value is \(0 \leq \text{targets}[i] \leq C-1\), or \((N, d_1, d_2, ..., d_K)\) with \(K \geq 1\) in the case of K-dimensional loss.

Output: scalar.: If reduction is 'none', then the same size as the target: \((N)\), or \((N, d_1, d_2, ..., d_K)\) with \(K \geq 1\) in the case of K-dimensional loss.

class graph4nlp.loss.KGLoss(loss_type, size_average=None, reduce=None, reduction='mean', adv_temperature=None, weight=None)¶

In the state-of-the-art KGE models, loss functions were designed according to various pointwise, pairwise and multi-class approaches. Refers to Loss Functions in Knowledge Graph Embedding Models

Pointwise Loss Function

MSELoss Creates a criterion that measures the mean squared error (squared L2 norm) between each element in the input \(x\) and target \(y\).

SOFTMARGINLOSS Creates a criterion that optimizes a two-class classification logistic loss between input tensor \(x\) and target tensor \(y\) (containing 1 or -1). Tips: The number of positive and negative samples should be about the same,

otherwise it’s easy to overfit

\[\text{loss}(x, y) = \sum_i \frac{\log(1 + \exp(-y[i]*x[i]))}{\text{x.nelement}()}\]

Pairwise Loss Function

SoftplusLoss refers to the paper OpenKE: An Open Toolkit for Knowledge Embedding

SigmoidLoss refers to the paper OpenKE: An Open Toolkit for Knowledge Embedding

Multi-Class Loss Function

Binary Cross Entropy Loss Creates a criterion that measures the Binary Cross Entropy between the target and the output. Note that the targets \(y\) should be numbers between 0 and 1.

Methods

`add_module`(name, module)	Adds a child module to the current module.
`apply`(fn)	Applies `fn` recursively to every submodule (as returned by `.children()`) as well as self.
`bfloat16`()	Casts all floating point parameters and buffers to `bfloat16` datatype.
`buffers`([recurse])	Returns an iterator over module buffers.
`children`()	Returns an iterator over immediate children modules.
`cpu`()	Moves all model parameters and buffers to the CPU.
`cuda`([device])	Moves all model parameters and buffers to the GPU.
`double`()	Casts all floating point parameters and buffers to `double` datatype.
`eval`()	Sets the module in evaluation mode.
`extra_repr`()	Set the extra representation of the module
`float`()	Casts all floating point parameters and buffers to `float` datatype.
`forward`([input, target, p_score, n_score])	Parameters
`get_buffer`(target)	Returns the buffer given by `target` if it exists, otherwise throws an error.
`get_extra_state`()	Returns any extra state to include in the module’s state_dict.
`get_parameter`(target)	Returns the parameter given by `target` if it exists, otherwise throws an error.
`get_submodule`(target)	Returns the submodule given by `target` if it exists, otherwise throws an error.
`half`()	Casts all floating point parameters and buffers to `half` datatype.
`load_state_dict`(state_dict[, strict])	Copies parameters and buffers from `state_dict` into this module and its descendants.
`modules`()	Returns an iterator over all modules in the network.
`named_buffers`([prefix, recurse])	Returns an iterator over module buffers, yielding both the name of the buffer as well as the buffer itself.
`named_children`()	Returns an iterator over immediate children modules, yielding both the name of the module as well as the module itself.
`named_modules`([memo, prefix, remove_duplicate])	Returns an iterator over all modules in the network, yielding both the name of the module as well as the module itself.
`named_parameters`([prefix, recurse])	Returns an iterator over module parameters, yielding both the name of the parameter as well as the parameter itself.
`parameters`([recurse])	Returns an iterator over module parameters.
`register_backward_hook`(hook)	Registers a backward hook on the module.
`register_buffer`(name, tensor[, persistent])	Adds a buffer to the module.
`register_forward_hook`(hook)	Registers a forward hook on the module.
`register_forward_pre_hook`(hook)	Registers a forward pre-hook on the module.
`register_full_backward_hook`(hook)	Registers a backward hook on the module.
`register_parameter`(name, param)	Adds a parameter to the module.
`requires_grad_`([requires_grad])	Change if autograd should record operations on parameters in this module.
`set_extra_state`(state)	This function is called from `load_state_dict()` to handle any extra state found within the state_dict.
`share_memory`()	See `torch.Tensor.share_memory_()`
`state_dict`([destination, prefix, keep_vars])	Returns a dictionary containing a whole state of the module.
`to`(args, *kwargs)	Moves and/or casts the parameters and buffers.
`to_empty`(*, device)	Moves the parameters and buffers to the specified device without copying storage.
`train`([mode])	Sets the module in training mode.
`type`(dst_type)	Casts all parameters and buffers to `dst_type`.
`xpu`([device])	Moves all model parameters and buffers to the XPU.
`zero_grad`([set_to_none])	Sets gradients of all model parameters to zero.

__call__

forward(input=None, target=None, p_score=None, n_score=None)¶

Parameters

MSELoss

input: Tensor.: \((N,*)\) where \(*\) means any number of additional dimensions
target: Tensor.: \((N,*)\), same shape as the input
output:: If reduction is ‘none’, then same shape as the input

SoftMarginLoss

input: Tensor.: \((*)\) where * means, any number of additional dimensions
target: Tensor.: same shape as the input
output: scalar.: If reduction is ‘none’, then same shape as the input

SoftplusLoss

p_score: Tensor.: \((*)\) where * means, any number of additional dimensions
n_score: Tensor.: \((*)\) where * means, any number of additional dimensions. The dimension could be different from the p_score dimension.

output: scalar.

SigmoidLoss

p_score: Tensor.: \((*)\) where * means, any number of additional dimensions
n_score: Tensor.: \((*)\) where * means, any number of additional dimensions. The dimension could be different from the p_score dimension.

output: scalar.

BCELoss:

Input: Tensor.: \((N, *)\) where \(*\) means, any number of additional dimensions
Target: Tensor.: \((N, *)\), same shape as the input
Output: scalar.: If reduction is 'none', then \((N, *)\), same shape as input.