graph4nlp.loss¶
Losses¶
-
class
graph4nlp.loss.
CoverageLoss
(cover_loss)¶ The loss function for coverage mechanism.
- Parameters
cover_loss (float) – The weight for coverage loss.
Methods
add_module
(name, module)Adds a child module to the current module.
apply
(fn)Applies
fn
recursively to every submodule (as returned by.children()
) as well as self.bfloat16
()Casts all floating point parameters and buffers to
bfloat16
datatype.buffers
([recurse])Returns an iterator over module buffers.
children
()Returns an iterator over immediate children modules.
cpu
()Moves all model parameters and buffers to the CPU.
cuda
([device])Moves all model parameters and buffers to the GPU.
double
()Casts all floating point parameters and buffers to
double
datatype.eval
()Sets the module in evaluation mode.
extra_repr
()Set the extra representation of the module
float
()Casts all floating point parameters and buffers to
float
datatype.forward
(enc_attn_weights, coverage_vectors)The calculation function.
get_buffer
(target)Returns the buffer given by
target
if it exists, otherwise throws an error.get_extra_state
()Returns any extra state to include in the module’s state_dict.
get_parameter
(target)Returns the parameter given by
target
if it exists, otherwise throws an error.get_submodule
(target)Returns the submodule given by
target
if it exists, otherwise throws an error.half
()Casts all floating point parameters and buffers to
half
datatype.load_state_dict
(state_dict[, strict])Copies parameters and buffers from
state_dict
into this module and its descendants.modules
()Returns an iterator over all modules in the network.
named_buffers
([prefix, recurse])Returns an iterator over module buffers, yielding both the name of the buffer as well as the buffer itself.
named_children
()Returns an iterator over immediate children modules, yielding both the name of the module as well as the module itself.
named_modules
([memo, prefix, remove_duplicate])Returns an iterator over all modules in the network, yielding both the name of the module as well as the module itself.
named_parameters
([prefix, recurse])Returns an iterator over module parameters, yielding both the name of the parameter as well as the parameter itself.
parameters
([recurse])Returns an iterator over module parameters.
register_backward_hook
(hook)Registers a backward hook on the module.
register_buffer
(name, tensor[, persistent])Adds a buffer to the module.
register_forward_hook
(hook)Registers a forward hook on the module.
register_forward_pre_hook
(hook)Registers a forward pre-hook on the module.
register_full_backward_hook
(hook)Registers a backward hook on the module.
register_parameter
(name, param)Adds a parameter to the module.
requires_grad_
([requires_grad])Change if autograd should record operations on parameters in this module.
set_extra_state
(state)This function is called from
load_state_dict()
to handle any extra state found within the state_dict.share_memory
()See
torch.Tensor.share_memory_()
state_dict
([destination, prefix, keep_vars])Returns a dictionary containing a whole state of the module.
to
(*args, **kwargs)Moves and/or casts the parameters and buffers.
to_empty
(*, device)Moves the parameters and buffers to the specified device without copying storage.
train
([mode])Sets the module in training mode.
type
(dst_type)Casts all parameters and buffers to
dst_type
.xpu
([device])Moves all model parameters and buffers to the XPU.
zero_grad
([set_to_none])Sets gradients of all model parameters to zero.
__call__
-
forward
(enc_attn_weights, coverage_vectors)¶ The calculation function.
- Parameters
enc_attn_weights (list[torch.Tensor]) – The list containing all decoding steps’ attention weights. The length should be the decoding step. Each element should be the tensor.
coverage_vectors (list[torch.Tensor]) – The list containing all coverage vectors in decoding module.
- Returns
coverage_loss – The loss.
- Return type
torch.Tensor
-
class
graph4nlp.loss.
SeqGenerationLoss
(ignore_index, use_coverage=False, coverage_weight=0.3)¶ The general loss for
Graph2Seq
model.- Parameters
ignore_index (ignore_index) – The token index which will be ignored during calculation. Usually it is the padding index.
use_coverage (bool, default=False) – Whether use coverage mechanism. If set
True
, the we will add the coverage loss.coverage_weight (float, default=0.3) – The weight of coverage loss.
Methods
add_module
(name, module)Adds a child module to the current module.
apply
(fn)Applies
fn
recursively to every submodule (as returned by.children()
) as well as self.bfloat16
()Casts all floating point parameters and buffers to
bfloat16
datatype.buffers
([recurse])Returns an iterator over module buffers.
children
()Returns an iterator over immediate children modules.
cpu
()Moves all model parameters and buffers to the CPU.
cuda
([device])Moves all model parameters and buffers to the GPU.
double
()Casts all floating point parameters and buffers to
double
datatype.eval
()Sets the module in evaluation mode.
extra_repr
()Set the extra representation of the module
float
()Casts all floating point parameters and buffers to
float
datatype.forward
(logits, label[, enc_attn_weights, …])The calculation method.
get_buffer
(target)Returns the buffer given by
target
if it exists, otherwise throws an error.get_extra_state
()Returns any extra state to include in the module’s state_dict.
get_parameter
(target)Returns the parameter given by
target
if it exists, otherwise throws an error.get_submodule
(target)Returns the submodule given by
target
if it exists, otherwise throws an error.half
()Casts all floating point parameters and buffers to
half
datatype.load_state_dict
(state_dict[, strict])Copies parameters and buffers from
state_dict
into this module and its descendants.modules
()Returns an iterator over all modules in the network.
named_buffers
([prefix, recurse])Returns an iterator over module buffers, yielding both the name of the buffer as well as the buffer itself.
named_children
()Returns an iterator over immediate children modules, yielding both the name of the module as well as the module itself.
named_modules
([memo, prefix, remove_duplicate])Returns an iterator over all modules in the network, yielding both the name of the module as well as the module itself.
named_parameters
([prefix, recurse])Returns an iterator over module parameters, yielding both the name of the parameter as well as the parameter itself.
parameters
([recurse])Returns an iterator over module parameters.
register_backward_hook
(hook)Registers a backward hook on the module.
register_buffer
(name, tensor[, persistent])Adds a buffer to the module.
register_forward_hook
(hook)Registers a forward hook on the module.
register_forward_pre_hook
(hook)Registers a forward pre-hook on the module.
register_full_backward_hook
(hook)Registers a backward hook on the module.
register_parameter
(name, param)Adds a parameter to the module.
requires_grad_
([requires_grad])Change if autograd should record operations on parameters in this module.
set_extra_state
(state)This function is called from
load_state_dict()
to handle any extra state found within the state_dict.share_memory
()See
torch.Tensor.share_memory_()
state_dict
([destination, prefix, keep_vars])Returns a dictionary containing a whole state of the module.
to
(*args, **kwargs)Moves and/or casts the parameters and buffers.
to_empty
(*, device)Moves the parameters and buffers to the specified device without copying storage.
train
([mode])Sets the module in training mode.
type
(dst_type)Casts all parameters and buffers to
dst_type
.xpu
([device])Moves all model parameters and buffers to the XPU.
zero_grad
([set_to_none])Sets gradients of all model parameters to zero.
__call__
-
forward
(logits, label, enc_attn_weights=None, coverage_vectors=None)¶ The calculation method.
- Parameters
logits (torch.Tensor) – The probability with the shape of
[batch_size, max_decoder_step, vocab_size]
. Note that it is calculated bysoftmax
.label (torch.Tensor) – The ground-truth with the shape of
[batch_size, max_decoder_step]
.enc_attn_weights (list[torch.Tensor], default=None) – The list containing all decoding steps’ attention weights. The length should be the decoding step. Each element should be the tensor.
coverage_vectors (list[torch.Tensor], default=None) – The list containing all coverage vectors in decoding module.
- Returns
- graph2seq_loss: torch.Tensor
-
class
graph4nlp.loss.
GeneralLoss
(loss_type, weight=None, size_average=None, ignore_index=-100, reduce=None, reduction='mean', pos_weight=None)¶ This general loss are backended on the pytorch loss function. The detailed decription for each loss function can be found at:
pytorch loss function <https://pytorch.org/docs/stable/nn.html#loss-functions>
- Parameters
- loss_type: str
the loss function to select (
NLL
,``BCEWithLogits``,MultiLabelMargin
,``SoftMargin`` ,``CrossEntropy`` )NLL loss<https://pytorch.org/docs/stable/_modules/torch/nn/modules/loss.html#NLLLoss> measures the negative log likelihood loss. It is useful to train a classification problem with C classes.
BCEWithLogits loss <https://pytorch.org/docs/stable/_modules/torch/nn/modules/loss.html#BCEWithLogitsLoss> combines a Sigmoid layer and the BCELoss in one single class. This version is more numerically stable than using a plain Sigmoid`followed by a `BCELoss as, by combining the operations into one layer, we take advantage of the log-sum-exp trick for numerical stability.
BCE Loss<https://pytorch.org/docs/stable/_modules/torch/nn/modules/loss.html#BCELoss> creates a criterion that measures the Binary Cross Entropy between the target and the output.
MultiLabelMargin loss <https://pytorch.org/docs/stable/_modules/torch/nn/modules/loss.html#MultiLabelMarginLoss> creates a criterion that optimizes a multi-class multi-classification hinge loss (margin-based loss) between input \(x\) (a 2D mini-batch Tensor) and output \(y\) (which is a 2D Tensor of target class indices).
SoftMargin loss <https://pytorch.org/docs/stable/_modules/torch/nn/modules/loss.html#SoftMarginLoss> creates a criterion that optimizes a two-class classification logistic loss between input tensor \(x\) and target tensor \(y\) (containing 1 or -1).
CrossEntropy loss <https://pytorch.org/docs/stable/_modules/torch/nn/modules/loss.html#CrossEntropyLoss> ` combines pytorch function `nn.LogSoftmax and nn.NLLLoss in one single class. It is useful when training a classification problem with C classes.
- weight: Tensor, optional
a manual rescaling weight given to the loss of each batch element. If given, has to be a Tensor of size nbatch. This parameter is not suitable for
SoftMargin
loss functions.- size_average: bool, optional
By default,the losses are averaged over each loss element in the batch. Note that for some losses, there are multiple elements per sample. If the field
size_average
is set toFalse
, the losses are instead summed for each minibatch. Ignored when reduce isFalse
. Default:True
.- reduce: bool, optional
By default, the losses are averaged or summed over observations for each minibatch depending on
size_average
. Whenreduce
isFalse
, returns a loss per batch element instead and ignoressize_average
. Default:True
- reduction: string, optional
Specifies the reduction to apply to the output:
'none'
|'mean'
|'sum'
.'none'
: no reduction will be applied,'mean'
: the sum of the output will be divided by the number of elements in the output,'sum'
: the output will be summed.Note:
size_average
andreduce
are in the process of being deprecated, and in the meantime, specifying either of those two args will overridereduction
. Default:'mean'
- pos_weight:Tensor, optional
A weight of positive examples. Must be a vector with length equal to the number of classes. This paramter is only suitable for
BCEWithLogits
loss function.- ignore_index: int, optional
Specifies a target value that is ignored and does not contribute to the input gradient. When
size_average
isTrue
, the loss is averaged over non-ignored targets. This paramter is only suitable forCrossEntropy
loss function.
Methods
add_module
(name, module)Adds a child module to the current module.
apply
(fn)Applies
fn
recursively to every submodule (as returned by.children()
) as well as self.bfloat16
()Casts all floating point parameters and buffers to
bfloat16
datatype.buffers
([recurse])Returns an iterator over module buffers.
children
()Returns an iterator over immediate children modules.
cpu
()Moves all model parameters and buffers to the CPU.
cuda
([device])Moves all model parameters and buffers to the GPU.
double
()Casts all floating point parameters and buffers to
double
datatype.eval
()Sets the module in evaluation mode.
extra_repr
()Set the extra representation of the module
float
()Casts all floating point parameters and buffers to
float
datatype.forward
(input, target)Compute the loss.
get_buffer
(target)Returns the buffer given by
target
if it exists, otherwise throws an error.get_extra_state
()Returns any extra state to include in the module’s state_dict.
get_parameter
(target)Returns the parameter given by
target
if it exists, otherwise throws an error.get_submodule
(target)Returns the submodule given by
target
if it exists, otherwise throws an error.half
()Casts all floating point parameters and buffers to
half
datatype.load_state_dict
(state_dict[, strict])Copies parameters and buffers from
state_dict
into this module and its descendants.modules
()Returns an iterator over all modules in the network.
named_buffers
([prefix, recurse])Returns an iterator over module buffers, yielding both the name of the buffer as well as the buffer itself.
named_children
()Returns an iterator over immediate children modules, yielding both the name of the module as well as the module itself.
named_modules
([memo, prefix, remove_duplicate])Returns an iterator over all modules in the network, yielding both the name of the module as well as the module itself.
named_parameters
([prefix, recurse])Returns an iterator over module parameters, yielding both the name of the parameter as well as the parameter itself.
parameters
([recurse])Returns an iterator over module parameters.
register_backward_hook
(hook)Registers a backward hook on the module.
register_buffer
(name, tensor[, persistent])Adds a buffer to the module.
register_forward_hook
(hook)Registers a forward hook on the module.
register_forward_pre_hook
(hook)Registers a forward pre-hook on the module.
register_full_backward_hook
(hook)Registers a backward hook on the module.
register_parameter
(name, param)Adds a parameter to the module.
requires_grad_
([requires_grad])Change if autograd should record operations on parameters in this module.
set_extra_state
(state)This function is called from
load_state_dict()
to handle any extra state found within the state_dict.share_memory
()See
torch.Tensor.share_memory_()
state_dict
([destination, prefix, keep_vars])Returns a dictionary containing a whole state of the module.
to
(*args, **kwargs)Moves and/or casts the parameters and buffers.
to_empty
(*, device)Moves the parameters and buffers to the specified device without copying storage.
train
([mode])Sets the module in training mode.
type
(dst_type)Casts all parameters and buffers to
dst_type
.xpu
([device])Moves all model parameters and buffers to the XPU.
zero_grad
([set_to_none])Sets gradients of all model parameters to zero.
__call__
-
forward
(input, target)¶ Compute the loss.
- Parameters
- NLL loss:
- Input: tensor.
\((N, C)\) where C = number of classes, or \((N, C, d_1, d_2, ..., d_K)\) with \(K \geq 1\) in the case of K-dimensional loss.
- Target: tensor.
\((N)\) where each value is \(0 \leq \text{targets}[i] \leq C-1\), or \((N, d_1, d_2, ..., d_K)\) with \(K \geq 1\) in the case of K-dimensional loss.
- Output: scalar.
If
reduction
is'none'
, then the same size as the target: \((N)\), or \((N, d_1, d_2, ..., d_K)\) with \(K \geq 1\) in the case of K-dimensional loss.
- BCE/BCEWithLogits loss:
- Input: Tensor.
\((N, *)\) where \(*\) means, any number of additional dimensions
- Target: Tensor.
\((N, *)\), same shape as the input
- Output: scalar.
If
reduction
is'none'
, then \((N, *)\), same shape as input.
- MultiLabelMargin loss:
- Input: Tensor.
\((C)\) or \((N, C)\) where N is the batch size and C is the number of classes.
- Target: Tensor.
\((C)\) or \((N, C)\), label targets padded by -1 ensuring same shape as the input.
- Output: Scalar.
If
reduction
is'none'
, then \((N)\).
- SoftMargin loss:
- Input: Tensor.
\((*)\) where \(*\) means, any number of additional dimensions
- Target: Tensor.
\((*)\), same shape as the input
- Output: scalar.
If
reduction
is'none'
, then same shape as the input
- CrossEntropy:
- Input: Tensor.
\((N, C)\) where C = number of classes, or \((N, C, d_1, d_2, ..., d_K)\) with \(K \geq 1\) in the case of K-dimensional loss.
- Target: Tensor.
\((N)\) where each value is \(0 \leq \text{targets}[i] \leq C-1\), or \((N, d_1, d_2, ..., d_K)\) with \(K \geq 1\) in the case of K-dimensional loss.
- Output: scalar.
If
reduction
is'none'
, then the same size as the target: \((N)\), or \((N, d_1, d_2, ..., d_K)\) with \(K \geq 1\) in the case of K-dimensional loss.
-
class
graph4nlp.loss.
KGLoss
(loss_type, size_average=None, reduce=None, reduction='mean', adv_temperature=None, weight=None)¶ In the state-of-the-art KGE models, loss functions were designed according to various pointwise, pairwise and multi-class approaches. Refers to Loss Functions in Knowledge Graph Embedding Models
Pointwise Loss Function
MSELoss Creates a criterion that measures the mean squared error (squared L2 norm) between each element in the input \(x\) and target \(y\).
SOFTMARGINLOSS Creates a criterion that optimizes a two-class classification logistic loss between input tensor \(x\) and target tensor \(y\) (containing 1 or -1). Tips: The number of positive and negative samples should be about the same,
otherwise it’s easy to overfit
\[\text{loss}(x, y) = \sum_i \frac{\log(1 + \exp(-y[i]*x[i]))}{\text{x.nelement}()}\]Pairwise Loss Function
SoftplusLoss refers to the paper OpenKE: An Open Toolkit for Knowledge Embedding
SigmoidLoss refers to the paper OpenKE: An Open Toolkit for Knowledge Embedding
Multi-Class Loss Function
Binary Cross Entropy Loss Creates a criterion that measures the Binary Cross Entropy between the target and the output. Note that the targets \(y\) should be numbers between 0 and 1.
Methods
add_module
(name, module)Adds a child module to the current module.
apply
(fn)Applies
fn
recursively to every submodule (as returned by.children()
) as well as self.bfloat16
()Casts all floating point parameters and buffers to
bfloat16
datatype.buffers
([recurse])Returns an iterator over module buffers.
children
()Returns an iterator over immediate children modules.
cpu
()Moves all model parameters and buffers to the CPU.
cuda
([device])Moves all model parameters and buffers to the GPU.
double
()Casts all floating point parameters and buffers to
double
datatype.eval
()Sets the module in evaluation mode.
extra_repr
()Set the extra representation of the module
float
()Casts all floating point parameters and buffers to
float
datatype.forward
([input, target, p_score, n_score])- Parameters
get_buffer
(target)Returns the buffer given by
target
if it exists, otherwise throws an error.get_extra_state
()Returns any extra state to include in the module’s state_dict.
get_parameter
(target)Returns the parameter given by
target
if it exists, otherwise throws an error.get_submodule
(target)Returns the submodule given by
target
if it exists, otherwise throws an error.half
()Casts all floating point parameters and buffers to
half
datatype.load_state_dict
(state_dict[, strict])Copies parameters and buffers from
state_dict
into this module and its descendants.modules
()Returns an iterator over all modules in the network.
named_buffers
([prefix, recurse])Returns an iterator over module buffers, yielding both the name of the buffer as well as the buffer itself.
named_children
()Returns an iterator over immediate children modules, yielding both the name of the module as well as the module itself.
named_modules
([memo, prefix, remove_duplicate])Returns an iterator over all modules in the network, yielding both the name of the module as well as the module itself.
named_parameters
([prefix, recurse])Returns an iterator over module parameters, yielding both the name of the parameter as well as the parameter itself.
parameters
([recurse])Returns an iterator over module parameters.
register_backward_hook
(hook)Registers a backward hook on the module.
register_buffer
(name, tensor[, persistent])Adds a buffer to the module.
register_forward_hook
(hook)Registers a forward hook on the module.
register_forward_pre_hook
(hook)Registers a forward pre-hook on the module.
register_full_backward_hook
(hook)Registers a backward hook on the module.
register_parameter
(name, param)Adds a parameter to the module.
requires_grad_
([requires_grad])Change if autograd should record operations on parameters in this module.
set_extra_state
(state)This function is called from
load_state_dict()
to handle any extra state found within the state_dict.share_memory
()See
torch.Tensor.share_memory_()
state_dict
([destination, prefix, keep_vars])Returns a dictionary containing a whole state of the module.
to
(*args, **kwargs)Moves and/or casts the parameters and buffers.
to_empty
(*, device)Moves the parameters and buffers to the specified device without copying storage.
train
([mode])Sets the module in training mode.
type
(dst_type)Casts all parameters and buffers to
dst_type
.xpu
([device])Moves all model parameters and buffers to the XPU.
zero_grad
([set_to_none])Sets gradients of all model parameters to zero.
__call__
-
forward
(input=None, target=None, p_score=None, n_score=None)¶ - Parameters
- MSELoss
- input: Tensor.
\((N,*)\) where \(*\) means any number of additional dimensions
- target: Tensor.
\((N,*)\), same shape as the input
- output:
If reduction is ‘none’, then same shape as the input
- SoftMarginLoss
- input: Tensor.
\((*)\) where * means, any number of additional dimensions
- target: Tensor.
same shape as the input
- output: scalar.
If reduction is ‘none’, then same shape as the input
- SoftplusLoss
- p_score: Tensor.
\((*)\) where * means, any number of additional dimensions
- n_score: Tensor.
\((*)\) where * means, any number of additional dimensions. The dimension could be different from the p_score dimension.
output: scalar.
- SigmoidLoss
- p_score: Tensor.
\((*)\) where * means, any number of additional dimensions
- n_score: Tensor.
\((*)\) where * means, any number of additional dimensions. The dimension could be different from the p_score dimension.
output: scalar.
- BCELoss:
- Input: Tensor.
\((N, *)\) where \(*\) means, any number of additional dimensions
- Target: Tensor.
\((N, *)\), same shape as the input
- Output: scalar.
If
reduction
is'none'
, then \((N, *)\), same shape as input.