Graph Convolutional Networks

Graph Convolutional Networks (GCN) is a typical example of spectral-based graph filters. A multi-layer GCN is considered with the following layer-wise propagation rule using spectral graph theory:

\[\mathbf{H}^{(l)} = \sigma( {\tilde{D}}^{-\frac{1}{2}}{\tilde{A}}{\tilde{D}}^{-\frac{1}{2}} \mathbf{H}^{(l-1)} \mathbf{W}^{(l-1)})\]

Here, \(\mathbf{W}^{(l-1)}\) is a layer-specific trainable weight matrix and \(\sigma(\cdot)\) denotes an activation function. \(\mathbf{H}^{(l)} \in \mathbb{R}^{n \times d}\) is the activated node embeddings at \((l-1)\)-th layer.

4.1.1 GCN Module Construction Function

The construction function performs the following steps:

  1. Set options.

  2. Register learnable parameters or submodules (GCNLayer).

class GCN(GNNBase):
        def __init__(self,
                     num_layers,
                     input_size,
                     hidden_size,
                     output_size,
                     direction_option='bi_sep',
                     feat_drop=0.,
                     gcn_norm='both',
                     weight=True,
                     bias=True,
                     activation=None,
                     allow_zero_in_degree=False,
                     use_edge_weight=False,
                     residual=True):
            super(GCN, self).__init__()
            self.num_layers = num_layers
            self.direction_option = direction_option
            self.gcn_layers = nn.ModuleList()
            assert self.num_layers > 0
            self.use_edge_weight = use_edge_weight

            if isinstance(hidden_size, int):
                hidden_size = [hidden_size] * (self.num_layers - 1)

            if self.num_layers > 1:
                # input projection
                self.gcn_layers.append(GCNLayer(input_size,
                                                hidden_size[0],
                                                direction_option=self.direction_option,
                                                feat_drop=feat_drop,
                                                gcn_norm=gcn_norm,
                                                weight=weight,
                                                bias=bias,
                                                activation=activation,
                                                allow_zero_in_degree=allow_zero_in_degree,
                                                residual=residual))

            # hidden layers
            for l in range(1, self.num_layers - 1):
                # due to multi-head, the input_size = hidden_size * num_heads
                self.gcn_layers.append(GCNLayer(hidden_size[l - 1],
                                                hidden_size[l],
                                                direction_option=self.direction_option,
                                                feat_drop=feat_drop,
                                                gcn_norm=gcn_norm,
                                                weight=weight,
                                                bias=bias,
                                                activation=activation,
                                                allow_zero_in_degree=allow_zero_in_degree,
                                                residual=residual))
            # output projection
            self.gcn_layers.append(GCNLayer(hidden_size[-1] if self.num_layers > 1 else input_size,
                                            output_size,
                                            direction_option=self.direction_option,
                                            feat_drop=feat_drop,
                                            gcn_norm=gcn_norm,
                                            weight=weight,
                                            bias=bias,
                                            activation=activation,
                                            allow_zero_in_degree=allow_zero_in_degree,
                                            residual=residual))

In construction function, one first needs to set the number of GCN layers and the data dimensions. For general PyTorch module, the dimensions are usually input dimension, output dimension and hidden dimension.

Besides data dimensions, a typical option for graph neural network is direction option (self.direction_option). Direction option determines whether to use unidirectional (i.e., undirected) or bidirectional (i.e., bi_sep and bi_fuse) version of GCN.

gcn_norm here is a callable function for feature normalization. In the GCN paper, such normalization can be: right, both, none.

use_edge_weight represents whether to use edge weights when computing the node embeddings.

residual represents whether to add residual connection between different GCN layers.

4.1.2 GCNLayer Construction Function

GCNLayer is a single-layer GCN and its initial options are same as class GCN. This module registers different GCNLayerConv according to direction_option.

class GCNLayer(GNNLayerBase):
    def __init__(self,
                 input_size,
                 output_size,
                 direction_option='bi_sep',
                 feat_drop=0.,
                 gcn_norm='both',
                 weight=True,
                 bias=True,
                 activation=None,
                 allow_zero_in_degree=False,
                 residual=True):
        super(GCNLayer, self).__init__()
        if direction_option == 'undirected':
            self.model = UndirectedGCNLayerConv(input_size,
                                                output_size,
                                                 feat_drop=feat_drop,
                                                 gcn_norm=gcn_norm,
                                                 weight=weight,
                                                 bias=bias,
                                                 activation=activation,
                                                 allow_zero_in_degree=allow_zero_in_degree,
                                                 residual=residual)
        elif direction_option == 'bi_sep':
            self.model = BiSepGCNLayerConv(input_size,
                                             output_size,
                                             feat_drop=feat_drop,
                                             gcn_norm=gcn_norm,
                                             weight=weight,
                                             bias=bias,
                                             activation=activation,
                                             allow_zero_in_degree=allow_zero_in_degree,
                                             residual=residual)
        elif direction_option == 'bi_fuse':
            self.model = BiFuseGCNLayerConv(input_size,
                                             output_size,
                                             feat_drop=feat_drop,
                                             gcn_norm=gcn_norm,
                                             weight=weight,
                                             bias=bias,
                                             activation=activation,
                                             allow_zero_in_degree=allow_zero_in_degree,
                                             residual=residual)
        else:
            raise RuntimeError('Unknown `direction_option` value: {}'.format(direction_option))

4.1.3 GCNLayerConv Construction Function

We will take BiSepGCNLayerConv as an example. The construction function performs the following steps:

  1. Set options.

  2. Register learnable parameters.

  3. Reset parameters.

The aggregation and upate functions are formulated as:

\[ \begin{align}\begin{aligned}h_{i, \vdash}^{(l+1)} = \sigma(b^{(l)}_{\vdash} + \sum_{j\in\mathcal{N}_{\vdash}(i)}\frac{1}{c_{ij}}h_{j, \vdash}^{(l)}W^{(l)}_{\vdash})\\h_{i, \dashv}^{(l+1)} = \sigma(b^{(l)}_{\dashv} + \sum_{j\in\mathcal{N}_{\dashv}(i)}\frac{1}{c_{ij}}h_{j, \dashv}^{(l)}W^{(l)}_{\dashv})\end{aligned}\end{align} \]

As shown in the equations, node embeddings in both directions are conveyed separately.

class BiSepGCNLayerConv(GNNLayerBase):
    def __init__(self,
                 input_size,
                 output_size,
                 feat_drop=0.,
                 gcn_norm='both',
                 weight=True,
                 bias=True,
                 activation=None,
                 allow_zero_in_degree=False,
                 residual=True):
        super(BiSepGCNLayerConv, self).__init__()
        if gcn_norm not in ('none', 'both', 'right'):
            raise RuntimeError('Invalid gcn_norm value. Must be either "none", "both" or "right".'
                               ' But got "{}".'.format(gcn_norm))
        self._input_size = input_size
        self._output_size = output_size
        self._gcn_norm = gcn_norm
        self._allow_zero_in_degree = allow_zero_in_degree
        self._feat_drop=nn.Dropout(feat_drop)

        if weight:
            self.weight_fw = nn.Parameter(torch.Tensor(input_size, output_size))
            self.weight_bw = nn.Parameter(torch.Tensor(input_size, output_size))
        else:
            self.register_parameter('weight_fw', None)
            self.register_parameter('weight_bw', None)

        if bias:
            self.bias_fw = nn.Parameter(torch.Tensor(output_size))
            self.bias_bw = nn.Parameter(torch.Tensor(output_size))
        else:
            self.register_parameter('bias_fw', None)
            self.register_parameter('bias_bw', None)

        if residual:
            if self._input_size != output_size:
                self.res_fc_fw = nn.Linear(
                    self._input_size, output_size, bias=True)
                self.res_fc_bw = nn.Linear(
                    self._input_size, output_size, bias=True)
            else:
                self.res_fc_fw = self.res_fc_bw = nn.Identity()
        else:
            self.register_buffer('res_fc_fw', None)
            self.register_buffer('res_fc_bw', None)

        self.reset_parameters()

        self._activation = activation

All learnable parameters and layers defined in this module are bidirectional, such as self.weight_fw and self.weight_bw.

Similarly, the aggregation and upate functions of BiFuseGCNLayerConv are formulated as:

\[ \begin{align}\begin{aligned}h_{i, \vdash}^{(l+1)} = \sigma(b^{(l)}_{\vdash} + \sum_{j\in\mathcal{N}_{\vdash}(i)}\frac{1}{c_{ij}}h_{j}^{(l)}W^{(l)}_{\vdash})\\h_{i, \dashv}^{(l+1)} = \sigma(b^{(l)}_{\dashv} + \sum_{j\in\mathcal{N}_{\dashv}(i)}\frac{1}{c_{ij}}h_{j}^{(l)}W^{(l)}_{\dashv})\\r_{i}^{l} = \sigma (W_{f}[h_{i, \vdash}^{l};h_{i, \dashv}^{l}; h_{i, \vdash}^{l}*h_{i, \dashv}^{l}; h_{i, \vdash}^{l}-h_{i, \dashv}^{l}])\end{aligned}\end{align} \]

Node embeddings in both directions are fused in every layer. The construction code of BiFuseGCNLayerConv is as follows:

class BiFuseGCNLayerConv(GNNLayerBase):

    def __init__(self,
                 input_size,
                 output_size,
                 feat_drop=0.,
                 gcn_norm='both',
                 weight=True,
                 bias=True,
                 activation=None,
                 allow_zero_in_degree=False,
                 residual=True):
        super(BiFuseGCNLayerConv, self).__init__()
        if gcn_norm not in ('none', 'both', 'right'):
            raise RuntimeError('Invalid gcn_norm value. Must be either "none", "both" or "right".'
                               ' But got "{}".'.format(gcn_norm))
        self._input_size = input_size
        self._output_size = output_size
        self._gcn_norm = gcn_norm
        self._allow_zero_in_degree = allow_zero_in_degree
        self._feat_drop=nn.Dropout(feat_drop)

        if weight:
            self.weight_fw = nn.Parameter(torch.Tensor(input_size, output_size))
            self.weight_bw = nn.Parameter(torch.Tensor(input_size, output_size))
        else:
            self.register_parameter('weight_fw', None)
            self.register_parameter('weight_bw', None)

        if bias:
            self.bias_fw = nn.Parameter(torch.Tensor(output_size))
            self.bias_bw = nn.Parameter(torch.Tensor(output_size))
        else:
            self.register_parameter('bias_fw', None)
            self.register_parameter('bias_bw', None)

        self.reset_parameters()

        self._activation = activation

        self.fuse_linear = nn.Linear(4 * output_size, output_size, bias=True)

        if residual:
            if self._input_size != output_size:
                self.res_fc = nn.Linear(
                    self._input_size, output_size, bias=True)
            else:
                self.res_fc = nn.Identity()
        else:
            self.register_buffer('res_fc', None)

4.1.4 GCN Forward Function

In NN module, forward() function does the actual message passing and computation. forward() takes a parameter GraphData as input.

The rest of the section takes a deep dive into the forward() function.

We first need to obatin the input graph node features and convert the GraphData to dgl.DGLGraph. Then, we need to determine whether to expand feat according to self.use_edge_weight and whether to use edge weight according to self.direction_option.

feat = graph.node_features['node_feat']
dgl_graph = graph.to_dgl()

if self.direction_option == 'bi_sep':
    h = [feat, feat]
else:
    h = feat

if self.use_edge_weight:
    edge_weight = graph.edge_features['edge_weight']
    if self.direction_option != 'undirected':
        reverse_edge_weight = graph.edge_features['reverse_edge_weight']
    else:
        reverse_edge_weight = None
else:
    edge_weight = None
    reverse_edge_weight = None

The following code actually performs message passing and feature updating.

for l in range(self.num_layers - 1):
    h = self.gcn_layers[l](dgl_graph, h, edge_weight=edge_weight, reverse_edge_weight=reverse_edge_weight)
    if self.direction_option == 'bi_sep':
        h = [each.flatten(1) for each in h]
    else:
        h = h.flatten(1)

logits = self.gcn_layers[-1](dgl_graph, h)

if self.direction_option == 'bi_sep':
    logits = torch.cat(logits, -1)
else:
    pass

graph.node_features['node_emb'] = logits