TransformerEncoder¶
- class torch.nn.TransformerEncoder(encoder_layer, num_layers, norm=None, enable_nested_tensor=True, mask_check=True)[source]¶
TransformerEncoder is a stack of N encoder layers. Users can build the BERT(https://arxiv.org/abs/1810.04805) model with corresponding parameters.
- Parameters:
encoder_layer – an instance of the TransformerEncoderLayer() class (required).
num_layers – the number of sub-encoder-layers in the encoder (required).
norm – the layer normalization component (optional).
enable_nested_tensor – if True, input will automatically convert to nested tensor (and convert back on output). This will improve the overall performance of TransformerEncoder when padding rate is high. Default:
True
(enabled).
- Examples::
>>> encoder_layer = nn.TransformerEncoderLayer(d_model=512, nhead=8) >>> transformer_encoder = nn.TransformerEncoder(encoder_layer, num_layers=6) >>> src = torch.rand(10, 32, 512) >>> out = transformer_encoder(src)
- forward(src, mask=None, src_key_padding_mask=None, is_causal=None)[source]¶
Pass the input through the encoder layers in turn.
- Parameters:
src (Tensor) – the sequence to the encoder (required).
mask (Optional[Tensor]) – the mask for the src sequence (optional).
is_causal (Optional[bool]) – If specified, applies a causal mask as mask (optional) and ignores attn_mask for computing scaled dot product attention. Default:
False
.src_key_padding_mask (Optional[Tensor]) – the mask for the src keys per batch (optional).
- Return type:
- Shape:
see the docs in Transformer class.