
Understanding Seq2Seq Neural Networks – Part 8: When Does the Decoder Stop?
In the previous article , we saw the translation being done. But there is an issue. The decoder does not stop until it outputs an EOS token . So, we plug the word "Vamos" into the decoder’s unrolled embedding layer and unroll the two LSTM cells in each layer. Then, we run the output values (short-term memory or hidden states) into the same fully connected layer. The next predicted token is EOS . How the Decoder Works So now, this means we translated the English sentence "let’s go" into the correct Spanish sentence. For the decoder, the context vector , which is created by both layers of encoder unrolled LSTM cells, is used to initialize the LSTMs in the decoder. The input to the LSTMs comes from the output word embedding layer, which starts with EOS . After that, it uses whatever word was predicted by the output layer. In practice, the decoder keeps predicting words until it predicts the EOS token or reaches some maximum output length. All these weights and biases are trained using bac
Continue reading on Dev.to
Opens in a new tab




