
Understanding Seq2Seq Neural Networks – Part 3: Stacking LSTMs in the Encoder
In the previous article, we created an embedding layer for the input vocabulary In this article, we will continue further by using it in an LSTM. We can place the embedding layer in front of the LSTM . Now, when we have the input sentence “Let’s go,” we put a 1 in the input position for “Let’s” and 0 for everything else. Then we unroll the LSTM and the embedding layer , place a 1 in the input position for “go,” and 0 for everything else. When we unroll the LSTM and the embedding layer , we reuse the exact same weights and biases , no matter how many times we unroll them. In theory, this is all we need to encode the input sentence “Let’s go.” However, in practice, in order to have more weights and biases to fit our model to the data , people often add additional LSTM cells to the input stage . So, to keep things simple, we will add one additional LSTM cell at this stage. This means: The two embedding values for the word “Let’s” are used as the input values for two different LSTM cells .
Continue reading on Dev.to
Opens in a new tab



