
Understanding LSTMs – Part 5: The Input Gate Explained
In the previous article , we have just went through the 2nd and 3rd components in LSTM, we will understand it further in this article. Starting with the block furthest to the right, we multiply the short-term memory and the input by their respective weights: This value, 2.03, becomes the input to the tanh activation function. Now we plug 2.03 into the tanh function and obtain approximately 0.97. The tanh activation function maps any input to a value between −1 and 1. When the input to the LSTM is 1, after calculating the x-axis value, the tanh activation function produces an output close to 1. In contrast, if the input to the LSTM were −10, then after calculating the x-axis value, the tanh activation function would produce an output close to −1. So, based on the short-term memory and the input, we now have a potential memory of 0.97. Next, the LSTM must decide how much of this potential memory to retain. This is done using the same method as before. This value, 4.27, is the x-axis inpu
Continue reading on Dev.to
Opens in a new tab




