Which of the following is true?
- Clipping the norm of the gradient is the easiest to compute form of gradient clipping under all performance measures.
- LSTMs have forget components/gates.
- Increasing the learning rate of your neural net always increases its effective capacity.