This article may be too technical for most readers to understand.(March 2022) |
Part of a series on |
Machine learning and data mining |
---|
Long short-term memory (LSTM)[1] is a type of recurrent neural network (RNN) aimed at dealing with the vanishing gradient problem[2] present in traditional RNNs. Its relative insensitivity to gap length is its advantage over other RNNs, hidden Markov models and other sequence learning methods. It aims to provide a short-term memory for RNN that can last thousands of timesteps, thus "long short-term memory".[1] The name is made in analogy with long-term memory and short-term memory and their relationship, studied by cognitive psychologists since early 20th century.
It is applicable to classification, processing and predicting data based on time series, such as in handwriting,[3] speech recognition,[4][5] machine translation,[6][7] speech activity detection,[8] robot control,[9][10] video games,[11][12] and healthcare.[13]
A common LSTM unit is composed of a cell, an input gate, an output gate[14] and a forget gate.[15] The cell remembers values over arbitrary time intervals and the three gates regulate the flow of information into and out of the cell. Forget gates decide what information to discard from the previous state by mapping the previous state and the current input to a value between 0 and 1. A (rounded) value of 1 means to keep the information, and a value of 0 means to discard it. Input gates decide which pieces of new information to store in the current cell state, using the same system as forget gates. Output gates control which pieces of information in the current cell state to output by assigning a value from 0 to 1 to the information, considering the previous and current states. Selectively outputting relevant information from the current state allows the LSTM network to maintain useful, long-term dependencies to make predictions, both in current and future time-steps.
lstm1997
was invoked but never defined (see the help page).hochreiter1991
was invoked but never defined (see the help page).graves2009
was invoked but never defined (see the help page).GoogleTranslate
was invoked but never defined (see the help page).FacebookTranslate
was invoked but never defined (see the help page).mayer2006
was invoked but never defined (see the help page).OpenAIhand
was invoked but never defined (see the help page).OpenAIfive
was invoked but never defined (see the help page).alphastar
was invoked but never defined (see the help page).decade2022
was invoked but never defined (see the help page).lstm2000
was invoked but never defined (see the help page).