To solve this problem, Google has introduced a new model called Reformer, which understands the context of 1 million lines using just 16GB space. The company built this to solve problems of its old model Transformer — a neural network that compares words in a paragraph to each other to understand the relationship between them. Current models, support understanding of a few lines or paragraphs before and after the text in focus. However, as it uses pair matching, Transformer takes a lot of data space if it needs to process text more than a few thousand words. So, it’s impractical when you’re processing a long article or a book. Google made Reformer to solve the problem of a short ‘attention span’ and memory consumption of the old model. To solve the first problem, the new model uses locality-sensitive-hashing (LSH). What does it mean? Instead of comparing all words with each other, the model uses a hash function to band similar words together in a bucket, and then compare words with each other in the same or neighboring bucket, reducing the processing overload. To solve the memory problem researchers have reversible residual layers that use activations (outputs) of one layer and use it in another layer. To test this model, Google fed Reformer some images and it created full-frame images out of that. Google’s engineers said the new model can easily process whole books. This opens up a huge potential to process text in bulk. You can read more about Reformer in a paper here, and play with its code here.