Meta's New Method Boosts LLM Long-Context Processing by 30.8 times
【Tech 24H】Meta has introduced the REFRAG efficient decoding framework, which compresses and selectively processes non-critical information in long contexts. This innovation accelerates the generation time of the first token in long-text processing by up to 30.8 times and expands the effective context length by 16 times, achieving dual improvements in speed and accuracy across multiple tasks.
Editor:Zhang Liyan