A technical paper titled “Efficient Streaming Language Models with Attention Sinks” was published by researchers at Massachusetts Institute of Technology (MIT), Meta AI, Carnegie Mellon University ...