Project Atom
A compression framework for the Transformers Key-Value Cache that enables large-scale AI models to run efficiently on consumer hardware with negligible precision loss.
Available for licensingView projectFrom the foundation to the evolution — the systems we're building to make powerful AI radically more efficient.
A compression framework for the Transformers Key-Value Cache that enables large-scale AI models to run efficiently on consumer hardware with negligible precision loss.
Available for licensingView projectA weight compression technique that takes cues from our Key-Value Cache compression system, allowing significant compression at high accuracy with minimal fidelity loss.
Currently under developmentLearn more