welcome

Huawei's Zurich Lab unveils SINQ, an open-source quantization method that it claims can reduce LLM memory use by 60-70% without significant quality loss (Carl Franzen/VentureBeat)

Carl Franzen / VentureBeat:
Huawei's Zurich Lab unveils SINQ, an open-source quantization method that it claims can reduce LLM memory use by 60-70% without significant quality loss  —  - Dual-Axis Scaling: Instead of using a single scale factor for quantizing a matrix, SINQ uses separate scaling vectors for rows and columns.



from Techmeme https://ift.tt/RkEPbHs

Share this:

Post a Comment

 
Copyright © TECH UPDATES. Designed by OddThemes & Best Wordpress Themes 2018