GritLM-7B 7B parameter model that uses bidirectional attention for embedding and causal attention for generation. It is finetuned from Mistral-7B 66.8 55.5 GritLM-8x7B 8x7B parameter model that uses ...
Or, if you prefer, you can use the "Download Zip" button available through the main repository page. Downloading the project as a .ZIP file will keep the size of the ...