Baka4LLM. New horizons
Maykeye
Posted on May 16, 2024
Hello fairy dairy diary~!
Conserved Causal LLM for a while, moved to new horizon...
...Masked LLM..
The reasoning is simple: hypothesis it's easier to train acceptable MLM on 16GB VRAM. Which then can be chained to itself.
For now points of references are
https://github.com/samvher/bert-for-laptops/blob/main/BERT_for_laptops.ipynb
https://arxiv.org/abs/2212.14034
and there was a paper that explored if it was possible to train decoder from encoder-only, IIRC, they found it can I think, but I can't find it and I might hallucinate it, will look for it later.
For now some pretraining is to go, then to decide what to do with it and how update it!
Chill!
đź’– đź’Ş đź™… đźš©
Maykeye
Posted on May 16, 2024
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.