Researcher releases small Dutch language model based on Microsoft Phi – IT Pro – News

Researcher releases small Dutch language model based on Microsoft Phi – IT Pro – News
Researcher releases small Dutch language model based on Microsoft Phi – IT Pro – News
--

Bram Vanroy here, the creator of Fietje. Also the maker of the much more powerful GOAT Ultra.

To avoid a lot of negative comments: I feared that users would make the comparison with other, more powerful models (such as GOAT Ultra or Mixtrals or new Llamas or even ChatGPT). That’s understandable, but not Fietje’s goal at all! Fietje is 2.5x smaller than Ultra, and then you quickly reach the limits of what knowledge and skills you can cram into one model! I have tried to emphasize this in various descriptions, but I will say it again: the intention is NOT to capture a new state-of-the-art, that is simply not possible with this size. The goal has always been to have a small model that strikes a balance between going as small as possible and still being useful for some tasks, e.g. in edge devices and research in a resource-constrained setting where deploying a larger model is simply doesn’t work. Instead of building larger models (which is certainly useful), I have now focused first on improving accessibility to language technology. Of course, this is just a single step in the process, and we will continue to move forward with new projects and ideas. Maybe that will be a finetune of Mixtral, maybe of Llama 3, or maybe just another small phi-3!

I also asked the author to remove this sentence “but be just as good as a larger model” and add the following in the text for clarification: “Although Fietje performs almost as well as GEItje 7B Ultra in benchmarks, it is still less powerful in practice. It is therefore intended as a step towards running local LLMs, also on small devices such as a Raspberry PI or a telephone.” I also asked the author to replace the screenshot. It’s not surprising that Fietje doesn’t know what to do: there is no system message, so she doesn’t know that I made her, she doesn’t know how she was made/trained, so she thinks she has to pretend to be a journalist who likes to train (in the gym, for example). I think a nicer screenshot is this one, for example, which shows that Fietje can create a DnD character for you in JSON!

Finally: what I find very important is community building and transparency. That’s why my datasets, models, training code, training log are all public and available to everyone in the hope of motivating other researchers to be equally transparent, and work together on better language technology for Dutch.

If you have any questions about LLMs, you can always ask them below. Let’s make it a fun Q&A! I am also very interested in what you as a Tweakers community want: the bigger the better, or living on the edge? Let me know!

[Reactie gewijzigd door BramVroy op 1 mei 2024 00:50]

The article is in Dutch

Tags: Researcher releases small Dutch language model based Microsoft Phi Pro News

-

NEXT Children’s tablets Round-up – Tweakers