Hi all,
thanks for the great models, especially the German one!
I tried to fine tune it but it is pretty memory hungry.
I think you could really release SOTA models if you would use, e. g. the gte implementation of Bert:
https://huggingface.co/Alibaba-NLP/new-impl
This implementation is pretty good in terms of learning, but also mostly in terms of memory efficiency! What do you think?
(Only train a German model for testing purposes would be great and I could gve you feedback haha! :-)
All the best
Aaron