-
Notifications
You must be signed in to change notification settings - Fork 143
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feat/954 llama cpp #1000
base: develop
Are you sure you want to change the base?
Feat/954 llama cpp #1000
Conversation
@davidberenstein1957 : I have added a working version for llama-cpp support to generate embeddings. Will add more parameters as we have here |
…beddings should be normalized - Added testcases to test normalize embeddings
|
@davidberenstein1957 : May I request you to review the changes. In addition, I couldn't add the documentation to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking good already. I left some initial comments. Have you seen this already #999?
Accept recommended suggestion Co-authored-by: David Berenstein <[email protected]>
- Incorporated changes suggested in review comments.
- use atexit to forcefully invoke cleanup
- Add test_encode_batch_consistency to ensure consistent results - Test large batch processing capability - Verify embedding dimensions and count for different batch sizes
Hi Bikash.That would be great. Do you think we can inherent the same / similar logic using a mixin? Perhaps we can tackle a similar approach for vLLM too? Not entirely sure about potential edge cases though but feel free to explore it. Also it might be cleaner to create a separate PR/issue and tackle them there. Regards,DavidOn 1 Oct 2024, at 00:29, bikash119 ***@***.***> wrote:
@bikash119 commented on this pull request.
In src/distilabel/embeddings/llamacpp.py:
+ if self.hub_repository_id is not None:
+ try:
+ from huggingface_hub.utils import validate_repo_id
+
+ validate_repo_id(self.hub_repository_id)
+ except ImportError as ie:
+ raise ImportError(
+ "Llama.from_pretrained requires the huggingface-hub package. "
+ "You can install it with `pip install huggingface-hub`."
+ ) from ie
+ try:
+ self._logger.info(
+ f"Attempting to load model from Hugging Face Hub: {self.hub_repository_id}"
+ )
+ self._model = _LlamaCpp.from_pretrained(
+ repo_id=self.hub_repository_id,
+ filename=self.model,
+ verbose=self.verbose,
+ embedding=True,
+ kwargs=self.extra_kwargs,
+ )
+ self._logger.info("Model loaded successfully from Hugging Face Hub")
+ except Exception as e:
+ self._logger.error(
+ f"Failed to load model from Hugging Face Hub: {str(e)}"
+ )
+ raise
@davidberenstein1957 : Should I capture this logic in original LLM class too?
cc : @gabrielmbmb
—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you were mentioned.Message ID: ***@***.***>
|
@davidberenstein1957 : Ok , let me create a different PR to handle the scenario. I will keep the this PR untouched and once we have the scenario handled using mixins, will remove the token handling code from |
Hi @davidberenstein1957 : I have made the changes to make model download from hub reusable through mixin. Please share your feedback. If this looks good, will create another PR to handle the model download in llamacpp class of llm too. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't forget to update the docs and API references :)
Have included the LlamaCppEmbeddings class to |
…r --cpu-only. `pytest tests/unit/embeddings/test_llamacpp.py --cpu-only` will generate embeddings using cpu `pytest tests/unit/embeddings/test_llamacpp.py` will generate embeddings using gpu
hi @davidberenstein1957 : Thank you for your time to review my work. I am tried to incorporate your feedback and did a few improvements.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the updates. Left some minor comments. :)
- model (name of the model) : the model used to generate embeddings.
from llama_cpp import Llama | ||
|
||
|
||
class LlamaCppEmbeddings(Embeddings, CudaDevicePlacementMixin): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
aren't some of the attributes already present in the parent class?
try except block is not needed. Co-authored-by: David Berenstein <[email protected]>
Co-authored-by: David Berenstein <[email protected]>
hidden attributes shouldn't be documented. Co-authored-by: David Berenstein <[email protected]>
…into feat/954_llama-cpp
This PR adds llama-cpp support to create embeddings.