Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update model hub #829

Merged
merged 24 commits into from
Dec 4, 2023
Merged

Update model hub #829

merged 24 commits into from
Dec 4, 2023

Conversation

hahuyhoang411
Copy link
Contributor

@hahuyhoang411 hahuyhoang411 commented Dec 4, 2023

Motivation:
I want to choose best of the best models instead of 4-5 best model for each hardware range -> Optimize the plu in play style.

TODOs:

GOAL:

  • Update model.json
    • update models
    • update parameters

@hahuyhoang411
Copy link
Contributor Author

Moved the HackMD evaluation questions here for ease https://hackmd.io/hhn9VFhGSD27_8LWMgAi6w

@tikikun
Copy link
Contributor

tikikun commented Dec 4, 2023

Some current feedbacks on nightly model:

  • Too many 33b and 70b, there models also need to be on the bottom of the list
  • Too few 7-13b which is the most usable range for most system (for desktop app)
  • Need to have "pinned" or equivalent in metadata to highlight best models, right now the first page of models are just models that is not desirable to use (1.5B 3B and 70b on first page)
  • Naming of the model must be descriptive because we do per quantization per model, NeuralHermes 7b , but at which size?

@tikikun
Copy link
Contributor

tikikun commented Dec 4, 2023

We need to add the below tags:
Recommended -> Small -> Medium -> Tiny - > Large

The tags are the order in which we need to follow through on the download page
Ref:
#839

@tikikun
Copy link
Contributor

tikikun commented Dec 4, 2023

@hahuyhoang411
Copy link
Contributor Author

hahuyhoang411 commented Dec 4, 2023

On Nitro the deepseek 1.3b still works. I have no idea why it's not working on Jan.

Example result when using Deepseek 1.3b in Nitro

{
  "choices": [
    {
      "finish_reason": null,
      "index": 0,
      "message": {
        "content": "<jupyter_code>\n# Fibonacci series implementation using recursion\ndef fib(n):\n    if n <= 1:\n        return n\n    else:\n        return fib(n-1) + fib(n-2)\n\n# Driver code\nprint(fib(8))\n<jupyter_output>\n<empty_output>\n",
        "role": "assistant"
      }
    }
  ],
  "created": 1701687606,
  "id": "qeKmj6lya0rTugXDbfxt",
  "model": "_",
  "object": "chat.completion",
  "system_fingerprint": "_",
  "usage": {
    "completion_tokens": 86,
    "prompt_tokens": 81,
    "total_tokens": 167
  }
}

Image:
image

Copy link
Contributor

@louis-jan louis-jan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@hahuyhoang411 hahuyhoang411 merged commit b475a6f into main Dec 4, 2023
@hahuyhoang411 hahuyhoang411 deleted the update-model-hub branch December 4, 2023 14:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

3 participants