fix: 4171 - Model loading gets stuck on stop #4177
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Describe Your Changes
This PR aims to update the cortex.cpp version to address a couple of issues, including rounded float values and stopping the model after loading.
Also, to cancel the pending model start quest, as there’s a known server-side issue where the model couldn’t be stopped mid-load.
I’ve noticed that enabling cont_batching and increasing the number of parallel operations can slightly improve the user experience, especially when the model is generating requests in the background. (There’s an issue where the server can’t stop a running inference). So that better to add these settings into cortex for advanced usage for LLM enthusiast, there will be options and enhanced on the settings UX, this is the first step of mapping engine parameters into Jan Settings.
Fixes Issues
Changes made
The code changes include:
Version Update:
version.txt
is updated from1.0.4-rc4
to1.0.4-rc5
.Settings Update:
default_settings.json
has been modified to include new settings (cont_batching
,caching_enabled
,cache_type
,use_mmap
) and updated descriptions and default values for various settings. The placeholder and value fields are also adjusted.Refactoring Constants:
rollup.config.ts
andglobal.d.ts
,DEFAULT_SETTINGS
is renamed toSETTINGS
.New Enum for Settings:
Settings
is created inindex.ts
to manage different available settings.Settings Handling:
JanInferenceCortexExtension
for default settings values.onSettingUpdate
to handle updates to settings.loadModel
.Abort Controller:
Enhancements in UI:
ErrorMessage
,LoadModelError
, andTextMessage
components, CSS classes are adjusted for capitalization and overflow behavior to improve UI consistency and presentation.These changes encompass enhancements in functionality (settings management, abort controllers) and user interface adjustments.