Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Git repositories added to config.toml do not appear to be indexed #3692

Open
Jreuningschererhubbell opened this issue Jan 14, 2025 · 5 comments
Labels
bug Something isn't working fixed-in-next-release

Comments

@Jreuningschererhubbell
Copy link

Overview

Describe the bug
Git repositories added to config.toml do not appear to be indexed. They are read properly from the config file, but they are not added to the database or indexed. This seems related to #3163.

Information about your version
Most testing was done with a local build of branch r0.22

target/debug/tabby --version
tabby 0.22.0

I also tested other versions. I built locally using commit 5aa27b5

target/debug/tabby --version                                                                                   
tabby 0.24.0-dev.0

I also built locally using branch r0.23

target/debug/tabby --version
tabby 0.23.0

Information about your GPU
Apple M2 Max

Details

Detailed description
When repositories are added to config.toml, they are not added to the database. They do not seem to be indexed either. When I went through the source, I could not find anywhere in where the config repositories are added.

Looking at previous versions of Tabby (like 0.11), it seems like this process was handled by the scheduler in the past (see commit da02d47) . However, I can't find where that functionality went in newer releases.

It also seems like names in the config.toml are not parsed anywhere. The RepositoryConfig struct only has one field:

#[derive(Serialize, Deserialize, Debug, Clone, PartialEq)]
pub struct RepositoryConfig {
    git_url: String,
}

Steps to reproduce
The outputs shown here were done with version r0.24.0-dev.0, but the results are similar on other versions.

  1. Pull clean instance of Tabby. Delete the entire ~/.tabby directory
  2. Build with cargo build
  3. Start Tabbytarget/debug/tabby serve
  4. Perform basic setup of admin user
  5. Stop Tabby with Ctrl+c
  6. Populate config.toml, as listed below
  7. Restart Tabbytarget/debug/tabby serve
  8. Wait 5 minutes. Click around the UI, try to access the code repositories, check the context providers. I'm not sure exactly what I did, but eventually...
  9. Check for error messages in the console. This did not occur every time.
2025-01-14T21:00:01.113684Z  WARN tabby_webserver::service::background_job: ee/tabby-webserver/src/service/background_job/mod.rs:133: Database maintainance failed: Other(Failed to run db maintenance job:
Failed to read active sources: failed to resolve path '/Users/ME/.tabby/repositories/https_github.com_TabbyML_tabby': No such file or directory; class=Os (2); code=NotFound (-3))
2025-01-14T21:00:01.114081Z DEBUG tabby_webserver::service::background_job::third_party_integration: ee/tabby-webserver/src/service/background_job/third_party_integration.rs:57: Syncing all github and gitlab repositories
2025-01-14T21:00:01.119021Z  WARN tabby_webserver::service::background_job: ee/tabby-webserver/src/service/background_job/mod.rs:149: Index garbage collection job failed: Other(failed to resolve path '/Users/ME/.tabby/repositories/https_github.com_TabbyML_tabby': No such file or directory; class=Os (2); code=NotFound (-3))
  1. Confirm the following:
    10.1. Use a SQL viewer to confirm that there are no entries in dev-db.sqlite in the repositories table
    10.2. Use a SQL viewer to confirm that there are no relevant entries in the database in the job_runs table
    10.2. Check that there are no repositories or index data stored in ~/.tabby
  2. Add another repository using the UI, to confirm the system works.
  3. Confirm the following:
    12.1. Use a SQL viewer to confirm the second repo was added to the database.
    12.2. Console messages confirm the job was successful
2025-01-14T21:20:06.763234Z DEBUG tabby_webserver::service::background_job: ee/tabby-webserver/src/service/background_job/mod.rs:97: Background job 1 started, command: {"SchedulerGitRepository":{"git_url":"https://github.com/TabbyML/interview-questions.git","source_id":"git:E16n1q"}}
Cloning into '/Users/ME/.tabby/repositories/https_github.com_TabbyML_interview-questions'
remote: Enumerating objects: 77, done.
remote: Counting objects: 100% (77/77), done.
remote: Compressing objects: 100% (59/59), done.
remote: Total 77 (delta 27), reused 0 (delta 0), pack-reused 0 (from 0)
Receiving objects: 100% (77/77), 18.95 KiB | 3.79 MiB/s, done.
Resolving deltas: 100% (27/27), done.
2025-01-14T21:20:16.077941Z DEBUG tabby::services::tantivy: crates/tabby/src/services/tantivy.rs:33: Index is ready, enabling search...
2025-01-14T21:20:18.292882Z DEBUG tabby_webserver::service::background_job: ee/tabby-webserver/src/service/background_job/mod.rs:129: Background job 1 completed
  1. For some reason, the second repo does not appear in the code viewer UI, but I think that is a separate issue.

config.toml
This is the entire config.toml I used.

[[repositories]]
name = "tabby"
git_url = "https://github.com/TabbyML/tabby.git"
@Jreuningschererhubbell
Copy link
Author

I did some more digging and found that ignoring the names is intentional (commit 71a7230). Disregard my previous note that the names are not parsed

@wsxiaoys
Copy link
Member

wsxiaoys commented Jan 15, 2025

It's likely that the URL has not been indexed yet due to a cron job that synchronizes the config.toml repositories at hourly intervals. This could be the source of the issue.

To quickly test this, you can set up a local git repository and examine it using the code browser. For example, configure your config.toml like this:

[[repositories]]
git_url = "file:///Users/meng/Projects/tabby"
image

@Jreuningschererhubbell
Copy link
Author

Jreuningschererhubbell commented Jan 15, 2025

Hi, thanks for the quick response!

I tried your suggestion of adding a local repository and that worked properly - I was able to add the repo to the config and access it in the code browser. However, when I added a remote repository, the issue reappeared.

To check whether the issue was related to the hourly interval, I left the server running for about 9 hours. At the end of that period, there were no files (including local repo) in the code browser. The console logs are in the attached file and my config.toml is shown below.

tabby_path_failure.log

[model.completion.local]
model_id = "StarCoder2-3B"
parallelism = 4

[model.chat.local]
model_id = "Qwen2.5-Coder-1.5B-Instruct"
parallelism = 4

[model.embedding.local]
model_id = "Nomic-Embed-Text"
parallelism = 4

[[repositories]]
git_url = "file:///Users/ME/dump/tabbyDump"

[[repositories]]
git_url = "https://github.com/TabbyML/tabby.git"

@wsxiaoys
Copy link
Member

Thanks for detailed information, this is indeed a bug, fixing in #3703

@Jreuningschererhubbell
Copy link
Author

Just pulled the branch associated with the bug-fix, and it works on my machine!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working fixed-in-next-release
Projects
None yet
Development

No branches or pull requests

2 participants