Using code2prompt open large git repository and generate prompts for LLMs #27

LZING · 2024-06-25T03:56:32Z

Hi, mufeedvh. Thank you for a very nice application.

I'm running into a problem right now when dealing with large code repositories. When I'm dealing with small code repositories, code2prompt works great. But when I'm dealing with large repositories, I have a token overflow problem when interacting with LLM.

So how should we deal with large code repositories? Sending only part of the source code will affect the context. Now it seems that only Gemini 1.5pro can handle about 200m tokens, which is the upper limit.

Can you perform tuning on a large code repository? Or do you have any good suggestions?

bhanub2406 · 2024-08-07T12:29:29Z

Hi @LZING
I don't really have a solution for your problem. But I have couple of observations from my experience

Including large code repositories would mean the resultant prompt is very large, which is not supported by many LLMs. Even if they do support, the quality of output may not always be relevant to your expectations.
Full context of the code may not be needed for all the usecases, Ex: find-security-vulnerabilities, github hub commits, git hub pull requests related templates need only a part of your code.
You can you use --exclude, --include kind of arguments to reduce the size of code that you send to code2prompt. If your requirements are more complex than that, you can write a pre-processing script that fetches the required files/folders into temp folder which in turn can be passed to the code2prompt.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Using code2prompt open large git repository and generate prompts for LLMs #27

Using code2prompt open large git repository and generate prompts for LLMs #27

LZING commented Jun 25, 2024

bhanub2406 commented Aug 7, 2024

Using code2prompt open large git repository and generate prompts for LLMs #27

Using code2prompt open large git repository and generate prompts for LLMs #27

Comments

LZING commented Jun 25, 2024

bhanub2406 commented Aug 7, 2024