Finetuning LLM models with PDF documents in h2o-llmstudio #719
-
Hi, Thanks. |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment
-
Hi, you would first need to generate input/output pairs for your documents. I answered a similar question already here that might be helpful: If you want to just to next token training on the text of your pdfs, then you would need to transform it to a csv file with raw text and follow this: https://docs.h2o.ai/h2o-llmstudio/faqs#what-if-my-data-is-not-in-question-and-answer-form-and-i-just-have-documents-how-can-i-fine-tune-the-llm-model |
Beta Was this translation helpful? Give feedback.
Hi,
you would first need to generate input/output pairs for your documents. I answered a similar question already here that might be helpful:
#522
If you want to just to next token training on the text of your pdfs, then you would need to transform it to a csv file with raw text and follow this: https://docs.h2o.ai/h2o-llmstudio/faqs#what-if-my-data-is-not-in-question-and-answer-form-and-i-just-have-documents-how-can-i-fine-tune-the-llm-model