OpenAI Finetuning error

Question

Hello, I want to finetune my own data using GPT3. But i'm trying to train model(i.e. finetuning) after preprocessing the data, but I get an error. Error message is 'Training data validation failed: each of the classes must start with a different token. You can view your class tokenizations at https://platform.openai.com/tokenizer?view=bpe.' The data format is as follows: {"prompt":"Big: 382 Mid: 1039 Small: 3417 Problem: blahblah blahblah ### ","completion":" 5032"} Examples for reference are:
https://github.com/openai/openai-cookbook/blob/main/examples/azure/finetuning.ipynb Please help me.

Answer

he error message suggests that there is a problem with the formatting of your training data. Specifically, it seems that there are issues with the class tokenizations, which are likely related to how you have preprocessed your data. One possible solution is to ensure that each class starts with a different token. You can check your class tokenizations using the link provided in the error message (https://platform.openai.com/tokenizer?view=bpe) to see if there are any issues there. Additionally, you may want to check that your preprocessing steps are consistent with the format expected by the GPT-3 model. Make sure that your training data is properly tokenized and that you are using the correct encoding for your inputs. Finally, it may be helpful to consult the OpenAI documentation and community forums for more specific guidance on how to troubleshoot this issue. Good luck!

Share via

OpenAI Finetuning error

1 answer

Your answer