Leaderboard README improvements#217
Conversation
| fi | ||
|
|
||
| gen_suffix=generations_$task\_$model.json | ||
| gen_suffix=generations_$task\_$model\_$task.json |
There was a problem hiding this comment.
we don't need to have the same path format as here
bigcode-evaluation-harness/main.py
Line 387 in 094c7cc
--load-generations_path which can be anything. So let's maybe keep the original path to not have task twice?
There was a problem hiding this comment.
because during evaluation we call
--load-generations_pathwhich can be anything.
Right however, current README steps for Evaluation passes $gen_suffix variable in --load-generations_path argument
bigcode-evaluation-harness/leaderboard/README.md
Lines 114 to 121 in 642c57f
Since $gen_suffix is missing the _task suffix, Running evaluations results in the following error
After adding _task suffix in $gen_suffix, evaluations run successfully.
I was able to run evaluations for the Artigenz-Coder-DS-6.7B here after these changes
There was a problem hiding this comment.
it shouldn't throw an error if you used save_generations_path=generations_$task\_$model.json in the generations
While trying to run the steps given in leaderboard README, found following improvements
1-Setup
modelvariable to be initialised before creating generations and metrics directories2-Generations
save_generationsflag is missing in while running generationsmax_lengthto be 1024 for some tasks, based on your tokeniser (Fix for max_length_generation parameter #207)3-Evaluations
Generations file is saved in
save_generations_path_$task, while running evaluations it should load from this path(_$task is missing in the path in README).bigcode-evaluation-harness/main.py
Line 387 in 094c7cc