Natural language generation evaluation metrics

衡量生成文本质量的方法集

把待检测文件整理成如下格式：

[
  {"ref": str, "hyps": [str, str, ...]},
  {...},
  ...
]

查看用法

或者查看run.sh的例子

python run.py --input=input_path --output=output_path --metrics="['rouge-1', 'bleu', 'self-bleu']"

当前支持的方法有rouge-l, rouge-2, rouge-l, bleu, self-bleu, meteor, ppl。

其中，如果选择ppl，则需要增加命令行参数--ppl_model_path=model_path，这个path为模型文件(bert模型)

如果第一次使用meteor，需要去nltk 下载带中文的wordnet数据 Open Multilingual Wordnet (omw)
以及 wordnet ，放入/root/nltk_data/corpora/中解压

from metrics import Metrics
inputs = json.load(...)
model = Metrics(metrics
 
To finish reading, please visit source site