¹³Interestingly, directly applying ConveRT to amazonQA without any fine-tuning also yields a reasonably high score of 67.0%. Moreover, learning the mapping function between inputs and responses (again without any fine-tuning) for ConveRT the same way as is done for use-qa-map results in the score of 71.6%, which outperforms use-qa-map (70.7%). The gap to the fine-tuned model’s performance, however, indicates the importance of in-domain fine-tuning.