11Note that use-qa encodes inputs/contexts and responses using separate sub-networks, while ConveRT (FigureĀ 1) relies on full parameter sharing in the Transformer layers.