9Combining multiple objectives in a dual-encoder framework has also been done by Al-Rfou et al. (2016) and Henderson et al. (2017). Note that more sophisticated solutions to fusing dialog history are possible such as using attention over older contexts as done by Vlasov et al. (2019) on the much smaller MultiWOZ 2.1 dataset (Eric et al.2019), but we have opted for simple concatenation as an efficient solution for training on the large Reddit data. The multiple objectives result in quicker learning, and also give useful diagnostic probes into the performance of each feature throughout training.