3Hyperparameters for pretraining and finetuning are in Appendix A.4.