arxiv Pre-training Small Base LMs with Fewer Tokens