arxiv ViNT: A Foundation Model for Visual Navigation