Traditional Large Language Models (LLMs) rely on a tokenizer (like BPE or SentencePiece) to convert text into subword tokens before feeding them to the transformer. The Byte Latent Transformer ...