What is token, tokenization work and context window in LLMs?
1 Answer
Token
Tokens are the basic unit of Large Language Models (LLMs). LLM cannot understand human language directly in the form of sentences or words, Before processing Model converts the sentences into chunks and assign by unique numerical value (Token ID) to the chunks which is called tokens.
Example:
I love artificial intelligence
Tokenizer will convert the sentence into:
["I", "love", "artificial", "intelligence"]
How a token is created?
Step 1: Input sentence
What is AI?
Step 2: Tokenizer
["What", "is", "AI", "?"]
Step 3: Vocabulary lookup
What → 1578
is → 425
AI → 9821
? → 63
Now model gets these information instead of text:
[1578,425,9821,63]
Tokenization
Tokenization means to divide the sentences into meaningful pieces called Tokens. Tokenization is required because computers are not able to understand the words instead of this they works on numerical values. The main problem is if we try to store each words in vocabulary like:
low
lower
lowest
Then the size of vocabulary becomes very high, Because of millions of words the storage become Huge and training become difficult. so we divide the word into tokens.
Example:
unhappiness
into
un
happy
ness
Now due to this model are also able to understand the new words.
Some of the famous tokenization algorithms are:
- Byte Pair Encoding (BPE)
- WordPiece
- SentencePiece
There are two types of tokenizations:
- Character Based
- Sub-word Based
Context window
Context Window means the maximum tokens which a model can process at a time.
Example: Model Context Window
128K Tokens
Which means the model can process the information of 128,000 tokens in a single request. Context also includes the older conversations. But if the context limit exceeds then the older message will dropped. Context Window is the model's temporary working memory not permanent memory.
Some models and there context size:
GPT 3.5 - 16k
GPT 4 - 8/32k
GPT 4 Turbo - 128k
GPT 4o - 128k
GPT 4.1 - 1M
Claude sonnet 4x - 1M
Gemini 2.5 - 1M-2M