What is token, tokenization work and context window in LLMs?

Token

Tokens are the basic unit of Large Language Models (LLMs). LLM cannot understand human language directly in the form of sentences or words, Before processing Model converts the sentences into chunks and assign by unique numerical value (Token ID) to the chunks which is called tokens.

Example:

I love artificial intelligence

Tokenizer will convert the sentence into:

["I", "love", "artificial", "intelligence"]

How a token is created?

Step 1: Input sentence

What is AI?

Step 2: Tokenizer

["What", "is", "AI", "?"]

Step 3: Vocabulary lookup

What → 1578
is → 425
AI → 9821
? → 63

Now model gets these information instead of text:

[1578,425,9821,63]

Tokenization

Tokenization means to divide the sentences into meaningful pieces called Tokens. Tokenization is required because computers are not able to understand the words instead of this they works on numerical values. The main problem is if we try to store each words in vocabulary like:

low
lower
lowest

Then the size of vocabulary becomes very high, Because of millions of words the storage become Huge and training become difficult. so we divide the word into tokens.

Example:

unhappiness

into

un
happy
ness

Now due to this model are also able to understand the new words.

Some of the famous tokenization algorithms are:

Byte Pair Encoding (BPE)
WordPiece
SentencePiece

There are two types of tokenizations:

Character Based
Sub-word Based

Context window

Context Window means the maximum tokens which a model can process at a time.

Example: Model Context Window

128K Tokens

Which means the model can process the information of 128,000 tokens in a single request. Context also includes the older conversations. But if the context limit exceeds then the older message will dropped. Context Window is the model's temporary working memory not permanent memory.

Some models and there context size:

GPT 3.5  -  16k 
GPT 4  -  8/32k
GPT 4 Turbo   -  128k
GPT 4o  -  128k
GPT 4.1  -  1M
Claude sonnet 4x  -  1M
Gemini 2.5   -  1M-2M