Question 1

What is Transformer?

Accepted Answer

The Transformer is a neural network architecture introduced in 2017 that revolutionized natural language processing. It uses a mechanism called "attention" that allows the model to weigh the importance of different parts of the input when generating each output token. Almost all modern LLMs — GPT, Claude, Gemini, Llama — are built on transformer architecture.

Question 2

How does Transformer work?

Accepted Answer

When a transformer model reads "The cat sat on the ___", the attention mechanism helps it focus on "cat" and "sat" to predict the next word is likely "mat" or "chair", rather than being distracted by less relevant words.

What is Transformer?

Definition

💡 Example

Related concepts

Explore AI tools