BLEU (
bilingual evaluation understudy) is an algorithm for evaluating the quality of text which has been
machine-translated from one
natural language to another. Quality is considered to be the correspondence between a machine's output and that of a human: "the closer a machine translation is to a professional human translation, the better it is" – this is the central idea behind BLEU. BLEU was one of the first
metrics to achieve a high
correlation with human judgements of quality, and remains one of the most popular automated and inexpensive metrics.