This article introduces ‘Evaluate’, a function that generates numeric scores given an input context and a list of outputs, all in natural language. Scoring text-based data is a foundational capability for tasks such as supervised prompt optimization, unsupervised text upscaling, guided reasoning, and synthetic data labeling.