Scoring TOEFL® Speaking responses - Overview

Modified on Sun, 07 Aug 2022 at 01:22 PM

TOEFL Speaking Scoring Rubrics

Download TOEFL Speaking rubrics

Copyright © 2019 by Educational Testing Service. All rights reserved.

Each TOEFL iBT Speaking response is scored holistically on a 4-band scale, with 4 being the highest score and 1 the lowest. A score of 0 is assigned to responses in which the speaker is unwilling or unable to provide a response to the question. The score on a speaking task represents an overall judgment of how well the response communicates the intended message. Raters evaluate the overall quality of the language and discourse features of the responses.

2 Scoring Rubrics are used to guide raters in evaluating the responses. The Independent Speaking Rubric is used to evaluate responses to the independent task. The Integrated Scoring Rubric is used to evaluate responses to the 3 integrated tasks.

Key Features of the Scoring Rubrics

Both scoring rubrics, the Independent Scoring Rubric and the Integrated Scoring Rubric, define the key characteristics of responses in terms of 3 important dimensions: delivery, language use, and topic development. When raters evaluate responses, they consider all 3 dimensions equally. No one dimension is weighted more heavily than another.


The 2 key features that characterize delivery are clarity of speech and pace. Pronunciation, stress, and intonation most often determine the clarity of speech in a response. The rate of speech, length of utterances, and the degree of hesitancy or choppiness all factor into the pace of a response. For example, a level 4 delivery is characterized by speech that is generally clear and uses stress and intonation patterns effectively. Mostly fluid speech is sustained for the required time. There may be some minor problems but they do not cause difficulty for the listener. There is a certain ease of presentation in responses that show level 4 delivery. At the very lowest band level for delivery, speakers generally have problems maintaining the flow of speech for the time allotted. The speech in low level responses tends to be fragmented and contains long pauses or it may be mostly unintelligible due to serious pronunciation difficulties.

Language Use 

The most salient features pertaining to language use at the highest band level are the efficiency of word choice and grammatical structures to convey meaning and the automaticity with which they are created. What stands out in a response that shows level 4 language use are a control of a wide range of vocabulary and a comfort with a variety of structures. Moving down the scale, word choice becomes less efficient and more vague. It takes more words to convey the same information. At the lower band levels, a smaller range of vocabulary and structures is evident. Responses at the lower band levels for language use are therefore marked by frequent repetitions and greater difficulty expressing meaning clearly.

Topic Development 

The topic development demands of the Independent tasks differ to some extent from those of the integrated tasks.

For the independent task, in which test takers speak about their own experiences, preferences, or opinions, topic development is characterized by the fullness of the content provided in the response and its overall coherence. At the highest level, responses address the tasks by clearly communicating a point of view and providing well-supported reasons or explanations with some elaboration. Responses do not need to be tightly organized, with a clear beginning, middle, and end; but in responses at the highest band level, ideas progress smoothly and cohesively, making the response easy for the listener to follow. Moving down the scale, fewer supporting details and less elaboration are evident. Ideas are not as well connected and their progression not as cohesive. At the lowest band level, very little relevant content is expressed; ideas are very general and vague.

In integrated tasks, test takers speak about information that they have listened to and/or read. For these tasks topic development is characterized by the accuracy and completeness of the content provided in the response as well as its overall coherence. At the highest level, responses address the tasks by presenting relevant information from the reading and listening material and organizing it in a way that makes it easy for the listener to follow the progression of ideas. Responses in the highest band provide the major information requested as well as some supporting detail or elaboration on the topic. Responses at this level may contain minor inaccuracies or omissions, as long as they do not impede the overall coherence and completeness. Moving down the scale, fewer supporting details and less elaboration are evident, generally with more omissions and inaccuracies. The progression of ideas becomes less fluid and the connection between pieces of information becomes less clear. The amount of content decreases and the response may become somewhat repetitive. At the lowest band level, very little relevant content is expressed; ideas presented are very general or inaccurate.

Was this article helpful?

That’s Great!

Thank you for your feedback

Sorry! We couldn't be helpful

Thank you for your feedback

Let us know how can we improve this article!

Select atleast one of the reasons

Feedback sent

We appreciate your effort and will try to fix the article