In the present study, aspects of the measurement of writing are disentangled in order to investigate the validity of inferences made on the basis of writing performance and to describe implications for the assessment of writing. To include genre as a facet in the measurement, we obtained writing scores of 12 texts in four different genres for each participating student. Results indicate that across raters, tasks and genres, only 10% of the variance in writing scores is related to individual writing skill. In order to draw conclusions about writing proficiency, students should therefore write at least three different texts in each of four genres rated by at least two raters. Moreover, when writing scores are obtained through highly similar tasks, generalization across genres is not warranted. Inferences based on text quality scores should, in this case, be limited to genre-specific writing. These findings replicate the large task variance in writing assessment as consistently found in earlier research and emphasize the effect of genre on the generalizability of writing scores. This research has important implications for writing research and writing education, in which writing proficiency is quite often assessed by only one task rated by one rater.