RTRL.47: Band Performance Ratings and Director Gender (Shouldice & Eastridge, 2020)

Shouldice, H. N., & Eastridge, J. L. (2020). A comparison of Virginia band performance assessments in relation to director gender. Journal of Research in Music Education, 68(2), 125-137.

What did the researchers want to know?

Given that existing research shows many women perceive gender-related discrimination in their jobs as secondary band teachers, is there a significant association between band festival ratings and director gender at the middle school or high school level?

What did the researchers do?

Shouldice and Eastridge accessed 6 years (2013-2018) of District Concert Assessment ratings sponsored by the Virginia Band and Orchestra Directors Association (VBODA), which are publicly available on the VBODA website. Data included ensembles’ overall performance ratings and director names, which Shouldice and Eastridge used to code each ensemble performance according to the assumed gender (male/ female) of the director(s). If the first name of a director was ambiguous or gender-neutral (e.g., Robin, Jamie), they performed an internet search to ascertain gender (via pronouns, title, and/or photograph). Finally, Shouldice and Eastridge then conducted statistical analyses to ascertain the percentages of ensembles receiving each rating (by level and director gender) and whether there was a statistically significant relationship between director gender and overall performance rating.

At both the middle school and high school levels, male-directed ensembles were more likely to receive a I rating while female-directed ensembles more likely to receive a II rating. The table below shows the percentages of ensembles receiving each rating by director gender, and the bar chart shows the comparison of I and II ratings by gender and level.

Table 1. Percentages of Ensembles Receiving Each Rating, by Director Gender.*
Figure 1. Number/Comparison of I and II Ratings Included in Chi-Square Analyses.*

*Due to required assumptions of the Chi-Square test, multiple ensemble performances from the same director were randomly removed so that only one listing was included for each director, which is why 3,229 performances are reflected in Table 1 but only 730 in Figure 1.

What does this mean for my classroom?

We cannot definitively infer the cause behind the association between ensemble ratings and director gender discovered by Shouldice and Eastridge. However, it is worth reflecting on possible explanations for this finding. One explanation might be that societal norms expect and permit men to display the behaviors and characteristics that are associated with being a successful band director (such as assertiveness and competitiveness) whereas these traits may be less expected and/or acceptable in women. Another potential explanation could be that women are more likely to be hired for band teaching jobs in smaller and/or rural schools/districts, which may be more likely to earn lower ratings than ensembles from larger school districts. Finally, female-directed bands may receive lower ratings than male-directed bands as a result of gender bias, either explicit or implicit.

One possibility is that gender bias might influence judges to rate female-directed groups differently than male-directed groups. Given that previous research findings suggest larger ensembles or those performing more difficult classifications of music may tend to receive higher ratings than smaller ensembles or those performing easier music, it is also possible that gender bias may be an influence in the hiring practices that lead to greater numbers of men than women securing jobs in larger, more prestigious programs that are likely to perform more advanced repertoire.

The results of this study indicate that although improving, a gender imbalance still exists in the secondary band teaching profession. Discrimination in the hiring process may be one possible explanation for this persistent gender imbalance, as suggested by findings of existing research. It is critical that those involved in the hiring process examine their own biases and actively work toward more equitable hiring of men and women. It is also crucial to strive for more equitable representation of women in the field of secondary band teaching. Rather than reinforcing the common image of conductor as male, music educators and music teacher educators might actively work to provide students with images of women in these teaching roles.

Related Research Summarized by “Research to Real Life”:

Career Intentions and Experiences of Pre- and In-service Female Band Teachers (Fischer-Croneis, 2016)

Male and Female Photographic Representation in 50 Years of Music Educators Journal (Kruse, Giebelhausen, Shouldice, & Ramsey, 2015)

RTRL.06: “Investigating Adjudicator Bias in Concert Band Evaluations” (Springer & Bradley, 2018)

Springer, D. G., & Bradley, K. D. (2018). Investigating adjudicator bias in concert band evaluations: An applications of the Many-Facets Rasch Model. Musicae Scientiae, 22(3), 377-393.

What did the researchers want to know?

What is the potential influence of adjudicators on performance ratings at a live large ensemble festival?

What did the researchers do?

Springer and Bradley (2018) collected evaluation forms from a concert band festival in the Pacific Northwest U.S. Each of the 31 middle school/junior high school bands performed three pieces and were rated by three expert judges on a scale of 5 (superior) to 1 (poor). Judges were also allowed to award “half points,” and they rated each group on eight criteria: tone quality, intonation, rhythm, balance/blend, technique, interpretation/musicianship, articulation, an “other performance factors” (such as appearance, posture, and general conduct). The researchers analyzed the data through a complex process called the Many-Facets Rasch Model.

What did the researchers find?

The use of half-points resulted in less clear/precise measurement than if half-points had not been allowed. All but one of the performance criteria “did not effectively distinguish among the highest-performing ensembles or the lowest-performing ensembles” (p. 385), which could indicate a halo effect–when judgements of certain criteria positively or negatively influence judgements of other criteria. Examination of judge severity revealed that one judge was more severe in their ratings than the other two, though all three more heavily utilized the higher end of the rating scale, indicating “leniency or generosity error” (p. 386). Finally, numerous instances in which some bands were rated unexpectedly higher or lower by one judge than the other two suggests “evidence of bias” (p. 386).

What does this mean for my classroom?

Adjudication training and calibration—ensuring judges rate in similar manners—is critical. Adjudicator training for the band festival studied by Springer and Bradley involved only a 30-minute session in which the adjudicator instructions and evaluation form were discussed and adjudicators were allowed to ask questions. A more in-depth and ongoing adjudicator training process may help improve the validity and reliability of ratings given. For example, Springer and Bradley suggest that adjudicators might participate in an “anchoring technique”—a process in which judges rate sample recordings and then discuss the specific “aural qualities necessary for rating each performance criterion on the scales provided on the evaluation form” (p. 389).  Festival coordinators might also attempt to hire adjudicators from other geographic regions in order to reduce bias due to prior familiarity with bands or directors.