Quantifying Award Network and Career Development in the Movie Industry

In show business, awards are conferred to persons and films to provide incentives to performers’ future career development through periodic film festivals and events. In this work, we focused on exploring the growth and dynamics of the film award system, the structure of the award network, and the relationships between historical performance, collaborations, and future career success of performers in the movie industry. We collected data from IMDb, which covers more than 3.5K movie events for 520K individuals with their award-winning and career records for over 90 years. By using network analysis and regression models, we find several novel results. At first, we found the exponential proliferation of awards across all genres of films and all professions of individuals and the uneven distribution of the number of awards in careers across time. More than 30% of the performers have won multiple awards. Second, we built an award network to reveal the interlocks between awards based on multiple award-winning phenomena. We found that for prestigious awards, 47% of the linkages were over-representative than the expectations from the null model. Furthermore, the performers’ collaboration network was highly clustered, exhibiting a high propensity of linkages between awarded performers. Lastly, our regression models revealed that multiple factors were related to performers’ early career success and award winning. Specifically, we showed that along with the performers’ historical achievements, their collaborators serve an important role in award winning after being nominated, with the scope and depth of the impact differing in the awards’ prestige. This work has strong implications for the harmonious dynamics of the movie industry and the career development of performers.


INTRODUCTION
Awards are conferred to recognize individuals' merits of performance across various domains. In science, award winning is a complementary scientific measure of excellence, and awards are always associated with funding, financial incentives, promotion, and prestige. For instance, awarded scientists will attract collaborators, become nominators, and enhance the development of their research fields [1][2][3][4][5][6]. A typical example in show business is the Oscar award; winners in this event always attract tremendous attention all over the world and will get much more benefits in their future careers [7,8]. There are a large number of works trying to quantitatively understand the system dynamics and predict the behaviors of performers in the movie industry, for example, previous studies include exploring the dynamics of the movie industry [7,9,10], measuring the significance of creative work [11], predicting the movie ratings and success [12,13], the collaboration network analysis between performers [14,15], the gender disparities and age [16][17][18][19][20], and quantifying and predicting the career development for performers [15,21,22]. These sufficient works on the movie industry gave us a comprehensive picture of the film industry, however, there is still a lack of study focusing on film awards, especially putting all the famous film awards together. In science, established studies found that awards play a positive role in inspiring new achievements [2,4,5,23], allocating scientific contribution [24], and signaling scientific credibility [25] and field advancement [6]. Based on these results in science, we are trying to collect a large dataset which will cover all the famous awards in the movie industry, explore the dynamics of the award system and quantify the performers' career success for future award winning.
Career development has received various attention, and lots of interesting insights are deserved to mention here. For instance, studies have shown that factors including talent/creativity [26,27], personal characteristics including gender and age [16,17,19,28], social networks like collaborations [29], cumulative advantages [30,31], and social media [32], all will affect the productivity and quality of individuals' future performance. Specifically, studies have shown that creativity in a career is random regardless of career stages, while there exists hot streaks phenomenon in career development [21,22,[33][34][35]. In addition, awards play an important role in a performer's career development. Borjas, et. al. found that mathematicians who win Fields medal will have a decline in productivity after the prizewinning year, which indicates the potential expense of exploring new research areas after prizewinning [23], and Reschke, et. al. found that the winner's "neighbors" may lose attention after the winner's prizewinning time [36]. Thus, by an analogy between career development in science and show business, the future award winning of performers may be determined by multiple factors, including their self-attributes, the collaboration network, and their performance.
In this work, we are trying to systematically study the dynamics of the award system in the movie industry and figure out what leads to the early career success of awardwinning actors and actresses. For policymakers in the movie industry, the work will be useful to help maintain a harmonious award system, and for actors and actresses, it provides sufficient insight for career management. We collected and integrated large-scale data, including movies, actors and actresses, and award events. We focused on film awards that are conferred to individuals for their excellence in filming, with interactions with other individuals in filming and directing. First, we explored the dynamics of the award system and the increase of prizewinning films by genres and actors by profession. Second, we derived the statistical properties of the award network and the collaboration network to further understand the award-winning propensity. Third, we modeled on what led to early career success when actors and actresses were first nominated for an award. We paid attention to the question of how the historical performance of actors and actresses' filming careers and the quantity and impact of their collaborators would lead to winning at the early stage.

DATA COLLECTION AND DESCRIPTION
The data we used from IMDb contains over a hundred years of movie records, including films, television cast, and production. For films and actors, it includes the names, principals, artwork information, and movie ratings. We collected extra data from IMDb and Wikipedia (for instance, the wiki Category: Lists of awards received by an actor) which contains the award records for performers. Across wiki and IMDb, the actors/actresses are linked by their IMDb ids starting with "nm," and the movies are linked by their IMDb ids starting with "tt." The events include almost all the world's biggest awards, including the Academy Award, the Golden Globe Award, and the Prime Emmy. We focused on film awards that are conferred to individuals for their excellence in filming, the interactions with other individuals in the movie industry.

H-Index for Awards
Here we developed the h-index for awards by analogy to the h-index in science. We used this because we found that the h-index ranking is closer to reality and easy to calculate. At first, h-index was first proposed by Hirsch to quantify an individual's research performance [37]. By definition, a scientist's h-index equals h if at least h papers of hers/his are at least h citations for each paper. In this work, we use the h-index for awards in the movie industry, which represents an award conferred to at least h films with at least h number of votes for each film from the IMDb database. Second, to verify the validity, we found that the h-index for awards is highly correlated to the network-based measure. PageRank [38], with Pearson correlation ρ 0.73 (see Figure 1 for correlation plot). The result yields consistency in evaluating awards' prestige, which further demonstrates the robustness of using the h-index for measuring the prestige of awards. Finally, we checked the top five highly ranked awards by h-index, i.e., BAFTA, Oscar, Emmys, OFTA, and Golden Globes, all of which are famous awards and are highlighted on the IMDb homepage. The network-based measures needs global information and will have high computational costs. Based on consistency and simplicity we used the h-index in this work.

The Growth of Awards and Winners
In the movie industry, there are events/festivals conducted regularly, for example, the Oscar is held each year. And during each event year, there are several awards like the best actor/actress, the best directors, the best writers for individuals, and the best action, best documentary, for the best film work. For each award, it may have more than one nominee in each event, and usually only one will be the winner. We found that among the nomination records of all professions, actors and actresses play leading roles in prizewinning, as shown in Figure 2. The actor and actress nominees have taken up more than 20% of the total nomination records. Further, we found that more than half of the nominees play multiple roles in the movie industry, for example, 7% of the individual awards (8207 awards) are taken up by performers with a composition profession of director, writer, and producer.
The award system has experienced an exponential expansion during the last 3 decades. At first, as the grey bars shows in Figure 3A, for all the prizewinning events, the number of awardees increases exponentially with time, in the 1950s, there were less than 200 awardees each year, however in the 2000s, the awardees reached 3000, this increase is probably due to the increase of the number of unique awards as the film events increase as shown in the blue line in Figure 3A. We can further see that the growth speed is changing with time, and roughly has three periods, i.e., the flat with gradual increase period from 1950 to 1994, the medium-growing period from 1994 to 2009, and the fast-growing period from 2010 to 2019.  Secondly, during the expansion, the number of awardees in each genre grew exponentially at different rates. As shown in Figure 3B, for awards conferred to movies, i.e., the awardees are films, the films of different genres grow with the exponential trend, with genres of drama having won the largest number of awards, short drama genres having the most abrupt upward trend, which reflects the fashion trend in the last decade. Lastly, when identifying the primary professions of individual award winners, actors receive the largest number of awards, but all professions have a similar rate of increase in the exponent as in Figure 3C.
With the increase of winners in the system, the performers may win multiple awards in their careers, as in the cases studied in science [5]. Thus, in Figure 4, we showed the distribution of the total number of awards a performer won during her/his career. We found that about 70% of the winners only get one award during their careers. We further split our data into three periods, the three periods are set according to the three growth periods in Figure 2A and showed the distribution in Figure 4. We can conclude that the multiple prizewinning cases are increasing slightly (33% win more than one prize in 2010-2019, this percentage is 31% was 1994-2009 and 30% in 1950-1994). Furthermore, we found that during their careers, actors often get more awards than actresses (Inset of Figure 3).

THE AWARD AND COLLABORATION NETWORK
Due to the 30% of multiple prizewinning in the three different periods, it prompts us to ask how other factors will influence the prizewinning of a performer, for instance, the potential correlation between awards and collaborations between performers. Thus, we constructed two networks, the award network and the performers' collaboration network. The definitions and analysis of the networks are as follows.

Award Network
The whole award network in total contains 3634 awards and the weighted links between awards. In the award network, awards are nodes, and weighted links between two awards i and j are the counts of the number of performers winning both awards i and j.
To quantify the connection strength in the award network, we built a degree-preserving null model. In the null model, we randomly assign all the link weights by keeping all the node's degree constant. Then relative link weights are calculated as the real link weights divided by the link weights from the null model. So, if the relative weight is greater than 1, the link is overrepresentative (real weights are larger than expectation), otherwise it is under-representative.
The award network shows that there are potentially strong pipelines between different awards. Due to the densely connected property of the award network, we filtered a subgraph by using the award h-index (see data description part for details). In Figure 5, for visualization convenience, we filtered awards having an h-index greater than 250 and showed the relative link weights by the stripe widths. By comparison, 47% of the links have stronger links than expectations (over-representative). For example, the link weight between the César Award and the Silver Ribbon is seven times larger than expected, both are famous events in Europe.  Figure 2A). We can see that performers are more likely to win a single prize (70%), while multiple prizewinning is slightly frequent in the recent decade. The Inset shows that the number of prizes won by the gender of performers differs a lot.

Performers' Collaboration Network
The collaboration network contains 52,531 performers (nominees) as nodes and the collaboration relationship in casts as links from 1950 to 2021. Due to the high density of the collaboration network between performers, we extracted the backbone of the network [39] and focused on the top prestigious prizes, including Oscar, BAFTA, Primetime Emmy, Golden Globes, and Saturn Award according to the award h-index. We found that the prize-winners tend to be hubs in the network and have on an average more connections compared to the ones only being nominated but not winning eventually (K-S test statistic 0.031, p-value of 0.024). Furthermore, we found that the chance of winning a prize is getting more difficult in the recent decade, in Figure 6 for the Saturn Award, the proportion of winners among all nominees decreased from 22% in the 1990s to 18% in the 2010s. The connections between winners and nominees indicate that prizewinning actors may affect a performer's prizewinning propensity after nomination.

PROPENSITY OF PRIZEWINNING
To further study the role of award and collaboration in a performer's early career award winning, we developed regression models. In the model, we incorporated potential factors of interest from our data, for example, the acting performance of the individual, the collaboration network, the award h-index (prestige), and the number of competitive nominees prior to prizewinning would all influence the chance of prizewinning. Historical performance of actors/actresses is related to the career length, productivity, quality of work, and influence of works. We also controlled the award types and the periods of the awards conferred. We used a logistic regression model to predict the probability that a performer wins at the first nomination. The independent variables we used are: 1) number of films the performers participated in before their nomination; 2) average movie ratings of historical works; 3) average number of votes for historical works; 4) length of career prior to nomination; 5) gender; 6) number of awarded films prior to nomination; 7) number of collaborated directors; 8) number of awarded collaborated directors; 9) number of collaborated actors; 10) number of awarded collaborated actors; 11) competitive pool size (number of potential awards which will be conferred in the  We performed multiple models for regressions, in Table 1 models M 1 , M 3, and M 4 indicate that the performer's career length and number of awarded films prior to nomination have a significant positive impact on early career prizewinning. All of models M 2 , M 3, and M 4 indicate that by collaborating with more prizewinning actors before nomination, the performer is more likely to win. In model M 4 , the number of a performer's collaborated directors has a negative impact, indicating when a performer collaborates with a smaller number of directors, he/she has a higher probability of winning. This highlights the importance of long-term and high-quality collaboration with directors.
Due to the small sample of winners compared to non-winning nominees, we further used a complementary log-log model, which is usually used in an asymmetrical case. The result is very similar, with a detected significance on the historical  performance and collaborators as shown in Supplementary Table S1 in the supplemental information.. We further controlled the award's prestige by h-index, and the consistent results are shown in model M 5 in Supplementary Table S1, which confirms the robustness of our study.

Historical Performance
From the models, we showed that the historical performance plays a positive role in a performer's future prizewinning. In detail, we showed the predictive impact in Figure 7. Based on model M 4 , the predictive probability of winning after nomination for performers with career lengths above 8 years is more than 35%, while it drops to 30% for performers with career lengths less than 2 years. The longer the career length is, the more likely the nominees will win the awards. When a performer participated in a larger number of awarded films before being nominated, he/she has a larger probability of prizewinning, Figure 7B shows this trend.

Collaborators' Influence
For the collaborators of a performer, we found that the awarded actor collaborators had a positive influence on the performer's prizewinning, yet the director collaborators showed negative effects, as shown in Figure 8. This reflects the fact that in the movie industry, collaborating with more prizewinning actors, the performer will benefit from the actors and gain experience in acting, and will have a positive impact on her/his career, as shown in Figure 8A. However, collaborating with more directors, which means the performer has lots of short collaborations, will be harmful to future prizewinning, as shown in Figure 8B.

DISCUSSION
In this work, we explored the dynamics of the award system and the award winning in performers' careers in the movie industry, which has not been well studied before. We found that, as expected, the award system experienced exponential growth during the last 60 years and showed different levels of stratifications in genres and professions. The network science tools are used in the analysis of nominees and awards. By comparing with the null model, the award network illustrates the existence of potential connections between different awards, which will affect multiple prizewinning in a performer's career. Our regression models showed multiple factors that will affect a performer's award winning, including the career length, the number of awarded films participated in, and the former collaborators, etc.
Our work provides strong implications for all actors and actresses in the movie industry, especially in their early careers. Based on our regression models, we found that the factors affecting a performer's first award winning after being nominated are multifaceted. At first, the career length and the number of awarded films participated play positive roles, which probably indicates that in the movie industry, "accumulated experience in filming," i.e., participating in excellent (awarded) films and keeping the career active (career length), is the key to future performance recognition (award winning) for a performer. Moreover, it is worth noting that collaboration with more famous (prizewinning) actors, sticking to few directors, and building a solid relationship will help the performer gain award-winning probability in the future. Our results also imply the success breeds success phenomenon in the movie industry. If a performer collaborates with more awarded actors and participates in more awarded films, her/his awardwinning probability will be improved considerably. This indicates that the success (of awarded films and awarded actors) will influence and 'breed' the success of the performer (future award-winning actors), the detailed mechanism behind this is an open question.
Despite the rigorous analysis and modeling in this work, there is still future work in potential directions: On one hand, the potential causal pipelines between awards, i.e., from the award network, there are strong correlations between different awards. However, the underlying mechanism of how wining an award a will influence the wining of an award b for a performer is not clear, which needs more quantitative exploration and quantitative tools related to the causal inference, which is out of the scope of this work. On the other hand, we showed the strong correlations between a performer's historical performance and award winning and highlighted the importance of career lengths and collaborations, the causal relationship is still unknown, which we cannot solve in this work due to the lack of high-resolution data, for example, the performer's characteristics (nationality, age, social networks, etc.).

DATA AVAILABILITY STATEMENT
The original contributions presented in the study are included in the article/Supplementary Material; further inquiries can be directed to the corresponding author.

AUTHOR CONTRIBUTIONS
YL analyzed data, designed the research, and wrote the manuscript, and YM collected data, designed the research, and wrote the manuscript.