Science Communication Desperately Needs More Aligned Recommendation Algorithms

With 1 billion watch-time hours per day, YouTube now plays a major role in communication. Unfortunately, a large amount of misinformation is produced and widely shared on this platform (Donzelli et al., 2018; Allgaier, 2019; Loeb et al., 2019). In this paper, after providing a brief overview of the creation of science content on YouTube, we particularly emphasize the importance of YouTube's automated recommendations. We then discuss the main challenges of making such recommendations aligned with quality science communication.


INTRODUCTION
"Without our science communicators to publicly inform, explain, teach, decode, counter misinformation, and debate science matters many would remain in a space where they don't have [the] information they need, leading to poor choices being made at really crucial times, " New Zealand Prime Minister Jacinda Arden asserted in July 2020 (LeBard, 2020). The COVID-19 pandemic is surely testing the importance of science communicators and, in many cases, the lack thereof (Yong, 2020).
Our societies arguably face challenges of increasing complexity, from pandemic mitigation to climate change, social inequalities, and mass surveillance. Despite this, some views opposing the scientific consensus are proliferating at a concerning rate, and this includes various critical topics such as climate change (Allgaier, 2019), cancer (Loeb et al., 2019), and vaccination (Donzelli et al., 2018). Disturbingly, Johnson et al. (2020) fit a model that "reproduces the recent explosive growth in anti-vaccination views [on social medias] and predicts that these views will dominate in a decade." The rise of misinformation, or simply the lack of quality information, seems to be a major risk factor for our societies in addition to numerous other social-media-related issues, such as online polarization (Tucker et al., 2018), anger pandemics (Berger and Milkman, 2012;Fan et al., 2014), and loneliness (Hunt et al., 2018) to name a few. In this context, quality science communication 1 has arguably become critical for the future of humanity.
The diffusion of information is currently undergoing a major shift with the rise of online platforms. According to a survey by Shearer and Gottfried (2017), around two Americans out of three report that "they get at least some of their news on social medias." In this paper, we will focus on the particular case of YouTube mostly because of its scale. YouTube claims 2 to have 2 billion users, each with an average of 30 min of daily watch time. According to Lewis (2020), this adds up to more daily views on YouTube (6.8 billion views) than searches on Google (5.9 billion queries). Through repeated exposure (see Zajonc and Rajecki, 1969;Kahneman, 2011;Kramer et al., 2014;Jackson, 2019), a vast fraction of the population seems greatly under the influence of what YouTube exposes them to on a daily basis (Lewis, 2018). To understand today's science communication, it thus seems critical to have at least an overview of the science communication produced and shared on YouTube. This will be the topic of section 2.
An important feature of YouTube is the central role played by its recommendation algorithm, which suggests home page videos, lists videos to watch next, and responds to YouTube search queries. According to YouTube's Chief Product Officer (see Solsman, 2018), 70% of the views on YouTube result from recommendations by the YouTube algorithm. This suggests that this algorithm is particularly critical for the future of communication on any topic. If the YouTube algorithm is somehow tweaked to recommend quality science contents two times more often than it currently does, then we might expect, at a first order approximation, that quality science YouTube videos will increase their reach by 70%; it is not clear, however, whether second-order effects will decrease or increase this.
Interestingly, a 2019 survey by Shearer and Grieco (2019) suggests that a large fraction of Americans are aware of this largescale impact of recommendation algorithms. Around two thirds of surveyed individuals believe that social media companies have "too much control over the news people see." Given the stakes of science literacy in the twenty-first century, as will be argued in section 3, it might be urgent to demand that the algorithms that control so much of the flow of information on social medias be aligned with quality science communication, as argued by Hoang (2020a).
Unfortunately, as discussed at length by El-Mhamdi and Hoang (2019), such an alignment of recommendation systems is challenging for both technical and non-technical reasons. In section 4, we discuss several of these challenges, as well as proposals to surmount them. In particular, we will defend the need for expert-driven content recommendation like Hoang (2020c), which builds upon the collaborative ethical design framework proposed by Noothigattu et al. (2018) and Lee et al. (2019).

A BRIEF OVERVIEW OF SCIENCE ON YOUTUBE
The early 2010s saw the rise of numerous science YouTube channels, such as Veritasium, Smarter Every Day, Numberphile, CGP Grey, Minute Physics, and ASAP Science among numerous others. Since then, thousands of channels have grown, some of which now have over 1 million subscribers. A few of these channels, such as SciShow or Mark Rober, have received over 1 billion views in total. However, most science YouTube channels remain small, with apparently a heavy tail of very small channels (Blanchard et al., 2018). Most successful channels seem mostly supported by in-video advertisement, YouTube's added advertisement, crowd-funding, and derived products such as books or goodies.
However, several channels have been supported or launched by organizations. The Public Broadcasting Service (PBS) Digital Studio hosts Crash Course, Physics Girl, It's Okay to Be Smart, PBS Space Time, PBS Infinite Series, and PBS Hot Mess, while specific series within channels, like Mind Field, Planet Slow Mo, The School of..., Could You Survive the Movies?, and Sleeping with Friends, are supported directly by YouTube itself. A few journalistic organizations also produce consistently successful science videos, like Vox, Wired, and Seeker, while other institutions like NASA, the World Heath Organization, and the National Science Foundation have a less consistent success. Recordings of lectures or talks, published by foundations or institutes like TED or MIT OpenCourseWare, also have a large variance in terms of success. A few videos exceed 1 million views, but most have less than 1,000 views.
Successful videos cover a wide range of topics from theoretical physics to social sciences. Perhaps more surprisingly, they also cover a wide range of technical levels. Remarkably, The hardest problem on the hardest test, a video by 3Blue1Brown, has received over 7 million views, while The Banach-Tarski Paradox by VSauce had over 29 million views, even though these videos present the proofs of high-level mathematics. In fact, some channels like Two Minute Papers (with nearly 700,000 subscribers) are devoted to research publications.
A survey by Beautemps and Bresges (submitted) yields greater insight into the audience of science YouTube videos. According to the survey, most viewers of a selection of German science YouTube channels are young, between 13 and 24, and overwhelmingly male (88%). Around 60% of viewers are not studying natural sciences (10% provided no answer), but around 85% of them have an interest or a strong interest in natural sciences. Science videos seem to currently fail to attract viewers with little prior interest in science.
Arguably, in terms of views, subscribers, and engagement, the most successful format consists of a presentation by a host, with multiple illustrative images accompanying the hosts' explanations. The host often speaks directly on camera, though they are sometimes completely off camera. There is often a single host, or at least a main host, who may then feature extracts of interviews of experts (Welbourne and Grant, 2016). One notable exception is the format of Periodic Videos, Objectivity, and Computerphile, among others, where the host is secondary, and the explanations are essentially provided by a given expert.
Having said this, many other formats can be successful as well, including lectures, media conferences, and interviews. In particular, in more recent years, podcasts in the form of discussions between hosts, with or without guests, have gained importance, on channels like Hello Internet, Mindscapeand , Lex Fridman. Finally, there are interesting artistic takes on sometimes very rigorous science, for instance on Epic Rap Battle or acapellascience. Perhaps most iconic in this regard is a collaboration between Vietnamese health authorities and Vietnamese artists to alert the population of the COVID-19 risks. This resulted in the song Ghen Co V y that had over 67 million views on YouTube alone (the song has been viral on TikTok too, and it has been remixed in many ways on numerous channels).

THE NEW BOTTLENECK FOR SCIENCE COMMUNICATION
On March 11, 2020, the World Health Organization characterized COVID-19 as a pandemic. Shortly afterwards, an extraordinary collaboration of 39 French science YouTubers collectively produced and published the same Creative-Commons video, entitled "Coronavirus: Chaque JOUR compte 3 " and released on March 14, which urged viewers to physically distance themselves from one another and, if possible, to stay home. The video obtained a total of half a million views on at least 28 channels 4 .
It is quite remarkable that the equivalent of nearly 1% of the French population watched this video, which was produced for free by a large group of volunteers. Nevertheless, the fact that this video did not reach a lot more users can be seen as a missed opportunity. Given that the video was released at the heart of the exponential spread of COVID-19 in France, Hoang (2020b) estimated, based on very rough calculations 5 , that the video might have saved around 10 lives. But if the video made 10 million views, then perhaps hundreds of lives would have been saved.
This example highlights a critical feature of science communication. It is not sufficient for quality content to be accessible. For the content to actually be impactful-to, in this case, save many lives-it also most importantly needs to be accessed. In fact, nowadays, at least for some topics, the bottleneck of science communication is arguably no longer the production of quality contents, especially on widely covered topics, such as vaccination, climate change, and scientific methods. More often than not, the bottleneck has become the large-scale promotion of top-quality content.
In this context, especially on YouTube, one entity is overwhelmingly more influential than anyone else. This entity is YouTube's recommendation algorithm. Recall that two views out of three on YouTube result from the choice of a video to recommend by this algorithm. The algorithm can easily be designed so that a particular video makes 10 million views instead of half a million. Unfortunately, thus far, especially for outsiders, not only the algorithm is mainly a black box but so is so much of what is happening on the platform. More transparency seems critical to better understand the impacts of YouTube on society, and what can be done to avoid the nasty side effects of the platform (Taylor, 2020).
More generally, the flow of information is perhaps what shapes our societies the most in terms of economy, science, public 3 Coronavirus: Each DAY counts. 4 The author of the present paper is one of the participants of this massive collaboration. 5 Essentially, the model assumed that only half of the population was prudent, that prudence reduced R from 2 to 0.8, that 80% of the viewers were prudent, and that the video convinced half of imprudent viewers to be prudent. It then projected an exponential growth over 8 weeks. The projections should be taken with a huge grain of salt, as the results are unfortunately very sensitive to the parameters. health, politics, activism, daily habits, and beliefs. What entity controls the flow of information the most? Arguably, this entity is no longer a human; YouTube's algorithm is arguably the entity that controls the flow of information in the world the most. As a result, the future of science communication seems to be, by and large, in its hands.

THE CHALLENGES FOR ROBUSTLY BENEFICIAL RECOMMENDATIONS
Following the COVID-19 crisis, on June 11, 2020, YouTube's CEO Wojcicki (2020) announced on YouTube's official blog that, among other things, YouTube is consulting the World Health Organization and local health organizations on a regular basis to combat "harmful medical misinformation". Wojcicki (2020) claims that, as a result of this, 200,000 videos were removed from YouTube.
While this cooperation is arguably a great news, it is noteworthy that the focus seems to be mostly on removing misleading, abusive, and hate contents. Unfortunately, such removals are often described as censorship by critics. In fact, there may be a reasonable fear that such removal decisions may fuel some conspiracy theories that already contest current authorities. This may be all the more the case when the author of the removed content is a major political figure, as in the case of the removal of President Trump's claim that children are "almost immune" to COVID-19 (see Culliford, 2020). More generally, content removal seems to be associated with high risk of backfire effects (Nyhan and Reifler, 2010;Trujillo et al., 2020).
Perhaps rather than a spectacular binary decision to remove misleading contents, a more nuanced solution could be to downgrade the recommendation rate of more problematic contents. Interestingly, in an interview by Sandlers (2019), Tessa Lyons, Facebook's Product Manager in charge of news feeds, says that Facebook used to only remove content that violated the platform policies, such as pornography, hate speech, and graphic violence. However, they found out that users were then seeking to post the most extreme content just below the removal threshold because such content was a lot more likely to go viral. Now, Lyons says, Facebook rather makes sure, using its recommendation algorithm, that contents that approach the removal threshold are widely de-recommended. Interestingly, this also incentivizes users to produce less extreme content. Arguably more research into the effects of de-recommendation, as opposed to content removal, is needed to better understand which strategy is preferrable.
But perhaps Facebook's approach is not going far enough. Recommendation algorithms have the potential to shift the battlefield of the economy of attention (Franck, 2019), where every user, influencer, advertiser, activist, web platform, and politician competes for the attention of their audience. These days, this battlefield is arguably mostly dominated by those who invested the most in hacking social media algorithms and user attention, often with clickbait, divisive, and addictive contents (Tufekci, 2017), and intent toward financial or political profit. Robustly beneficial recommendation algorithms could not only help to fight harmful contents; they could also make sure that the top-quality content will not be drowned in an ocean of sensationalism (Jackson, 2019). In fact, these days, the main misinformation may not be misleading information but rather the prevalence of unimportant information.
"In the past, censorship worked by blocking the flow of information, " Harari, 2016 argues. "In the twenty-first century, censorship works by flooding people with irrelevant information. People just don't know what to pay attention to, and they often spend their time investigating and debating side issues." The more important news that fail to reach a large audience are sometimes known as mute news. The lack of attention to such mute news may be a greater concern than the presence of fake news (Rosling et al., 2018). Yet, to solve the problem of mute news, removing bad apples will be useless. What should perhaps be done, instead, is to identify top-quality content and flood YouTube with a stream of this content.
Evidently, one critical challenge will be to convince YouTube and other social medias to adopt such a strategy. This will not be easy. After all, it will likely at least partly conflict with companies' current desire to maximize user retention (Franck, 2019). Social pressure, and probably regulation, will likely be critical to get there. Interestingly, however, such companies seem to be giving increasing attention to the ethics of their recommendations even though this attention remains arguably largely insufficient (Nicas, 2020). Perhaps more importantly, reliable and scalable solutions to identify quality content are still lacking, which makes the advocacy for their implementations very difficult.
In fact, identifying top-quality content is a challenging endeavor in itself. For one thing, it seems important to acknowledge that even science communicators will disagree on the definition of what "quality content" is and thus on what makes content worth sharing.
Clearly, the reliability of the information presented by the content is a key feature. Surely, like quality scientific publications, quality science communication should be accompanied with reliable sources and should present the scientific consensus if it exists. Quality content should be transparent about the methods used and should perhaps also share the data it has relied on. But this is arguably far from being the only thing that matters.
As discussed earlier, it seems at least as critical to prioritize contents that address important topics. Reliable content on some anecdotal event is usually not the most urgent content to be shared on a large scale. Perhaps more interestingly, the effect size of an idea discussed in a content should also be taken into account. For instance, in terms of environmental impact, content arguing for local food consumption may not be as important to promote as a content arguing for the reduction of meat consumption (see Weber and Matthews, 2008;Ritchie and Roser, 2020). Unfortunately, fully agreeing on what is important is likely hopeless. Voting methods are probably critical to reach any agreement.
But this is not all. Arguably, quality content should also be extremely pedagogical rather than superficial or dogmatic. In fact, Muller (2008) showed that even correct and seemingly pedagogical videos in physics can fail to reliably teach concepts to viewers. Worse, such content may increase the viewer's selfconfidence even when the viewer's prejudices are contradicted by the video. Overall, it seems critical to further investigate what makes pedagogical videos effective and to make sure that the results of this research are taken into account to promote videos that really have a strong positive impact.
There are still other features that may also seem critical to identifying the videos worth sharing. A top-quality video should arguably also be engaging. In fact, it seems desirable that it presents science enthusiastically, raises numerous questions, points to further contents, promotes intellectual humility and triggers genuine curiosity (Davies, 2019). In fact, the evidence collected by Kahan et al. (2017) suggests that scientific curiosity is a critical trait to fight politically motivated reasoning. Additionally, quality content should probably also minimize the risks of backfire effects, such as viewers increasing their confidence in their biased views (Taber and Lodge, 2006).
At this point, it seems clear that the content produced by recognized authorities, like the World Health Organization, are not always the contents that should be promoted in priority. The YouTube ecosystem hosts millions of videos designed by science communication talents. It would probably be greatly suboptimal not to recommend these videos.
Unfortunately, identifying these videos in the ocean of YouTube content is an extremely challenging task. Recently, the Tournesol framework has been proposed by Hoang (2020c). Tournesol aims to query experts to collect data on what experts regard as quality content according to the different features we discussed. More precisely, it asks any user from a trusted institution (universities, health agencies, NGOs, etc.) to register on the platform by confirming their certified email address. The user is then regarded as an expert by the platform. The expert is then asked to select any two videos of their choice and to say which video is more reliable, which is more pedagogical, which is more important, which is more engaging, and which is more resilient to backfire risks. Note that since Tournesol only aims at identifying quality content, it need not be exhaustive about its video reviewing process.
Tournesol then leverages a machine learning model inspired from Bradley and Terry (1952) to infer what scores the expert would assign to the videos they rated for different quality features. Tournesol then aggregates the scores from different experts using a median-like operator, akin to majority judgment (Balinski and Laraki, 2011). The global scheme is a collaborative computable ethics design inspired from Noothigattu et al. (2018) and Lee et al. (2019). Users that search contents can then adjust the importance they give to the different quality features, to obtain personalized recommendations. Perhaps this framework, which needs further research to optimize, may pave the way toward more robustly aligned recommendation algorithms 6 .

CONCLUSION
In this paper, we stressed the importance of science communication for the future of our societies. We also argued that today's main bottlenecks to science communication are the social media platforms' recommendation algorithms and YouTube's in particular. We discussed the challenges posed in trying to make the YouTube recommendation algorithm robustly beneficial, and we also touched on a currently investigated path to partially solve these challenges. We are facing a challenging endeavor. But, as argued by El-Mhamdi and Hoang (2019), this great endeavor may be viewed above all as a fabulous endeavor. After all, it boils down to answering what is arguably the most central question of science communication: What is quality science communication?

AUTHOR CONTRIBUTIONS
The author confirms being the sole contributor of this work and has approved it for publication.