Wikipedia:Wikipedia Signpost/2020-08-02/Special report

Special report

Wikipedia and the End of Open Collaboration?

This article was originally published by Wikipedia@20. The authors have generously allowed The Signpost, and others, to publish it under a CC-BY license.—S

That which enabled Wikipedia to grow and become an invaluable free and open information resource have changed—both within Wikipedia itself and in the wider world. Hill and Shaw ask if there is room for optimism about the future of Wikipedia or open collaboration more broadly.
— The editors of Wikipedia@20

Was Wikipedia a fluke? Just luck? Some freak accident of timing, technology, and vision? Since the project began in January, 2001, no attempt at collaborative knowledge production has produced as large, widely-used, or valuable a public resource. Given its exceptional character, any attempt to explain Wikipedia’s growth and impact—much less draw general insights on how to replicate it—can seem like a fool’s errand.

Figure 1: Number of active contributors to the eight largest language editions of Wikipedia measured by the number of total editors as of January 2019[1]

Wikipedia’s incredible success masks a more complex story. In fact, not even Wikipedia has been able to maintain a stable community of volunteers over the past two decades. Figure 1 shows the number of “active” contributors to eight of the largest language versions of Wikipedia over time. The top left panel shows English Wikipedia’s explosive contributor growth through March 2007 and its transition into a long, slow period of decline. The other panels show similar patterns across the seven largest Wikipedia language versions measured by contributor base. Readership and other uses of Wikipedia have increased steadily over the period shown. As scholars of open collaboration and as concerned contributors to, and users of, Wikipedia, these dynamics have been the center of much of our research over the last decade.

Although the death of Wikipedia has been foretold many times, the dynamics playing out in the graphs above imply that long-term decline in contributors may be undermining the project from within in important ways. As the contributor bases of most of the large Wikipedia language versions shrink over time, fewer editors means reduced capacity to cover new topics and to maintain high quality content. What future do these projects have? What explains the patterns illustrated in Figure 1? What should Wikipedia and proponents of open collaborations and knowledge do?

Lifecycles of peer production projects

Although no other attempt at collaborative production has produced a knowledge base exactly like Wikipedia, Wikipedia is far from unique. At the heart of Wikipedia lies a model of technology-mediated collaboration called "peer production." Wikipedia did not invent peer production and thousands of other efforts at peer production have occurred. Many of these have used Wikipedia as a model or a source of inspiration. For example, thousands of other wikis—collaboratively written knowledge bases that can be edited and accessed online—have used the same software and organizing principles as Wikipedia to create public information resources on diverse topics. Six such projects hosted by the commercial firm Fandom (formerly known as Wikia) cover such topics as the city of Seattle, crocheting, the book and film series Hunger Games, academic jobs, Star Wars (Wookiepedia), and Starbucks. Each of these projects share a common software infrastructure with Wikipedia and with each other, but they vary in size, goals, contributors, and the sets of rules and norms governing participation. Further afield, other peer production projects that create free and open source software, news communities, and technical support vary even more.

Despite their diversity, some shared patterns also characterize peer production projects. For example, most attempts to create new projects do not amount to much in terms of output or collaborative activity. For every wiki that attracts multiple contributors to create and edit content, many others never attract a single follow-on contributor. This is true of other kinds of peer production projects as well. The rare, successful attempts to build large-scale collaborations also share a lot in common with Wikipedia in terms of how they organize the production of knowledge. For instance, clear founding visions, a core of committed early-stage participants, and strong cultural norms about what constitutes high quality outputs (a good encyclopedia article, for example) help differentiate the projects that grow from those that do not.

Figure 2: Number of active editors with at least 5 edits per month in standard deviation units for 740 of the largest wikis from Fandom/Wikia. The dashed lines represent the results of a LOESS regression. The error bars represent bootstrap 95% confidence intervals.

Strikingly, many larger and longer-lasting peer production projects seem to unfold over time in ways that are similar to Wikipedia. Figure 1 shows how the patterns of growth and decline seen in the contributor base of English Wikipedia are surprisingly general across different language Wikipedia versions. A similar dynamic is evident in Figure 2 which shows the number of active contributors to the largest 740 wikis hosted by Fandom/Wikia over the first five years of each wiki's history.[1] To allow for comparison across communities of different sizes, the y-axis shows a standardized measure of the number of active contributors calculated as a proportion of the monthly average for each community.

Figures 1 and 2 provide evidence of general lifecycle dynamics and undermine two common explanations for English Wikipedia's decline in contributor base. First, wikis with vastly different numbers of speakers/readers around the world all seem to "peak" after a few years. This implies that Wikipedia’s rise and decline is not simply because the community has already written everything or exhausted the pool of potential contributors. Compared to English Wikipedia, Spanish Wikipedia and Hindi Wikipedia contain a small fraction of the number of articles and are edited by a fraction of the number of the language speakers. Still, they exhibit similar patterns of growth and decline. Second, the fact that the communities seem to peak at different points in time makes it unlikely that some single external factor can explain the dynamic. For example, English Wikipedia happened to enter its decline around the time that Facebook grew to massive popularity. But other communities peaked at later dates.

Lifecycle dynamics driven by forces operating within communities offer a simple explanation for the similar patterns across diverse projects, languages, communities, cultures, times, and places. However, if communities have lifecycles, what forces drive them? What determines whether a community finds itself in a growth phase or a period of decline or stabilization? A partial explanation derives from the reasons why Wikipedia and other projects grew in the first place.

Some early scholars and proponents of peer production have suggested that peer production projects like Wikipedia grew because they were organized around deeply "open" institutions. Institutions, according to social scientists, are the written and unwritten rules that govern how people interact in the world. From an institutional perspective, Wikipedia was incredibly open in its early days in that it had porous boundaries and very little in the way of formal rules. Even today, almost anybody can join Wikipedia with no more than a click of the edit button. This openness—a lack of what you might call “organizational structure”—made it much simpler to join the projects and contribute. Many other large peer production projects began as similarly open institutions. By lowering the transaction costs associated with collaborative production, open institutions may have catalyzed the growth of peer production communities, leading to the initial phase of accelerating contributions in projects that manage to attract a critical mass of participants.[2]

Over time, the initial openness of these communities seems to shift. A body of recent research—including a number of studies that we have conducted—suggests that the decline in contributors in both English Wikipedia and a range of other peer production projects has been driven by an increase in newcomer rejection that results in a decrease in newcomer retention.[3] In general, it appears that peer production projects' decline is not primarily a function of a decrease in potential contributors, but of existing community members turning away newcomers at an increasing rate. In other words, Wikipedia and its cousins are declining precisely because they seem to have moved from a more open model of collaboration to one that is more "closed".

Why are peer production communities becoming more closed in ways that cut off the contributions that sustain them? In our own research over the last several years, we have demonstrated that organizational closure typically emerges in reaction to real threats that communities face. In English Wikipedia, these threats include surges in vandalism.[4] Vandalism of Wikipedia articles can be relatively innocent and goofy—like a campaign to edit the actor Jeremy Renner’s biography so that it claimed that he is a velociraptor. It can also be profound in its consequences—like the vicious hoax made by editing the journalist and statesman John Seigenthaler’s biography to suggest he had been implicated in RFK’s assassination.[5] Other types of edits may damage the reputation of the community and degrade the trustworthiness of the content it produces. For example, the outdoor apparel brand North Face worked with an advertising firm to replace images on Wikipedia of notable parks, waterfalls, and other outdoor sites with pictures that prominently displayed the company’s gear.[6] Multiple public relation firms offer services that involve surreptitiously scrubbing Wikipedia pages of true information a client does not want mentioned in public.[7]

Figure 3: Proportion of damaging edits to English Wikipedia based on a random sample of edits drawn from each half-year period from 2001 through mid-2010.

The threat of vandalism on Wikipedia has increased enormously over time. Figure 3 shows estimates of the percentage of “damaging” edits to English Wikipedia over a period of 9 years.[8] The figure paints a stark picture and suggests that the proportion of damaging edits increased enormously from only 5% in 2001 through 2004 to around 30% (between 20,000 and 40,000 damaging edits per day) in 2007 around the time that Wikipedia’s active contributor base peaked. These data suggest that Wikipedia’s increased rates of newcomer rejection should be understood, at least in part, as reactions to very real increases in both the amount and proportion of bad stuff coming in the door.

Again, this pattern repeated in a range of other wikis.[9] This suggests that this rise and decline pattern reflects a durable features of peer production organizations rather than a pathology of Wikipedia. Across many peer production projects, we find early periods of growth followed by increases in attacks and low quality contribution followed by increases in rejections of good-faith and newcomer contributions in response. Declining contributor bases appear as an indirect effect of increases in the need to create, monitor, and maintain quality. Although communities can respond to threats in ways that incur more or less collateral damage, the success of a collaborative project predicated on openness leads to an influx of damaging contributions which leads to difficult decisions that communities must make. Institutional closure provides a way to protect the resources that communities have built.

We believe that dynamics of growth and decline peer production suggest a trade-off at the heart of Wikipedia’s experience over the last twenty years. When Wikipedia was an obscure hobbyist encyclopedia project, there was little incentive to vandalize it. In the early days, there was little incentive for firms like The North Face to launch a surreptitious product placement campaign in its illustrations. In both cases, few people would see any given change and few people would care. Wikipedia’s enormous success created the incentives for vandals and firms like North Face who feel they can co-opt Wikipedia to their own benefit.

Over the last twenty years, Wikipedia has shifted away from a model where it was exclusively focused on building a knowledge base through widespread engagement. Today, that goal remains but it must be balanced against a second goal of maintaining the quality of the knowledge base that Wikipedia has amassed against a set of increasingly determined attacks caused by its growing importance. Where Wikipedia could previously rely on policies of openness, the need to maintain quality means that it has to turn toward new policies of closure and the formalization of boundaries, rules, and routines.

Lifecycle of the peer production model

Peer production includes much more than wikis. OpenStreetMap provides detailed, high quality maps and StackExchange provides hundreds of question and answer sites that resolve thousands of inquiries every day. The products of free/libre open source software (FLOSS) communities, GNU/Linux operating systems run most servers and mobile telephones. In citizen science, the collaborative bird-monitoring database eBird includes contributions from a network of nearly half a million birders.[10] Although we have not studied these communities in depth, some evidence indicates that the lifecycle dynamics we’ve found in wikis extend to many of these other peer production projects as well. For example, in FLOSS communities that predate Wikipedia by nearly a decade, patterns of growth, maturity, and decline are common.[11] This wider ecosystem has shifted away from the sort of open organizations that characterized early-stage peer production.

Until 2009 or so, much of mass collaboration online occurred in peer production communities. Closed alternatives were attempted too, but they rarely succeeded during this period. For example, Wikipedia was preceded by a series of less open forms of encyclopedia projects. They were also much less successful. Wikipedia’s two founders, Jimmy Wales and Larry Sanger, famously created Wikipedia as an experiment that would produce content to feed into a more tightly managed and expert-authored encyclopedia project called Nupedia. With a body of rules designed to ensure high quality articles vetted by experts, Nupedia famously failed to attract more than two dozen articles before folding. Wikipedia succeeded because it was structured openly.[12]

Just as community-based forms of collaborative production took off, experimentation also led to the creation of new organizational forms for building and sustaining information goods in other ways. Many of these forms were inspired by peer production, but are closed in ways that allow firms to maintain control and extract value. For example, Airbnb built the core of its enormous marketplace for housing along very similar lines to a set of “network hospitality” communities that predated it. Both CouchSurfing and Hospitality Club were online communities that predated Airbnb and provided ways to connect people who needed a place to stay with strangers who had an extra bedroom or a couch. Both expressly prohibited monetary exchange.

CouchSurfing and Hospitality Club worked in part because they used a series of techniques like peer-to-peer reputation systems based on interconnected networks of reviews and attestations. Intentionally or not, the creators of Airbnb were able to adapt many of these tools to the context of their venture-funded marketplace for “sharing” residential spaces around the world. Ironically, the communities built on non-monetary sharing began to decline right around the same time that the use of Airbnb and the “sharing economy” took off.

Figure 4: Comparison of yearly sign-ups of trusted hosts on CouchSurfing and Airbnb. Hosts are “trusted” when they have any form of references or verification in CouchSurfing and at least one review in Airbnb.

Taken from a paper published in 2017, Figure 4 uses data on all verified hosts on Airbnb and CouchSurfing based on when they signed up.[13] The two curves show that the number of Airbnb hosts eclipsed CouchSurfing at about the time that CouchSurfing plateaued and began adding new hosts as a decreased rate. While some portion of CouchSurfing hosts departed for Airbnb, we see this as two distinct processes. CouchSurfing appears to have been on the type of rise and decline trajectory described in the previous section. Airbnb supported a market-based form of network hospitality with a much higher ceiling. Airbnb and other market-based players have, in effect, adapted the tools of mass collaboration from peer production. In many cases, they have done so with much more success than their peer production predecessors enjoyed.

The iPhone App Store provides another powerful example. When Apple launched the iPhone in 2007, users could not install applications. The system built by Apple to keep the iPhone clean of non-Apple software was referred to as a “jail” by early iPhone users. Because iPhone users wanted to write and run custom applications on their smartphones, a large portion of them—a minimum of 25% according to one analyst—would “jailbreak” their phones and install a range of custom applications.[14] Many of these applications were released under free licenses and developed in peer production communities. After failing to prevent jailbreaking through a cat-and-mouse game, Apple eventually created their own App Store. The Apple App Store ensures that anybody can create and disseminate apps in a decentralized way similar to what occurred with the peer-produced software for jailbroken phones.

In the App Store, however, Apple put itself in the middle of every transaction. The company sets policies and decides what software will and will not go into the store. Apple also taxes any financial transactions between iPhone users and app developers. Sure, the software development ecosystem of the Apple App Store is “open” insofar as it creates a porous boundary of software development that extends beyond Apple, but it also remains closed in terms of critical questions of control over the ecosystem. Like Airbnb and Apple, firms have learned to enact strategic closure around critical parts of the online communities producing valuable goods and services collaboratively. They harness decentralized creativity, just like peer production communities do, but manage it to ensure that they preserve control and the ability to extract payments.

At a certain point, the growth of firm-controlled models of distributed collaboration and information exchange comes at the expense of peer production. The peer production model that created Wikipedia was the product of a moment where working openly in commons was the only available technique for building the kind of massive, public knowledge repository that is Wikipedia’s goal. In the ways we have detailed, large firms have found ways of harnessing the kind of open collaboration and community-based organization that made Wikipedia successful without placing the products of this collaboration in open commons or distributing decision-making authority over these communities to participants.

The rise and fall of the organizational form of peer production reflects a second lifecycle dynamic. Although the evidence for this second lifecycle is sketchier, we offer one more piece of anecdotal support in favor of it: the emblematic peer production communities were nearly all created before 2010. Linux was created in 1991; Apache in 1995, Wikipedia in 2001, OpenStreetMap in 2004, and StackExchange in 2009.

Mass collaboration and distributed knowledge sharing on the Internet has hardly slowed down. What has changed is the way that it is occurring. If Wikipedia were created today, we think it much more likely that it would have happened in a market. Which is to say that it would not have been Wikipedia at all.

The future

We could conclude our essay focusing exclusively on the gloom-and-doom side of our story. The largest, most impactful, and most storied peer production communities are past their peaks, at best. Some of the most deeply innovative, public-spirited, and transformative parts of the web that helped build invaluable digital infrastructure relied upon by billions cannot hope to compete with venture-funded and for-profit alternatives that have found ways to enclose and extract resources that might have been shared far more widely. These are fair, accurate interpretations that make us fearful for the future and our collective ability to build a world in which everyone enjoys free access to the sum of all knowledge.

At the same time, we find that thinking about peer production lifecycles in the context of Wikipedia’s first twenty years opens up several strategic opportunities. First, a deeper understanding of the patterns of community development that impact Wikipedia can inform policy interventions aimed at sustaining both the encyclopedic resources it has created and the communities of volunteers who have built them. Second, and in an analogous way, thinking about the past and present trajectories of peer production communities more broadly can justify opportunistic investments in the peer-produced resources that currently exist as well as those that might be created. Open access, peer-produced encyclopedias and other forms of public knowledge have enabled follow-on innovations, wealth, and social benefits that Wikipedia’s founders never saw coming.

The lifecycle dynamics we have described do not mean that Wikipedia is destined to die and disappear or that future efforts like Wikipedia cannot thrive in the future. But in that the conditions that allowed Wikipedia to emerge and grow have shifted, reproducing Wikipedia’s past successes will likely require additional resources and different tactics. With knowledge of the value that Wikipedia has produced in hand, it falls on us to tackle the new challenges of sustaining this value. We must develop new ways of balancing this goal with the goal of continued production. It falls on us to preserve the opportunity for similarly extraordinary collaborative efforts of the future.

Acknowledgments: This work was supported by the National Science Foundation (awards IIS-1617129 and IIS-1617468).

Notes

  1. ^ a b The figure is reproduced from Nathan TeBlunthuis, Aaron Shaw, and Benjamin Mako Hill, “Revisiting ‘The Rise and Decline’ in a Population of Peer Production Projects,” in Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems (CHI ’18) (New York, NY: ACM, 2018), 355:1–355:7, https://doi.org/10.1145/3173574.3173929.
  2. ^ Yochai Benkler, “Coase’s Penguin, or, Linux and ‘The Nature of the Firm’,” The Yale Law Journal 112, no. 3 (December 2002): 369, https://doi.org/10.2307/1562247; Yochai Benkler, The Wealth of Networks: How Social Production Transforms Markets and Freedom (New Haven, CT: Yale University Press, 2006).↩
  3. ^ Aaron Halfaker et al., “The Rise and Decline of an Open Collaboration System: How Wikipedia’s Reaction to Popularity Is Causing Its Decline,” American Behavioral Scientist 57, no. 5 (May 1, 2013): 664–88, https://doi.org/10.1177/0002764212469365.
  4. ^ TeBlunthuis, Shaw, and Hill, “Revisiting ‘The Rise and Decline’ in a Population of Peer Production Projects.
  5. ^ John Seigenthaler, “A False Wikipedia ‘Biography’,” USA Today, November 30, 2005, https://usatoday30.usatoday.com/news/opinion/editorials/2005-11-29-wikipedia-edit_x.htm; Katherine Q. Seelye, “Snared in the Web of a Wikipedia Liar,” New York Times: 4, December 4, 2005, https://www.nytimes.com/2005/12/04/weekinreview/snared-in-the-web-of-a-wikipedia-liar.html.
  6. ^ Sarah Mervosh, “North Face Edited Wikipedia’s Photos. Wikipedia Wasn’t Happy,” The New York Times: Business, May 30, 2019, https://www.nytimes.com/2019/05/30/business/north-face-wikipedia-leo-burnett.html.
  7. ^ Rebecca Lefort and Ben Leapman, “MPs Accused of Wikipedia Expenses ‘Cover-up’,” May 8, 2010, https://www.telegraph.co.uk/news/newstopics/mps-expenses/7696484/MPs-accused-of-Wikipedia-expenses-cover-up.html; Michael Cieply, “Wikipedia Pages of Star Clients Altered by P.R. Firm,” The New York Times: Business, June 22, 2015, https://www.nytimes.com/2015/06/23/business/media/a-pr-firm-alters-the-wiki-reality-of-its-star-clients.html; Liz Alderman, “Bell Pottinger, British P.R. Firm for Questionable Clients, Collapses,” The New York Times: Business, September 12, 2017, https://www.nytimes.com/2017/09/12/business/bell-pottinger-administration.html.
  8. ^ Visualization and new analysis of data shared by Halfaker et al., “The Rise and Decline of an Open Collaboration System.”
  9. ^ TeBlunthuis, Shaw, and Hill, “Revisiting ‘The Rise and Decline’ in a Population of Peer Production Projects.”
  10. ^ Team eBird, “eBird 2018—Year in Review,” eBird, December 21, 2018, https://ebird.org/news/ebird-2018-year-in-review.
  11. ^ Charles M. Schweik and Robert C. English, Internet Success: A Study of Open-Source Software Commons (Cambridge, MA: MIT Press, 2012).
  12. ^ Benjamin Mako Hill, “Almost Wikipedia: What Eight Early Online Collaborative Encyclopedia Projects Reveal About the Mechanisms of Collective Action.” in Essays on Volunteer Mobilization in Peer Production (Cambridge, Massachusetts: Massachusetts Institute of Technology, 2013).
  13. ^ Maximilian Klein, Jinhao Zhao, Jiajun Ni, Isaac Johnson, Benjamin Mako Hill, and Haiyi Zhu, “Quality Standards, Service Orientation, and Power in Airbnb and CouchSurfing,” Proceedings of the ACM on Human-Computer Interaction 1, no. CSCW (2017): 58:1–58:21, https://doi.org/10.1145/3134693.
  14. ^ Tom Krazit, “Apple: 250,000 iPhones Bought to Unlock,” CNET, October 23, 2007, https://www.cnet.com/news/apple-250000-iphones-bought-to-unlock/; Ethan Mollick, “Filthy Lucre? Innovative Communities, Identity, and Commercialization,” Organization Science 27, no. 6 (December 1, 2016): 1472–87, https://doi.org/10.1287/orsc.2016.1100.