16/04/2022
Over the past 30 years, the democratization of the internet has made it possible for researchers, journals, and publishers to provide free online access to scholarly papers. This practice, also known as open access (OA), allows anyone with an internet connection to access, read, distribute, and download scientific publications for free with no legal or technical barriers [1]. OA publishing is no longer a marginal phenomenon, thanks to a massive rise in OA mandates [2], the introduction of several new OA publishers and OA options for legacy publishers [3], the creation of open-source software that facilitates the production of publications (such as the Public Knowledge Project), and the rise of OA mega-journals such as PLOS ONE and Scientific Reports [4].
The advantages of OA have been well-documented: increased global visibility [5], higher citation rates [6, 7], and a better use of taxpayers’ money [8]. Several studies have attempted to assess the overall share of OA publications in the scientific literature, with results ranging from 27.9% to 53.7%, depending on the data source and period of investigation [6, 7, 9, 10]. The range of these proportions demonstrate the uncertainty and variability in these numbers. This study aims at providing a comparison of the proportion of OA as represented in two prominent bibliometric databases, Web of Science (WoS) and Dimensions, and assess how the different coverage of these two databases may affect the measurement of OA across different countries.
Data sources
The Science Citation Index (SCI) was originally developed by Eugene Garfield [11] to help librarians and researchers find articles and journals relevant for their work through citation indexing. Since it was impossible to manually index the entire range of journals (~50,000 at the time [11]), only the most cited periodicals were indexed. For decades, WoS remained the main—if not only—source of large-scale bibliometric data. However, over the past 15 years, there has been a multiplication of new data sources such as Scopus (2004), Google Scholar (2004), Microsoft Academic (2016), and more recently, Dimensions (2018). The different approaches to indexation lead to inevitable differences in coverage, which have been well-studied in several previous investigations [12–17].
For instance, Mongeon and Paul-Hus [13] have shown that, compared to Scopus, WoS has a significantly lower coverage of research in all fields, and is also much less likely to index journals from non-English-speaking countries and developing countries [13, 18]. Dimensions has much broader coverage than both WoS and Scopus [16, 19, 20]. This is largely explained by the fact that Dimensions uses Crossref (among other sources) to populate the database and focuses on a single variable for inclusion (i.e., the presence of a Digital Object Identifier (DOI)) rather than on selective criteria (e.g., citations or reputation). Despite the lack of selectivity, there are journal articles not indexed by Dimensions that are indexed by Scopus, due to the lack of a DOI across all publications [16]. However, Dimensions remains—by far—the largest and broadest indexer of scientific documents. It remains to be seen, however, whether the use of this database produces different outcomes in studies of OA.
Country differences in OA practises
Countries differ in the proportion of their publications that are OA [6, 9, 21]. One explanation is merely one of disciplinary differences: there are well-established differences in OA practices across disciplines [6, 22] and countries differ in their disciplinary profiles [23, 24]. Policy can also drive differences, with institutional and government mandates varying in both their scope and intensity across countries [2]. These differences often intersect, in sometimes unexpected ways, with levels of economic development. For example, Iyandemye and Thomas [25] found regional differences in OA publication in biomedicine, with low-income countries and countries in sub-Saharan Africa showing a high percentage of OA publication, moderate OA publication in North America and Europe, and low participation in North Africa and South Asia. They suggested a combination of article processing charge (APC) waivers, self-archiving infrastructure, and funder policies could be contributing to these differences between countries.
The approach used by developing and developed countries for OA dissemination have historically been different [5, 10]. Developed countries tend to make use of repositories, with self-archiving mandates in place at many institutions [26] and funders [2]. These mandates may be supported by corresponding infrastructure, such as the government-funded PubMed repository or institutionally-supported repositories. Repositories are less prevalent in developing countries, as reported by the Registry of Open Access Repositories (http://roar.eprints.org/). Conversely, authors from developing countries tend to make use of OA journals [27] with various initiatives in these countries and regions which specifically focus on supporting local journals and launching OA journals to promote research from their regions. Such platforms include AJOL (Africa), AmeliCA (Latin America), and SciELO (Brazil).
In addition, OA is built on the assumption that internet access is a basic public utility that is reliably and conveniently available to everyone. This flawed assumption places developing countries at a significant disadvantage when discussing, implementing infrastructures to support, and benefitting from OA [28]. For example, in 2018, nearly 75% of the African population did not have access to the internet [29]. This lack of (affordable) internet access sometimes extends to researchers at African universities [30]. This assumption extends to the affordability of OA for researchers. APCs could make it prohibitively expensive for researchers from developing countries to render their articles OA through hybrid OA and APC charging OA journals. Full APC waivers for researchers from low-income countries, as opposed to partial waivers for middle income countries, could also be contributing to differences in OA publication practised [25, 31].