Introduction. Article 50 million: an estimate of the number of scholarly articles in existence RESEARCH ARTICLE

Similar documents
Evolution of the Development of Scientometrics

THE SUBJECT COMPOSITION OF THE WORLD'S SCIENTIFIC JOURNALS

STM Response to Science Foundation Ireland (SFI) Policy Relating to the Open Access Repository of Published Research

Can Linguistics Lead a Digital Revolution in the Humanities?

Guidelines for the Professional Evaluation of Digital Scholarship by Historians

STI 2018 Conference Proceedings

Technology forecasting used in European Commission's policy designs is enhanced with Scopus and LexisNexis datasets

The impact of the Online Knowledge Library: its use and impact on the production of the Portuguese academic and scientific community ( )

Analysis of Temporal Logarithmic Perspective Phenomenon Based on Changing Density of Information

The impact of the Online Knowledge Library: Its Use and Impact on the Production of the Portuguese Academic and Scientific Community ( )

Re-Engineering the Scientific Publishing Process for the Internetworked Global Academic Community

Global Trends in Neuroscience Publishing Background and Developments

Resource Review. In press 2018, the Journal of the Medical Library Association

Francis Fukuyama s The End of History and the Last Man

Writing for Publication [Video]

WHITEPAPER. Electronic Journal Archives Their Creation, Acquisition, and Use: scientific

Understanding Apparent Increasing Random Jitter with Increasing PRBS Test Pattern Lengths

Ancient Engineering:

Laboratory 1: Uncertainty Analysis

1 NOTE: This paper reports the results of research and analysis

Global Trends in Physics Publishing Background and Developments

Champlain s Legacy. When concerning ourselves with a person s legacy, we are trying to understand

Knowledge abundance and the global network of science

Methods for Assessor Screening

Economic Contribution Study: An Approach to the Economic Assessment of Arts & Creative Industries in Scotland. Executive Summary June 2012

The Effects of 3D Information Technologies on the Cellular Phone Development Process

Statistics and Science, Technology and Innovation Policy: How to Get Relevant Indicators

The Scientist as Consultant BUILDING NEW CAREER OPPORTUNITIES

Mapping Academic Publishing: Locating Enclaves of Development Knowledge

New forms of scholarly communication Lunch e-research methods and case studies

Fairfield Public Schools Science Curriculum. Draft Forensics I: Never Gone Without a Trace Forensics II: You Can t Fake the Prints.

Using Administrative Records for Imputation in the Decennial Census 1

The National Library Service (SBN) towards Digital

Do It Yourself 3. Speckle filtering

Prediction of building entry loss

The study of human populations involves working not PART 2. Cemetery Investigation: An Exercise in Simple Statistics POPULATIONS

Jews in Latvia in : a genealogical perspective. Mag. Theol. Valts Apinis (Riga)

CONSIDERATIONS REGARDING THE TENURE AND PROMOTION OF CLASSICAL ARCHAEOLOGISTS EMPLOYED IN COLLEGES AND UNIVERSITIES

Introduction. amy e. earhart and andrew jewell

How Many Imputations are Really Needed? Some Practical Clarifications of Multiple Imputation Theory

REPORT DOCUMENTATION PAGE

University of Dundee. Design in Action Knowledge Exchange Process Model Woods, Melanie; Marra, M.; Coulson, S. DOI: 10.

CODE V Tolerancing: A Key to Product Cost Reduction

Chapter 3 WORLDWIDE PATENTING ACTIVITY

Article. The Internet: A New Collection Method for the Census. by Anne-Marie Côté, Danielle Laroche

Product architecture and the organisation of industry. The role of firm competitive behaviour

Paper Waste X.1. What is Waste? Components of Waste. How Do I Assign Waste to a Service?

An Investigation of Factors Influencing Color Tolerances

General Education Rubrics

Appendix VIII Value of Crosscutting Concepts and Nature of Science in Curricula

14 th Berlin Open Access Conference Publisher Colloquy session

On Epistemic Effects: A Reply to Castellani, Pontecorvo and Valente Arie Rip, University of Twente

Module-02 Evolution of User Studies

PhD Student Mentoring Committee Department of Electrical and Computer Engineering Rutgers, The State University of New Jersey

UCLA UCLA Historical Journal

Best practices in product development: Design Studies & Trade-Off Analyses

2. Overall Use of Technology Survey Data Report

STRATEGIC FRAMEWORK Updated August 2017

Kenneth Nordtvedt. Many genetic genealogists eventually employ a time-tomost-recent-common-ancestor

Connecting Australia. How the nbn broadband access network is changing Australia. An economic study of the way we work, live and connect.

Lecture - 06 Large Scale Propagation Models Path Loss

Issues in Emerging Health Technologies Bulletin Process

Characterization of noise in airborne transient electromagnetic data using Benford s law

Combining scientometrics with patentmetrics for CTI service in R&D decisionmakings

University of Bristol - Explore Bristol Research. Peer reviewed version Link to published version (if available): /ISCAS.1999.

UNIVERSITY OF CRAIOVA ROMANIA HABILITATION THESIS ABSTRACT

The concept of significant properties is an important and highly debated topic in information science and digital preservation research.

Socio-Economic Status and Names: Relationships in 1880 Male Census Data

TJHSST Senior Research Project Exploring Artificial Societies Through Sugarscape

National Perpetual Access & Digital Preservation CRKN & Scholars Portal

Roswitha Poll Münster, Germany

Environmental Law and Policy Annual Review (ELPAR) Methodology for Trends in Environmental Legal Scholarship

COMPUTER APPLICATIONS

Packaging Science Information Access Policy Clemson University Libraries

The Treadmill Speeds Up.

If These Crawls Could Talk: Studying and Documenting Web Archives Provenance

Durham Research Online

Solutions. Trusted Content to Innovative. From

springer.com The Big Deal A Quest Dr Frans Lettenstrom Director, Library Sales Saloniki November 2011

1. Introduction and About Respondents Survey Data Report

The Next Generation Science Standards Grades 6-8

How New York State Exaggerated Potential Job Creation from Shale Gas Development

Linking Science to Technology - Using Bibliographic References in Patents to Build Linkage Schemes

INTERNATIONAL JOURNAL FOR NUMERICAL METHODS IN BIOMEDICAL ENGINEERING IMPACT FACTOR

Academia. Elizabeth Mezzacappa, Ph.D. & Kenneth Short, Ph.D. Target Behavioral Response Laboratory (973)

Testing a wavelet based noise reduction method using computersimulated

Outlining an analytical framework for mapping research evaluation landscapes 1

1) Analysis of spatial differences in patterns of cohabitation from IECM census samples - French and Spanish regions

International Smoking Statistics. Spain

Publishing date: 23/07/2015 Document title: We appreciate your feedback. Share this document

Abstract. Justification. Scope. RSC/RelationshipWG/1 8 August 2016 Page 1 of 31. RDA Steering Committee

APPENDIX II ANALYSIS BY COUNTRY - CHILE -

Academies outline principles of good science publishing

IXIA S PUBLIC ART SURVEY 2013 SUMMARY AND KEY FINDINGS. Published February 2014

A COMPREHENSIVE DATABASE OF HIGH-QUALITY RESEARCH. natureindex.com. Track top papers Explore collaborations Compare research performance

Engineering Systems Doctoral Seminar. ESD.83 Fall 2011

Increased Visibility in the Social Sciences and the Humanities (SSH)

Modelling Science, Technology, and Innovation

Frank Heymann 1.

CONFERENCE AND JOURNAL TRANSPORT PROBLEMS. WHAT'S NEW?

Transcription:

Article 50 million: an estimate of the number of scholarly articles in existence Arif E. Jinha 258 Arif E. Jinha Learned Publishing, 23:258 263 doi:10.1087/20100308 Arif E. Jinha Introduction From the first model of the modern journal, Le Journal des Sçavans, published in France in 1665, followed by Philosophical Transactions, publishedbythe Royal Society in London later that year, 1 thenumberofactivescholarlyjournaltitles has increased steadily. In 2006 there were roughly 23,750 titles. 2 There are direct correlations between the numbers of researchers, journals, and articles. 3 Björk et al. 2 have argued that changes in the dynamics of literature-based research, provoked by the communications revolution, have made the article itself relevant today as the basic molecular unit of research communication. The correlations are revealed by studies in the past decade on global research output that have reported the growth rate and annual figures for researchers, journals, and articles. 3 6 Researchers retire, but more new researchers emerge. Journals fold, but a higher number are launched. Changes over time in the number of active researchers and journals describe the dynamics of both publishing and research, and the increase in absolute size of active production. 5 However, the article has a static nature that makes it unique as a metric. Articles, once created and published, are rarely destroyed. They can always be reactivated, and through citation each article occupies a position in the architecture that researchers can continue to build upon. The article is born essentially through the efforts of journals and their publishers, but articles survive the death of journal titles. Although disciplines develop distinct fields of inquiry, there are ultimately no fixed boundaries in scholarship this is a single system of documented written knowledge. Therefore a metric that describes the quantitative whole of this system the global total of all modern scholarly journal RESEARCH ARTICLE Article 50 million: an estimate of the number of scholarly articles in existence Arif E. JINHA Faculty of Post-Doctoral and Graduate Studies, University of Ottawa ABSTRACT. How many scholarly research articles are there in existence? Journal articles first appeared in 1665,and the cumulative total is estimated here to have passed 50 million in 2009. This sum was arrived at based on published figures for global annual output for 2006, and analyses of annual output and growth rates published in the last decade. Arif E. Jinha 2010

Article 50 million: an estimate of the number of scholarly articles in existence 259 articles in existence at the present moment or at any point in time can be useful as a starting point for research into the structure of the system itself. Further, getting better estimates of the global volume of research can enable information scientists to achieve a great deal. Such estimates allow them to map the geography of knowledge production, identify routes to retrieval of articles and extract content, while ensuring its preservation and its availability for use. This paper presents an estimate for the total number of all peer-reviewed articles published worldwide since 1665. Included is a replication of earlier studies showing the current numbers of active journal titles, reported here for the year 2009. Literature review Inquiry into the scope of production of scholarly articles through peer-reviewed journals and the universe of journal titles and articles has never been precise. However, several works exist that attempt to quantify global output of scholarship dating to the post-war Big Science period as well as more recent works from the 1990s until the present. In 1963, Derek de Solla Price plotted the growth of journal titles from 1665 to 2000 and predicted that an astronomical 1 million journal titles would exist by 2000. Price also identified key relationships between research investment, the numbers of researchers, and the numbers of journal titles, abstracts, and articles. These relationships have been carried forward in more recent research. Estimates of the numbers of journal titles worldwideweremadebykinget al. in 1977 at 57,400 and in 1995 by Meadows and Singleton at 70,000 80,000. 7 More recent research reports figures that are far more modest than these earlier estimates and predictions. 7 Mabe and Amin explained in the introduction to their 2001 paper that improvements to Ulrich s 8 system of classification allow for more realistic estimates, 5 and Mabe followed up in an 2003 article with an argument for a novel approach based upon this. 3 Earlier estimates are considered high because researchers were unable to differentiate peer-reviewed journals from other periodicals and could not differentiate active journal titles from those that had closed. Significantly, the growth rate cannot be taken as exponential or cumulative as Price had assumed, and this canexplainwhywedonothaveanything like 1 million journal titles today. 3,7 Mabe 3 used search terms in Ulrich s classification system to filter in scholarly, refereed and active journal publications, as well as the ANDNOT functionforseveralterms that disqualify a database resource from being included in the definition of a scholarly/scientific journal. From this, Mabe estimated the global number of journal titles to be 14,694 in 2001. Mabe also followed up on Price s identification of the relations between the numbers of researchers, journal titles, and articles to test how reasonable the estimate happened to be. This was done by identifying the number of titles indexed in the ISI database and applying Bradford s law to estimate the quantity of non-isi indexed journals. The second method produced an estimate of 16,000 titles. 3 Therefore the first estimate that defined the parameters more accurately and included only active, refereed, and scholarly journal titles can be traced to this study. Tenopir and King 6 estimated a global annual output of 1 million articles at the turn of the twentieth century, an estimate based on empirical data on the number of active researchers and the average research output per research author. Björk et al. produced an estimate of 23,750 journal titles for 2006 using the method introduced by Mabe, 3,5 although Björk et al. did not include the AND NOT filter in their study (this does not appear to be necessary in any case). The authors were then able to make the first estimate of global annual output of articles. In order to do this, they distinguished ISI-indexed titles, which as a rule produce more articles than non-isi titles, and then determined the average number of articles per title for each category (by an indirect method for ISI described in the study, and by statistical sample of non-isi titles). Calculating the sum of titles multiplied by the average number of articles per title for each category gives an estimate of 1,346,000 (rounded) articles for 2006. Incidentally, inquiry into the scope of production of scholarly articles has never been precise

260 Arif E. Jinha Ulrich s remains the most comprehensive database for determining worldwide though ISI titles represent 36% of the total number of active journals, ISI articles represent 70% of the total number of articles. 2 At the time of revising this article, a lively discussion occurred on the American Scientist Open Access forum regarding the wide variance in estimates, demonstrating the ongoing difficulty in getting precise numbers. 9 Morris 10 discusses the limitations of relying on Ulrich s database, stating: [T]he directory s publishers are entirely reliant on the information supplied by the publishers of the journals listed therein. New journals are often not listed immediately. There can therefore be no hard-andfast guarantees as to the completeness, currency, or accuracy of that information. (p. 299) Concern was expressed on the AMSCI forum that a greater number of smaller journals, particularly those published in languages other than English and those published in developing countries, would be more likely to be excluded, leading to both a skewed view of the universe of academic publishing and an underestimation of its size. 9 This issue was discussed further by Tenopir and King 7 in their recent book The Future of Academic Publishing. However, aside from embarking on a manual method of counting titles, Ulrich s remains the most comprehensive database for determining worldwide totals and the most sensitive to filtering for key distinctions such as active, refereed and scholarly titles. 7,10 Moreover, results from Ulrich s have been consistent with what we understand about the relationships between the numbers of researchers, titles, and articles as well as the growth rates. 2,3 Included in this study is a replication of the basic method of searching Ulrich s to determine the number of active journal titles in 2009. From this figure, we can produce an estimate for global annual article output assuming no great change in the average number of articles per title for ISI and non-isi titles since 2006. Methods The estimate is based on the measurement of global output of scholarly output in 2006 reported by Björk et al., 2 and rests on the assumption that Mabe, 3 Ware, 4 Mabe and Amin, 5 and Tenopir and King 6 are correct in reporting a steady increase in the number of researchers, journals, and articles over three centuries. While the average rate of increase in the number of journals is reported by these authors to be 3.26%, Ware reports a growth in article output of roughly 3% per year. 4 Thisproducesadoublingtimeofjust under 24 years. We chose 1726 as the initial year for our calculations, because it corresponds to the beginning of the line of steady growth of journalsshowninfigure1(reproducedfrom Ulrich s Periodicals Directory 2001 in Mabe 3 ). Figure 1. Number of journals launched per year Source: Ulrich s International Periodicals Directory, reproduced with permission from Mabe. 3

Article 50 million: an estimate of the number of scholarly articles in existence 261 The literature reports steady growth for over twotothreecenturies. 3,4 The earliest period of publishing, from 1665 to the middle of the 18th century, shows less predictable growth. Ulrich s does not produce results for the number of journals as far back as 1726 so this method cannot be applied to determine the start figure. It appears reasonable to start 1726 with a number greater than zero but negligible to the global quantity today. Additionally, the number of articles can be set to a figure that produces closely matched results to estimates for global annual output in the past decade when the 3% growth curve is applied, a form of backward mapping. When the number of articles for 1726 is set to 344, the curve corresponds closely to Tenopir and King s 6 estimates of annual output at the turn of the millennium as well as to the estimate by Björk et al. 2 for the number of articles in 2006. This was done using an Excel spreadsheet. The estimated annual output for each year from 1726, and the cumulative total is given in the online Appendix. Mabe 3 reports that journal growth experienced its largest year-to-year increase during the Big Science period from 1946 to 1976, with lower-than-average rates before the Second World War and after 1976. The author used the same multipliers for the changes in the journal growth rate to adjust the article growth rate and calculated a sum taking into account this variability. Since the results for both calculations were almost identical, the more straightforward calculation based on the average growth rate is reported here. Results This method yielded an estimate of nearly 50 million articles by the end of 2008, with the figure expected to pass 50 million in 2009. In actual fact, the year that the sum of all scholarly articles passes 50 million cannot be determined precisely, but we can report that this result is the first estimate to follow from the current evidence for the grand total of all scholarly articles that exist at the time of writing in 2009 and publishing in 2010. In good humour, the author can make the claim that this article could itself be the 50 millionth to be published in history! Estimated total at 31 December 2008 = 49,234,626 Estimated total at 31 December 2009 = 50,712,009 Replicating Björk et al. 2 and using the search terms Academic/Scholarly, Refereed, and Active in Ulrich s, the total number of active journal titles for the year 2009 is 26,406. Assuming little change in the proportion of ISI titles (36%) and using the averages given by Björk et al. for articles per title in ISI and non-isi journals, the total number of published articles estimated for 2009 by relationofjournaltitlestoarticlesiscalculated by estimating the number of ISI titles and subtracting that from the total number of titles for 2009, then multiplying each category by its respective average number of articles. The average number of articles per title reported by Björk et al. for ISI titles is 111.7; for non-isi, the average number of articles is 26.2. To determine the number of ISI titles, we multiply the total number of titles by 35%. 26,406 0.36= 9506 (i.e. the estimated number of ISI titles). To determine the number of ISI articles, we multiply the number of titles by the number of articles per title for ISI journals. 9,506 111.7 = 1,061,820 (i.e. the estimated number of ISI titles). To determine the number of non-isi titles, we subtract the number of ISI titles from the total number of titles. 26,406 9,506 = 16,900 (i.e. the estimated number of non-isi titles). To determine the number of non-isi articles, we multiply the number of non-isi titles by the number of articles per title for non-isi journals. 16,900 26.2 = 442,780. To determine the annual global output of articles for 2009, we sum the number of ISI and non-isi articles. this method yielded an estimate of nearly 50 million articles by the end of 2008

262 Arif E. Jinha Figure 2. Estimated annual global research article output at 3% annual growth *Year 1985 (2009 25 years): the doubling time for annual output for articles of just under 24 years; **1999: corresponds to estimates by Tenopir and King, 6 for research output in the late 1990s 1 million articles per year; ***2006: corresponds very closely to Björk et al. s, 2 estimate for 2006 1.35 million articles; ****2007: corresponds closely to Ware s, 4 estimate for the same period 1.4 million articles per year. 50 million peer-reviewed journal articles is an impressive heritage, and a powerful resource for humanity 1,061,820 + 442,780 = 1,504,600. Summing these, the global output of articles in 2009 is 1,504,600. This data is shown in Figure 2. The total number articles estimated for 2009 using the 3% growth curve and produced in the Excel table is 1,477,382. The difference between the estimate by the method employed by Björk et al. and that produced by the growth curve is less than 2%. Discussion and conclusion The estimate of the global total of scholarly articles that exist is clearly a ballpark figure, rather than a precise number. However, the study of the size, growth, and composition of a global body of scholarship has moved forward in this decade. We can better determine global annual output of scholarships through our understanding of (i) the relationship between numbers of researchers, journals, and articles; (ii) the year-to-year growth rates for the number of active titles and the number of published articles; (iii) the relationships between ISI and non-isi journal titles; and (iv) the improvements to Ulrich s classification system. However, further investigation is needed to test the robustness of each of the relationships and indeed the comprehensiveness of Ulrich s database. 50 million peer-reviewed journal articles is an impressive heritage, and a powerful resource for humanity. In order to manage such a resource in a way that is equitable, useful and sustainable, we would do well to take ongoing interest in where we stand in terms of the access, digitization, search and indexation, and preservation of this global library of knowledge. Appendix The calculated data for annual and cumulative article totals are available online: http://dx.doi.org/10.1087/20100309 Acknowledgements The author would like to acknowledge the assistance of Azim Jinha with the calculations, and the advice and editorial help of Robin Beecroft (Searchligher) and Moustapha Diack. References 1. Brown, H. 1972. History and the learned journal. Journal of the History of Ideas, 33: 365 378. http://www.jstor.org/stable/2709041 2. Björk, B., Roos, A. and Lauri, M. 2008. Global annual volume of peer reviewed scholarly articles and the share available via different open access options. Proceedings of the ELPUB2008 Conference on Electronic

Article 50 million: an estimate of the number of scholarly articles in existence 263 Publishing, Toronto, Canada, June 2008. http://oacs. shh.fi/publications/elpub-2008.pdf 3. Mabe, M. 2003. The growth and number of journals. Serials, 16: 191 197 4. Ware, M. Scientific Publishing in Transition: An Overview of Current Developments. Bristol, Mark Ware Consulting, 2006. 5. Mabe, M. and Amin, M. 2001. Growth dynamics of scholarly and scientific journals. Scientometrics, 51: 147 162. http://dx.doi.org/10.1023/a:1010520913124. 6. Tenopir, C.W. and King, D. W. Towards Electronic Journals. Washington DC, Special Libraries Association, 2000. 7. Tenopir, C.W. and King, D.W. 2009. The growth of journals publishing. In Cope, B., and Phillips A. (eds), The Future of the Academic Journal. Chandos Publishing/Woodhead Publishing Ltd. ISBN 1 84334 416 5. 8. Ulrich s Periodicals Directory (Ulrichsweb.com). Ulrich s has been a global source of periodicals information since 1932. 9. American Scientist Open Access Forum. 2009 Archives. See discussions with subject line Number of Scholarly Journals in the World. http:// listserver.sigmaxi.org/sc/wa.exe?a1=ind09&l=american-scientist-open-access-forum&f=l 10. Morris, S. 2007. Mapping the journal publishing landscape, how much do we know? Learned Publishing, 20(4): 299 310. http://dx.doi.org/10.1087/095315107x239654 Arif Jinha 179 Daly St, Apt. O Ottawa, ON, Canada K1N 6E8 Email: arif@stratongina.net Website: www.stratongina.net