More than 12 million books published in the US from 1923: Somehting to consider in Google Book Settlement debate?
By Paul Biba
More than 12 million books have rolled off the presses in the U.S. since January 1, 1923—a landmark year in the world of copyright, especially amid the controversy over the proposed Google Books settlement. Check out Lorcan Dempsey’s Weblog for more details, including the definition of “book.” An excerpt
The proposed Google Books settlement has created a strong interest in quantifying publications and authors, to get a better sense of the scale of impact. We have been looking at Worldcat and hope to publish an analysis later this year.
Here is an issue that came up this week: how many print books were published in the US since 1923, and how many authors were associated with those books? Here are some numbers, acknowledging that they provide good indications based on the data we have and what we can do with it, not definitive answers.
Print books published in the US in 1923 or later: 12,582,962
Unique personal authors: 3,685,778
Unique corporate authors: 977,679
…Here is the ranked list of the personal authors by number of manifestations published in the US after 1923.
Shakespeare, William 1564 1616
Marsh, Carole
Twain, Mark 1835 1910
Rudman, Jack
Dickens, Charles 1812 1870
Jackson, Ronald vern
Bloom, Harold
Christie, Agatha 1890 1976
Stevenson, Robert Louis 1850 1894
Cowley, JoyAn interesting list; I have remarked on the Bloom phenomenon before.
Here is the ranked list of corporate authors:
society of automotive engineers
american national standards institute
national business institute
national learning corporation
foreign technology div wright patterson afb ohio
national bureau of economic research
sothebys firm
sotheby parke bernet inc
electric power research institute
naval postgraduate school monterey caIt will be seen from the list of corporate authors that our working definition pulls in standards and art catalogs. Remember that we are not counting theses and government documents. This is a reminder that although we may have a common-sense notion of a ‘book’ based on an academic or trade publication, it actually requires some discretionary interpretation to bound the population of books in an operational way for this type of analysis.
And a final reminder: these lists are based on print books published in the US since 1923, not on an analyis of the whole of Worldcat.
The actual analysis was done by my colleagues Jenny Toves and Brian Lavoie.
The proposed Google Books settlement has created a strong interest in quantifying publications and authors, to get a better sense of the scale of impact. We have been looking at Worldcat and hope to publish an analysis later this year.









August 17th, 2009 at 4:13 pm
I think it would be interesting to know how many of the fiction books published since 1923 are available at ANY price or in any form. The unwillingness of publishers to print small runs of some of this lost literature, and the straight jacket of copyright, especially after the author has died, precluding any additional works from that author, makes me glad that there is SOMEONE willing to try and preserve, in electronic format, a portion of what we are losing every year due to neglect, greed, and poorly considered legislation pushed thru congress by corporate lobbyists, who do so without regard to the bigger picture and the impact they have on all of us who love to read.