Potential weeding candidates among 19th and early 20th century books

In a previous blog post, we looked at the characteristics of 19th century and early 20th century books, those published between 1800 and 1923. The inquiry was inspired by our discussions with Professor Andrew Stauffer, whose work with the Book Traces project focuses on this population of titles.   From this project's perspective the concern is over how many of these titles might get weeded before there is chance to review them for marginalia and other traces of use. 

Among those titles encountered in SCS projects the median U.S. holdings level is 18 holdings; the average U.S. holdings level is 33 holdings.  Our sample included 658,224 unique titles published between 1800 and 1923 and drawn from a pool of 1,764,448 title holdings.

In response to this blog post a follow-up question was posed: what percentage of these titles might be considered withdrawal candidates in a typical SCS project?

In a typical weeding project, two primary factors are looked at: 1) Has the title circulated? and 2) It is widely held?  Or in other words, is there a good likelihood we can get this title if it is needed at a future date?  Other factors come into the mix, but to keep things simple we will focus on these two attributes.

When we looked at the circulation rate for each unique or distinct title, we found that 64% had one or fewer charges on average; 52% had zero charges on average.  As for how many are commonly held, we see 20% are held by more than 50 U.S. libraries while 7% are held by more than U.S. 100 libraries.

If we combine these factors we see that the percent of titles that could be considered withdrawal candidates ranges from one percent on the conservative end and 12% when more liberal withdrawal criteria are used. 

Avg Circs = 0 Avg Circs <= 1
> 100 US Holdings 1% 4%
> 50 US Holdings 4% 12%

n = 658,224 unique titles published between 1800 and 1923

These are the percentages if we look at each distinct title on it's own.  If, however, we look at all the holdings in these client libraries the percentages are higher as they reflect that fact that there are more copies of widely held titles in the mix.   Here the percentage of holdings that could be considered withdrawal candidates ranges from 18% on the conservative end to 39% on the aggressive end.

Circs = 0 Circs <= 1
> 100 US Holdings 18% 24%
> 50 US Holdings 30% 39%

n = 1,764,448 holdings of titles published between 1800 and 1923

In both cases, we're talking about titles at risk of being weeded.  In the first case, we're quantifying the percentage of unique titles that might be weeded;  in the second we're quantifying the percentage of holdings that might be weeded.   

The question remains as to whether libraries will treat these 19th and early 20th century titles any differently than newer titles when it comes to deselection.  For those that don't have ready access to WorldCat holding counts, I expect there would be a presumption that these titles are scarcely held and thus a reluctance to weed them. Also, to the extent that these titles are held in special collections, they are unlikely to be weeded.   Finally, although 44% of the titles in our sample are freely available in the Hathi Trust Digital Library our clients seem reluctant to use this fact to inform their weeding efforts. 

A closer examination of 19th and early 20th century books

During the ALA conference this year we saw an interesting presentation about the Booktraces project, which focuses on capturing artifactual usage details from 19th century and early 20th century books.  Andrew Stauffer from the University of Virginia demonstrated compelling examples of notes and inscriptions that were found in library books held by the university, examples which shed light on the significance of these books to those that used them.

Professor Stauffer has encouraged crowdsourced contributions to the project web site in hopes of preserving these details before they are lost.  With space at a premium in campus libraries, and with weeding efforts targeting low-use titles for discard or transfer to storage, such an effort seems timely indeed.  Among the questions this effort raises, is the one of scarcity.  How widely held are these titles? 

From the standpoint of the Booktraces project, titles that are flagged as widely held are more likely to be found on a weeding list and thus more at risk of not being examined for useful marginalia.  This assumes that libraries are using WorldCat holdings levels to inform their weeding efforts—something that we, of course, encourage.  Our clients typically regard a title as widely held when they have at least 50 U.S. holdings, though often  this threshold is set at 100 holdings or higher.  

To satisfy our curiosity on this matter we looked at data from 120 client projects run over the past two years.   All but 2 of them were academic libraries, ranging from community colleges up to ARL libraries, with the majority being mid-sized institutions.  Within this sample population of libraries we gathered information on 658,224 unique titles published between 1800 and 1923. Here is a graph showing the distribution of these titles by how widely held they are in the U.S. 

Nineteen percent of these titles are held by more than 50 U.S. libraries;  seven percent are held by more than 100 U.S. libraries.    The median value is 18 U.S. holdings.

This holdings level varies by decade of publication, with earlier works being less commonly held.  The following set of boxplots shows the distribution of these titles by holdings level per decade.  

The top and bottom of the blue boxes represent the 3rd and 1st quartiles respectively while the red line represents the median holdings level.  The width of the blue bars is proportional to the number of titles for that decade.  Note that the 1920s are a partial decade (1920-23).

Also notable about these 19th and early 20th century titles is the fact that 44% of them are digitized as public domain titles and made available in the Hathi Trust Digital Library.   As a secure, accessible digital surrogate – a Hathi Trust version can signal to a library that it is safe to deaccession that title.  In practice, however, our clients rarely use this criterion to identify safe withdrawal candidates, preferring instead to rely on physical copies nearby or held by consortial partners.

The chart below shows the number and breakdown of these titles by decade of publication and Hathi Trust status.

A follow-up blog post discusses a follow-up question:  What percentage of these titles would be be weeding candidates in a typical SCS project? 

Talking With Faculty About Library Collections (Revisited)

In the course of our work at SCS, we regularly visit campuses to talk with teaching faculty about “Rethinking Library Resources: The Role of Print Collections in a Digital Age.” I wrote about one such session more than a year ago, and have subsequently done another half-dozen. Listening to faculty views on the use and future of print book collections is invariably interesting, and vital to our thinking and actions. Not surprisingly, these discussions about the changing value of local print collections reflect a much broader dialogue about the changing nature of higher education.

Well-attended faculty session at US Naval Academy

Well-attended faculty session at US Naval Academy

In recent conversations at Trinity University in San Antonio and the US Naval Academy, there was strong representation from the Humanities disciplines and robust exchanges about browsing, serendipity, the limits of data, changing student behavior, and how the library is valued on campus.

In these sessions, SCS presents profession-wide data and trends, and makes the case for engaging in deeper analysis of print book collections, and for considering the full range of deselection decisions: retention, preservation, storage, sharing, and withdrawal. For these two libraries, we had already completed a preliminary analysis of their respective collections. We were able to talk specifically about circulation rates, subject dispersion, holdings among designated peers, their collection’s match rate against Hathi Trust, and other factors. We were also able to frame each library’s collection characteristics in relation to our SCS Monographs Index, which profiles aggregate and average collection attributes across all the projects SCS has completed to date.

A few themes emerged which are worth highlighting. These observations are supplemented by insights from librarians at Connecticut College and Wesleyan University, as presented in earlyNovember at the Charleston Library Conference.

  • The limits of data: Most libraries have reasonably good historical data on circulation. On average, we see 15 years’ worth of total checkouts, a significant subset of which will also include the last checkout date. Some libraries also record in-house use via re-shelving counts. Depending on how it is tracked, reserve use might also be captured. Together, these provide the best available picture of collection use. But there is a strong conviction among faculty that these measures under-count actual use. They argue that circulation data is not comprehensive. They believe that in-stacks use is much higher than browsing statistics reflect. Faculty often re-shelve the books they use, and assume others do the same. Some assert the value of “negative use”, in which titles they reject or bypass in browsing help lead them to what they actually use. And, no matter how good the use data, many faculty members believe that use is not a legitimate indicator of a title’s value. Every book has intrinsic value, irrespective of use.
  • Whose use?  Like all of us, faculty tend to view things through the prism of their own experience. They think about how they now use the collection, and more importantly, how they used the collection when completing their Ph.D’s. To some extent, they project that experience onto other users. Undergraduates should be working in the stacks, and using print books. But at the same time, undergraduate use (or non-use) of the collection is viewed as an unreliable indicator; “we should not be basing collections decisions on the behavior of 18-year olds.” Some faculty have argued for weighted usage statistics, in which use by a graduate student or a faculty member counts more than use by an undergraduate. Here again, some valid points, and perhaps an argument for incorporating ‘patron type’ into the data, something that is not typically done.
  • The role of browsing: Again, people are often recalling their own research experiences: how they used the arrangement of books in the stacks to get an overview of a discipline. There are many stories about serendipity, and how browsing can broaden or focus an inquiry. These are certainly legitimate points, but always seem to loom disproportionately large. Physical browsing has always been a partial research strategy. It is limited to the books held (and not currently checked out) in a given library. Subject collections can be dispersed across multiple buildings, floors, or classification schemes. Books are only one format—it is still necessary to look at journals, e-resources, government documents. And a modest weeding or shared print project does not eliminate the stacks – browsing is still possible. I have taken these issues up in more detail in entries on “Browsing Now” and “Browsing Now (2)”, and “Virtual Browsing.”
  • Getting students into the stacks: Among Humanities faculty in particular, there remains a strong desire to assure that students get into the stacks and experience the riches of the print collection. Faculty acknowledge that this is an uphill battle, but continue to exhort and sometimes design assignments that require use of print books. They believe that students will produce better work if they are required to push beyond the convenience of electronic resources. Underlying this is a sense that academic standards are slipping, that student work is less substantive and nuanced than it should be, and that print books compel focus and reflection in a way that online resources do not.
  • Our undergraduates are different:There are many variations on this theme, depending on the identity of the institution. But every session includes some mention that students on this campus use more print than average. The reasons differ: they are high achievers, they are confined to campus and cannot visit other libraries, they want to use print even for last-minute work. This is impossible for an outsider to judge, but is always anecdotal, and often seen very differently by the librarians. And these behaviors are rarely put into context: what proportion of users walk through the library's front door as compared to the proportion that enters through its many virtual front doors.
  • What users want or what is good for them?:  There is a real philosophical divide here. Should the library (and the faculty) give students what they want or what we think they need? Many faculty seek to guide or encourage students toward more thorough, reflective work--and that is often construed as toward print books. There is often a sense that we are making it too easy for students, allowing them to bypass the richness of our print book collections in favor of the convenience of Google or online resources. We need to force them to dig deeper, work harder -- not just give them what they want. Student work and learning will suffer otherwise.
  • Digital surrogates are not sufficient:Not surprisingly, many faculty members are unfamiliar with Hathi Trust, and its crucial role in securing the scholarly record. Once they learn about it, there is a tendency to construe it as an access path, rather than a preservation solution. One factor comes through loud and clear: any book-length digital surrogate, no matter how secure or accessible, pales in comparison to having a print copy nearby. There is a fertile discussion here about the limits of onscreen reading, enhancement of discoverability, and copyright distinctions.
  • Concern about the library:  Faculty often feel and act protectively toward the library. Transformation of stacks space into collaborative study areas, information commons, or teaching/learning centers is sometimes met with skepticism: what does that have to do with the “real” library function of connecting users and resources? There is often a suspicion that any space freed in the library may be redirected to non-library uses: admissions, development, welcome center, etc. On the positive side, this indicates the value they place on the library.  
  • Being informed, being heard:  Like any constituency, faculty are susceptible to sensing that things are happening that they are not privy to. They want more information and context on library (and campus) direction. They want to register their concerns and express their support. Sometimes they want to lament the pace of change and the perceived erosion of standards. They want to assert the value of traditional approaches to teaching and learning, and they want to hear and discuss the future of the library. These are difficult issues and difficult choices. It's important to take these questions seriously, and to spend the time necessary to listen and respond, even to the extent of one-on-one meetings with those with the strongest opinions. It won't always be possible to persuade or to accommodate, but it is important to have the dialogue.
Listening to Faculty:  the Presenter's View

Listening to Faculty:  the Presenter's View

Listening to Faculty: the Presenter's View

What's confusing is the mixture of rational and emotional elements in this discussion. All of us who work in this area have some uneasiness about the changes we're trying to manage. At some level, the debate around print collections is part of a conflict of values that is being played out at all levels of academic institutions: how to weigh the needs/demands of current users with the traditions and values of the academy; how to make the difficult choices around cost of and participation in services; how education should work versus how it is actually working. Teaching, learning, and scholarship are all being rocked by changes in technology, pedagogy, and publishing/access models. It’s not surprising that these fault lines also run through the print collection.