Accessing the collection of a large public library: an analysis of OPAC use

Despite widespread use of Internet search engines, the online catalogue is still the main pathway to the collection of a particular library. The use of Internet search engines does, however, have implications for user expectations around the online catalogue, and search strategies when using the online catalogue. There is much research on online catalogue use that predates search engine use, and there is a need for more up-to-date research, particularly on the use of online catalogues in public libraries. This paper reports on an analysis of transaction logs of end users of the online catalogue of a large public library in Australia, the State Library of Victoria. It compares searches over four years, taking into account the search settings and search strategies and looking at search success, including the reasons for search failure. The paper also introduces the concept of abandonment rates to online catalogue search, defining a metric that adds to the useful information that can be determined from transaction logs. The paper uses the findings as the basis for its concluding recommendations for how public library users can be assisted to find what they are looking for on the library catalogue.

Search engines such as Google enable users to find highly relevant web sites by simply typing a few relevant words into a text box. Research conducted since the rise of search engine use suggests that people use the principle of least effort in their information seeking (Bates, 2003), favouring convenience (Tenopir, 2003) and ease of use (OCLC, 2005) when choosing amongst electronic information sources. The OPAC is convenient in that it is possible for users with a connection to the Internet to access the catalogue at any time, from any place. Even so, much research shows that people start their information searches with a search engine, and in particular with Google (see Markey, 2007 for a summary). Although there are many search engines available, Google is by far the most used. In 2008, Google's market share was almost 90% in the UK (Hitwise UK, 2008) and Australia (Hitwise Australia, 2008), 80% in Europe (Comscore, 2008) and more than 70% in the United States (Google nears 72 percent of U.S searches in November 2008November , 2008. Users of search engines include searchers who are looking for information on the Internet as well as searchers wanting to find information contained in books. As of May 2008, a user who finds an item through Google Book Search can also link directly to the catalogue record of the closest Online Computer Library Center (OCLC) member libraries that have that item in their collection (OCLC, 2008). Furthermore, with many libraries making their collection metadata progressively available to Internet search engines, links to a library catalogue or collection can now appear in search engine results. This means that the user of a search engine can arrive directly at a library's digitized materials, without consciously using the catalogue. However, there are searchers who want to access the collection of a particular library; they may want to see what the library holds on a particular topic, or they may want to obtain a physical copy of a specific book. Unless these searchers are in the library and want to browse the shelves, (assuming the collection is open access), these searchers will have to use the library catalogue. In other words, despite the predominance of search engine use, the library catalogue is still the most convenient way to access the collection of a particular library. While there are predictions that Web 2.0 technologies will render the library catalogue irrelevant (Weinberger, 2007), or at least indistinguishable from a search engine (Calhoun, 2006), there are also arguments that search engine or metasearch results will never be an adequate substitute for a structured catalogue (Mann, 2007). Whatever eventuates, the library catalogue is likely to be around for at least a few decades (Calhoun, 2006).
The rise of search engines like Google does, however, have implications for online catalogue use. Students find using Google easier than using a library catalogue Marcum, 2006) and experience in using Google has an effect on a searcher's expectations of using the library catalogue (Novotny, 2004). It has been found that online catalogue end users have higher expectations about the relevance of catalogue search results and the display of results given their experiences with sites like Google and Amazon (Calhoun, Cantrell, Gallagher, & Hawk, 2009). It is also possible that Google might be affecting search strategies. Internet search engine users tend to enter only a single search term (Spink, Wolfram, Jansen, & Saracevic, 2001); almost two-thirds (62%) of search engine users do not go past the first page of results and less than 10% go past the first three pages of results (iProspect, 2006).

Research on OPAC use
In the first years of the OPAC, librarians were the main users. Since the 1980s, however, end-user searching has become more prevalent (Hsieh-Yee, 1993). One of the consistent findings of the literature on online catalogue use is that, despite improvements in user interfaces and the display of results, a high proportion of end users receive zero hits for their searches. Failure rates reported in the literature tend to hover either side of 50%. According to Bates, "few exact match rates top 50 percent, and many are lower. Zero match cases are high" (2003, p. 14). For example, in a 1989 study, in a visual analysis of transaction logs from a university library, Peter (1989) identified failure rates of around 40%. In a 1991 study of a university catalogue, Hunter (1991) identified failure rates of 54%. In 1999, Nordlie (1999 identified failure rates of 45% and, more recently, in analysing one semester's worth of OPAC transaction logs from a university library, Lau and Goh (2006) identified failure rates of 49.5%.
Bates explains the high failure rates from the searcher's perspective: The typical library catalog functions as a black box for the searcher. That is, the searcher has to produce a search phrase with no direct help from the system. The phrase is entered, then the delphic system responds with a match or a failure, seldom with any guidance on what to search for instead (2003, p 16).
In addition, few end users know that the catalogue has a controlled vocabulary. Hence, they formulate a search query in the same way that they might formulate a query for a search engine. Yu and Young (2004) analysed the reasons for zero hits in 2003 and found that using uncontrolled vocabulary in a keyword search accounted for 70% of zero hits. As Large and Beheshti have so diplomatically said, "the arcane nature of the Library of Congress Subject Headings eluded a majority" (1997, p 111). Many public library users still find catalogues "baffling" .
Certain characteristics of the end user have been shown to be relevant to the level of success when searching an OPAC. These include knowledge and experience with library catalogues in general, as well as the one under study (Borgman, 1989). More experienced users are more likely to include synonyms, and manipulate the search terms (Hsieh-Yee, 1993). The time available to search and knowledge of the subject being searched are also relevant (Borgman, 1989), while differences according to age or gender are inconclusive (Tenopir, 2003). In addition, people use a library catalogue differently according to whether their searches are work-related or private (Tenopir, 2003).
As well as the user characteristics and search strategies, other variables that affect the success of catalogue searching are library settings and the system itself (Large & Beheshti, 1997). In terms of library settings and users, most analyses of online catalogue searches tend to be undertaken in academic libraries. For example all of the major studies and more than 95% of the 200-odd smaller-scale studies reported in Tenopir (2003) were conducted with university staff or students. As Nordlie (1999, p. 12) points out, an important difference between users in academic libraries and public libraries is that users of public libraries are more likely to use the catalogue irregularly, resulting in most users remaining "permanent novices" with regard to the online catalogue.
The ubiquity of search engine use makes it important to have up-to-date information on how end users are using online catalogues. Ironically, however, there seems to be less research on OPACs in recent years because of the focus on search engine use. As with older studies, recent studies tend to also be undertaken in academic libraries (for example, Dinet, Favart, & Passerault, 2004;Lau & Goh, 2006;Moulaison, 2008;Yu & Young, 2004). This paper addresses this gap in the literature by analysing the transaction logs of a large public library, the State Library of Victoria. While a 2007 survey of onsite users of the catalogue under study had concluded that this catalogue was an effective tool for accessing the collection (Hider, 2008), transaction log analyses measure what the end user actually does and are a useful complement to what end users say they do (Calhoun et al., 2009). While Hamilton and Thurlow (2005) report on an analysis of a large public library, the State Library of Queensland, that study is confined to comparing search type with resulting hits and it is not clear that it excludes the logs of staff at the State Library of Queensland.
The analysis reported on here excludes the searches of staff at the State Library of Victoria and provides a longitudinal perspective by comparing searches over four years. It takes into account the search settings (whether the user is using the catalogue inside the library or remotely), and search strategies (search type, limits placed, session duration and number of searches) to look at search success (failure rates, reasons for failure, abandonment rates).

Methodology
The State Library of Victoria, Australia (SLV or State Library) is the major reference and research library in Victoria, responsible for collecting and preserving Victoria's documentary heritage and making it available through a range of services and programs. Most of the SLV collection is closed access, so most items can only be accessed via the catalogue. The Main catalogue contains records for books, magazines, newspapers, electronic books and journals, websites, video recordings, music, maps and oral history. By 2008, this catalogue had more than 1.3 million items. Until late 2009, the system used by the State Library was a Voyager (Ex Libris) catalogue. It had been in use by the SLV since 2001 and was running Version 6 by May 2008 1 .
In September-November 2008, analysis was undertaken on logs of searches on the Main catalogue of the SLV. Transaction log analysis is a well-established method for studying use of an OPAC (Bates, 2003;Large & Beheshti, 1997). As has been noted elsewhere, transaction logs have an advantage over studies undertaken in experimental sessions; for example they enable the study of entire populations of users, they are unobtrusive, and do not affect user behaviour. The main limitation of analysing transaction logs is that it is impossible to know whether the user found what they were looking for or was satisfied with their search. Ultimately interviews and observations of searchers also suffer this limitation as relevance can only really be judged after the user has tried to use the retrieved document (Large & Beheshti, 1997).
The extracts analysed contained logs of all searches undertaken in August 2005, August 2006, and May 2008 and more than 75% of searches undertaken in May 2007 2 . The data for May 2007 was incomplete as it could not be obtained for the first week of May. However, this is unlikely to affect the results reported in this analysis. A total of 599,097 transaction logs were imported into the computer program, SPSS, for analysis.
The focus of the analysis presented in this paper is on explicit use of the Main catalogue by members of the public. This includes members of the public accessing the catalogue from inside the State Library and those accessing it remotely. Although no demographic data is available, the IP address of the computer used to conduct the search enabled identification of whether the user was an SLV staff member or a member of the public. The transaction logs of SLV staff members were not included in the analysis. Distinctions could also be made between members of the public who were using a public computer in the State Library (slv public terminal), those using their own computer in the State Library (slv public wireless), and those using a computer outside the State Library, for example from home or work or a public library (outside public). All of these users are presented with exactly the same web interface.

Types of search undertaken
There are two distinctly different ways of searching the Main catalogue. One can type in a term, or click on a hyperlink in a catalogue record. The option of clicking in a catalogue record is only available once a catalogue record has been retrieved as the result of an initial typed-in search. As shown in Figure 1, when typing a search term into the Main catalogue, the user can choose from the following ten user options: • Keyword Anywhere (Relevance Ranked) • Title List (Omit initial article -the, a, an) • Author's Name (Last name first) • Subject List   Figure 2 is a snapshot of part of the record for a book called "24/7" retrieved using the search option of Author Browse and typing in the query "Hassan, R". Once this record is retrieved, the user can perform the second type of catalogue search by clicking on the hyperlinks in the results. This snapshot is annotated to illustrate how the search log will record clicks by a user on various parts of this record. For example: • Clicks on "Stanford business books" or "Twenty-four seven", will be logged as a Left Anchored Title search. • Clicking on any of the listed Subjects, will be logged as a Subject Browse search.
• Clicking on Other authors, will be logged as an "Author search" • Clicking on the Call Number List will be logged as a Call Number Browse search.
While the first search undertaken in a session is always typed in by the user, the log does not always allow precise identification of which subsequent searches in a session are the results of requests typed in by the user and which are the result of clicking in the resulting record. Some types of search must be typed; the search-type logged as "Author" is always the result of clicking inside a retrieved record. Other types of searches can be either typed in or clicked.
The first search undertaken in a session (hereafter referred to as first search) is always typed in. This is reported on separately to give a better indication of the kinds of search keyed in by members of the public using the State Library catalogue. Doing this helps avoid double-counting of what is essentially part of the one search. Hence, this analysis is mainly presented in terms of searchers rather than searches.

Limits set
When searching the catalogue, it is possible to place limits on the types of results returned, except when doing a browse type search (author browse, call number browse, or subject browse). The results can be limited by language (hundreds to choose from), location of collection (for example, browsing collection, children's collection), date, place of publication (hundreds of countries to choose from) and type of item (for example, book, serial, or manuscript).

Session Duration
Each search is time-stamped and each session has a unique id. This makes it possible to calculate the duration of a session and the number of searches in a session, although the measure is not always valid. It is possible that in some cases, what is the one session for the user will be counted as more than one session in the log files. This is because after eight minutes of idleness the system warns the user that the session is about to end, unless some user action is undertaken. It is also possible that consecutive users at a public terminal in the library can be counted as the one session if the first user does not close the screen and the next user arrives within eight minutes. Tonta (1992) identifies three distinct types of search failures, retrieving non-relevant documents (precision failures), failing to retrieve relevant documents (recall failures) and retrieving too many unpromising, non-relevant documents (fallout failures). The measure of failure rates used in this paper is a measure of recall failure, that is, the proportion of searches (excluding browse searches) that got zero hits. Browse searches were excluded from the calculation of failure rates as the system records all browse searches as receiving "-1" hits. Using zero hits as a measure of failure is common in the literature (Large & Beheshti, 1997;Moulaison, 2008) but it does exclude precision failures and fallout failures. Moreover, as Tonta (1992) points out, receiving zero hits is not necessarily a failure; for example, a researcher may be delighted to find that there is nothing published in their intended research area. A further limitation of this measure of failure rates is that it does not indicate why the search failed. It could be to do with the way that the search query was formulated or it could be because there is nothing in the collection that is relevant.

Failure rates
In order to find out more about why a search failed, an analysis was undertaken on the reasons for failure. This drew from Connaway, Budd & Kochtanek (1995) and was based on a random sample (stratified by search type) of 729 main catalogue search queries that received zero hits. This constituted approximately one percent of all such queries. Spelling mistakes were identified and where possible, improper terms or syntax.
It was easy to identify all those title searches (Left Anchored Title) which failed because they incorrectly included an initial article. However, it was not possible to identify all of those title searches that failed because the title was typed incorrectly. For example, the title search for "CONSEQUENCE OF MODERNITY" failed. It is very likely that the person who typed in this query was searching for "The consequences of modernity", a well-known book by Anthony Giddens. The catalogue user correctly omitted the first article, "the", but made an error in the rest of the title and hence got zero hits. Similarly the title search for "RETHINKING PUBLIC SPHERE" was probably a search for "Rethinking the public sphere" by Nancy Fraser. This sort of mistake is easy to identify when the book is well-known to the author (of this article), but the occurrence of such errors is not possible to quantify. Hence, the reported results on improper terms or syntax underestimates the true number of searches that were incorrectly constructed as, without knowing what the searcher was after, it is not always possible to identify this sort of error.

Abandonment rates
Abandonment rates are commonly used as a metric to measure call centre effectiveness (for example, Cheong, Kim & So, 2008;Morales, 2008). More recently they have been applied to the field of e-commerce, measuring the proportion of would-be buyers who exit from an e-commerce check-out process before completing the transaction (see https://www.google.com/analytics/). They are an attempt to measure the proportion of system users who "give up" before successfully completing their intended activity on the system.
The concept of abandonment rates can be applied to online catalogue use. A user who received no hits for any searches and terminated the session before getting any hits can be considered to have abandoned the search. Two measures were calculated: abandonment after one search with zero hits and abandonment after two searches with zero hits. Of course, there would also have been those who abandoned their search because they did not find the particular results to be useful, or were overwhelmed by too many hits, but these could not be identified from the data. Conversely, as already mentioned, it is possible that zero hits signified a successful search.

Findings 4
The analysis provides a longitudinal perspective by comparing searches over four years. It reports on both search strategies and search success, taking into account whether the user is using the catalogue inside the library or remotely. The search strategies reported on are search type, limits placed, session duration and number of searches. Search success is analysed in terms of failure rates, reasons for failure, and abandonment rates. 4 Statistical significance is not reported in the following analysis as the cell sizes were too large for tests of significance to yield meaningful results. Use of the catalogue increased steadily between 2005-2008. While most of the people who access the catalogue in the State Library building were using a public terminal, use of the catalogue at public terminals has fallen off slightly since 2007. Wireless access was introduced to the library at the end of July 2006 and its use to access the catalogue has increased only slightly. In contrast, remote access of the catalogue has increased markedly, possibly as more members of the public become aware that they can access the catalogue remotely.

Type of search undertaken
More than 80% of searches were typed in by the user with the remainder enacted through clicking on the results. The type of search first undertaken by the user is shown according to catalogue access location in Table 1. Just under half of all users used the default search type (Keyword Anywhere) for their first search; almost one in five commenced with a Title search and one in seven commenced with an Author Browse search 5 .
While the distribution of search types was stable across the years 2005-2008, that data in Table 1 show that the type of first search varied among the different catalogue access locations. Those within the State Library were much more likely to start their session with a Keyword Anywhere search (53.7% of users within the library compared to 42.6% of those outside the library). Those using a public terminal were much less likely to start their search with a Left Anchored Title search (13.0% of those using a public terminal compared to 20.4% of those using the wireless connection and 20.6% of those outside the library).

Limits set
The extent to which the limit facility was used is shown in Table 2. It can be seen that the limit facility was used for only 5.8% of all searches for which limits could be set. 5 Reporting on all searches rather than first searches would have underestimated the proportion of Keyword Anywhere searches and overestimated the proportion of Subject Browse searches. This suggests that most Subject Browse searches are the result of clicking on headings in a search result rather than being typed in. *Excludes journal title searches as these include an automatic limit on item type.
Date was the most commonly used limit, used for 3.7% of searches. Date is the easiest limit to use as it appears on the initial screen of the catalogue interface. Users need to click on a tab to get to a screen where the other limits are available. Although retrievals can be limited to one of more than a hundred different languages, more than 90% of the language limits set were for English language items. Those using the catalogue at a public terminal in the State Library were slightly more likely to set limits (6.7% of those using a public terminal compared to 5.2% of those using a wireless connection or 5.2% of those outside the library).
Limits were most likely to be set after the first search (limits were set for 2.8% of first searches, compared to 7.2% of subsequent searches). It appears that there are a set of searchers who receive too many hits in the first search, set limits that are too restrictive and then receive zero hits. Searchers were more likely to get more than 1000 hits in their first search (45.5% of first searches got more than 1000 hits compared to 38.7% of subsequent searches) and much more likely to get zero results after the first search (16.2% of first searches received zero hits compared to 28.5% of subsequent searches). In addition, while those who set limits in their first search were just as likely as those who didn't set any limits to get zero hits, in the second search, those who set limits were more likely than those who didn't set limits to get zero hits (24.6% of those who set limits in the second search received zero hits compared to 19.4% of those who didn't set a limit in the second search).

Session Duration
The average and median session duration for those users who conducted more than one search (60% of users) is shown in Table 3. The average amount of time spent on the catalogue was 4.7 minutes. Because the mean can be affected by a few extremely high values, it is also instructive to look at the median, the value above and below which half of the cases fall. The median was 2.2 minutes. In other words, half of all users who conducted more than one search spent less than 2.2 minutes searching the catalogue. It is possible that people are getting faster at using the catalogue, or more impatient, and the data provides some evidence to support this. Between 2005 and 2008, the average session duration slightly decreased and the median session duration decreased by 24 seconds.
The time spent on the catalogue varied according to the type of catalogue access, as shown in Table 4 below. The public using the catalogue inside the library tended to spend longer on the catalogue than the public using the catalogue outside the library. For example, 33.6% of users outside the library spent less than one minute on the catalogue compared to 24.4% of those using a public terminal within the library and 21.5% of those accessing the SLV wireless connection. Almost one in five (19.3%) of those using their own computer within the library spent more than 10 minutes using the catalogue, compared to one in 10 users outside the library (10.3%). There is often competition to use the public terminals within the library, including the catalogue-only terminals. This may explain why those using their own wireless connection tended to spend longer on the catalogue than those using a public terminal within the library.

Number of searches
The total number of searches undertaken varied according to the number of hits in the first search, as shown in Figure 4. Those users who received only one hit in the first search were the most likely to undertake just one search. This possibly indicates that they found what they wanted to know. Conversely, it could indicate that they gave up. (This possibility is explored in the section on abandonment rates.) Those most likely to stop at one search in the session were those who received either zero hits or more than 45 hits. Those who received no hits in the first search were also slightly more likely to undertake more than 10 searches.
The average number of searches per session was 3.4 and the maximum number of searches undertaken within a session was 207. Two fifths (39.8%) of searchers undertook one search. Just under one third (31.5%) undertook two or three searches, just under one quarter (23.6%) undertook 4-10 searches and 5.1% undertook more than 10 searches.
Users who opted to search first using a Boolean search or Construct a search were much more likely to conduct more than one search (74.3% of those who started with a Boolean search and 61.5% of those who started with Construct a search) than the average searcher (41.0% conducted more than one search). This could be because it took several attempts for those using Boolean or Construct a search to successfully construct their search.
A browse search (Subject Browse, Author Browse or Call Number Browse) locates the search term within an ordered list. Other types of searches return a certain number of hits. The maximum number of hits was 193,620 and the average was 5,234 hits. The number of hits varied according to the type of search undertaken. As would be expected, Keyword Anywhere searches were the most likely to receive more than 1,000 hits (73.5% of all searches).

Search success
Failure rates and reasons for failure Failure rates varied according to the type of search, as shown in Figure 5 below. The Boolean search was most likely to get zero hits (55.6% of all searches of this type) and more than half of title searches received zero hits (51.2% of Journal Title searches and 52.7% of Left Anchored Title searches). As would be expected, Keyword Anywhere searches were by far the least likely to receive zero hits (4.7% of all searches). It is searches that are incorrectly constructed or have spelling errors that are of most concern, as the user may receive zero hits when in fact the collection has relevant items. Failure analysis indicated that 11% of all searches which failed contained spelling mistakes.
As discussed in the methodology section, it is not possible to identify all searches that were incorrectly constructed without knowing what the searcher was after. However, it is possible to identify some of the ways in which searches fail because of incorrect construction. Just over half of all Boolean searches and Left Anchored Title searches received zero hits. Boolean searches require the user to link words with "and", "or" or "not". Without these linking words, an error is registered and the system returns zero hits. The failure analysis indicated that half (50%) of the Boolean searches that retrieved no hits were incorrectly constructed. The title search requires that the user omit any initial articles (the, a, an). The analysis of failures indicated that almost one fifth (18%) of title searches failed because they included an initial article. Overall, 15% of searches that failed had easily identifiable format errors.
It was evident from the data that there were searches that failed because of other less obvious types of format error. For example one of the Journal Title searches was for "MOTHER TAKES ARCHIBALD WITH FATHER-IN-LAW IN SHORTS". One can be fairly sure that there is not a journal, magazine or newspaper with this title, although there may well have been a magazine or newspaper article with this title (referring to Davida Allen's winning entry for the Archibald Prize for portraiture). In addition to Journal Title search queries which indicated that the catalogue user thought that they were searching for an article rather than a title, there were Journal Title search queries which indicated that the catalogue user thought that they were doing a Keyword Anywhere search, for example the Journal Title search query "HERB TONIC BACK AILMENT". Without access to the universe of journal, magazine and newspaper titles, it is not possible to measure the extent of this type of error.
The data in Table 5 is included to suggest how frustrating it is when one is unable to formulate catalogue queries correctly. This example illustrates zero hits in the context of a sequence of 35 searches conducted by a user at a public terminal in the State Library. They all concern the one topic of packaging waste. Most of these searches received zero hits. This user tried a variety of different types of search and tried to use more advanced features such as limits and Boolean searches. The details of these searches as shown earlier in Table 1 indicate that the user was unlikely to have found what they were after.
It can be seen that this user started by doing a Keyword Anywhere search on the word "packaging". This search returned 408 hits. Presumably this was too many results for the user to sort through as their next search was another Keyword Anywhere search on the word "packaging" but this time with the following limits: Date -2004, Language -English, Location -Browsing collections, Place of publication -Australia. Because the Keyword Anywhere search had failed to return useful results, the user then undertook a range of Journal Title searches and Boolean searches, some with a range of limits set. Most of these returned zero hits, either because of spelling mistakes or because there were no journals beginning with the specific title typed in. Most of the Boolean searches had the requisite Boolean operator ("and"), but several of them towards the end of the session failed because they did not have a Boolean operator. It can also be seen that the user repeated several searches. It is easy to do this as one has to return to a previous screen to conduct a new search; there is no indication in this new screen of what searches have already been undertaken. One has to move to a separate screen (Search History) to see searches already undertaken. This user spent 24 minutes on this session and may not have found any items of relevance to them; their last three searches returned zero hits.
The data in Figure 6 depicts the failure rates for users based on the location of catalogue access. It can be seen that the outside public were most likely to get zero hits for their first search (17% compared to 11% of those using public terminals and 12% of those using a wireless connection in the library). The outside public were also more likely to get zero hits for their second search (in addition to zero hits for their first search).

Abandonment rates
The proportion of users who abandoned their search after zero hits were analysed longitudinally and compared for different types of catalogue access. Abandonment rates were higher in 2008 than in 2005, as shown in Table 6. The abandonment rates were much higher for those using the catalogue outside the State Library than those using the public terminals inside the library, as shown in Figure 7 below 6 .  One quarter (26%) of users outside the State Library abandoned their search after receiving zero hits for the first search. This compares with one tenth (11%) of users at public terminals in the State Library who abandoned their search after receiving zero hits for the first search. Those outside the State Library who conducted a second search were twice as likely to give up after receiving zero hits for both the first and second search (21% of users outside the State Library compared to 11% of users at public terminals).

Implications of the analysis of use of the SLV main catalogue
In its analysis of use of the SLV main catalogue, this paper has taken into account the search settings (whether the user is using the catalogue inside the library or remotely), and search strategies (search type, limits placed, session duration and number of searches) and looked at search success (failure rates, reasons for failure, abandonment rates).
Although nothing is known about the users of the SLV catalogue, other than whether they access the catalogue in the library or remotely, they are likely to be a very heterogenous group (Hancock-Beaulieu & Borgman, 1996). They are likely to be more heterogenous than users in studies confined to academic libraries, the subject of most literature on OPAC use.
It is not possible to directly compare the search strategies of the SLV catalogue user with the academic library user as search options and user interfaces vary across catalogues. The data does indicate, however, that search strategies of users of a public library catalogue may be simpler than those of users in academic libraries. It has been a consistent finding in the literature on OPAC use in academic libraries that only a minority of end user queries use Boolean searches. For example, Dinet et al. (2004) found that even when the OPAC screen was specifically structured to invite users to use Boolean operators, only 28.2% of queries made use of at least one Boolean operator. Moulaison (2008) found that 15.6% of queries used the "Keyword -Boolean" search, while Lau and Goh (2006) found that 11.8% of searches were Boolean. In this study, less than 1% (0.7%) of users commenced with a Boolean search and only 1.2% of all searches were Boolean searches. In other words, more than one in ten users of the academic library started with a Boolean search compared to about one in one hundred of the users of the SLV (public library) catalogue.
In this study, the default option for searching, Keyword Anywhere, was the most commonly used search type. In studies of academic libraries, it is also the case that the default search type is the most commonly used search type (for example Hildreth, 1997;Lau & Goh, 2006;Moulaison, 2008). It is inappropriate to compare the actual percentages as they depend on the number of search options available to the user.
Perhaps surprisingly, given the heterogenous nature of the public library user, the failure rates were comparable with those in the literature on users of catalogues in academic libraries. The failure rates were just over 50% for title searches, 56% for Boolean searches and 42% for constructed searches. Obviously some searches "fail" because there are no matching items in the SLV collection. However, many searches fail because they are incorrectly constructed. The analysis of failure rates indicated that 11% of searches which fail contain spelling mistakes. The results also supported Arsenault and Ménard's (2007) finding that Left Anchored Title searches often fail because of the presence of an initial article; 18% of failed Left Anchored Title searches failed for this reason. As discussed, it was not possible to identify all searches that were incorrectly constructed without knowing what the searcher was after.
It is not possible from the data to identify which browse searches fail. The user's search term is placed within a browsing list, which may or may not have entries related to the search term. However, it is likely that many browse searches failed. Subject Browse searches, in particular, could be considered as quite prone to failure because of the difficulty involved. Subject Browse searches require the user to enter in a term that matches a Library of Congress Subject Heading. As Bates observes, the average user "identifies their search term with their whole subject query. It does not occur to them that it might be called other things by the catalog" (2003, p. 7).
Bates also says of the average user, "they look up their topic, do not find it, therefore the library must not have anything on it" (2003, p. 7). This observation seemed to be true for a substantial proportion of searchers as abandonment rates were 22.3% after one search with zero hits and 18.0% after two searches with zero hits. The analysis also identified a set of searchers who receive too many hits in the first search, set limits that are too restrictive and then receive zero hits.
These findings indicate the need to provide assistance to users to help them successfully formulate their search. The Voyager interface has online help screens that briefly outline the mechanics of the catalogue. However, if someone requires help in formulating their search, their options are limited. If they are within the library, they need to physically queue up at the Information desk or hail a passing librarian. If they are accessing the catalogue remotely, they can phone the library. Assistance provided at or before the point of failure could reduce the likelihood of the user abandoning their search. For example, an online chat, "live help" facility could be equally effective in helping both those using the catalogue within the library and those outside the library. Other research indicates that such a facility would be well used. Almost two thirds (62%) of students surveyed by the OCLC said they would use online help from a librarian if it was free (OCLC, 2005).
Given the increasing speed of computers and search engines over the last few years, it could be that people expect to find things faster and are getting less patient with using a catalogue. The data provided some evidence for this with the analysis showing that session duration has been decreasing while abandonment rates have increased slightly since 2005.
It could have been thought that those accessing the catalogue from outside the library are likely to be the more seasoned library users as they have deliberately used the Internet to go to the library catalogue search page. In contrast, a physical visitor to the State Library who sits at a computer sees the library catalogue search page as the default page. However, the analysis shows that those users accessing the catalogue remotely are more likely to fail in their search and more likely to abandon their search.
The high failure rates and abandonment rates temper Hider's (2008) conclusion that the State Library of Victoria catalogues are effective tools for accessing the collection and indicate the limitations of relying on user self-reporting. Self-reporting has the possibility of selection bias in the sample who respond to the survey, with those more familiar with using the catalogue more interested in responding. The analysis in this paper also suggests that surveys of catalogue use should include remote users as well as users within the library; those within the library are more likely to be successful with their searches. More research is needed to find out whether there is a group of users who only access the catalogue remotely.
Subsequent to the conduct of this analysis, a new catalogue interface, Primo by Ex Libris, was introduced in the second half of 2009. Like other next generation tools, this new interface is designed to offer a search experience similar to that offered by sites like Google and Amazon. The searcher can but does not need to choose from among different search types or different parts of the collection. The user-friendly interface allows grouping of search results by various criteria for further search refinement and has Web 2.0 functions such as user tagging and user reviews. Its features also include accommodation of flexible search syntax, "Did you mean?" suggestions as well as additional results based on synonyms. So, for example, using Primo the search query "CONSEQUENCE OF MODERNITY" should successfully find the record for the book "The Consequences of Modernity". Given that the user group should remain relatively constant, at least in the short term, this study provides baseline data for evaluating this new interface to the catalogue 7
The ease of use of the library catalogue has implications for the use of the library collection as the catalogue is still the main way of searching the collection of a particular library. This analysis indicates that many public library users in the SLV find catalogues difficult to use. While the failure rates were comparable with those in the literature on users of catalogues in academic libraries, the analysis of search types indicate that search strategies of users of a public library catalogue may be simpler than those of users in academic libraries.
The paper has introduced the concept of abandonment rates as the idea that the searcher "gives up" searching the catalogue because their first or second search is unsuccessful. The operationalisation of this concept in the analysis provides an additional measure of how effectively an OPAC is being used. It shows the importance of comparing the search success of those inside the library with those outside the library, as the analysis shows that those outside the library are more likely to fail in their searches and more likely to abandon their searches.
Importantly, despite suggestions that the catalogue is becoming irrelevant, the data shows that people still want to use the library catalogue. The data from this study indicates a steady increase in the use of the OPAC between 2005 and 2008. At the same time, it also shows an overall increase in abandonment rates over this period. More observational research of people using public library catalogues remotely and onsite is needed in order to be able to interpret this observation and better understand the factors affecting catalogue use.
Two main conclusions can be drawn about how online users can be assisted to find what they are looking for on the library catalogue. Firstly, the data supports the notion that an online chat, "live help" facility could be equally effective in helping both those using the catalogue within the library and those outside the library. Secondly, the analysis provides evidence that the user-friendliness of online catalogues is greatly enhanced through Did You Mean? (spell check) and more flexible syntax. The analysis of reasons for search failure in this paper indicates that, far from being unnecessary features, these capabilities would enable many users to find materials in the collection that they would not otherwise have found.