134 · Representative Documents: Collaborative Shelving Facility Strategies

GEORGIA INSTITUTE OF TECHNOLOGY

Georgia Tech Algorithm (ASERL Collaborative Journal Retention Program)

Georgia Tech Algorithm

The Georgia Tech Algorithm was developed to assign a numeric value to facilitate the review of journals

for the ASERL Cooperative Journal Retention program. The algorithm is designed to assess

completeness of the collection, relevance to the institution and relevance to the ASERL project. The

algorithm consists of 6 elements:

(FirstCopy)2 – Missing/10 +(LastCopy OR Currency) +Class +(ASERL *-‐2.25)

• FirstCopy: Ratio of owned first volume to the first volume of the title squared (Values: 0 to 1)

• Missing: A negative numerical score of missing volumes. Each missing volume counts as 1 and

each missing issue counts as .1. All missing issues are summed and this sum is divided by 10.

(Values: -‐n to 0, at GT this was -‐3.5 to 0)

• LastCopy: For ceased titles only. This is a ratio of owned latest volume to the final volume of the

title. (Values: 0 to 1)

• Currency: For continuing titles. Currently, received journals are assigned a value of 1, and .1 is

subtracted for each year not held (.9 for 2010 cancellations, .8 for 2009 cancellations, etc). GT

used a floor of 0 for titles cancelled in or before 2000. (Values: 0 to 1)

• Class: A weight added for classes relevant to the library’s mission. At GT we added a weight of

.25 to all LC Q and T titles. (Values: 0 or 0.25)

• ASERL: A proxy variable if the item has been nominated for ASERL by another library (0 or -‐1).

We then multiply this proxy times the maximum value of the algorithm – 2.25.

Discussion

I created the algorithm to provide a quick assessment of 1,059 journals that had previously been

selected to be withdrawn (see additional background below), but I think that it could also be used as a

starting point for review. It does require a number of data points: earliest volume held, latest volume

held, first volume published, last volume published (or knowledge that the title is current)(Ulrich data), a

count of missing volumes and issues, selection by other schools (ISSN +Title), and I treated

continuations as a single title (call number). I had much of this material from previous projects, and

looked up the remaining information using our catalog, Ulrich, and the ASERL spreadsheets.

For FirstCopy, I chose to emphasize the owning the first volume by squaring the term which creates a

rapid tail off for coming into a series later (FirstCopy =.25 if your holdings begin with volume 2, and

FirstCopy =.11). I would caution against assuming that the first volume is volume 1 unaccounted for

title changes and title splits often prove to be exceptions. For Missing, I counted missing issues as -‐.1. A

more precise way of accounting for missing issues would be to evaluate the frequency (e.g. a missing

quarterly would be -‐.25, and a missing monthly would be -‐.08), but this added an additional data

collection step. I divided the missing count by 10 to make it comparable to the other values in the

algorithm. LastCopy is similar FirstCopy, but I chose not to square this value. Looking at the current

GEORGIA INSTITUTE OF TECHNOLOGY

Georgia Tech Algorithm (ASERL Collaborative Journal Retention Program)

Georgia Tech Algorithm

The Georgia Tech Algorithm was developed to assign a numeric value to facilitate the review of journals

for the ASERL Cooperative Journal Retention program. The algorithm is designed to assess

completeness of the collection, relevance to the institution and relevance to the ASERL project. The

algorithm consists of 6 elements:

(FirstCopy)2 – Missing/10 +(LastCopy OR Currency) +Class +(ASERL *-‐2.25)

• FirstCopy: Ratio of owned first volume to the first volume of the title squared (Values: 0 to 1)

• Missing: A negative numerical score of missing volumes. Each missing volume counts as 1 and

each missing issue counts as .1. All missing issues are summed and this sum is divided by 10.

(Values: -‐n to 0, at GT this was -‐3.5 to 0)

• LastCopy: For ceased titles only. This is a ratio of owned latest volume to the final volume of the

title. (Values: 0 to 1)

• Currency: For continuing titles. Currently, received journals are assigned a value of 1, and .1 is

subtracted for each year not held (.9 for 2010 cancellations, .8 for 2009 cancellations, etc). GT

used a floor of 0 for titles cancelled in or before 2000. (Values: 0 to 1)

• Class: A weight added for classes relevant to the library’s mission. At GT we added a weight of

.25 to all LC Q and T titles. (Values: 0 or 0.25)

• ASERL: A proxy variable if the item has been nominated for ASERL by another library (0 or -‐1).

We then multiply this proxy times the maximum value of the algorithm – 2.25.

Discussion

I created the algorithm to provide a quick assessment of 1,059 journals that had previously been

selected to be withdrawn (see additional background below), but I think that it could also be used as a

starting point for review. It does require a number of data points: earliest volume held, latest volume

held, first volume published, last volume published (or knowledge that the title is current)(Ulrich data), a

count of missing volumes and issues, selection by other schools (ISSN +Title), and I treated

continuations as a single title (call number). I had much of this material from previous projects, and

looked up the remaining information using our catalog, Ulrich, and the ASERL spreadsheets.

For FirstCopy, I chose to emphasize the owning the first volume by squaring the term which creates a

rapid tail off for coming into a series later (FirstCopy =.25 if your holdings begin with volume 2, and

FirstCopy =.11). I would caution against assuming that the first volume is volume 1 unaccounted for

title changes and title splits often prove to be exceptions. For Missing, I counted missing issues as -‐.1. A

more precise way of accounting for missing issues would be to evaluate the frequency (e.g. a missing

quarterly would be -‐.25, and a missing monthly would be -‐.08), but this added an additional data

collection step. I divided the missing count by 10 to make it comparable to the other values in the

algorithm. LastCopy is similar FirstCopy, but I chose not to square this value. Looking at the current