Author: Joanna
Memorial Minutes
Background
Michael McCarthy approached the Archives & Special Collections Library about digitizing the Memorial Minutes read at faculty meetings. He has some recent ones available at http://aevc.webs.com, dating back to 1990, but would like a more robust list. Michael was referred to us for this digitization.
The Archives & Special Collections Library contains three volumes of Memorial Minutes:
- Volume 1, 1877-1942
- Volume 2, 1943-1960
- Volume 3, 1960-1978
There is a gap from 1978-1989 which need to be pulled from the faculty minutes in Archives & Special Collections. The digital library has already completed this digitization.
Digitization consists of approximately 90-100 pages per volume. The volumes cannot be digitized as one book but will need to be digitized on a person-by-person (memorial-by-memorial) basis. Each volume also contains an archival folder with the same contents except for Volume 3, whose folder contains overlapping but unequal content.
File creation
- Goal is a one-to-one correspondence between person and minute (and, in the case of multiple minutes submitted per person, one-to-many). Thus this project is inherently slow.
- No metadata exists.
- Items must be hand-scanned.
File naming scheme:
aevc_last_XXX_YYY
Where:
- aevc = project code (memorial minutes is too long)
- last = last name of person. When same last names exist, use lastname-firstinitial.
- XXX = zero-left string-padded number indicating which memorial minute per person is being scanned.
- YYY = zero-left string-padded number indicating the page sequence per memorial minute per person.
Vassar Wesleyan Program in Paris
Vinay Swamy approached the digital initiatives group about digitizing the old files related to the VWPP program since its inception in 1969. We’ve reviewed the files and are now awaiting Wesleyan’s response. Vassar can do this digitization in-house.
Note: this is an institutional repository project, not a digital library project.
Imaging Specs:
Canon Image Runner 3030: Set dpi to 400, multipage tiffs, send to ftp site.
File name prefix: vwpp
Partnership: Wesleyan University archives. Wesleyan has processed the items and Vassar will use the box/folder number setup for identifiers and filenaming scheme.
Einstein project
The Albert Einstein project, funded by the Polonsky foundation, seeks to digitize and make available the contents in the following collection:
http://specialcollections.vassar.edu/findingaids/einstein_albert.html
Items:
There are approximately 290 images that will be produced from these series
Filenaming scheme
einstein_series_subseries_folder_item_page[a].extension
einstein_01_01_014_001_001.tif – Albert Einstein to Otto Nathan, Sept 1936 – first telegram, first page; service copy image
einstein_01_01_014_001_002.tif – Albert Einstein to Otto Nathan, Sept 1936 – first telegram, second page; service copy image
einstein_01_01_014_002_001.tif – Albert Einstein to Otto Nathan, Sept 1936 – second telegram, first page; service copy image
einstein_01_01_014_002_002.tif – Albert Einstein to Otto Nathan, Sept 1936 – second telegram, second page; service copy image
Longfellow and Vassar Songs Sheet Music Collections
This project plan provides background information, special considerations, and digitization recommendations for two related projects:
- Babs and Bella C. Landauer collection of musical settings of the poetry of Henry Wadsworth Longfellow (“Longfellow Collection”)
- Collection of Vassar College song books (“Vassar Songs”)
The information below is not a detailed technical and preservation analysis but a summary of known issues and basic road map for further consideration.
Basic information
- Stakeholder(s): Sarah Canino
- Time frame: AY2012-2013
- Preservation needed?: yes
- Funding opportunity?: yes (Farrish Foundation)
Background and Non-Digital Considerations
The Music Library contains two sheet music collections proposed as good candidates for digitization. In the course of another proposal for digitization (a more generalized sheet music digitization), Sarah Canino provided an analysis of these rare and unique collections. Each features very strong facets:
- Both contain a significant portion of unique materials when searched against the Sheet Music Consortium (SMC) and the Petrucci Music Library Database at the International Music Score Library Project (IMSLP), the two premier repositories of digitized music.
- Both contain large (if not all) objects created pre-1923, i.e., free from copyright restriction.
- One collection contains items completely unique to Vassar College.
- Although current scope is the boxes of the Vassar Songs collection only, further digital projects can stem from this theme, including audio, other song texts, class parties, and musicals.
In addition, there are some drawbacks:
- There is virtually no item-level (EAD container / <c>) metadata for the items. Sarah is also interested in item-level cataloging / MARC records for these items.
- The Vassar music, in particular, is extremely rare but in poor condition.
- Some further research must occur to properly determine copyright status and weed out duplicates in the Longfellow Collection.
- A precise metadata standard must be met to share items with the SMC and IMSLP databases.
Digital profiles for collections
Unit of consideration in physical collection: sheet / song
Unit of items to be digitized: page
Longfellow Collection
- Item count: 378 objects, most of which are sheet music but some that are vocal scores (~ 35 of them, 50 pages each)
- Assumption: 4-6 pages per item; average 5 pages
- Estimated item count: 1,890 pages. N.B.: Sarah will identify and modify this page count. Duplicate items will not be counted.
- Format: loose objects in boxes (4 boxes; identifier = 78 L86 v.1-4)
- Dates: most published pre-1924; most are American publications (some are British). This needs to be verified.
- Oversized materials?: no
Vassar songs collection
- Item count: 8 volumes plus some additional publications of songs (“Peace I leave with you” and 1903 yearbook).
- Estimated item count: 500 pages (provided by Sarah)
- Format: bound volumes in poor condition
- Dates: 1881-1940
- Oversized materials?: TBD
Digital Considerations
Longfellow Collection
Number of items to digitize
There are 378 pieces in the Longfellow Collection (possibly including duplicates). Sarah has found that approximately 20% of the items in the collection have already been digitized and available elsewhere. We must ensure that the oversized folios are not too large for the copystand.
Recommendation: we digitize the collection in its entirety — minus the duplicates — and not worry about the overlap. However, in our metadata schema, we should provide reference to a related item that provides a URL to an alternate digital object in another institution. Sharyn Cadogan and Joanna must measure the oversized folios.
Metadata and Copyright Research and Cataloging
There is virtually no metadata for these items, and significant research must be conducted to determine unique items, any background information, and copyright considerations. Additionally, we must create a metadata profile that is flexible and useful locally and worldwide.
Recommendations:
- Joanna works with Sarah and Ann Churukian to create a new metadata profile that maps to Dublin Core or MODS (most likely MODS) in Islandora.
- Music Library uses part of the available funds to hire a library school intern for a paid intership to research each piece, copy metadata when needed, and provide original cataloging of other items, under direction from Ann.
- Cataloging should be done directly in Islandora. This can serve as a pilot project for account management, maintenance, and documentation in our chosen digital library software.
- Once cataloged, Joanna can work with Ann, the library intern, and Laura Streett to create an EAD-compliant finding age for this collection. Additionally, because data in Islandora is stored as MODS, Joanna can fairly easily transform metadata into other standards, such as data required for the SMC.
- Joan Pirie and Shay Foley should be consulted about formatting data for MARC ingest into the library catalog.
Recommendations and outcomes from 11/2/2012:
- Sarah will work to analyze duplicates
- Sarah will provide basic metadata in electronic format — Title, Composer, Number of Pages — for each item
- Once Joanna has metadata, we can begin digitization
- Bound volumes will be “Phase II”
- Library school intern should be hired for paid internship
- Sabrina and Sarah will identify possible interested faculty in collection
Vassar songs collection
Digitization process
Sharyn and Joanna, with help from Laura Streett, must assess the fragility of the biding in the context of digitization. Laura can determine the fragility of the object itself, while Sharyn and Joanna can determine the amount of shadowing, curvature, and margin; we must understand how much impact the condition of the item will impact a high-quality digital copy.
Recommendation: Sharyn, Joanna, and Laura examine the Vassar songs collection and take basic measurements. We cannot fully determine the feasibility of digitizing this collection in-house unless we do this critical step.
Metadata
There are some items already digitized, but at the collection / book level. We need metadata at the song / “sheet” level. We need to determine whether or not a one-to-one correspondence exists between song and page; in other words, do songs begin on the same page as other songs, or does a new song begin a new page? If the former, our metadata profile and digitization may be difficult; the easiest way to digitize may be to duplicate pages that contain the end and beginning of songs, adding to our digital count.
Recommendations:
- Joanna should examine the volumes to determine the page-to-song correspondence, which will increase the page count.
- Similar to the recommendations for the Longfellow Collection, it may be useful to provide a paid internship opportunity for the right MLIS student to research and then directly catalog items into Islandora.
Recommendations and outcomes from 11/2/2012:
- Joanna will work with Laura to obtain songbooks
Recommendations and outcomes from 1/4/2012:
- We have asked Hudson Microimaging for a proposal and cost estimate for digitization services
- Item-level metadata will be at BOOK level. We may wish to OCR and then copy the Table of Contents from songbooks (when available) to help identify which songs are in which books
- Books are already cataloged, so should be easy to obtain metadata
Printers’ Marks
About the Project
Working name: | Printers’ Marks |
Sponsors: | Sabrina Pape and Ron Patkus |
Duration: | Summer 2012 |
Nature: [Text; image; text+image; GIS; audio/video; other] |
Text and images |
Project track: | 2 – VCL project with special considerations |
Date prepared: | 2012-08-01 |
Background / Purpose
The printers’ marks throughout the Main Library have been of interest to researchers and Vassar community members since they were installed in the early 20th century. A published volume, A list of the printers’ marks in the windows of the Frederick Ferris Thompson Memorial Library, Vassar College, is available online. We will digitize this volume to provide scans with very high resolution, as well as undertake a research project to document the printers, marks, and current locations of each plate. Additionally, we will photograph the current marks in situ. We will apply for a Ford Scholar to assist us in this work in Summer 2013.
Scope
Phases of project Based on item temporal coverage |
Phase 1: Photograph plates, scan images Phase 2: Develop research with Ford Scholar Phase 3: Publish online project |
Number of items to be digitized | TBD – there are 16 pages in the volume, and 66 current windows. We will splice marks from the TIFFs created from the book as well; there are 82 marks. |
Total number of images Assumption: one JPG derivative per each archival image created |
16 TIFFs page + 16 JPGs page + 82 TIFF marks + 82 JPG marks + 66 TIFFs windows + 66 JPGs windows = 328 images |
Total number of records | TBD |
Special considerations | Photography may be difficult |
Location of Physical Items
Book is located in Special Collections; windows are dispersed throughout Main Library.
Hardware/Storage
System type | System | Space required |
Archival image storage | digcol | 164 images; 6560 MB |
Derivative item storage | digcol | 164 images; 1640 MB |
TOTAL SPACE NEEDED | 8200MB / 0.8 GB |
Software
Image capture: | Scanners and cameras to Photoshop |
Metadata capture and storage: | Islandora |
Final product display: | Islandora |
Scanning specifications
We will scan at 400ppi, 3000px for largest dimension. Individual marks at 1200ppi.
File Naming Convention
Formula
For book:
- Prefix: pmarks
- ID: book
- ID part: page number (left pad 3 digits)
- Delimiter: underscore
Example:
Page 5: pmarks_book_005
- Archival file: pmarks_book_005_a.tif
- Service file: pmarks_book_005_s.tif
- Derivative: pmarks_book_005.jpg
For extracted images per page:
- Prefix: pmarks
- ID: book
- ID part: page number (left pad 3 digits)
- ID part: wing (e.g., “West Wing 4th” = ww4)
- ID part: image number in sequence (left pad 3 digits)
- Delimiter: underscore
Example:
Page 5, John Besson 1923 mark:
pmarks_book_005_ww4_001
- Archival file: pmarks_book_005_ww4_001_a.tif
- Service file: pmarks_book_005_ww4_001_s.tif
- Derivative: pmarks_book_005_ww4_001.jpg
For windows:
- Prefix: pmarks
- ID: photo
- ID part: wing (e.g., “West Wing 4th” = ww4)
- ID part: image number in sequence (left pad 3 digits)
- Delimiter: underscore
Example:
John Besson 1923 mark: pmarks_photo_ww4_001
- Archival file: pmarks_photo_ww4_001_a.tif
- Service file: pmarks_photo_ww4_001_s.tif
- Derivative: pmarks_photo_ww4_001.jpg
Bidloo digitization
Proposal to digitize Vassar’s millionth book, Bidloo’s Anatomia. After careful consideration, we realize that we don’t have the equipment in-house to digitize such a large volume, and we’ve asked for estimates from the Northeast Document Conservation Center (NEDCC) for digitization.
Status: approved, estimate received. Digitization will begin in the summer.
Notes:
Functionality needed:
- Zoomable images (400ppi, 48-bit archival TIFFs, jp2 generated)
- Searchable text
- Keep color bars on service copies?
- Essays from faculty and librarians about importance of work?
Stakeholders:
- Susan Kuretsky, Art History
- Libraries
Salmon-Underhill Digital Exhibit
Instructions
Fill out the About the Project information below, and then use the Worksheet for Functional Specifications during consultation with stakeholders to help determine the software and steps used. The project track determination may change over time.
About the Project
Working name: | Salmon-Underhill Digital Exhibit |
Sponsors: | Gretchen Lieb |
Duration: | 3 weeks |
Nature: [Text; image; text+image; GIS; audio/video; other] |
Text + Image |
Project track: | Track 2 |
Date prepared: | February 1, 2012 |
Background / Purpose
The purpose of the Salmon-Underhill Digital Exhibit is to provide an Omeka- and CONTENTdm-ready set of images, metadata, and narratives to contribute to the Women’s History Month exhibit sponsored by HRVH.
Scope
Phases of project | One phase only |
Number of items to be digitized | ~ 30 |
Total number of images Assumption: one JPG derivative per each archival image created |
~ 150 |
Total number of records | |
Special considerations | Some items may be fragile; letters may have bleed-through from recto to verso. |
Location of Physical Items
Unit | Location |
Letters | Special Collections |
Pictures | Special Collections |
Caption/text | Thumb drive |
Hardware/Storage
System type | System | Space required |
Archival image storage | Artfiles server | |
Derivative item storage | Omeka, CONTENTdm | |
TOTAL SPACE NEEDED |
Software
Image capture: | VRL scanning; Archival TIFF, service TIFF |
Metadata capture and storage: | Excel spreadsheet; already-written captions |
Final product display: | Omeka (HRVH); CONTENTdm (VCL and HRVH) |
File Naming Convention
Formula
- Prefix: salmon
- ID: box and folder #, delimited by hyphen
- ID: left-padded, 3 characters
- Page sequence: left-padded, 3 characters
- Delimiter: underscore
Example: salmon_46-2_001_001.tif
Turn-of-the-Century Sheet Music (Music Library)
N.B.: this project became the Longfellow sheet music project
About the Project
Working name: | Sheet Music |
Sponsors: | Sarah Canino, Sabrina Pape |
Duration: | TBD |
Nature: [Text; image; text+image; GIS; audio/video; other] |
Images, text, possibly audio (see “specialized software”) |
Project track: | |
Date prepared: | November 10, 2011 |
Project status: | Proposed |
Background / Purpose
From Sarah Canino’s proposal:
This collection includes popular sheet music from the mid-1800s to mid-1900s and includes about 2,000 items of about 3-5 pages each. Many have the “decorative” title pages and many have been listed in a FileMaker Pro database.
This collection could be a good choice because there is interest not only to scholars focusing on music, but also those interested in visual images of the period and textual representations of inventions (telephone, airplane), events (elections, wars, fairs, etc.) and depictions of race, gender, and ethnicity (in particular African American, Native American, women, Jews. For example, Peter Antelyes drew upon these for shared imagery and text depictions of American Indians and Jews. Even though our collection is relatively small, we may have unique items.
Other schools have done quite a bit with their collections and LC has also included sheet music in its American memory project:
See http://library.duke.edu/music/sheetmusic/collections.html for a selective list.
Other thoughts (from Joanna):
If we are able to get high-quality TIFFs in this process, we could try to use software that uses Optical Music Recognition (OMR) to make a sheet music “transcript” — see http://journal.code4lib.org/articles/84.
Scope
Phases of project Based on item temporal coverage |
Phase 1: Digitize pre-1900 itemsPhase 2: Digitize 1900-present, checking for copyright issues
Phase 3: Explore OMR |
Number of items to be digitized | Approx. 2,000 |
Total number of images Assumption: one JPG derivative per each archival image created |
Approx. 3-5 pages per item, for totals of 6000-10,000 images |
Total number of records | |
Special considerations | Estimates for space needed will vary widely depending on size of items, color depth for certain pages versus all-text pages (if any). |
Location of Physical Items
Units | Location |
Music Library | |
Hardware/Storage
System type | System | Space required |
Archival image storage | ||
Derivative item storage | ||
TOTAL SPACE NEEDED |
Software
Image capture: | TBD |
Metadata capture and storage: | FileMaker database already created, unsure depth of metadata |
Final product display: |
File Naming Convention
Formula
- Prefix: msheet
- ID: 4-digit, string pad left with zeroes, based on FMP item primary key
- ID part: 3-digit based, string pad left with zeroes, based on order of pages
- Delimiter: underscore
Early Images of Vassar
Currently resides at: http://libweb.vassar.edu/earlyimages/
Needs migration to new home.