Catholic Portal look & feel
Thanks to the good work done by Eric Frierson of St. Edwards University, the "sandbox" of "Catholic Portal" now sports the look & feel of our public view:
Search the Catholic Portal
Thanks to the good work done by Eric Frierson of St. Edwards University, the "sandbox" of "Catholic Portal" now sports the look & feel of our public view:
CRRA Update NOVEMBER 2010
This posting outlines a possible workflow for getting digitized versions of Notre Dame's Catholic pamphlets into the "Catholic Portal".
The University of Notre Dame owns a significant number of Catholic pamphlets. These materials have been cataloged and denoted as destined for the "Portal" in their MARC records with the letters "CRRA" in field 590$u.
This posting documents how I wrote and edited a couple of VUFind record drivers and Smarty templates for the "Portal" of the Catholic Research Resources Alliance. In writing this posting I hope to support any developer coming behind me as well as inform the wider open source community on how VUFind works.
The Problem
This is the quickest of blog postings outlining how I am initially providing a text mining interface to digitized Catholic pamphlets.
Jean McManus used a scanner to create PDF versions of a few Catholic pamphlets. Along the way, she also had the software to a bit of OCR. She then gave the PDF documents to me with filenames matching MARC 001 fields.
CRRA Update OCTOBER 2010
New Member Highlights
The posting outlines how I have: 1) mirrored metadata and full text content from the Internet Archive, 2) made the mirrored content accessible through VUFind, and 3) implemented a rudimentary text mining interface against the mirror.
The "Catholic Portal" is intended to be a research tool centered around "rare, unique, and uncommon" materials of a Catholic nature. Many of these sorts of things are older as opposed to newer, and therefore, many of these things are out of copyright. Projects such as Google Books and the Open Content Alliance specialize in the mass digitization of out of copyright materials. By extension we can hope some of the things apropos to the Portal have been digitized by one or more of these projects.
This posting outlines how the names & addresses of the "Catholic Portal" are made available. The purpose of this posting is mostly documentation. Documentation for myself, since I always forget. And documentation so somebody else can do the work after I win the lottery and move to the beach to drink cocktails with umbrellas in them.
Here goes:
Today we had a CRRA Digital Access Committee (DAC) meeting via the telephone. Attendees included:
I did a bit of "Portal" show & tell demonstrating the work done to date on indexing EAD files. (See the previous blog posting.) We then discussed ways the indexing/display could be improved. Suggestions included:
This posting outlines how I am currently indexing MARC and EAD files in VUFind with Solr for the CRRA. (Boy, there are a lot of acronyms in that sentence!)
The Catholic Research Resources Alliance (CRRA) is a member-driven organization with the purpose of making available "rare, unique, and uncommon" research materials for Catholic scholarship. Presently the membership is primarily made up of libraries and archives who pool together their metadata records, have them indexed, and provide access to the index. My responsibility is to build and maintain the technical infrastructure supporting this endeavor.
I have made significant progress in the process of harvesting EAD files and preparing them for ingestion into the "Catholic Portal". This posting outlines the successes.
Assuming a Catholic Research Resources Alliance members place their EAD files in a HTTP-accessible directory, and those files have a .xml extension, then the following Perl scripts enable me to harvest and prepare them for indexing:
This is the briefest of travelogues reporting on a meeting about EAD files at Marquette University for the Catholic Research Resources Alliance on September 20, 2010.
Today I indexed some of the metadata I extracted yesterday using a script called index-ead.pl. Of all the scripts I've written so far, this one is the most straight-forward. Read locally-developed XML file. Extract the unique identifier, title, and date. Associate each with VUFind/Solr fields. Commit.
You can (temporarily) see the fruits of these labors because all of the records have been associated with the Eric Lease Morgan Foo Bar Library. The result is a list of container-level records with very little additional information.
CRRA Update
SEPTEMBER 2010
In this update …
This posting outlines how I plan to prepare EAD files for indexing with Solr, the underlying indexing technology of VUFind.
I am aggregating sets of EAD files from Catholic Research Resource Alliance members. I am expected to index these files at the most granular level possible -- meaning at the did
level. In order to satisfy both human and computer requirements, each indexed record needs at least a unique identifier, a human-readable descriptor, and a location code. The unique identifier can be gotten from the unitid
element. The human-readable descriptor can come from the unittitle
. The location code can be inferred from the url attribute of the eadid
element.
This posting outlines how I believe I will add unitid elements to did elements of EAD files.
As the CRRA matures, I expect a greater amount of the metadata ingested into the "portal" will come from EAD files. In order to index EAD files meaningfully, I need to extract unique identifiers from each container-level element, a human-readable description of the container, and a location code. The identifier and human-readable description can easily come from unitid and unititle elements of did elements.
VUFind is the technical backbone of the "Catholic Portal", and this posting documents my experiences at the VuFind 2.0 Conference held at the Villanova Conference Center on September 15 & 16, 2010. In short, it provided an opportunity for the community to share successes, challenges, and visions for the future.
We invite you to attend the CRRA reunion and discussions in San Diego on Thursday afternoon, January 6, 2011. We are scheduling this meeting before the ALA Midwinter Meeting meetings begin on Friday in hopes that many of you who are attending the ALA meetings will be able to join in the CRRA discussions as well.
At this time, we are putting together what promises to be a set of lively and informative discussions. This will be an opportunity to talk about CRRA activities taking place at your library, to discuss progress to date on the 2010/11 goals in the strategic plan, and to explore our readiness to promote the Catholic portal to librarians and scholars. VuFind 1.0 will be very near to being ready for implementation and this will be an opportunity to explore its functionality. Also, we will take a look at how the contents on the portal are growing particularly in regard to adding rare, unique and uncommon archival collections and other materials. The outlines of the proposal to be submitted to the NEH Challenge Grant will be ready for discussion. And, we want to hear from everyone – new and continuing members – how things are going at your library. Very importantly, this is an occasion to network and socialize with your CRRA colleagues.
An inaugural VUFind "Midwest" User's Group Meeting was held Friday, September 3, and this posting outlines my perceptions of what happened there.
(The following is the current collection policy for the Catholic Portal.)
Collection Policy Statement for the Catholic Portal