Open Source Software and Rhetorical Questions about the Internet's Future: DSpace, Part Dos

Now that I've let the DSpace project sit for a week and a half, my thoughts about using this program as an archival digitization content management system are distilled.

First, let me make a comment about another program that is out there. One program, ContentDM, is an exorbitant, annual price even though it is a fairly good program. In paying that licensing price, the user receives excellent support from OCLC, the Online Computer Library Center, to run ContentDM. The drawback of ContentDM, however, in archival digitization management is that it was designed specifically for libraries; libraries do not host unique materials, AND their items are offered for check-out. And, most obviously, libraries mainly stock books and journals only. So ContentDM is more geared toward library materials, and researchers looking at ContentDM sites will be using different access points through the metadata than will researchers seeking archival materials.

Dublin Core: Dublin Core is internet standard, and a cataloger worth his/her salt will be able successfully to create the access points necessary to bring researchers to the collection. Therefore, the most important aspect of creating an archival content management system is to create access points that state obviously where materials are--if it's in an archive, it is rare or unique--you really need your researchers following your trail of breadcrumbs.

DSpace itself. I've become increasingly fond of Dspace. I wish I could post my test website here, but unfortunately, this is not possible. Perhaps I'll post a screenshot eventually. As I had said earlier, I'd originally tried it on FreeBSD, and that was just a little too unwieldy for me. Gentoo proved surprisingly stress-free to run, and Tomcat on this box has been pretty stable. (I'd be curious to find out others' experience with Tomcat. Most of the discussion I've read about Tomcat relates to DSpace, and typically the errors people are having are standard administrator errors--errors that I myself ran into, too, until I realized what I was doing.)

So once everything was built, it took me about an hour to create graphics that would brand our DSpace test box, make it look like it belonged to the TTU SWCO. The file formats DSpace will display are many and varied. I posted a music collection (mp3) in our archive, as well as a broadside collection (thank you again, Dr. W). Finally, once things were in DSpace, I indexed it to be searchable, and things seem to continue to be running smoothly. I've not had to reboot the system or anything.

Things DSpace is not: It is not a typical content management system. You can't post a Wikki or a Forum in there to let users comment on your content. (Though it would not surprise me at all if there were modules for that. It is, after all, open source.) I question the accessibility of DSpace for users with disabilities; I should run some tests on that, too, to make sure that DSpace properly tags everything for these users. DSpace has a good variety of bin files to execute the most common tasks (backup, index, create users, etc.). I haven't had any problem with that. DSpace can use either Oracle or Postgresql. I'm running this version off Postgresql because of the small sample of material and the speed at which Postgresql was configurable.

If the collection decides to work with DSpace, things I will have to consider/get:

-Server: Probably just a PE RedHat
-Server room: In the process of cleaning out an old storage room. This entails everything from throwing garbage away to removing building floor and ceiling tiles to surplussing old equipment.
-Building support: Once we let people understand that digitization will happen HERE (x marks the spot), Jason, I, and a few other folks will provide training and recommendations for tactical support on the metadata creation side.

This is essential for a simple reason: no archival content management system is easy. I think some folks expect digitization to mean "scan this in," but in reality, there is a lot that goes into it, not the least of which is how to make a graphic (or other file) searchable, how to index it. DSpace (and all other digitization programs) provide us with this opportunity. It's important for everyone committed to using DSpace (whether at my office, or in planning to use DSpace elsewhere) that DSpace isn't easy, and it's going to take careful planning. As I explained to one of my bosses, I can't in good conscience recommend anything without strong planning first.

And, finally, the other thing I'll be doing this week is installing Archivists' Toolkit onto the server to be used with DSpace.

Tuesday, March 10, 2009

DSpace, Part Dos

1 comment:

Open Source Software and Rhetorical Questions about the Internet's Future

Followers

Blog Archive

About Me