Monday, March 30, 2009

This academic political controversy brought to you by DSpace

Article title, "MIT to make all faculty publications open access"

To quote, "If there were any doubt that open access publishing was setting off a bit of a power struggle, a decision made last week by the MIT faculty should put it to rest. Although most commercial academic publishers require that the authors of the works they publish sign all copyrights over to the journal, Congress recently mandated that all researchers funded by the National Institutes of Health retain the right to freely distribute their works one year after publication (several foundations have similar requirements). Since then, some publishers started fighting the trend, and a few members of Congress are reconsidering the mandate. Now, in a move that will undoubtedly redraw the battle lines, the faculty of MIT have unanimously voted to make any publications they produce open access."

and

"The faculty will have to prepare an appropriately formatted copy of their works to the provost for hosting. MIT plans to place them on its DSpace system, a content hosting system it developed with HP and distributes under a BSD license."

This is actually huge news, particularly for academic researchers AND for users and admins of DSpace. I'm hoping this will mean that a lot more funding gets kicked toward DSpace, Inc. Granted, MIT is the institution that built DSpace, but this is the core reason it was created. I'm really eager to see what the future will bring in regard to this.

Friday, March 13, 2009

This is just cool

And this link: gopher://gopher.std.com/11/The%20Online%20Book%20Initiative

This is just cool. Although it's fun to consider the potential implications of Gopher having taken over, it would be sheer speculation!

Tuesday, March 10, 2009

DSpace, Part Dos

Now that I've let the DSpace project sit for a week and a half, my thoughts about using this program as an archival digitization content management system are distilled.

First, let me make a comment about another program that is out there. One program, ContentDM, is an exorbitant, annual price even though it is a fairly good program. In paying that licensing price, the user receives excellent support from OCLC, the Online Computer Library Center, to run ContentDM. The drawback of ContentDM, however, in archival digitization management is that it was designed specifically for libraries; libraries do not host unique materials, AND their items are offered for check-out. And, most obviously, libraries mainly stock books and journals only. So ContentDM is more geared toward library materials, and researchers looking at ContentDM sites will be using different access points through the metadata than will researchers seeking archival materials.

Dublin Core: Dublin Core is internet standard, and a cataloger worth his/her salt will be able successfully to create the access points necessary to bring researchers to the collection. Therefore, the most important aspect of creating an archival content management system is to create access points that state obviously where materials are--if it's in an archive, it is rare or unique--you really need your researchers following your trail of breadcrumbs.

DSpace itself. I've become increasingly fond of Dspace. I wish I could post my test website here, but unfortunately, this is not possible. Perhaps I'll post a screenshot eventually. As I had said earlier, I'd originally tried it on FreeBSD, and that was just a little too unwieldy for me. Gentoo proved surprisingly stress-free to run, and Tomcat on this box has been pretty stable. (I'd be curious to find out others' experience with Tomcat. Most of the discussion I've read about Tomcat relates to DSpace, and typically the errors people are having are standard administrator errors--errors that I myself ran into, too, until I realized what I was doing.)

So once everything was built, it took me about an hour to create graphics that would brand our DSpace test box, make it look like it belonged to the TTU SWCO. The file formats DSpace will display are many and varied. I posted a music collection (mp3) in our archive, as well as a broadside collection (thank you again, Dr. W). Finally, once things were in DSpace, I indexed it to be searchable, and things seem to continue to be running smoothly. I've not had to reboot the system or anything.

Things DSpace is not: It is not a typical content management system. You can't post a Wikki or a Forum in there to let users comment on your content. (Though it would not surprise me at all if there were modules for that. It is, after all, open source.) I question the accessibility of DSpace for users with disabilities; I should run some tests on that, too, to make sure that DSpace properly tags everything for these users. DSpace has a good variety of bin files to execute the most common tasks (backup, index, create users, etc.). I haven't had any problem with that. DSpace can use either Oracle or Postgresql. I'm running this version off Postgresql because of the small sample of material and the speed at which Postgresql was configurable.

If the collection decides to work with DSpace, things I will have to consider/get:

-Server: Probably just a PE RedHat
-Server room: In the process of cleaning out an old storage room. This entails everything from throwing garbage away to removing building floor and ceiling tiles to surplussing old equipment.
-Building support: Once we let people understand that digitization will happen HERE (x marks the spot), Jason, I, and a few other folks will provide training and recommendations for tactical support on the metadata creation side.

This is essential for a simple reason: no archival content management system is easy. I think some folks expect digitization to mean "scan this in," but in reality, there is a lot that goes into it, not the least of which is how to make a graphic (or other file) searchable, how to index it. DSpace (and all other digitization programs) provide us with this opportunity. It's important for everyone committed to using DSpace (whether at my office, or in planning to use DSpace elsewhere) that DSpace isn't easy, and it's going to take careful planning. As I explained to one of my bosses, I can't in good conscience recommend anything without strong planning first.

And, finally, the other thing I'll be doing this week is installing Archivists' Toolkit onto the server to be used with DSpace.

Friday, March 6, 2009

Government, Powered by Google


Okay, two links to share before I go into my diatribe:

http://news.slashdot.org/article.pl?sid=09/03/06/1326247&from=rss "America's New CIO Loves Google."

and

http://www.slideshare.net/domainlabs/building-more-transparent-effective-government-presentation "Building More Transparent and Effective Government: The Case Study of Washington D.C."

Okay, for one thing, any case study that uses the word "transparent," automatically should seem suspect. Transparency is the word "experts" use to make their audience think that all the information they need is revealed when, in reality, the reverse is the truth.

Second: I am terrified by the slide that shows the "Business/Gov't Tech Satisfaction" as being low UNTIL it has been powered by Google. When you're an "expert," and you are affiliated with a company AND have been asked by the government to research a solution about something--well, no flipping fig newton, of COURSE you're going to say that the company with which you're affiliated is the solution for fixing all the problems in the government. (And anyone who reads through this with an intelligent eye will recognize these arguments/anxieties.)

Implications: What are the implications on the American economic system of allowing a private corporation to be in charge of the digitization of America's records? I'm thinking about examples we've seen past and present of companies to which the American government (read: the American economy) has been too intimately tied. One example is Ford Motor Company, another example is Haliburton; I think of these two for different reasons, though I'd argue that each had multiple impacts.

What do we have to plan for electronic records management at a state university level, for instance, if the federal government is creating a digitization and ERM model through Google? Since I work at the archive for TTU campus, I recognize that this is a weighty question for all archivists to consider. Should Google be allowed to plug itself into all governmental archival practices? (And, another important question: has it already?)

I'm not trying to say that Google _IS_ the evil empire, but in some ways it is. I think the Google CEOs have demonstrated a lot of foresight in their digitization business model--and of course they're getting sued, but why shouldn't they?--but I also feel a lot of anxiety for obvious reasons. Google might BE the new Ford Motor Company: it is demonstrating incredible foresight; it is providing innovative solutions to problems that are only going to get worse without a modicum of remediation (and really, A LOT worse, and a LOT more than a modicum is needed); and it demonstrates the ability to achieve good solutions.

Even still, I can't stop the internal shudder when I see that "Business/Gov't Tech User Satisfaction: Powered by Google." And really, I'm using Blogspot, which also is . . . Powered by Google.