Friday, August 29, 2008

Summer Recap and Semester Preview

I didn't blog this summer, though I did continue to do project work in DLS. I was officially still on the EAD "task force", which in my case involved attending the occasional meeting to keep up on the project's progress. There wasn't as much for Joanna and I to do during this time, as the main tasks fell to John (working with the Archon database) and Raj (creating the web interface). There's a day trip planned to the University of Illinois in the next couple of weeks to meet with the Archon developers, but since my involvement at this point is minimal I likely won't go. I do plan to continue going to task force meetings this semester, just to stay informed and help with anything that might be needed.

My focus this summer was on migrating the Virtual Writing University audio files into the Iowa Digital Library. I migrated two sub-collections within the VWU digital collection, which were recordings from the 2005 NonfictioNow Conference and interviews by Prof. Peter Nazareth with members of the International Writing Program (around 100 files in all). The project involved reformatting the files from wav to mp3, uploading them to contentDM, making a metadata template, creating the records, and coming up with a preliminary interface design. As a former comparative literature major, it was really interesting for me to listen to the recordings and research the writers involved in order to catalog the entries. There are more subcollections to be added and the interface to be fleshed out, which will be continued this semester by a new set of fellows.

This semester, my last, I will be working on the eGranary Digital Library at Widernet. While I'll miss DLS, it will be good to get some experience in a different setting. This week I met with Cliff Missen and Brent Palmer, my new mentors, to discuss project possibilities. We decided I will start out working to develop the template for a "community information platform", which will allow users of the eGranary to add their own content (for creating local directories, etc.). Widernet is partnering with Intel, which donated a large sum of many for the project, so it looks to be an interesting opportunity. In fact, the day after our meeting Cliff and Brent left for California to meet with the Intel folks.

I spent the rest of the week familiarizing myself with the eGranary and researching Drupal, the content management system we will use to create the template. While I haven't used Drupal before, the fact that I learned Joomla! in Electronic Publishing last semester will give me a big leg up. Next week, I'll meet with Brent to learn more about the project and discuss next steps.

Saturday, April 26, 2008

Week 10-11: Troubleshooting Archon

I've been a bit remiss about keeping up on the weekly postings. This is partly because school has gotten incredibly busy, and partly because progress on the EAD project is rather slow going so I don't have a lot of regular updates to make.

As I mentioned in a previous post, we ran into trouble importing our XML files into Archon -- basically, it seems it can't be done. We were hoping this would be easily fixed, but it has turned into a significant issue (or at least, a time-consuming one). John Osborn has been diligently trying to figure out a way to work around the problem, which so far involves converting the EAD into delimited files, importing those into Archon, then exporting them as EAD (I hope I've got that right -- it's a little confusing). It's a roundabout method to accomplish what should be a straightforward task, which Archon is supposed to be able to handle. There's obviously some glitch happening, but we don't know for sure what it is.

In searching online, I haven't found as much documentation as I'd hoped about how others have implemented Archon. I did read a blog post written by a librarian who had the same problem with importing XML files, although I haven't found any information indicating the cause or solution. The main task for the remainder of the semester may well be trying to figure this out.

We've been having meetings with Jen, John, Sue, Linda, and a new apps person, Raj, to discuss the import issue and the search interface design. The meetings are great experience because they give Joanna and I an inside view of how project management and workflow actually happen on the job. I feel less like a student worker than a professional and team member, which is a good feeling.

In addition, we've also been continuing to encode finding aids for the sample set using our template. After much debate about how to handle the intellectual hierarchy of the container lists, which can be hard to discern from the finding aids, we decided to simplify it as much as possible. This means not specifying levels like series, subseries or folder, and just using numbered container tags to indicate different levels. This makes things easier for now, and the levels can always be added later.

With the semester coming to a close, we've had to revise our goals somewhat. We hope to have a small sample set completed and imported into Archon (somehow), along with a front-end search interface for delivery. But it will likely be more bare-bones than originally planned, given the Archon issues we're having. However, there is talk of Joanna and possibly myself continuing work on the project this summer, so there will be more time to do some fine-tuning.

Monday, April 14, 2008

Week 9: Making a Template

The past couple of weeks, we've been having regular work sessions with Jen. We're each working on encoding separate finding aids, and then comparing our work to catch errors and inconsistencies and answer each other's questions. This is very helpful because when Joanna and I first started doing markup we felt kind of lost and didn't know if we were doing it right. Since Jen has more experience, having her do the coding along with us seems to be speeding up the learning process. I feel much more comfortable now, due to her help as well as the amount of practice I've had since the beginning of the semester.

Joanna and I also finally got a draft done of that template we've been planning for so long. I'm sure it will end up being revised, but it's nice to know we have some form of completed document. Our work in the coming week will likely focus on tweaking the template.

Since my last post, we also had a meeting with DLS folk to go over the search/browse interface for delivering the finding aids on Archon. Sue, Linda, and Bryan will be in charge of making our design a reality. The meeting was surprisingly short; I expected there would be the need for some negotiation of our design, but Sue and Linda didn't see any problem with implementing our original idea. It will be interesting to see what they come up with.

Speaking of Archon, we also tried to import one of our completed EAD XML files into the program. Unfortunately, Archon didn't like that and wouldn't take the file. This is problematic, as the whole point of creating a template is so the resulting customized files can be uploaded directly to Archon. Sue and John Osborn are working on resolving the issue, so we're keeping our fingers crossed. We have a meeting scheduled for next week with John to discuss the problem.

Sunday, March 30, 2008

Week 7: Fun With Photoshop

Joanna and I got really burned out on encoding. I worked on my Louis Noun finding aid on and off for weeks, and am still only halfway through it because the container lists take FOREVER to mark up by hand. Therefore, we took a break from that last week to work on other neglected tasks.

Specifically, we're focusing on designing the interface that will allow users to search and browse the finding aids. This will include simple keyword and advanced searching, as well as browsing by topic, subject, creator, collection title, collection date, and form/genre. We came up with a list of topical categories for browsing, and worked on mapping EAD elements to MARC record codes for controlled access elements.

We also created a screen shot in Photoshop. It was fun to do something purely creative for a change, but we seemed to be barking up the wrong tree (we were sure Jen would be blown away by our idea to use expandable boxes rather than drop-down menus, but she politely nixed that), so we'll be going back and overhauling the whole thing. That's ok though, because it's an iterative process, as we learned in seminar last semester.

Our next goal is to complete a sample set of finding aids from the three collections (ignoring the container lists for now), and bring those along with our search page ideas to a meeting with Sue and Bryan at the end of the week. They will be responsible for the actual programming, so we'd like to give them something to work with ASAP.

Saturday, March 8, 2008

Week 6: Markup Madness

This post will be relatively short because I already wrote it once, after which it was immediately lost to the Internet ether when Blogger went down briefly (and didn't auto save my draft! Bad Blogger!). Here's a quick recap.

My main task this week was to continue encoding a sample finding aid in EAD using the oXygen text editor. Joanna and I each chose f.a.'s from the Iowa Women's Archive to mark up separately, after which we will compare notes and try to come up with an EAD template that can be used for all future finding aids. The next step after that will be to figure out how to plug the template into Archon.

The encoding has been a slow process, because we're using EAD samples we've found online to guide us and they're all a little different. Thus, there is a lot of revision and backtracking as we go. We've learned that EAD is pretty flexible and provides a lot of room for customization. At the same time, this can make it seem somewhat ambiguous and confusing. But I feel I'm making definite progress, and am learning a lot by getting my hands dirty with the actual encoding.

Saturday, March 1, 2008

Weeks 4 & 5: Field Trip

Oops. I forgot to post last week because all of us DLS fellows were immersed in the TEI at UIUC. The two-day workshop was intensive and kind of exhausting, but it was relevant to my project and helpful for understanding the text encoding process better. We learned how to use the Oxygen text editor, which will carry over to my work with EAD. And having experience with XML-based markup languages will be useful in general, so it was worth the trip.

To backtrack a bit, the week previous to the workshop Joanna and I had meetings with Wendy Robertson and John Osborn. Wendy discussed strategies for batch migration of legacy finding aids, and we spent most of the rest of the week playing around with that. The process we're testing involves creating tab-delimited text files of the finding aids, plugging them into an Excel spreadsheet, and figuring out how to make the various headings match up. This is easy if the finding aids have a consistent format, but not so easy otherwise. It's very time consuming to get the formatting right, so I'm not sure this method will work for all of the finding aids. We definitely have more work to do to figure it out.

We also talked with John Osborn from ITS about streamlining the process using some form of programming script. He's working on a template script for migrating the finding aids into Archon. Honestly, I'm not clear on this method yet, but hopefully will understand it better as we work on it more.

This week, Joanna and I continued working on the Excel spreadsheets and also practiced marking up finding aids in Oxygen. I'm not sure what else may be in store for next week, but will find out when we have a check-in meeting with Jen on Monday.

Saturday, February 16, 2008

Week 3: Meetings

This week consisted of many meetings. On Monday, Joanna, Jen and I met with Sue Julich to learn about Archon, an open-source program for delivering finding aids online in EAD. Sue showed us the basics of the program, and Joanna and I later spent time exploring it ourselves. We plugged in some legacy (i.e. old) finding aids to see how it worked. The nice thing about the software is that you don't have to actually write any EAD code yourself, but can copy and paste text into boxes with the appropriate headers and an EAD document is automatically created. The downside to this might be a lack of control, and that's something we'll have to evaluate as we assess the program further.

We also had meetings with David McCartney from University Archives and Janet Weaver from the Iowa Women's Archives. Like our meeting with Greg Prickman, our purpose was to learn about the current state of finding aids and the typical workflow in those units. The upshot is that all of them have their finding aids available online as static HTML pages. The format and means of processing them differ somewhat from unit to unit, but there don't seem to be any major differences beyond that.

Finally, I spent more time researching EAD to get a firmer grasp on how it is structured and what the different tags mean. I still have a ways to go on that front, as the tag library is quite large. I also took notes from the book "Arranging and Describing Archives and Manuscripts" by Kathleen Roe, to get a clearer understanding of terminology used in the archives world (fonds, series, manuscript vs. record group, etc.) that may be useful for learning EAD.

Next week, there's a data migration meeting planned with Wendy Robertson on Monday. Aside from that I plan to continue doing research, working with Archon, and playing around with EAD tags. Over the weekend, I'll be attending a TEI (Text Encoding Initiative) workshop at the University of Illinois. TEI is a markup language similar to EAD. Hopefully, learning it will prove to be very helpful for this project.