Summary of MDG session, 3-18-08

* Written by Jenn Riley *

The article for discussion this month was:

Yakel, Elizabeth, Seth Shaw, and Polly Reynolds. “Creating the Next Generation Archival Finding Aids.” D-Lib Magazine 13, no. 5/6 (May/June 2007). Available from http://www.dlib.org/dlib/may07/yakel/05yakel.html.

Early on, the discussion focused around the predictability (or lack thereof) of EAD files. EAD as a markup language is designed to be flexible, for the encoding of many different types of finding aids. This means that any two EAD-encoded finding aids may not look very much alike. The potential of using a common controlled vocabulary across finding aids was envisioned as one way to tackle this fundamental unpredictability. The group expressed the idea that for sharing, broad subject headings are good, despite the claim of the article that these weren’t adequate. However, within the local environment, the specific ones this article says were needed make sense.

A large part of the group’s discussion of this article worked through how better access could be provided to these materials with some reasonable level of expediency. While detailed analysis such as noting that a proverb appears within a story within a volume in the IU Folklore collection could be beneficial, it’s unlikely we can afford to be this detailed. Respondents reported that it’s often difficult to resist the urge to provide this detailed analysis, even though there is pressure to process collections quickly. One has to ask, how meaningful will the description be if I don’t go into more detail? One has to stop and think. General practice is to only pull out the “important” data, such as only some names rather than all of them. One participant noted that a recent article in the American Archivist (Fall/Winter 2007, Vol. 70, No. 2), “Archives of the People, by the People, for the People,” by Max J. Evans discusses how one might get more mileage out of an EAD-encoded finding aid. [Note from after the meeting: this same volume has another article on the Polar Bear Expedition project which might address some of the issues the group was wishing were discussed in the article we read for this week. “Interaction in Virtual Archives: The Polar Bear Expedition Digital Collections Next Generation Finding Aid,” by Magia Ghetu Krause and Elizabeth Yakel.]

Some members of the group expressed interest in studying how keyword indexing of full text could be used to help add description for archival collections (although the group realized automatically generating transcriptions of scanned handwritten documents is currently not very feasible). The CLiMB project at Columbia was noted as an example of how this technology might work. The possibility of capturing transcriptions from users was discussed.

Participants noted the potential utility of user-supplied information, as these users often have a vested interest in and knowledge of the materials.

The group wondered why the project staff was hesitant to include information from the database with data on soldiers, including birth dates, death dates, etc. The prevailing thought in the room was that if the catalog can include this short of information, it should – that this sort of information was not fundamentally out of scope of the “catalog.”

Participants noted several features of the Polar Bear Expeditions site that they believed had been implemented well, including providing coherence to a collection brought together by theme rather than by format, effective browsing (although it was noted the browse might be used more because the search feature was not very full-featured!), and the fact that the entire collection had been digitized rather than just highlights. Some drawbacks were mentioned as well, most notably the current lack of critical mass of user comments, and clear information on what it is that brings these various collections together.

A “wish list” for more information on this project emerged, including specifics on the metadata implementation (e.g., what controlled vocabularies were used), and to what degree site features were developed in response to use cases and user studies. For example – the “visitor awareness” feature appears to be a way of getting users to talk to each other. The article didn’t describe how this feature was determined to be a priority – was it implemented in response to a defined need or just because it was interesting? Participants also wanted more information on balancing this sort of functionality with user privacy issues, while recognizing that this sort of project can open users’ minds as to what is possible, allow us to get feedback from them, and to ask them what they want, while they’re using it.

The challenges described in this article were disheartening to some participants, who felt that this project represents a best possible case, with all the material already digitized. The fact that there were still so many problems is a bit scary, as the mantra we’ve been hearing is that online materials were supposed to make this sort of thing much easier. Or are we just making these system too complex? Flickr seems to work, and it operates at a much simpler level. To what degree does the system need to reflect the complexity of the collections and the items within them?

Summary of MDG session, 2-26-08

* Written by Jenn Riley *

The February 2008 meeting of the Metadata Discussion Group drew about 50 attendees. Thank you to everyone who continues to make this group a success.

The article for discussion this month was:

Chapman, John. “The Roles of the Metadata Librarian in a Research Library.” Library Resources & Technical Services v. 51 no. 4 (October 2007).

The discussion began examining what job responsibilities presented in the article represented entirely new tasks, which were slight evolutions from current practice, and which seem pretty much the same as duties of some current technical services positions. For the most part, the group felt that the four areas described (collaboration, research, education, development) were generally part of technical services responsibilities currently. Collaboration, especially with collection managers (Archive-It is an example of this at IU), was an area participants felt was already a strong part of technical services jobs, although expanding the scope to working directly with faculty might be necessary in the future.

The area of development was thought to be the most different from “traditional” technical services positions, requiring a stronger need to think about the final form of access for materials being described. Technical services staff needed to deal with this in the early days of automation but since access hasn’t changed much since then there needs to be more thinking in this area. Staff will increasingly need to deal with different levels and types of metadata – some web presentable, some more internally-focused. The will need to work closely with technology-intensive positions (although they do this already with MARC data). Designing new platforms and interfaces is what’s new.

The decision in this article to only look at positions within technical services may have been a practical one, but it does potentially introduce homogeneity into an inherently heterogeneous environment. Dealing with this heterogeneity is a key role of metadata staff. The MARC/AACR2 stack is well-tested, some of the newer ones aren’t. Metadata librarians will have to determine for each new set of materials which sets of standards to use. A major weak point now is that our mainstream cataloging system can only handle the one set of standards, so our users have to go to multiple places to access content. This heterogenity makes interoperability difficult. How do we allow things to be different when they need to be but not make them different just to be different? It’s fun to make up new things, but we have to be sustainable.

A participant noted that a colleague from another university had observed a trend that in places where there is significant funding for digital library work, metadata tends to be a separate operation, outside of technical services, and in places where there’s no funding, technical services is often asked to take non-MARC metadata work on themselves. Library organizational models are so fluid, it’s no surprise there are so many different models out there for metadata librarians and digital library work.

Asking technical to do non-MARC metadata is a huge investment – it’s asking already busy people to do more things. We also think we need higher-level salary lines for this planning work. But technical budgets are being cut. How do we deal with this? Don’t think of it as dumping more work on folks. Think of it as adapting to the world as it changes. It’s an exciting opportunity. Think outside the box – metadata work in acquisitions perhaps.

Regardless of the reporting structure, the group felt a strong need to move to mainstream processes. We know enough about how to deal with many types of material, even with non-MARC metadata, to make it operationalized.

A participant posed an interesting hypothetical situation: tomorrow we all came to work and all jobs with “cataloging” in the title changes to “metadata”. What would we need to make that happen? The first reaction to this proposal was that MARC is metadata so this could be true now. To expand into other types of metadata, would need training. More contact would be required with subject specialists to learn about needs these staff would not currently be aware of based on current standards. Staff would need a lot of support during the transition.

One function of metadata is to organize information, another function is to make connections between things. This means subject specialties will be more important into the future.

People think cataloging of online resources when you say metadata. But many definitions of metadata are broader, so the word is almost useless now in many cases. A view raised in discussion was that metadata facilitaties online discovery, whether the material is online or not. It’s the long-held idea of metadata as a document surrogate.

Although descriptive metadata is what’s primarily being discussed, acquisitions departments may need to deal with other types of metadata, especially rights metadata. Flexible staff that can take on digital library type activities when other duties lull are likely to be needed. Libraries will need to continue to prepare individuals for new work.

Summary of MDG session, 1-29-08

* Written by Jenn Riley *

The second meeting of the Metadata Discussion Group was a lively session, with about 50 people in attendance. We’re glad to have a new exit installed in the staff lounge so that we no longer have to count heads to keep under room capacity!

The article for discussion in the January session was:

Elings, Mary W. and Günter Waibel. “Metadata for All: Descriptive Standards and Metadata Sharing across Libraries, Archives and Museums.” First Monday 12, no. 3 (March 2007). http://www.firstmonday.org/issues/issue12_3/elings/index.html

The discussion began surrounding to what degree the article successfully described the different perspectives of the library, archives, and museum communities. Some in the group thought it was a good introduction, but didn’t make clear possible differences in broad vs. narrow scope or unique vs. non-unique materials. One participant noted that these institutions all have collection management needs in common.

Overall, the group felt that choosing standards based on material type rather than institution type made a great deal of sense. Visual materials were mentioned especially as needing a different approach than textual materials, and this participant postulated that this new way of thinking has been influenced heavily by the visual resources community, where content and structure standards developed side by side and influenced each other. Many questions about how to put this approach into practice were raised, however. Are we breaking user expectations by using different models for different types of materials? Some of our current categorizations of materials are hard to separate out by type – government documents, for example, are a mixture of archival-style things, visual materials, and texts, and a complete collection of the photographic work of one person would benefit from the context provided by archival description but would also benefit from in-depth item-level indexing.

The group discussed for a time why things developed they way they have. One participant noted that the article (politely!) casts blame and says we insist on using the wrong tools because we’re used to them. There’s a reason we use the tools we have, because we have financial, administrative pressures to produce more. We care about interoperability, but we can’t afford it. Implementing a major shift in how we approach things has enormous financial implications. No good models for institutions making this shift on a large scale, and our technological tools haven’t caught up to this new way of thinking either. We need to “expand our personal toolkits.”

The group spent a significant amount of time discussing issues of efficiency and streamlining the descriptive process. The Greene/Meissner report cited by the article was mentioned, as was RLG Programs’ recent report: Shifting Gears: Gearing Up to Get Into the Flow. The disconnect between collection- or file-level description and item-level digitization was noted, and seemed problematic. Questions of efficiency led us to discuss user-contributed metadata, especially as seen in the recent LC experiment with Flickr. Participants felt this approach could shift some burden from us to our users, and increases exposure for our collections. But we still need to process collections, and do a great deal of work with them. We need to experiment – things will keep changing. Questions about the need for oversight of user-driven data were raised, with overall acknowledgment of the problem but no concrete solutions.

Participants raised some specific questions with regards to the article:

  • A version of the chart in the conclusion adding visual resources as another category was distributed at a conference last summer. Is this simply another category or does it change the argument somewhat?
  • Could the chart in the conclusion include MARC as data structure for each community?
  • Where does Dublin Core fit in all of this?

The session ended on a somewhat philosophical note, with participants commenting that the shift in thinking proposed by this article reflects changes in society in general – more collaboration, themes emerging between groups are happening. The social definition of knowledge is changing. The group closed by noting interesting content on the web and remembering that the “interesting stuff” is why we do all of this in the first place.

Our next session will be Tuesday, February 26. The article for discussion for that session will be distributed by February 12, Please send ideas for topics for future session to the MDG listserv, and feel free to use the listserv for discussion between in-person sessions.

Summary of MDG session, 11-27-07

* Written by Jenn Riley *

The first meeting of the IUB Metadata Discussion Group seems to have been an unqualified success. Although we held the session in a room with the largest seating capacity in the Wells Library, we still had to turn some people away. My apologies go out to those who wanted to attend but were unable to because of space. By our next session, a second door should be installed in the room which will raise its legal maximum capacity. Thank you to all who attended, and who tried to.

The article for discussion this month was:

Gilliand, Anne J. “Setting the Stage.” In Introduction to Metadata: Pathways to Digital Information, ed. Murtha Baca. Online edition, version 2.1. Available from http://www.getty.edu/research/conducting_research/standards/intrometadata/setting.pdf

The group used the general principles outlined in the article to discuss the role of metadata in libraries and their technical services departments. Participants appreciated the breadth and high-level focus of the article, but expressed an interest in balancing this approach with more practical approaches in future meetings. The difficulty of describing the concept of “metadata” in any succinct way was noted by participants.

Two features of the article were brought out in discussion: the thought of metadata as something that grows and changes over time, and the fact that “lay” metadata is important in addition to “expert” metadata.

Regarding the continued accrual of metadata over the lifecycle of an object, the group discussed the potential effects on copy cataloging of this need, noted that WorldCat Local could play a part in this, and postulated that one of the roles of a technical services department could be the adding to of metadata records over time.

The concepts of “Lay” vs. “Expert” metadata, not surprisingly, generated a good deal of discussion. No participant voiced the sometimes-heard opinion that metadata from lay sources such as users, publishers, etc. (including user reviews, sales data, tagging of images, etc.) had no place in the library environment, although several individuals cautioned that the metadata we maintain must support effective retrieval and that more uncontrolled metadata could threaten that goal. One participant voiced an opinion that one role of libraries is to supplement lay metadata with expert metadata, to help ensure authority, a sentiment that seemed to have general agreement.

From this point, the discussions turned to the role of systems in providing services based on metadata. Participants felt that our systems needed to handle both factual, structured data like ISBNs and more fluid, organic, unstructured data like that our users can provide. It was noted that to provide high-quality services on these different types of metadata, our systems need to have *more* structure on the back end, rather than less. While the discussion didn’t delve very far into specific metadata formats, there was a general sense that the data being recorded was more important than the format in which it was stored. One participant summarized this view as “I don’t need to have MARC, I need to have the specificity of MARC.” The need for different approaches for different types of materials was raised, which led to a request for future MDG sessions to study in more depth these different approaches, the standards that emerge from them, and the communities behind them, and to discuss whether there is more these communities can do to work together. Participants also expressed interest in system design issues, allowing complex linking of records but still allowing them to make sense out of context.

Throughout the discussion, possible roles for technical services staff in the metadata environment emerged. Most ides centered around creating and maintaining descriptive metadata, although a need was expressed for all involved in metadata creation to know about all types of metadata being created and how it is used. Possible roles for technical services staff included:

  • Recording relationships between information objects that are not possible to generate automatically. (For one initiative designed to help automatic recognition of relationships between objects, see http://www.openarchives.org/ore/)
  • Authority control, to allow more powerful discovery mechanisms
  • “Expert” metadata to supplement that from other sources
  • Describing hidden collections with no or inadequate existing descriptive metadata
  • Describing Web sites intended for archiving
  • Describing objects deposited into IU ScholarWorks
  • Targeted projects to enhance older metadata
  • Provide value-added content
  • Managing groups of records
  • Providing acquisition information to fund managers

We had a lively discussion, with many points of view raised. Our next session will be Tuesday, January 29. The article for discussion for that session will be distributed by January 15. Please send ideas for topics for future session to the MDG listserv, and feel free to use the listserv for discussion between in-person sessions.