Spoonfeeding library data to search engines

When you talk to a search engine, you need to realize that it's just a humongous baby. You can't expect it to understand complicated things. You would never try to teach language to a human baby by reading it Nietzsche, and you shouldn't expect a baby google to learn bibliographic data by feeding it MARC (or RDA or METS or MODS, or even ONIX).

When a baby says "goo-goo" to you, you don't criticize its misuse of the subjunctive. You say "goo-goo" back. When Google tells you that that it wants to hear "schema.org" microdata, you don't try to tell it about the first indicator of the 856 ‡u subfield. You give it schema.org microdata, no matter how babyish that seems.

Great summary by Eric Hellman about using schema.org microdata to disclose book metadata to search engines.

It's another nail in the coffin of Dublin Core (I suspect)... pitched at more or less the same level but more detailed in parts, less so in others - a nice mix of traditional library properties and things we more used to seeing in the web world all wrapped up in an easy to embed format.

Transforming the Library of Congress bibliographic framework

The Library of Congress will address these issues:

  • Determine which aspects of current metadata encoding standards should be retained and evolved into a format for the future.  We will consider MARC 21, in which billions of records are presently encoded, as well as other initiatives.
  • Experiment with Semantic Web and linked data technologies to see what benefits to the bibliographic framework they offer our community and how our current models need to be adjusted to take fuller advantage of these benefits.
  • Foster maximum re-use of library metadata in the broader Web search environment, so that end users may be exposed to more quality metadata and/or use it in innovative ways.
  • Enable users to navigate relationships among entities—such as persons, places, organizations, and concepts—to search more precisely in library catalogs and in the broader Internet.  We will explore the use of promising data models such as Functional Requirements for Bibliographic Records (FRBR) in navigating relationships, whether those are actively encoded by librarians or made discernible by the Semantic Web.
  • Explore approaches to displaying metadata beyond current MARC-based systems.
  • Identify the risks of action and inaction, including an assessment of the pace of change acceptable to the broader community: will we take incremental steps or take bolder, faster action?
  • Plan for bringing existing metadata into new bibliographic systems within the broader Library of Congress technical infrastructure—a critical consideration given the size and value of our legacy databases.

The Library of Congress’s process will be fully collaborative.  We will consult our partners and customers in the metadata community, standards experts in and out of libraries, and designers and builders of systems that make use of library metadata.  We intend to host meetings during conferences of the American Library Association, specialized library associations, and international organizations, as well as special “town hall” meetings open to the metadata community, to gather input from all interested parties.  We plan to establish an electronic discussion group for constant communication during the effort of reshaping our bibliographic framework, and we expect to host a series of invitational meetings of experts and stakeholders in 2012 and 2013.

A somewhat wooly but generally positive statement of future metadata activity at the Library of Congress. I particularly like "identify the risks of action and inaction" :-)