Catalog


Mostly written by Liza Loop. This page, like all pages on this wiki, is a work in progress. Please comment or edit.

( related pages: Catalog Data Entry Process, Database Team, Relational Structure, Discussing the HCLE Catalog, Catalog Subject View, Catalog Technical View, The Omeka Catalog Platform )




Introduction



The HCLE online "Catalog" has many functions to perform and, therefore, is undergoing restructuring almost continuously.

In the Spring and Summer of 2016, we conducted a Make versus Buy comparison between our custom software development process versus using existing open source and commercial products. As was discussed at a 2016 meeting of the Online Archive of California, there are no solutions that meet all needs. The trade-offs are cost, time to implement, and functionality. A preliminary conclusion based on a review of about two dozen systems has been to test a commercial cloud-based system called Collector Systems. The test should be completed by the end of Fall.

As of Spring, 2015 we have a custom data entry system for capturing information (data and metadata) about the items we know about that are related to our topic. This is called the Catalog Maintenance System. We also have a Constituency Relationship management system built on the Open Source CiviCRM Platform. We are about to partner with the California Digital LIbrary to host "finding aids" that make all the items in our Catalog easy to discover and access through your browser.


Our Challenge: Managing Collections, Content and Constituency

Here's the challenge as of the end of August, 2014 as viewed by Liza Loop. We have a collection of physical items - books, papers, letters, videos, audios, software on all media, urls, program listings, course syllabi, etc. Most of this "stuff" is on paper. I expect to have at least 10,000 items and grow from there. In addition we have hundreds of web links to digital items other people and organizations have posted on the web. By combining these items in many different ways we can tell the story of how computing was learned and became a tool for learning in general. We need a comprehensive catalog to help us find these items.

We have three types of information to be managed -- three C's. The physical paper needs to be scanned to create digital images readable online. Then both physical and digital items need to be cataloged. All of this stuff is the "collection" and should be described in the Collection Catalog (first C). We also need some kind of constituency relationship management software (CRM) to keep track of members, donors, potential funders, authors, staff, volunteers -- all the people and institutions that are related to a museum or library or archive. This will probably start with about 3,000 entries and grow. "Constituency" is the second C. I want to relate the people with the items catalog without having to double enter any of the data. for example, the volunteer who enters a piece of software into the catalog should have a record in the CRM and an identifying field (element) in the catalog. The third C is content, specifically, the content of the web site we are building as a Virtual Museum. So our Collections Management System has to talk to our Constituency Management System and both have to work with our (Web) Content Management System. What are the best (most functional and easiest to maintain) open source tools to use for this? Simple, eh?

HCLE will have at least three teams to that will concern themselves with implementing our catalog: A Database Team, a Metadata Team and a Data Entry Team. During this design phase of the museum these teams will work on getting some workable tools in use so that we can demonstrate a "Proof of Concept" for the museum. Thereafter the teams will refine and upgrade HCLE's infrastructure to keep abreast of rapid development in this field.

A Few Terms, Tools and Standards2014 is a banner year for museums, libraries, archives and private collections to go digital. If each of us invents a different way of describing what we have to the world of web users most of our valued items will not be found. Hence the need for a common vocabulary (or "ontology") and global standards. These indexing systems are not handed down from on high by some higher-than-human authority, they are created by groups of humans. The process comprises holding a series of meetings (who attends such meeting is a whole other issue), proposing a list of terms with descriptions, publicizing this list among potential users, trying it out over several years and eventually converging on a single "core" list with idiosyncratic additions (extensions) needed by differing communities of practice (say, automobile parts dealers and 4th grade school teachers). The builders of HCLE wish to be as compatible as possible with global standard as they emerge. The commentary in this section is aimed at exploring the major standards now being developed for describing the kind of "stuff" in our collection. To begin to see what a tough nut this is to crack, take a look at this Online Dictionary for Library and Information Science

The task of the Metadata Team is to choose how to describe the items in various databases. The Database Team will then implement the utilities needed to create and maintain the catalog. Then the Data Entry Team will get to work filling up the database so that museum visitors can search for (and find) all the items related to HCLE's mission. Some of these items will be in the HCLE archive. Most will be physically scattered around the world.

Digital Resource LocatorsAs our collection grows more of our digital items are being hosted (stored on a computer connected to the Internet) by other institutions, (e.g. Stanford University Libraries Special Collections and Internet Archive) and not necessarily in HCLEs own digital repository. We don't want visitors to HCLE to be concerned with where "in the cloud" the images they encounter come from. To make these items show up on a museum visitor's screen requires each one must have its own internet address. There are several competing methods for identifying online resources and HCLE is working on choosing which one to use.

Catalog Version Definitions
The catalog will go through many updates. Version definitions are arbitrary, but demonstrate progress and also establish near term goals

Version 1) Alpha - basic operations but only available internally
  • 1.0
    • log in via web site encrypted string
  • 1.1
    • automatic log in - needs wordpress expert
  • 1.2
    • incorporate Svetlana‚Äôs metadata work

Version 2) Beta - basic operations established, preliminary public interface available by invitation, efficiency improvements incorporated
Version 3) Public Launch - 80% functionality, public access read only
Version 4) Public Access - 95% functionality, moderated public access sufficient for crowdsourcing
Version 5) Public Exhibits - UI improvements sufficient to allow general public to create HCLE exhibits


Metadata and Ontologies (Link to HCLE Metadata discussions)The question of how to describe different kinds of objects (items) online is being hotly debated today. Books are fairly straight forward since librarians have been exploring this issue for thousands of years. Other media, such as software, or complex content, such as a programmed teaching workbook, may require more description. HCLE needs a volunteer specialist who can advise us as we proceed down the metadata path. So far we have identified the following resources:


Metadata Agencies and Standards Committees

Some examples of metadata schemes


HCLEs 92 fields were built from the Qualified Dublin Core.

ONIX is a professional organization of book sellers.
The ONIX Best Practices recommends that only one Audience Code- the one which best fits the work- be supplied.

X.12 832 Audience Code List (Audience Code, Description)
  • COL- College
  • JUV- Juvenile
  • PSP- Professional and Scholar
  • SCH- Elementary/High School
  • TRA- Trade
  • YA- Young Adult

ONIX Code List 28 Audience Code List (Audience Code, Description)
  • 01- General/Trade
  • 02- Children/Juvenile
  • 03- Young Adult
  • 04- Primary & Secondary/Elementary & High School
  • 05- College/Higher Education
  • 06- Professional and Scholarly
  • 07- English as a Second Language
  • 08- Adult Education

Google Books and Metadata - see


HCLE Tools
So far I've explored MS Access, MySQL, Omeka and I want to look at CiviCRM. One of our volunteer consultants has suggested that we should think of the task as implementing CiviCRM and extending it to include the catalog. I prefer to have the catalog be a single, simple, flat table rather than a complicated relational structure. I'm collecting opinions on this from advisers who have experience in this rather than depending on my own limited knowledge. I'm a terrible programmer so I will either have to be dependent on volunteers or raise the money for paid consultants.


In addition to the catalog and the constituency relationship databases HCLE needs a web interface. Right now this wiki is it. We are looking into developing the Museum interface in Drupal.



Earlier Notes


As the proof of concept develops, we'll provide a glimpse of the catalog - the main event for researchers where every document will be available and searchable after it has been scanned. This may not be the most heavily visited page, but it may be the most valuable both for HCLE staff and outside researchers wanting to see what's available in our collection.

Right now (Sept. 28, 2013) we are working on implementing the catalog using an open source (read: free and maintained voluntarily by a community of programmers and users) platform called Omeka. You can see our progress by clicking here or by going to loopcntr.net/hclecatalog.

We have the beginnings of a Catalog development team. Click here if you'd like to follow along with the team's conversation.

Send an email to Liza@hcle.org if you would like to join the team.

Previously, (August 2013 and looking back to 2006) our catalog had been implemented in a MS Access database on Liza Loop's laptop. This is not a winning strategy for a publicly accessible catalog so we are working to migrate it to MySQL hosted "in the cloud" as they say these days. This, of course, is not as easy as it might seem because the Access system provides the user interface and the database structure together. MySQL does not.

A new page on data base structure was added on Aug 26th. You can jump directly to it by clicking here

Stay tuned.