Monday, December 5, 2011

The Organization of Information

Chapter 11: Systems for Categorization


Categorization is all around us. We see it in the grocery store, at home and especially in schools, from Pre-Kindergarten to the college level. People's desire to categorize include the need for order, ease of finding things and to make sense of the world around them.  In libraries, children's picture books have a designated place that is different from the books that help you learn a foreign language. Furthermore, items within those groups are put into particular groups and subgroups as well. 

One of the kings of categorization was Melvile Dewey.  In 1876, Dewey devised a way of organizing books by separating knowledge into 10 divisions, each one being broken down into 10 sections that are further divided, all corresponding to variations of a particular subject. By using this appropriately named Dewey Decimal Classification (DDC) system, when you enter my library looking for books on dogs, you will be led to the 600’s. This is the division of science that includes farming, pets, machines, etc. Books on dogs are found in the 636’s with different kinds of dog books being assigned numbers that include whole and decimal numbers.

For more information on the DDC and the Library of Congress Classication (LCC), an enumerative system of categorization that provides a guide to subject groupings and subclasses of books that are actually in a library, please visit: http://www.oclc.org/dewey/resources/summaries/#nb and http://www.loc.gov/catdir/cpso/lcco/, respectively.

The Organization of Information

Chapter 10:  Systems for Vocabulary Control

Bass. Bass. Bass.  A fish? A guitar? Low male singing voice?  If you were in a music store, bass (which rhymes base) numbers 2 and 3 would peak your interest.  Planning for a weekend with your friends at the lake might leave you more interested in the first bass, as in class.  Searching online would definitely present challenges if the pronounciation and background knowledge of the words were not known.  That is why there is such a dire need for systems of vocabulary control.

A controlled vocabulary is a database of terms in which all of words and phrases that represent a term are brought together.  Most often, one word or phrase would be the best choice for retrieving the desired information. Nevertheless, there are challenges to creating such a vocabulary.  Among other concerns, issues of homographs (Polish-nationality vs. polish-to make smooth or shiny), homophones (sea-ocean vs. see-visualize) and abbreviations and acronyms (Ph.D.--Doctor of Philosophy and NASA-National Aeronautics and Space Administration) arise. To make searching easier and more efficient, online subject heading lists (Library of Congress Subject Headings, Medical Subject Headings), thesauri (Thesaurus of ERIC descriptors and Art & Architecture Thesaurus) and ontologies (Unified Medical Language System and the Semantic Web) are readily available. 

Sunday, December 4, 2011

The Organization of Information

Chapter 9:  Subject Analysis

What is this? What is it for? What is it about? These are some of the questions asked about information resources in order to properly assign the correct metadata and make them available for use.  This process is called subject analysis, and it involves conceptual analysis, carefully examining an item to determine the answers to questions such as those above. Once the "aboutness" is determined, that information is used to apply the proper terminology to the item. Dewey Decimal Classificatin, Library of Congress Subject Headings and Sears Subjects Headings are a few of the authoritative sources consulted when seeking to assign appropriate terms.

The information that Taylor and Joudrey provide in The Organization of Information (2009) on how to examine an item is invaluable. I purchase books from online vendors like Scholastic.com or Amazon.com , so I have to decide where to house them in the school library. Knowing how to think about the physical item (cover/illustrations, size), skimming the text for key words and phrases or subject/topics and reading and understanding the title (for example, finding out that a book called Yummy is about a troubled boy who likes eating candy but not about junk food or candy) can make a big difference in the organization of materials in the library.

The Organization of Information

Chapter 8  Metadata:  Access and Authority Control

At one time or another, almost everyone has had difficulty finding a book, an article or other source of information on an online catalog.  After typing in numerous words or phrases, including their synonyms or a rearranged version of the same words, one can still be left without any resources. There are countless ways people try to search for what they want because no two people think exactly alike. That is why there is a great need for authority control.


Authority control is the result of the process of maintaining consistency of access points, words or phrases used to obtain information from an organized system.  Among other things, this will enable users to identitfy the creator of a particular work using their own vocabulary, to collocate (bring together related information resources) and to decide if the information provided from the search is what they are looking for. Take note the following real-life situation. Look for what happens when different versions of the authors' names were entered and how resources were collocated.

As part of our schoolwide reading program, students are encouraged to read fairy tales and folklore throughout the month of December. (Last month's genre was Historical Fiction/Non Fiction.)  From Pre-K to 6th grade, everyone seems to love Princess Furball, a fairy tale based on the Brothers Grimm's Thousandfurs. When I searched the Library of Congress database for the published works by the siblings, I found several hundred pages when I used "Grimm Brothers" and the "Brothers Grimm". "Grimm" alone retrieved over 3500 pages of records, the first several ones written in German. Princess Furball was listed three times, once as a video and twice as a book. Thousandfurs was mentioned in a book titled Rare Treasures from Grimm.

The Organization of Information

Chapter 7 Metadata:  Description

Of her beloved Romeo Montague, Shakespeare's Juliet Capulet ponders:

What's in a name? that which we call a rose
By any other name would smell as sweet

The young maiden is in love with Romeo. She is convinced that if Romeo were not a Montague, her parents and other relatives would be unable to deny Romeo's sweetness, too.

On the contrary, when describing metadata, one must be very concerned with names. There are particular descriptors for particular items. When creating metadata--data about data--for a resource,
1. provide a description of the resource. Be sure to include any pertinent information that will aid in managing and preserving it,
2.  provide ways to access the description, and
3.  encode the information resource, (book, CD, DVD, etc.). Encoding will enable each part of the item to be displayed and searched for according to the wishes of whomever creates the way in which the resource will be displayed.
These surrogate records, explanations of the title, creator, subject, etc., of an information resource must be exact so the desired information can be retrieved from each query.

The International Standard Bibliographic Description (ISBD) is a schema, or set of metadata elements set aside for a particular type of resource, used in metadata description.  The prescribed punctuation (commas, colons, semicolons and more) in a record serves to precede and predict the data that follows. Also, since each of the eight areas of the ISBD contains more than one element, the order of the data is prescribed. Area 1 contains the title and statement of responsibility whereas Area 8 has information that identifies the resource and its terms of availability. The ISBD, DACS (Describing Archives:  A Content Standard) and VRA (Visual Resources Association) are just three of many sets of standards by which books, works of art, cultural objects, archival documents and more are described in a consistent manner, one that is oftentimes dependent upon the community for which the resources were created.

The Organization of Information

Chapter 6:  Systems and System Design


     An information system is an organized, purposeful structure that is made up of interrelated and interdependent elements that (in)directly influence each other to achieve a goal. This type of system works in three different ways: storage (to organize data) , retrieval (to search for data based on queries) and display of information (related to the interface design, the part of the system design that controls the interaction between the computer and the user).  The ILS, or Integrated Library Systems, is a type of system that unifies and then shares information from multiple databases. It serves many purposes, including keeping track of items owned and patrons who borrow materials.
     Online Public Access Catalogs, OPACs, like the Chicago Public Schools' S.O.A.R. (Seeking Online Access to Resources), are very useful systems that allow users remote access to search for and retrieve online information. One can find out what items school libraries have, whether those items are currently available and what form they are in, e.g., e-book, magazine, hardcover book.  The information gathered by these systems is readily available wherever there is Internet access.  The arrangement of the information is of little or no significance to the user since each arrangement is created in response to queries.
     WorldCat is the the world's largest online library catalog. It keeps a record of the collections of over 70,000 libraries in 170 countries and territories that participate in the global cooperative OCLC, the Online Computer Library Center.  Go to http://www.worldcat.org/ to find more than one billion items in libraries near you!

The Organization of Information

Chapter 5:  Encoding Standards

There are many ways to encode data. Without encoding the surrogate records, e.g., title, creator, subject, etc., of an information resource in electronic form, they cannot be accessed on the World Wide Web.  During my 2-year experience as school librarian, I have learned about two ways that records can be encoded--the MARC (MAchine-Readable Cataloging) format and HTML (HyperText Markup Language).


Encoding surrogate records allows individual parts of a record to be set aside to be used for specific reasons. When my students search our online library catalog for their favorite authors, the catalog will only search the areas in the MARC records that are designated for author entries, either 100 or 700.  Thanks to The Library Corporation, the family-owned company who created S.O.A.R. (Seeking Online Access to Resources),the Integrated Library System, or ILS, for Chicago Public Schools, I do not have to remember what field (title, author, publisher, etc.) is associated with what number. TLC provides us with all of that information. See a few of the components of a MARC record below. Then, notice a rough example of how that information looks when I have to enter it into the S.O.A.R. database.


MARC      
                                                                       
1XX Main entry field (usually author's name)               
2XX Title and edition field                                              
3XX  Physical description field                                     


100 |a Polacco, Patricia.
245 |a Betty Doll/|cPatricia Polacco.
260 |a New York: |b Philomel Books |c c2001.

S.O.A.R.

Name                         Polacco, Patricia
Title                          Betty Doll
Publisher                    Philomel Books
Date of Publication     2001

Another encoding method, HTML, HyperText Markup Language, brings power to the people! Almost anyone can create Web pages.  HTML defines the content and display of documents on the web. By using this application, images can be displayed and documents can be linked.  I have included a excerpt from a Wikipedia article on Patricia Polacco. Compare that to the HTML version of the text.


Patricia Barber Polacco (b. July 11, 1944, Lansing, Michigan) is the author and illustrator of numerous picture books for children.  She struggled in school because she was unable to read until age 14 due to dyslexia; she found relief by expressing herself through art. Polacco endured teasing and hid her disability until a schoolteacher recognized that she could not read and began to help her. Her book Thank You, Mr. Falker is Polacco's retelling of this encounter and its outcome.


<!-- bodycontent -->
<div lang="en" dir="ltr" class="mw-content-ltr"><p><b>Patricia Barber Polacco</b> (b. July 11, 1944, <a href="/wiki/Lansing,_Michigan" title="Lansing, Michigan">Lansing, Michigan</a>) is the author and illustrator of numerous picture books for children.</p>
<p>She struggled in school because she was unable to read until age 14 due to <a href="/wiki/Dyslexia" title="Dyslexia">dyslexia</a>; she found relief by expressing herself through art. Polacco endured teasing and hid her disability until a schoolteacher recognized that she could not read and began to help her. Her book <i>Thank You, Mr. Falker</i> is Polacco's retelling of this encounter and its outcome.</p>

I do not know how to read this encoded data, but thank you, Mr. Tim Berners-Lee, for being instrumental in creating HTML so everyday people can create audible and visual web pages!

The Organization of Information

Chapter 4: Metadata

Metadata is data about data, and it plays a very significant role in the way we understand the world around us. Granted, our textbook for LIMS 5320 is a book, but it is not the same kind of book as a cookbook or Patricia Polacco's children's book, Pink and Say. By providing us with information, that is, organized data, metadata increases our understanding of objects, and it enables us to save time, as we are relieved of the burden of searching through huge amounts of information that we may not need.

A type of metadata is the MARC, or Machine-Readable Cataloging, format. MARC records are the behind-the-scenes, encoding metadata that gives extensive information (title, author, publication, edition, page number, subject, etc.) about a book. This information is not readily accessible to library patrons, but librarians/catalogs can retrieve this information as they include this and similar works in their collections. Below is a partial example of a MARC record: 

100 1_|a Polacco, Patricia.
245 14|a Pink and Say/|c Patricia Polacco
260 __|a New York:  |b Philomel |c 1994


Granularity is the fineness with which data in particular fields is sub-divided.  This breakdown is evident in MARC records as information about a particular object—book or DVD—is more extensive than what we see on the front of a book—title, author and sometimes publisher.  Information can go from before 1XX, the creator of the object, to beyond 7XX, added entries about it.  Note the examples of low and high granularity with my school’s address:


Low granularity:
1.     address:  3244 W. Ainslie St. Chicago, IL 60625 USA

High granularity:

1.     street address:  3244 W. Ainslie St.
2.     city:  Chicago
3.     postal code: IL 60625
4.     country:  USA

Higher granularity:

1.     street: W. Ainslie St.
2.     address:  3244
3.     city:  Chicago
4.     state:  IL
5.     postal code: 60625
6.     country:  USA