Metadata is loosely defined as data about data. Metadata is a concept that applies mainly to electronically archived or presented data and is used to describe the a) definition, b) structure and c) administration of data files A computer file is a block of arbitrary information, or resource for storing information, which is available to a computer program and is usually based on some kind of durable storage. A file is durable in the sense that it remains available for programs to use after the current program has finished. Computer files can be considered as the modern with all contents in context Context is the surroundings, circumstances, environment, background, or settings which determine, specify, or clarify the meaning of an event to ease the use of the captured and archived data for further use. For example, a web page may include metadata specifying what language it's written in, what tools were used to create it, where to go for more on the subject and so on.
Contents |
Metadata definition
Metadata is defined as data providing information about one or more other pieces of data, such as:
- means of creation of the data
- purpose of the data,
- time and date of creation,
- creator or author of data,
- placement on a network (electronic form) where the data was created,
- what standards A technical standard is an established norm or requirement. It is usually a formal document that establishes uniform engineering or technical criteria, methods, processes and practices. In contrast, a custom, convention, company product, corporate standard, etc. which becomes generally accepted and dominant is often called a de facto standard used
- etc.
For example: A digital image A digital image is a representation of a two-dimensional image using ones and zeros . Depending on whether or not the image resolution is fixed, it may be of vector or raster type. Without qualifications, the term "digital image" usually refers to raster images also called bitmap images may include metadata that describes how large the picture is, the color depth, the image resolution, when the image was created, and other data. A text document's metadata may contain information about how long the document is, who the author is, when the document was written, and a short summary of the document.
Metadata is data. As such, metadata can be stored and managed in a database, often called a registry or repository. However, it is impossible to identify metadata just by looking at it. We don't know when data is metadata or just data.[1]
Libraries
Metadata has been used in various forms as a means of cataloging archived information. The Dewey Decimal System employed by libraries for the indexing of books is an early example of metadata usage. This system used small 3x5 inch cards to display a book's title, author, subject matter, and a brief plot synopsis along with an abbreviated alpha-numeric identification system which indicated the physical location of the book within the library's shelves. Such data helps classify, aggregate, identify, and locate a particular book. Another form of older metadata collection is the use by US Census Bureau of what is known as the “Long Form." The Long Form asks questions that are used to create demographic data to create patterns and to find patterns of distribution. [2] The term was coined in 1968 by Philip Bagley, one of the pioneers of computerized document retrieval Information retrieval is the science of searching for documents, for information within documents, and for metadata about documents, as well as that of searching relational databases and the World Wide Web. There is overlap in the usage of the terms data retrieval, document retrieval, information retrieval, and text retrieval, but each also has.[3][4] Since then the fields of information management, information science, information technology, librarianship and GIS have widely adopted the term. In these fields the word metadata is defined as “data about data”.[5] While this is the generally accepted definition, various disciplines have adopted their own more specific explanation and uses of the term.
For the purposes of this article, an "object" refers to any of the following:
- a physical item such as a book, CD, DVD, map, chair, table, flower pot, etc
- an electronic file such as a digital image, digital photo, document, program file, database table etc
Photographs
Metadata may be written into a digital photo file that will identify who owns it, copyright & contact information, what camera created the file, along with exposure information and descriptive information such as keywords about the photo, making the file searchable on the computer and/or the Internet. Some metadata is written by the camera and some is input by the photographer and/or software after downloading to a computer.
Photographic Metadata Standards are governed by organizations that develop the following standards. They include, but are not limited to:
- IPTC Information Interchange Model The Information Interchange Model is a file structure and set of metadata attributes that can be applied to text, images and other media types. It was developed in the early 1990s by the International Press Telecommunications Council (IPTC) to expedite the international exchange of news among newspapers and news agencies IIM (International Press Telecommunications Council),
- IPTC Core Schema for XMP,
- XMP The Adobe Extensible Metadata Platform is a standard, created by Adobe Systems Inc., for processing and storing standardized and proprietary information relating to the contents of a file - Extensible Metadata Platform (an Adobe standard)
- Exif Exchangeable image file format is a specification for the image file format used by digital cameras. The specification uses the existing JPEG, TIFF Rev. 6.0, and RIFF WAV file formats, with the addition of specific metadata tags. It is not supported in JPEG 2000, PNG, or GIF - Exchangeable image file format, Maintained by CIPA (Camera & Imaging Products Association) and published by JEITA (Japan Electronics and Information Technology Industries Association)
- Dublin Core The Dublin Core set of metadata elements provides a small and fundamental group of text elements through which most resources can be described and cataloged. Using only 15 base text fields, a Dublin Core metadata record can describe physical resources such as books, digital materials such as video, sound, image, or text files, and composite media (Dublin Core Metadata Initiative -DCMI)
- PLUS (Picture Licensing Universal System)
Video
Metadata is particularly useful in video, where information about its contents (such as transcripts of conversations and text descriptions of its scenes) are not directly understandable by a computer, but where efficient search is desirable.
Web pages
Web pages often include metadata in the form of meta tags Meta elements are HTML or XHTML elements used to provide structured metadata about a Web page. Such elements must be placed as tags in the< code>head section of an HTML or XHTML document. Meta elements can be used to specify page description, keywords and any other metadata not provided through the other head elements and attributes. Description and keywords meta tags are commonly used to describe the Web page's content. Most search engines use this data when adding pages to their search index.
Creation of metadata
Metadata can be created either by automated information processing or by manual work. Elementary metadata captured by computers can include information about when a file was created, who created it, when it was last updated, file size and file extension.
Metadata structures
Metadata is typically structured according to a standardised concept using a well defined metadata scheme, including: metadata standards To ensure correct and proper use and interpretation of data, all users and owners of data should have a common understanding of the meaning or semantics of the data. To achieve this common understanding, a number of characteristics, or attributes of the data have to be defined, also known as metadata and metadata models Metadata modeling is a type of metamodeling used in software engineering and systems engineering for the analysis and construction of models applicable and useful some predefined class of problems. Tools such as controlled vocabularies Controlled vocabularies provide a way to organize knowledge for subsequent retrieval. They are used in subject indexing schemes, subject headings, thesauri and taxonomies. Controlled vocabulary schemes mandate the use of predefined, authorised terms that have been preselected by the designer of the vocabulary, in contrast to natural language, taxonomies Taxonomy is the practice and science of classification. The word finds its roots in the Greek τάξις, taxis and νόμος, nomos (meaning 'law' or 'science'). Taxonomy uses taxonomic units, known as taxa (singular taxon), thesauri A thesaurus is a book that lists words grouped together according to similarity of meaning , in contrast to a dictionary, which contains definitions and pronunciations. The largest thesaurus in the world is the Historical Thesaurus of the Oxford English Dictionary[citation needed], which contains more than 920,000 words, data dictionaries A data dictionary, a.k.a. metadata repository, as defined in the IBM Dictionary of Computing, is a "centralized repository of information about data such as meaning, relationships to other data, origin, usage, and format." The term may have one of several closely related meanings pertaining to databases and database management systems : and metadata registries Because metadata registries are used to store both semantics and systems-specific constraints (for example the maximum length of a string) it is important to identify what systems impose these constraints and to document them. For example the maximum length of a string should not change the meaning of a data element can be used to apply further standardisation to the metadata.
Metadata syntax
Metadata syntax refers to the rules created to structure the fields or elements of metadata.[6] A single metadata scheme may be expressed in a number of different markup or programming languages, each of which requires a different syntax. For example, Dublin Core may be expressed in plain text, HTML HTML, which stands for HyperText Markup Language, is the predominant markup language for web pages. It is written in the form of HTML elements consisting of "tags" surrounded by angle brackets within the web page content, XML Extensible Markup Language is a set of rules for encoding documents in machine-readable form. It is defined in the XML 1.0 Specification produced by the W3C, and several other related specifications, all gratis open standards and RDF The Resource Description Framework is a family of World Wide Web Consortium (W3C) specifications originally designed as a metadata data model. It has come to be used as a general method for conceptual description or modeling of information that is implemented in web resources, using a variety of syntax formats.[7]
Metadata types
The metadata application is manifold covering a large variety of fields of application there are nothing but specialised and well accepted models to specify types of metadata. Bretheron & Singley (1994) distinguish between two distinct classes: structural/control metadata and guide metadata.[8] Structural metadata is used to describe the structure of computer systems such as tables, columns and indexes. Guide metadata is used to help humans find specific items and is usually expressed as a set of keywords in a natural language. According to Ralph Kimball Ralph Kimball is an author on the subject of data warehousing and business intelligence.He is widely regarded as the 'Guru' of Data Warehousing and is known for long-term convictions that data warehouses must be designed to be understandable and fast. His methodology, also known as dimensional modeling or the Kimball methodology, has become the de metadata can be divided into 2 similar categories - Technical metadata and Business metadata. Technical metadata correspond to internal metadata, business metadata to external metadata. Kimball adds a third category named Process metadata. On the other hand, NISO distinguishes between three types of metadata: descriptive, structural and administrative. [5] Descriptive metadata is the information used to search and locate an object such as title, author, subjects, keywords, publisher; structural metadata gives a description of how the components of the object are organised; and administrative metadata refers to the technical information including file type. Two sub-types of administrative metadata are rights management metadata and preservation metadata.
Hierarchical, linear and planar schemata
Metadata schemas can be hierarchical in nature where relationships exist between metadata elements and elements are nested so that parent-child relationships exist between the elements. An example of a hierarchical metadata schema is the IEEE LOM Learning Object Metadata is a data model, usually encoded in XML, used to describe a learning object and similar digital resources used to support learning. The purpose of learning object metadata is to support the reusability of learning objects, to aid discoverability, and to facilitate their interoperability, usually in the context of online schema where metadata elements may belong to a parent metadata element. Metadata schemas can also be one dimensional, or linear, where each element is completely discrete from other elements and classified according to one dimension only. An example of a linear metadata schema is Dublin Core schema which is one dimensional. Metadata schemas are often two dimensional, or planar, where each element is completely discrete from other elements but classified according to two orthogonal dimensions.[9]
Metadata hypermapping
In all cases where the metadata schemata exceed the planar depiction, some type of hypermapping is required to enable display and view of metadata according to chosen aspect and to serve special views. Hypermapping frequently applies to layering of geographical and geological information overlays.[10]
Granularity
Granularity is a term that applies to data as well as to metadata. The degree to which metadata is structured is referred to as its granularity Granularity is the extent to which a system is broken down into small parts, either the system itself or its description or observation. It is the "extent to which a larger entity is subdivided. For example, a yard broken into inches has finer granularity than a yard broken into feet.". Metadata with a high granularity allows for deeper structured information and enables greater levels of technical manipulation however, a lower level of granularity means that metadata can be created for considerably lower costs but will not provide as detailed information. The major impact of granularity is not only on creation and capture, but moreover on maintenance. As soon as the metadata structures get outdated, the access to the referred data will get outdated. Hence granularity shall take into account the effort to create as well as the effort to maintain.
Metadata standards
International standards apply to metadata. Much work is being accomplished in the national and international standards communities, especially ANSI The American National Standards Institute or ANSI [citation needed] is a private non-profit organization that oversees the development of voluntary consensus standards for products, services, processes, systems, and personnel in the United States. The organization also coordinates U.S. standards with international standards so that American (American National Standards Institute) and ISO The International Organization for Standardization , widely known as ISO, is an international-standard-setting body composed of representatives from various national standards organizations. Founded on 23 February 1947, the organization promulgates worldwide proprietary industrial and commercial standards. It has its headquarters in Geneva, (International Organization for Standardization) to reach consensus on standardizing metadata and registries.
The core standard is ISO The International Organization for Standardization , widely known as ISO, is an international-standard-setting body composed of representatives from various national standards organizations. Founded on 23 February 1947, the organization promulgates worldwide proprietary industrial and commercial standards. It has its headquarters in Geneva,/IEC 11179-1:2004 [11] and subsequent standards (see ISO/IEC_11179 ISO/IEC 11179 (formally known as the ISO/IEC 11179 Metadata Registry standard) is an international standard for representing metadata for an organization in a Metadata Registry). All yet published registrations according to this standard cover just the definition of metadata and do not serve the structuring of metadata storage or retrieval neither any administrative standardisation.
Metadata usage
Statistics and census services
Standardisation work has had a large impact on efforts to build metadata systems in the statistical community. Several metadata standards are described, and their importance to statistical agencies is discussed. Applications of the standards at the Census Bureau, Environmental Protection Agency, Bureau of Labor Statistics, Statistics Canada, and many others are described. Emphasis is on the impact a metadata registry can have in a statistical agency.
Thu, 19 Aug 2010 10:40:23 GMT+00:00
Music Ally (blog) ... other mobile apps combined, and claims it's not possible for Grooveshark to proactively remove UMG content from its service due to metadata issues. ...
489px x 774px | 95.00kB
[source page]
can be displayed in Windows Explorer by adding the fields to be displayed Right click over any of the fields in Windows Explorer and a quick list of fields in use appears with a check mark Select the Author and Comments fields and they will then be displayed in Windows Explorer
Jeff Carr
ue, 15 Jun 2010 14:00:18 GM
Taxonomy managed in another tool outside of the SharePoint environment may be imported into SharePoint 2010.


