Extensible Markup Language (XML) is a set of rules for encoding documents in machine-readable In telecommunication, a machine-readable medium is a medium capable of storing data in a machine-readable format that can be accessed by an automated sensing device and capable of being turned into (practically in every case) some form of binary form. It is defined in the XML 1.0 Specification[4] produced by the W3C The World Wide Web Consortium is the main international standards organization for the World Wide Web (abbreviated WWW or W3), and several other related specifications, all gratis Gratis is the process of providing goods or services without compensation. It is often referred to in English as "free of charge" or "complimentary". Companies, producers, and service providers often provide certain things free of charge as part of a larger business model or pricing strategy open standards An open standard is a standard that is publicly available and has various rights to use associated with it, and may also have various properties of how it was designed . There is no single definition and interpretations do vary with usage.[5]

XML's design goals emphasize simplicity, generality, and usability over the Internet The Internet is a global system of interconnected computer networks that use the standard Internet Protocol Suite to serve billions of users worldwide. It is a network of networks that consists of millions of private, public, academic, business, and government networks of local to global scope that are linked by a broad array of electronic and.[6] It is a textual data format with strong support via Unicode Unicode is a computing industry standard for the consistent representation and handling of text expressed in most of the world's writing systems. Developed in conjunction with the Universal Character Set standard and published in book form as The Unicode Standard, the latest version of Unicode consists of a repertoire of more than 107,000 for the languages of the world. Although the design of XML focuses on documents, it is widely used for the representation of arbitrary data structures, for example in web services Web services are typically application programming interfaces or Web APIs that are accessed via Hypertext Transfer Protocol (HTTP) and executed on a remote system hosting the requested services. Web services tend to fall into one of two camps: big Web services and RESTful Web services.

Many application programming interfaces An application programming interface is an interface implemented by a software program to enable interaction with other software, much in the same way that a user interface facilitates interaction between humans and computers. APIs are implemented by applications, libraries and operating systems to determine the vocabulary and calling conventions (APIs) have been developed that software developers use to process XML data, and several schema systems An XML schema is a description of a type of XML document, typically expressed in terms of constraints on the structure and content of documents of that type, above and beyond the basic syntactical constraints imposed by XML itself. These constraints are generally expressed using some combination of grammatical rules governing the order of elements, exist to aid in the definition of XML-based languages.

As of 2009[update], hundreds of XML-based languages have been developed,[7] including RSS RSS is a family of web feed formats used to publish frequently updated works—such as blog entries, news headlines, audio, and video—in a standardized format. An RSS document (which is called a "feed", "web feed", or "channel") includes full or summarized text, plus metadata such as publishing dates and authorship, Atom The name Atom applies to a pair of related standards. The Atom Syndication Format is an XML language used for web feeds, while the Atom Publishing Protocol is a simple HTTP-based protocol for creating and updating web resources, SOAP In chemistry, soap is a salt of a fatty acid. Soap is mainly used for washing and cleaning, but soaps are also important components of lubricants, and XHTML XHTML is a family of XML markup languages that mirror or extend versions of the widely used Hypertext Markup Language (HTML), the language in which web pages are written. XML-based formats have become the default for most office-productivity tools, including Microsoft Office Microsoft Office is an office suite of interrelated desktop applications, servers and services for the Microsoft Windows and Mac OS X operating systems. Microsoft Office was introduced by Microsoft in 1989 for Macintosh, with a version for Windows in 1990. Initially a marketing term for a bundled set of applications, the first version of Office (Office Open XML Office Open XML is a ZIP-based file format originally developed by Microsoft for representing spreadsheets, charts, presentations and word processing documents. The Office Open XML specification has been standardised both by Ecma and, in a later edition, by ISO and IEC as an International Standard (ISO/IEC 29500)), OpenOffice.org OpenOffice.org, commonly known as OOo or OpenOffice, is an open-source application suite whose main components are for word processing, spreadsheets, presentations, graphics, and databases. It is available for a number of different computer operating systems, is distributed as free software and is written using its own GUI toolkit. It supports the (OpenDocument The Open Document Format for Office Applications is an XML-based file format for representing electronic documents such as spreadsheets, charts, presentations and word processing documents), and Apple Apple Inc. is an American multinational corporation that designs and markets consumer electronics, computer software, and personal computers. The company's best-known hardware products include the Macintosh computers, the iPod, the iPhone and the iPad. Apple software includes the Mac OS X operating system; the iTunes media browser; the iLife suite's iWork iWork is an office suite of desktop applications created by Apple for the Mac OS X and iOS operating systems. The first version of iWork, iWork '05, was released in 2005. The suite originally bundled Keynote, a presentation program which had previously been sold as a standalone application, and Pages, a combined word processing and page layout.[8]

Contents

Key terminology

The material in this section is based on the XML Specification. This is not an exhaustive list of all the constructs which appear in XML; it provides an introduction to the key constructs most often encountered in day-to-day use.

(Unicode) Character
By definition, an XML document is a string of characters. Almost every legal Unicode Unicode is a computing industry standard for the consistent representation and handling of text expressed in most of the world's writing systems. Developed in conjunction with the Universal Character Set standard and published in book form as The Unicode Standard, the latest version of Unicode consists of a repertoire of more than 107,000 character may appear in an XML document.
Processor and Application
The processor analyzes the markup and passes structured information to an application. The specification places requirements on what an XML processor must do and not do, but the application is outside its scope. The processor (as the specification calls it) is often referred to colloquially as an XML parser.
Markup and Content
The characters which make up an XML document are divided into markup and content. Markup and content may be distinguished by the application of simple syntactic rules. All strings which constitute markup either begin with the character "<" and end with a ">", or begin with the character "&" and end with a ";". Strings of characters which are not markup are content.
Tag
A markup construct that begins with "<" and ends with ">". Tags come in three flavors: start-tags, for example <section>, end-tags, for example </section>, and empty-element tags, for example <line-break/>.
Element
A logical component of a document which either begins with a start-tag and ends with a matching end-tag, or consists only of an empty-element tag. The characters between the start- and end-tags, if any, are the element's content, and may contain markup, including other elements, which are called child elements. An example of an element is <Greeting>Hello, world.</Greeting> (see hello world A "Hello World" program is a computer program which prints out "Hello, World!" on a display device. It is used in many introductory tutorials for teaching a programming language. Such a program is typically one of the simplest programs possible in most computer languages. It is often considered to be tradition among programmers). Another is <line-break/>.
Attribute
A markup construct consisting of a name/value pair that exists within a start-tag or empty-element tag. In the example (below) the element img has two attributes, src and alt: <img src="madonna.jpg" alt='by Raphael'/>. Another example would be <step number="3">Connect A to B.</step> where the name of the attribute is "number" and the value is "3".
XML Declaration
XML documents may begin by declaring some information about themselves, as in the following example.
<?xml version="1.0" encoding="UTF-8" ?>

Example

Here is a small, complete XML document, which uses all of these constructs and concepts.

<?xml version="1.0" encoding="UTF-8" ?>
<painting>
<img src="madonna.jpg" alt='Foligno Madonna, by Raphael'/>
<caption>This is Raphael's "Foligno" Madonna, painted in
<date>1511</date>–<date>1512</date>.
</caption>
</painting>

There are five elements in this example document: painting, img, caption, and two dates. The date elements are children of caption, which is a child of the root element painting. img has two attributes, src and alt.

Characters and escaping

XML documents consist entirely of characters from the Unicode Unicode is a computing industry standard for the consistent representation and handling of text expressed in most of the world's writing systems. Developed in conjunction with the Universal Character Set standard and published in book form as The Unicode Standard, the latest version of Unicode consists of a repertoire of more than 107,000 repertoire. Except for a small number of specifically excluded control characters In computing and telecommunication, a control character or non-printing character is a code point in a character set, that does not in itself represent a written symbol. It is in-band signaling in the context of character encoding. All entries in the ASCII table below code 32 (technically the C0 control code set) and 127 are of this kind,, any character defined by Unicode may appear within the content of an XML document. The selection of characters which may appear within markup is somewhat more limited but still large.

XML includes facilities for identifying the encoding of the Unicode characters which make up the document, and for expressing characters which, for one reason or another, cannot be used directly.

Valid characters

Main article: Valid Characters in XML

Unicode code points in the following ranges are valid in XML 1.0 documents:[9]

XML 1.1[10] extends the set of allowed characters to include all the above, plus the remaining characters in the range U+01–U+1F. At the same time, however, it restricts the use of C0 and C1 control characters other than U+09, U+0A, U+0D, and U+85 by requiring them to be written in escaped form (for example U+01 must be written as &#x01; or its equivalent). In the case of C1 characters, this restriction is a backwards incompatibility; it was introduced to allow common encoding errors to be detected.

The code point U+00 is the only character encoded in Unicode and ISO/IEC 10646 that is not permitted in any XML 1.0 or 1.1 document.

Encoding detection

The Unicode character set can be encoded into bytes for storage or transmission in a variety of different ways, called "encodings". Unicode itself defines encodings which cover the entire repertoire; well-known ones include UTF-8 UTF-8 is a variable-length character encoding for Unicode. Like UTF-16 and UTF-32, UTF-8 can represent every character in the Unicode character set, but unlike them it has the special property of being backwards-compatible with ASCII. For this reason, it is steadily becoming the dominant character encoding for files, e-mail, web pages, and and UTF-16 In computing, UTF-16 is a variable-length character encoding for Unicode, capable of encoding the entire Unicode repertoire, by mapping each character (or code point) to a sequence of 16-bit code units. For characters in the Basic Multilingual Plane (BMP) the encoding is a single code unit equal to the code point. For characters in the other.[11] There are many other text encodings which pre-date Unicode, such as ASCII The American Standard Code for Information Interchange is a character-encoding scheme based on the ordering of the English alphabet. ASCII codes represent text in computers, communications equipment, and other devices that use text. Most modern character-encoding schemes are based on ASCII, though they support many more characters than did ASCII and ISO/IEC 8859 ISO/IEC 8859 is a joint ISO and IEC series of standards for 8-bit character encodings. The series of standards consists of numbered parts, such as ISO/IEC 8859-1, ISO/IEC 8859-2, etc. There are 15 parts, excluding the abandoned ISO/IEC 8859-12. The ISO working group maintaining this series of standards has been disbanded; their character repertoires in almost every case are subsets of the Unicode character set.

XML allows the use of any of the Unicode-defined encodings, and any other encodings whose characters also appear in Unicode. XML also provides a mechanism whereby an XML processor can reliably, without any prior knowledge, determine which encoding is being used.[12] Encodings other than UTF-8 and UTF-16 will not necessarily be recognized by every XML parser.

Escaping

There are several reasons why it may be difficult or impossible to include some characters directly in an XML document.

For these reasons, XML provides escape An escape sequence is a series of characters used to change the state of computers and their attached peripheral devices. These are also known as control sequences, reflecting their use in device control. Some control sequences are special characters which always have the same meaning. Escape sequences use an escape character to change the meaning facilities for referencing problematic or unavailable characters. There are five predefined entities: &lt; represents "<", &gt; represents ">", &amp; represents "&", &apos; represents ', and &quot; represents ". All permitted Unicode characters may be represented with a numeric character reference A numeric character reference is a common markup construct used in SGML and other SGML-related markup languages such as HTML and XML. It consists of a short sequence of characters that, in turn, represent a single character from the Universal Character Set (UCS) of Unicode. NCRs are typically used in order to represent characters that are not. Consider the Chinese character "中", whose numeric code in Unicode is hexadecimal 4E2D, or decimal 20,013. A user whose keyboard offered no method for entering this character could still insert it in an XML document encoded either as &#20013; or &#x4e2d;. Similarly, the string "I <3 Jörg" could be encoded for inclusion in an XML document as "I &lt;3 J&#xF6;rg".

"&#0;" is not permitted, however, as the null character The null character is a character with the value zero, present in the ASCII and Unicode character sets, and available in nearly all mainstream programming languages is one of the control characters excluded from XML, even when using a numeric character reference.[14] An alternative encoding mechanism such as Base64 Base64 is a group of similar encoding schemes that represent binary data in an ASCII string format by translating it into a radix-64 representation. The Base64 term originates from a specific MIME content transfer encoding is needed to represent such characters.

Comments

Comments may appear anywhere in a document outside other markup. Comments should not appear on the first line or otherwise above the XML declaration for XML processor compatibility. The string "--" (double-hyphen) is not allowed (as it is used to delimit comments), and entities must not be recognized within comments.

An example of a valid comment: "<!-- no need to escape <code> & such in comments -->"

International use

XML supports the direct use of almost any Unicode Unicode is a computing industry standard for the consistent representation and handling of text expressed in most of the world's writing systems. Developed in conjunction with the Universal Character Set standard and published in book form as The Unicode Standard, the latest version of Unicode consists of a repertoire of more than 107,000 character in element names, attributes, comments, character data, and processing instructions (other than the ones that have special symbolic meaning in XML itself, such as the open corner bracket, "<"). Therefore, the following is a well-formed XML document, even though it includes both Chinese A Chinese character, also known as a Han character , is a logogram used in writing Chinese (hanzi), Japanese (kanji), less frequently Korean (hanja), and formerly Vietnamese (hán tự), and other languages. Chinese characters are also known as sinographs, and the Chinese writing system as sinography. Chinese characters represent the oldest and Cyrillic Cyrillic script is an alphabet developed in the 9th century in Bulgaria, and used in the Slavic national languages of Belarusian, Bulgarian, Russian, Rusyn, Serbian, Macedonian, Montenegrin and Ukrainian, and in the non-Slavic languages of Moldovan, Kazakh, Uzbek, Kyrgyz, Tajik, Tuvan, and Mongolian. It also was used in past languages of Eastern characters:

<?xml version="1.0" encoding="UTF-8"?>
<外语>Китайська мова</外语>

Well-formedness and error-handling

The XML specification defines an XML document as a text which is well-formed For example, in HTML: <b>word</b> is a well-formed element, while <i><b>word</i> is not, since the bold element is not closed. In XHTML, empty elements should be closed by putting a slash at the end of the opening tag, e.g. <img />, <br />, <hr />, etc. In HTML, there is no closing tag for such, i.e. it satisfies a list of syntax rules provided in the specification. The list is fairly lengthy; some key points are:

The definition of an XML document excludes texts which contain violations of well-formedness rules; they are simply not XML. An XML processor which encounters such a violation is required to report such errors and to cease normal processing. This policy, occasionally referred to as draconian Draco was the first legislator of ancient Athens, Greece, 7th century BC. He replaced the prevailing system of oral law and blood feud by a written code to be enforced only by a court. Because of its harshness, this code also gave rise to the term "draconian", stands in notable contrast to the behavior of programs which process HTML HTML, which stands for HyperText Markup Language, is the predominant markup language for web pages. It is written in the form of HTML elements consisting of "tags" surrounded by angle brackets within the web page content, which are designed to produce a reasonable result even in the presence of severe markup errors. XML's policy in this area has been criticized as a violation of Postel's law.[15]

Schemas and validation

In addition to being well-formed, an XML document may be valid. This means that it contains a reference to a Document Type Definition (DTD) Document Type Definition is a set of markup declarations that define a document type for SGML-family markup languages (SGML, XML, HTML). A DTD is a kind of XML schema, and that its elements and attributes are declared in that DTD and follow the grammatical rules for them that the DTD specifies.

XML processors are classified as validating or non-validating depending on whether or not they check XML documents for validity. A processor which discovers a validity error must be able to report it, but may continue normal processing.

A DTD is an example of a schema An XML schema is a description of a type of XML document, typically expressed in terms of constraints on the structure and content of documents of that type, above and beyond the basic syntactical constraints imposed by XML itself. These constraints are generally expressed using some combination of grammatical rules governing the order of elements, or grammar. Since the initial publication of XML 1.0, there has been substantial work in the area of schema languages for XML. Such schema languages typically constrain the set of elements that may be used in a document, which attributes may be applied to them, the order in which they may appear, and the allowable parent/child relationships.

Show All>>

 

The above information uses material from Wikipedia and is licensed under the GNU Free Documentation License The purpose of this License is to make a manual, textbook, or other functional and useful document "free" in the sense of freedom: to assure everyone the effective freedom to copy and redistribute it, with or without modifying it, either commercially or noncommercially. Secondarily, this License preserves for the author and publisher a.
Some facts may not have been fully verified for accuracy. [Disclaimers Wikipedia is an online open-content collaborative encyclopedia, that is, a voluntary association of individuals and groups working to develop a common resource of human knowledge. The structure of the project allows anyone with an Internet connection to alter its content. Please be advised that nothing found here has necessarily been reviewed by]
This page was last archived by our server on Thu Sep 2 22:20:46 2010. [ refresh local cache ]
Displaying this page or its contents does not use any Wikimedia Foundation's resources.
The owners of this site proudly support the Wikimedia Foundation.


US Gulf eco-threat remains despite oil-eating microbes: ecologist - Platts
platts.com
US Gulf eco-threat remains despite oil-eating microbes: ecologist - Platts
Thu, 26 Aug 2010 23:02:34 GMT+00:00
Platts ... gerry_ Similar stories appear in Oilgram News. See more information at http://www.platts.com/Products.aspx?xmlFile=oilgramnews. xml .
Google News Search: XML,
Mon Sep 6 12:08:22 2010
load mapnik xml png
dbsgeo.com
load mapnik xml png
778px x 1058px | 240.90kB

[source page]

http dbsgeo com tmp load mapnik xml png

Yahoo Images Search: XML,
Mon Sep 6 12:08:22 2010
How To Use Spy For Editing
videojug.com
How To Use Spy For Editing

Mon, 27 Apr 2009 19:29:08 PDT

videojug.com.

Google Videos Search: XML,
Mon Sep 6 12:08:22 2010
stealthcopter.com Making Prettier Buttons in android; XML ...
stealthcopter.com
stealthcopter.com Making Prettier Buttons in android; XML ...

mat

Mon, 23 Aug 2010 14:04:42 GM

. .
Google Blogs Search: XML,
Mon Sep 6 12:08:23 2010