Whitepapers
XML:
The Foundation for the Future
By Mike Hogan, POET Software,
A Sponsor Member of OASIS
Executive Summary
XML is incredibly hot these days. Articles in major business magazines are
trumpeting XML as the heir apparent to HTML in the Internet. Technology leaders
like Microsoft, Netscape, Sun, Novell and others have announced new products
and technologies based on XML that promise to have a dramatic impact on the
computing landscape. Bill Gates has stated that Microsoft Office will support
XML, giving it critical mass. Internet visionaries are talking about the impact
XML will have on Internet search engines, electronic commerce, intelligent agents,
seamless roaming, file systems, electronic data interchange (EDI), push technologies,
software distribution, data re-purposing and more.
In describing XML, it is important to first understand HTML. HTML helped establish
the Internet by providing a universal way to present information. However, HTML
only addresses the presentation of data. XML takes this one step further by
addressing the context, or meaning of the data. For example, using XML, the
word "bill" can be tagged as a name, a charge, a paper currency, a
proposed law, or the mouth of a bird. Tagging the data enables machine interpretation
with great precision. This is even more important when dealing with numbers,
which have no inherent context. For example, 1000 might be a good price, in
dollars, for a state-of-the-art laptop. However, it would be a very bad average
number of days required for delivery. As a result, the tag that puts the 1000
in context is critical.
In addition, these tags have associations or structure. A product has a price,
a tax rate, a delivery charge, etc. By defining the structure of data with XML
tags, finding, manipulating, acting on and interacting with the data are much
easier. If presenting the data in a universal way-via HTML-created the Internet,
think of what will happen when the data has a structure that is both universally
understood, and capable of being processed by machines. XML provides this structure.
Because of this, XML is the foundation for the next exciting wave of change
to reshape the computing landscape.
Although the development of XML was a lengthy evolutionary process, the impact
of XML will be nothing short of revolutionary. However, achieving this potential
requires a scalable object-oriented solution for persistence that supports the
structured content provided by XML. The need for flexible, scalable storage
is universal, and applies to both the client and the server. This storage must
also be capable of seamlessly incorporating and linking with other legacy data
formats, in an intelligent manner, in order to smooth the evolution toward XML,
while addressing current needs in a competitive manner.
When formulating the requirements of an XML-savvy repository there are many
very important criteria. From a functional point of view, the ideal XML repository
should handle large data volumes (larger than physical memory), concurrent editing,
link management and navigation, full-text queries, standardized API access,
BLOB (Binary Large OBject) management, revision control, legacy entity management,
metadata with associations to legacy data types, and hierarchical namespaces.
When evaluating the available options for such persistencenamely relational
databases, file systems and object databasesobject databases are the only option
that supports these requirements. However, object databases, by themselves,
do not address all of these needs. What is required is a framework built on
top of the ODBMS to address the XML-specific object model, while making it extensible
via an easy-to-use, component-oriented API. In addition, it would be highly
advantageous if the object database and the XML framework could be embedded
in all clients and servers, in much the same way that file systems are today,
to create a layer of consistency enabling more intelligent interaction between
the client and the server.
The Internet is a massive and unwieldy collection of unstructured data. The
broad scale adoption of XML will change the Internet into a structured, easily
navigable transport for sophisticated business and personal transactions. While
the Internet will be the showcase for the benefits of XML, the tools to drive
this revolution in the Internet, intranet and extranet are being delivered today
by leading companies like Microsoft, Netscape, Sun Microsystems, POET, Novell
and others. The adoption of XML will usher in the second phase of the Internet,
supplanting the first phase, which was characterized by HTML. By combining the
benefits of XML with rich client- and server-based persistence, the Internet
will undergo a dramatic metamorphosis, becoming the dominant transport for business-to-business
and business-to-consumer communication. The impact of this revolution will be
as dramatic as the creation of the Internet itself. Finally, businesses will
have access to secure ad hoc commerce and EDI. Individuals will be able to participate
in commerce, communications, publishing and information consumption in an intelligent
fashion. Information overload will give way to dynamic, targeted, granular and
personal information manipulation. The Internet will be transformed from a massive
unstructured, unmanaged and cumbersome collection of documents, into a structured,
interactive, navigable and useful part of our everyday lives.
XML will spawn a new wave of applications, and these applications will have
a new set of data storage requirements. XML storage requirements are very different
from those of its predecessor, HTML. The architectures of relational databases
and file systems do not map well to the richly interlinking hierarchical architecture
of structured XML content. Only object databases can effectively store, manage
and manipulate XML data. As a result, the adoption of XML will drive the adoption
of object database technology by the mainstream corporate market. Because of
this, XML may very well change both the Internet and the database landscapes.
An Example of XML's Impact on Our Everyday Lives
Alice arrives at work at 8AM, grabs a coffee and sits down in front of her
computer. Her automated agent has posted a message on her active desktop, with
links to a short company profile, the company's 90-day stock history, and its
web site. All of this information has been formatted according to her personal
stylesheet. The company, named simply "It", specializes in on-line
entertainment. Alice runs a full-text search on the company's name in her document
repository of press releases and news articles and finds three links. The most
recent article explains that It is engaged in talks with Janet Jackson about
future rights to her interactive music videos. If the deal goes through this
would mean dramatically increased revenues and visibility for It.
Alice deduces that the deal is imminent and that word has started to leak
out, pushing up the stock. She drags the category indicated in the original
alert, "on-line entertainment", to her address book, which builds
a mailing list of clients interested in companies in this area. She drags the
press release and stock history into the message window and adds a short note
asking if anyone would be interested in acquiring the stock. A large private
investor quickly replies with a request for 20,000 shares at the current price
of $4. She drags the order from this message to her automated ordering agent,
which places the order on-line and archives the transaction record, complete
with the digital signatures of both buyer and seller.
Three hours later Alice's mailer beeps, triggered by another automated agent
instructed to notify Alice of any new press releases from companies in which
she has recently bought stock. Sure enough, the deal between It and Janet Jackson
has been announced and It stock is shooting up. It reaches $9 before leveling
off. Not bad, she thinks, not even noon and I've already made over $100,000
for my client. Just then she hears another beep. It is her mother's birthday.
Alice's calendar program calls up an on-line buying agent, which automatically
grabs her mother's mailing address from her address book. It prompts her to
enter a short note and then goes out in search of an on-line flower shop with
delivery service to Raleigh, North Carolina. The agent searches for one dozen
large long-stem roses, at the best price, including delivery and taxes. As for
Alice, she settles back in her chair feeling vaguely guilty about how easy it
all is.
Wait a minute! Doesn't this sound a bit too much like science fiction? Just
getting two different software applications to work together is close to impossible,
let alone a fully integrated array of software agents connected to an Active
Desktop, calendar, mail program, financial information provider, document repository
and Internet-based purchasing system. Speaking of which, an automatic purchasing
agent has a hard enough time getting consistent price information from different
on-line stores; how could it also find out where delivery is available? And
while viewing an on-line stock history is no problem, formatting it with a custom
stylesheet in an e-mail message is very difficult. How could the formatting
program figure out how to rearrange the pieces of information according to Alice's
stylesheet? And wouldn't a full-text query on "It" retrieve every
document in the database?
The technology that is making scenarios like this feasible is XML, which is
now clearly emerging as HTML's equivalent for data. Unlike HTML, XML data elements
have well defined tags that describe their content, enabling applications and
autonomous agents to extract useful information from them. In addition, the
available tags are not fixed but can be defined in a Document Type Definition
(DTD) for a given application. The DTD describes which tags are allowed in the
document as well as defining a hierarchy that determines where tags may occur
(i.e. a "paragraph" may occur in a "chapter", but not vice
versa). Although the XML standard is very new, mainstream DTDs for describing
software components have already been deployed for applications needing standards-based,
open access to information. Examples of these standard DTDs include CDF (Channel
Definition Format) for describing push content and OSD (Open Software Description).
Because the investment in XML development tools and expertise can be leveraged
across a wide range of applications, communication based on well-defined, standard
DTDs will become increasingly important in providing universal access to information
across applications from different vendors. In Alice's case, XML DTDs for address
book entries, stock price histories, company information and the like are what
enable her diverse set of applications to extract and act on or interact with
the relevant information.
Since XML provides a mechanism for tagging data with fields that describe the
data, the applications can pinpoint the exact data they need and share it in
a manner that is easily brokered between the applications. For example, searching
for flowers for Alice's mother simply entails:
- Searching a directory for on-line flower shops
- Connecting to these flower shops and searching for XML tags that correspond
to the appropriate delivery area <Delivery Area> for Raleigh, NC
- >Then searching this subset of shops for the appropriate flowers: <Flower
Class> Rose, <Color> Red, <Stem> Long, <Vase> none, etc.
- >4.From this subset, the agent computes the corresponding costs by searching
for the information tagged by <price>, <tax rate> and <delivery
fee> and doing the appropriate calculation. It then selects and purchases
the least expensive offering.
In fact, there is more to this story. Most of the tasks that could be accomplished
in this way would occur in the background without any interaction from Alice.
XML-Enabled Technologies
The following are just a few examples of some of the exciting new technologies
enabled by XML:
Internet Search Engines:
Imagine a search engine that understands and uses contextual information when
performing a full-text search. Searching for information about the Java programming
language would no longer yield links to coffee sites or the Island of Java.
This is because searching for the term "Java" is narrowed down to
those fields tagged as a "programming language". As a result, the
speed and accuracy of the search is dramatically improved. Widespread use of
XML repository technology on Web servers will play a vital role in easing the
"information overload" currently suffered by Internet users. For example,
when searching for information on a subject that is contained in a single chapter
or even a single page within a book, XML enables you to retrieve only that chapter
or page, while HTML currently gives you the entire book. Of course all of these
benefits require a sophisticated, scalable and fast repository. This repository
must be able to manage the rich XML links and understand XML structure so that
it indexes text based on its context and use in a document.
Electronic Commerce:
The long-expected rise of electronic commerce has been stymied by the difficulty
encountered by consumers in finding the desired product among the myriad of
vendors setting up shop on the Internet, all with different product lines, prices,
on-line viewing capabilities, delivery options and so forth. So-called intelligent
agents have not helped because they have an even harder time than humans in
trying to make sense of the digital morass presented by HTML. With XML repository
technology, on-line stores can present product information in a standard, structured
format, independent of page design. Electronic commerce is obviously focused
on financial transactions. Using HTML, the user must manually wade through HTML
information to extract relevant data like price, tax, etc. And unlike text,
numbers have no inherent context. In other words, price means something, but
how do you know whether a number is associated with a price, a tax, an address
or anything? XML creates this association, making human and machine interpretation
a reality. XML is the catalyst that will finally unleash the explosive potential
of electronic commerce. The XML-aware query facilities of the repository make
it possible to retrieve relevant information directly and re-purpose it as needed
it for processing by an automatic agent or a user. By reducing the time needed
to locate a product, a price, or any other relevant information on the Internet,
XML repositories will play an important role in making on-line shopping more
efficient and enjoyable.
Self-describing BLOBs and Distributed Object File Systems:
File systems today store files as BLOBs (Binary Large OBjects), whether they
are word processing documents, presentations, databases, pictures, CORBA objects,
etc. The information is locked inside the file, and is not accessible to the
file system. The file system has only minimal information about the files. File
systems could do no better with XML data. However, by leveraging a sophisticated
object-oriented structure, the contents and structure of XML data can be indexed,
searched and manipulated in a sophisticated and granular fashion. As a result,
the XML data can be managed in a distributed fashion much like the Internet
itself. A powerful XML linking mechanism also makes it possible to tightly bind
a BLOB with a set of XML metadata describing it, sort of a machine-readable
summary. Since an ODBMS-powered XML repository can natively manage structured
content, binary data types and arbitrary links between the two, it is
the only solution with the power to build distributed object repositories that
offer scalable management of legacy documents and XML data.
Electronic Data Interchange (EDI):
EDI is carried on today through secure Value Added Networks (VANs) that map
the data from between companies and their disparate applications. By leveraging
XML, the applications easily broker information between themselves. Mapping
data from one company's purchasing system to another company's inventory is
just a matter of understanding the XML tags on the data. XML becomes the universal
format for EDI, enabling companies to create ad hoc secure extranets with other
companies over the Internet transport, while eliminating the VAN middleman.
Data Re-purposing:
By breaking documents into discrete elements, it becomes very easy for individuals
to extract the truly relevant information from several sources and reassemble
it into any format (e.g. web page, document, presentation, whatever). This helps
to address the current information overload, because the user receives only
the relevant information. In fact, the information might even be assembled by
a personal agent. This ability also facilitates the acceleration of learning
since it becomes much easier to assemble the "current" body of work
on a particular subject, and then take it a step further, pushing the development
of human knowledge ever forward.
Content Personalization (intelligent pull, agent accumulation, and push):
Today, people use the Internet as a news service. By defining a few key words
or general topics of interest in specific industries you can get a fairly good
news service. However, this type of service requires human interaction
to determine what is actually new. The alternative is to monitor a few news
sites. Unfortunately, this approach limits the ability to filter and personalize
the information. However, XML, combined with a sophisticated repository, can
solve all of these problems. Using XML, you could create a very sophisticated
personal news filter that spans multiple sites or the entire Internet. The XML
repository would provide the date stamp, enabling agents or search engines to
filter the information to extract only the "new" information. Then,
the information could be easily extracted, formated, and delivered in any way
that you choose, whether it is an Active desktop, a personal news web page,
email, pager, anything. This capability will allow all individuals to create
"custom" newspapers with the latest information formatted and delivered
any way they want it.
Customized Bandwidth Allocation:
Allocation and prioritization of bandwidth is an increasingly important issue.
The allocation of bandwidth can be accomplished at a number of points in the
flow of data, including: the information serving company, their access provider,
the Internet backbone, the recipient's local service provider, or the recipient's
company (of course, this is an oversimplification). At each of these points,
bandwidth can be prioritized, but the problem is how to determine the priority
and then how to bill accordingly. By attaching the appropriate XML tags, or
simply reading the appropriate tags, this process can be implemented in an efficient
and consistent fashion. For example, "urgent" e-mail messages might
be tagged urgent and therefore have certain additional rights, also determined
by the recipient, e.g. superior bandwidth, auto-routing to a pager, placing
a call to the recipient's cell phone to give a machine translation of the message
over the phone, and more. The repository plays a critical role in associating
the appropriate routing tags, managing these tags throughout the system, identifying
and billing the appropriate customer and more.
Individual Content Cache (Local Cache):
A local XML cache provides a means for more efficient utilization of the facilities
enabled by the Internet. As individuals surf the web, transact business, compile
information, communicate, etc. a local content cache could facilitate the process
in many ways. For example, as you or your automated agent transact business,
the local cache could provide a dated receipt storage facility. This local cache
might also generate an index on the fly as you surf the Internet, enabling you
to query your past findings for that gem of a site you now need. A local cache
could also store content that is incrementally updated based on time stamps
and element-level delta synchronization. Or, you might choose to cache selected
data for off-line browsing. The exciting opportunities enabled by a local content
cache are boundless.
These are just a few of the exciting technologies enabled by XML. Looking at
these examples, it is easy to understand why XML is creating such excitement
in the Internet community. As software developers begin to implement XML applications,
however, they will have to address the need to turn these ideas into reality,
while keeping up with the ever shortening development cycles characteristic
of Web development. In many cases, developers will find that their prototypes
work fine in the test laboratory but do not scale to address real world conditions
of concurrent usage and data volume. XML's rich interlinking and hierarchical
naming structure introduces a whole new set of requirements that bring solutions
based on the file system of relational architectures to their knees. An XML-savvy
object repository, designed to be embedded in XML applications of all types,
or the operating system itself, is the only solution that provides the functionality
and scalability required to drive the realization of this vision of a new generation
of networked applications.
XML Repository Requirements
All of the exciting applications described above require data persistence,
and these requirements are very different from the requirements of monolithic
file storage we are used to. XML extends HTML's simple unidirectional
linking, adding support for links to multiple targets, indirect addressing and
bidirectionality. Handling this rich linking requires a storage mechanism with
far more powerful management of references between objects than that provided
by the file system or relational databases. In order to effectively address
the XML opportunity the storage mechanisms must also be able to understand the
structure of XML content-which is composed of a dynamic number of objects-while
scaling effectively to handle increased usage load and data volume. Furthermore,
XML's object centric focus will create the need for an API that enables rich
object-centric manipulation from object-oriented languages like C++ and Java.
In other words, what is needed is an XML-aware object repository. Only an object
database management system (ODBMS) can maintain information about XML document
structure in a scalable manner while handling standard data types and BLOBs
with rich hyperlinking and navigation in the database.
File System Storage of XML Data
HTML storage management is almost always implemented using flat file storage.
This is because HTML, lacking any definable structure, is stored as a monolithic
block. Using the file system this way provides acceptable functionality, and
therefore wins out because it is extremely easy to implement. There are a number
of tools that build on the file systems functionality, and file systems are
included in operating systems at no additional charge. XML, however, has very
different storage requirements. XML applications must store and index the fine-grained
elements as well as the document structure. In addition, they must be capable
of linking these fine-grained elements directly to each other and to a variety
of data types containing associated information. The increased demands implied
by this functionality mean that additional care must be taken to build a system
which scales under increasing load. Attempts to parse XML data of any complexity
would overwhelm the capabilities of the file system to maintain the rich linking
structure and semantics of the data. Of course, the alternative is to store
the XML data as BLOBs and then parse it on the fly each time it is used. This
results in sub-optimal performance, because of repeated parsing. In addition,
once the data were parsed, the file system could not execute the complex data
manipulation required. In addition, this BLOB storage approach undermines the
ability to link various disparate elements into a rich tapestry of information
that models real-world usage cases. Of course, implementing all of these features
on top of the file system is a possibility, through custom development, but
this would essentially recreate object database functionality from scratch.
Relational Database Storage of XML Data
Relational database management systems (RDBMS) are the other plausible candidate.
Unfortunately, their table-based data model is very poorly suited to the hierarchical,
interconnected nature of structured XML content. Never the best systems for
managing variable length data and BLOBs, RDBMSs are further hampered by the
fact that they must represent the tree structure of XML content with an inefficient
set of tables and joins. Relational databases disassemble the XML objects in
order to fit them into their tabular architecture. As a result the XML object's
structure and semantics are either lost, minimizing its value, or they must
be duplicated in the design of the database. Duplicating the structure and semantics
of complex XML objects in the design of the database is very difficult, particularly
if the structure of the XML data is variable, as it almost always is. The rigidity
of the relational design is a poor fit with the dynamic assembly and manipulation
of XML data. Relational databases also cannot handle object-level locking, the
best they can provide is row-level locking. Since relational databases decompose
XML elements into various tables, linked via keys, it is very difficult to implement
an effective locking scheme that doesn't dramatically hinder concurrent use
and scalability. In concurrent editing environments, there will be an increase
in demand for related objects from disparate users. Relational databases respond
by locking entire rows across multiple tables. This can cause unacceptable performance
degradation if multiple users are requesting different objects that are locked
via this broad locking scheme. If the DBA responds by separating the information
into a larger number of more granular tables, the performance is degraded by
the number of joins required to model the richly linked structure of structured
XML content. In addition, relational databases are typically too heavyweight
to form an infrastructure for embeddable storage and require substantial development
to adapt to the complex structure of structured XML content. Quite simply, relational
databases, while excellent for many purposes, are not architecturally compatible
with the storage needs of XML data.
Object Database Storage of XML Data
The architecture of object databases is ideally suited to handling XML data,
in fact the adoption of XML data could be the "killer application"
for the object database market. Object databases are designed to handle
objects in their native forms. The objects then maintain their own data,
methods, relationships and the semantics of the whole model. This is ideal for
the creation and management of hierarchical XML trees, while providing both
hierarchical tree navigation and rich link traversal. For example, traversing
to the other side of a tree in the two dimensional relational model forces the
developer to climb up the tree, through joins, and then back down the other
side. The rich relationship linking of an object oriented model enables both
hierarchical navigation and rapid branch traversal, reducing computation and
increasing performance. Object databases are also designed to handle arbitrary,
variable-length data types and interrelated data. This is critical due to the
various data types linked within structured XML content. XML also enables an
ever-changing web of relationships between hyperlinked data elements, such as
on-the-fly creation of documents. Object databases, with their flexibility and
rich relationship management are ideal for managing this type of information.
Those object databases that allow object-level locking, provide for a much
more granular locking than relational or file system-based solutions. This granular
locking is critical for user scalability, since it limits the conflicts between
user requests for data. Object databases are also designed to handle larger
than memory content, providing for content scalability. Object databases also
offer more simplified creation and management of distributed partitions, which
further addresses the issue of database volume scalability and distributed implementation.
In short, object databases were designed to address the very requirements that
XML is just now starting to force upon tomorrow's storage solutions.
Just as most software developers would never consider developing their own
encryption technology for a new product, licensing it instead from a third-party
vendor selling best-of-breed solutions, it makes sense for XML developers to
seek a third-party storage engine that instantly provides them with the feature
set they require for effective XML data management. These applications
need a mature database engine well-suited to the structure and data types associated
with XML.
The Ideal Repository for XML Data
The ideal XML repository should address the needs of XML as well as the needs
of the associated applications. In evaluating the requirements of applications
in this field, the following criteria are critical: (1) scalability, (2) language
support, (3) ease of programming and (4) embeddability. Scalability will be
very important since the XML applications described above will run on both the
client and the server. It is important that the object database scale down as
well as up, while leveraging the same APIs, to simplify application development.
Language support is also important. Ease of programming is critical due to the
compressed development cycles, particularly in the Internet. Embeddability encompasses
two criteria--zero-management and low memory footprint. Embeddability is an
interesting issue since it has both short and long-term ramifications. In the
short-term, embeddability it important because most initial XML applications,
lacking sufficient XML support in the file system, will build-in this support.
However, long-term the object database will replace the standard file manager
running on top of the file system. This of course will make embeddability an
absolute requirement.
Storing and retrieving XML data is only the start. In addition to standard
XML data storage, the ideal repository would offer tightly integrated XML-specific
tree navigation, versioning, management of arbitrary links, import/export, publishing
of structured content on the web, support for object-oriented programming languages
as well as common scripting languages, and more. Building these facilities on
top of a object database are non-trivial, yet they are critical to the actual
process of managing and manipulating the XML data stored in the object database.
Conclusion
While XML has evolved from SGML and HTML, its impact will not be evolutionaryit
will be revolutionary! XML will transform the Internet from a massive collection
of unmanageable data into the intelligent transport we have all been waiting
for. The development community is just beginning to recognize the far-reaching
benefits of XML as a standard, flexible, structured content format. Existing
XML DTDs such as CDF, for push channel management, and OSD, for on-line software
distribution, merely hint at the types of applications that will benefit from
XML's unique characteristics. The announcement by Bill Gates that future versions
of Microsoft Office will support XML virtually assures its broad adoption.
Because XML applications have a far wider scope than their HTML counterparts,
in terms of both application domain and type of data being managed, existing
solutions for HTML storage are not sufficient for XML. File system-based solutions
do not support XML's rich linking and hierarchical tree structure, negating
XML's value-add. Relational databases are based on a two-dimensional table-based
architecture that is also ill-suited to the needs of XML. Attempts to force
a relational database into storing XML will result in sub-optimal performance,
concurrent access and scalability due to the architectural mismatch between
XML content and relational databases. The only database architecture that is
suited to the demands of XML is the object database. Only object databases support
granular element access and locking for superior concurrent access, rich high-performance
hyperlinking and fast hierarchical navigation.
Supporting Materials: The Evolution of XML
XML is the culmination of an evolution toward a standard portable structured
document format that provides information about both content and context.
Efforts to create a standard open format for structured documents that could
be exchanged and manipulated dates back to the 60's, when GenCode was introduced
by the Graphic Communications Association (GCA). IBM then introduced Generalized
Markup Language (GML). In the early 80's representatives of the GenCode and
GML communities combined to form the ANSI committee on Computer Languages for
the Processing of Text in an attempt to unify these languages in to a common
standard for document markup. In 1986, the ISO standard for Standard Generalized
Markup Language (SGML) was created. SGML provides tags for defining context
within a portable, human-readable format. In 1990, Tim Berners-Lee, inventor
of the World Wide Web, created a language based on SGML that addressed only
presentation of the data, not context, but he added links. By 1992, Tim Berners-Lee's
format evolved into HTML (HyperText Markup Language). This development, combined
with the Mosaic browser, led to explosion of interest in the Internet that is
still going strong today. However, the effort to simplify SGML to create HTML
went too far. While SGML is hopelessly complex, HTML's fixed tag set and lack
of structure severely limits the potential of the Web. This deficiency is now
being rectified by the rapid adoption of eXtensible Markup Language (XML), which
offers the benefits of SGML without the complexity.
While HTML has a common tag set that is understood by all browsers, the difficulty
is establishing critical mass for an alternative format so that both the readers
and publishers of data understand the same tag set, inherently limiting its
richness and flexibility. XML overcomes this chicken-and-egg hurdle because
it is self-descriptive. The document description is provided in the Document
Type Definition (DTD) which is attached to each XML file, and serves as a sort
of lexicon for the document. These tags are extensible providing a mechanism
for communities of interest to create their own ontology; in other words, a
common system codifying the concepts that are meaningful to that community.
"XML: The Foundation for the Future" was written by
Mike Hogan, POET Software,
( www.poet.com).
POET Software is a sponsor member of OASIS,the Organization for the Advancement
of Structured Information Standards (www.oasis-open.org).
OASIS is a nonprofit, international consortium dedicated to accelerating
the adoption of product-independent formats based on public standards. These
standards include XML, SGML and HTML as well as others that are related to structured
information processing. Members of OASIS are providers, users and specialists
of the technologies that make these standards work in practice. 1997 POET Software
Corporation, San Mateo, and POET Software GmbH, Hamburg. All rights reserved.
The information in this document is subject to change without notice and
does not represent a commitment on the part of POET Software or OASIS. No part
of this document may be reproduced or transmitted in any form or by any means,
electronic or mechanical, including photocopying and recording, or or for any
purpose without the express written consent of POET Software.
POET Software
999 Baker Way - Suite 100
San Mateo, CA 94404 USA
Phone: +1.650.286.4640
Fax: +1.650.286.4630
E-mail: info@poet.com
|