XML and Web Services In The News - 11 August 2006
Provided by OASIS |
Edited by Robin Cover
This issue of XML Daily Newslink is sponsored by IBM
HEADLINES:
Solr: Indexing XML with Lucene and REST
Bertrand Delacretaz, XML.com
Solr (pronounced "solar") builds on the well-known Lucene search
engine library to create an enterprise search server with a simple
HTTP/XML interface. Using Solr, large collections of documents can be
indexed based on strongly typed field definitions, thereby taking
advantage of Lucene's powerful full-text search features. This article
describes Solr's indexing interface and its main features, and shows
how field-type definitions are used for precise content analysis. Solr
began at CNET Networks, where it is used to provide high-relevancy
search and faceted browsing capabilities. Although quite new as a
public project: the code was first published in January 2006, and it
is already used for several high-traffic websites. The project is
currently in incubation at the Apache Software Foundation (ASF). This
means that it is a candidate for becoming an official project of the
ASF, after an observation phase during which the project's community
and code are examined for conformity to the ASF's principles. With Solr
you have all the indexing power of Lucene under the hood, with its
highly customizable analyzers, similarity searches, controlled ranking
of results, faceted browsing, etc. Also, having been designed for high-
traffic systems means that Solr's performance and scalability is already
up there with the best. Index replication between search servers is
available, Solr's no-nonsense HTTP interface makes it possible to
create search clusters using common HTTP load-balancing mechanisms, and
powerful internal caches help get the most out of each Solr instance.
See also: Solr Apache Incubator Project
W3C Releases SVG Tiny 1.2 As a Candidate Recommendation
Ola Andersson, Robin Berjon, et al.(eds), W3C Technical Report
W3C has announces the advancement of the "Scalable Vector Graphics (SVG)
Tiny 1.2 Specification" to W3C Candidate Recommendation as of 10 August
2006. With native support shipping in Opera and Firefox browsers on
desktops, the SVG language describes interactive vector graphics, text,
images, animation and graphical applications in XML. SVG Tiny 1.2 is
designed for Web access by devices of all sizes from handhelds to
desktops, automobile media centers and entertainment consoles. The
specification describes a collection of abstract modules that provide
specific units of functionality. These modules may be combined with
each other and with modules defined in other specifications (such as
XHTML) to create SVG subset and extension document types that qualify
as members of the SVG family of document types. The SVG Working Group
expects to request that the Director advance this document to Proposed
Recommendation once the Working Group has demonstrated at least two
interoperable implementations for each test in the SVG Tiny 1.2 test
suite; furthermore, at least one of the passing implementations must
be on a mobile platform. The SVG Working Group, working closely with
the developer community, expects to show these implementations by
January 2007. This estimate is based on the Working Group's preliminary
implementation report. The Working Group expects to revise this report
over the course of the implementation period. The Working Group does
not plan to request to advance to Proposed Recommendation prior to
10 November 2006. The companion "SVGT 1.2 Requirements" specification
has also been updated.
See also: the announcement
BPEL: Creating Simple Asynchronous & Synchronous Business Processes
Gopalan Suresh Raj, Web Cornucopia
This tutorial provides an overview of the sample project,
AsynchronousSample, and illustrates deploying, executing and testing
a asynchronous BPEL process using the NetBeans 5.5 Early Access bundle
with all the necessary runtimes. The Process is simple. It is basically
an echo process, but it is an asynchronous echo, not a synchronous echo.
A client sends the process a message. The process receives the input
message and returns immediately. Then the process asynchronously calls
the original client and sends the same message back. An asynchronous
process is used when the BPEL process is long running (takes a long
time to compute the result) and the results are returned to the client
by doing an invocation on the client. In this tutorial you will use a
simple BPEL project called AsynchronousSample and a Composite Application
project called AsynchronousSampleApplication. The project includes WSDL
and Schema files, a deployment descriptor, and input files for testing.
The web service interface for this process is a single asynchronous
operation. The NetBeans Enterprise Pack 5.5 Early access that is part
of the Java EE Tools Bundle is a free download that comes with a
plethora of tooling that helps the SOA Developer. Tools like the XML
Schema (XSD) Editor, the WSDL Editor, the BPEL Visual Designer, and all
the other tools that are part of this download help the SOA developer
be extremely productive.
See also: BPEL Designer Feature
The Java XML Validation API: Check Documents for Conformance to Schemas
Elliotte Rusty Harold, IBM developerWorks
Validation reports whether a document adheres to the rules specified by
the schema. It enables you to quickly check that input is roughly in
the form you expect and quickly reject any document that is too far away
from what your process can handle. If there's a problem with the data,
it's better to find out earlier than later. Different parsers and tools
support different schema languages such as DTDs, the W3C XML Schema
Language, RELAX NG, and Schematron. In the context of Extensible Markup
Language (XML), validation normally involves writing a detailed
specification for the document's contents in any of several schema
languages such as the World Wide Web Consortium (W3C) XML Schema Language
(XSD), RELAX NG, Document Type Definitions (DTDs), and Schematron.
Sometimes validation is performed while parsing, sometimes immediately
after. However, it's usually done before any further processing of the
input takes place. Until recently, the exact Application Programming
Interface (API) by which programs requested validation varied with the
schema language and parser. DTDs and XSD were normally accessed as
configuration options in Simple API for XML (SAX), Document Object Model
(DOM), and Java API for XML Processing (JAXP). RELAX NG required a custom
library and API. Schematron might use the Transformations API for
XML(TrAX); and still other schema languages required programmers to
learn still more APIs, even though they were performing essentially the
same operation. Java 5 adds a uniform validation Application Programming
Interface (API) that can compare documents to schemas written in these
and other languages.
Yahoo Delivers Resource for Python Developers
Darryl K. Taft, eWEEK
Yahoo has created a new resource called the 'Yahoo Developer Network -
Python Developer Center'. The Center is a Web site that provides
Python developers with access to information to help them build
applications in the Python object-oriented dynamic language. Yahoo
officials said the Sunnyvale, Calif., company quietly launched the
site as a developer resource for information about using Python with
Yahoo Web Services APIs. Simon Willison, the developer who put together
the Yahoo site, said the bulk of the information on the site is
"how-tos" that show developers how to do various things with Python.
Some of the specific how-tos Willison pointed out include: Make Yahoo
Web Service REST calls with Python, Cache API calls using Python, Parse
JSON using Python, Parse XML using Python, Access the Yahoo Search
APIs using pYsearch, and Access Yahoo RSS feeds using Python. pYsearch
is an open-source Python library for accessing the Yahoo Search APIs.
The Yahoo Python Developer Center also features links to several Python
educational resources, including Python.org, the home of Python on
the Web; the Python Cookbook, a collection of useful Python code
snippets; and the Python Package Index, which offers a range of
open-source Python packages for developers to install.
See also: Python Developer Center
XML in Focus
Ken North, DB2 Magazine, Special Issue on XML
DB2 9's "pureXML" technology is speeding development for early customers,
including financial-services giant Storebrand. Explore the developer-
friendly features behind the radical improvements. Since IBM introduced
object-relational technology with DB2 Universal Database 5.0, Internet
technology, distributed computing, and, most recently, Extensible Markup
Language (XML) have exerted a major influence on computing. XML turns
a spotlight on document-centric computing, new standard formats for
office documents, and SQL/XML:2003, the successor to the SQL standard.
Content management and Web-facing applications often involve storing
and retrieving XML data. XML provides the underpinnings for data
integration, process integration, and enterprise information integration.
XML also provides enabling technology for a new distributed computing
model that includes Web services, grid services, and service-oriented
architectures (SOA). DB2 9's ability to process both XML and SQL is a
substantial benefit. It enables the use of a single database platform
for data processing, document processing, and SOA. To someone grounded
in SQL and tabular structures, XML opens the door to a structured
document mindset and new query technology. A common approach to
integrating XML into an SQL platform is to support queries over XML by
mapping to relational algebra. This approach uses the existing relational
engine, which DB2 XML Extender has done since DB2 UDB 6.1. In DB2 9, a
single engine (optimized for both XML and relational data) processes
relational and XML (hierarchical) data; however, the two data types
reside in separate storage layers. The new engine treats an XML document
as a parsed, annotated tree structure and supports indexing parts of
documents. Hand-in-hand with the new XML data store, DB2 9 supports the
SQL/XML:2003 XML type, SQL/XML functions, and XQuery. DB2 9 lets you
query XML data using XQuery alone, SQL alone, XQuery that invokes SQL,
and SQL/XML functions that execute XQuery expressions.
See also: the Editor's intro
Call for Asia to Adopt ODF
Aaron Tan, ZDNet Asia News
An official from the United Nations (U.N.) has called for countries
in the Asia-Pacific region to embrace the OpenDocument format. Sunil
Abraham, manager of the International Open Source Network (IOSN) at
the U.N., told ZDNet Asia that most governments in the region have
already stated their support for open standards, through their
respective government interoperability frameworks. He hopes that
governments in the region will now extend that support and "seriously
consider" the OpenDocument Format (ODF). Last month, Malaysia became
the one of the first Asian countries to propose the use of ODF as a
national standard for office documents. Hasannudin Saidin, a member
of Sirim, the country's standards development agency, said on his
blog last month that the proposal will now undergo approval from a
higher-level committee within Sirim. Public consultation on the
proposal will stretch over two months, beginning in September and
ending in October 2006, after which comments will be raised to the
Malaysian Minister of Science, Technology and Innovation. According
to Saidin, ODF is expected to become a Malaysian-defined standard MS
26300, by the year-end. In the Philippines, there is no official
policy on the adoption of ODF in the country, according to Peter
Antonio Banzon, division chief of the Philippines' Advanced Science
& Technology Institute, although the government agency has already
standardized its internal documents on the ODF.
Why Microsoft Should Open XAML
Jon Udell, InfoWorld
Open standards are key to leading the rich Internet applications market.
In his recent blog entry, Google's Joe Beda accepts partial blame for
the excruciatingly slow progress of the Windows Presentation Foundation
(aka Avalon). The idea, he admits, was to go big and 'build something
only Microsoft can build.' With 20/20 hindsight, Beda wishes things had
been done differently: a smaller team, incremental releases. And he
holds out some hope for the awkwardly named Windows Presentation
Foundation/Everywhere (WPF/E), the lightweight, portable, .Net-based
'Flash killer,' that I discussed in my interview with Bill Gates from
the 2005 Professional Developers Conference. The WPF/E runtime won't
implement all of XAML (XML Application Markup Language), a .Net
language tuned for declarative application layout. But 'the portion of
XAML we've picked,' Gates told me, 'will be everywhere, absolutely
everywhere, and it has to be.' Here's a crazy idea: Open-source the
WPF/E, endorse a Mono-based version, and make XAML an open standard.
Why? Because an Adobe/Microsoft arms race ignores the real competition:
Web 2.0, and the service infrastructure that supports it. The HTML/
JavaScript browser has been shown to be capable of tricks once thought
impossible. Meanwhile, though, we're moving inexorably toward so-called
RIAs (rich Internet applications) that are defined, at least in part,
by such declarative XML languages as Adobe's MXML, Microsoft's XAML,
Mozilla's XUL (XML User Interface Language), and a flock of other
variations on the theme. Imagine a world in which browsers are
ubiquitous, yet balkanized by incompatible versions of HTML. That's
just where RIA players and their XML languages are taking us. Is
there an alternative? Sure. Open XAML. There's a stake in the ground
that future historians could not forget.
Healthcare, Meet Open Source
Sean Michael Kerner, InternetNews.com
Though the ability to collaborate and share information is a critical
component of modern IT infrastructures, it is often lacking in
healthcare environments, where siloed information is the norm. Such
information is housed on proprietary computing architectures that can't
always be accessed by different platforms. Taking its cue to deliver a
salve for this situation, IBM this week said it is open sourcing
technology to the Eclipse Foundation's Open Healthcare Framework (OHF)
project in an effort to bridge the information silos. "Medical
facilities and doctors all have their own ways of communicating and
distributing medical information much of it hard copy," Scott Handy,
vice president of worldwide Linux and open source at IBM: "There is no
good way to transmit medical information because there was no standard."
Even with a standard in place, solutions would still be difficult to
come by, which is why IBM is open sourcing an implementation of a
health care information exchange standard. Handy said that because
abstract specs are often so hard to collaborate on among different
vendors, an open source implementation of a specification is the best
way to collaborate. Eclipse OHF is endeavoring to create a standards-
based platform for the healthcare software industry. IBM is no novice
in open sourcing healthcare software. In 2005, the systems vendor began
an effort called the Interoperable Healthcare Information Infrastructure
(IHII) project, which includes an SOA approach to exchange information
using OHF.
See also: Open Healthcare Framework (OHF) Project
XML Programming with PHP and Ajax
Hardeep Singh and Cindy Saracco, DB2 Magazine
DB2 and other relational databases have matured considerably in their
XML offerings, making them an ideal choice to store and manage XML data
in addition to relational data. DB2 9 XML support (called pureXML)
provides the capability to store XML in its pure form (in other words,
in annotated, tree-like, hierarchical storage). Inside DB2 9, XML data
can be indexed using XML patterns, composed from relational data,
decomposed to relational data, and queried, transformed, and published
stand-alone or combined with relational data using a mix of SQL/XML
and XQuery. Web browsers are also providing more functionality to
client script to efficiently handle XML. Using Asynchronous JavaScript
and XML (Ajax), Web pages can now make direct remote procedure calls
to application servers and use DOM APIs on any returned XML data. This
article shows how to exploit the capabilities provided by DB2 XML, Ajax,
and PHP Hypertext Preprocessor (PHP) to write simple XML-based
applications. With the help of a sample scenario, you will learn how
to make JavaScript calls to a PHP application; how to modify any XML
data using DOM and SimpleXML APIs, how to transfer the XML from the
client to application to database, and how to create a PHP Web service
to publish reports on the XML data using SQL/XML and XQuery. XML
provides developers with the ability to define rules and structures for
business documents as well as to instantiate the documents in memory as
hierarchical objects that can be navigated, modified, and serialized in
any of the tiers using standard APIs. Ajax enables Web-based client
scripts to call DOM APIs and make remote procedure calls to a middle
tier. PHP provides one of the simplest approaches for handling XML
and Web services, making it a perfect fit for XML-based application
development.
Authoritative Metadata
Roy T. Fielding and Ian Jacobs (eds), Approved W3C TAG Finding
Vincent Quint announced that The W3C Technical Architecture Group
(TAG) has approved the finding on "Authoritative Metadata." This release
is an update to the previously approved finding of 25-February-2004. W3C
created the TAG to document and build consensus around principles of Web
architecture and to interpret and clarify these principles when necessary.
The TAG also resolves issues involving general Web architecture brought
to the TAG, and help coordinate cross-technology architecture
developments inside and outside W3C. In Web architecture, communication
between agents consists of exchanging messages with predefined syntax
and semantics: a shared expectation of how each message's control data
and payload (representation data and metadata) will be interpreted by
the recipient. When supported by the communication protocol, the Web
architecture uses representation metadata to indicate the sender's
intentions regarding how the recipient should interpret the
representation data. For example, HTTP and MIME use the value of the
"Content-Type" header field to indicate the Internet media type of
the representation, which influences the dispatching of handlers and
security-related decisions made by recipients of the message. The key
architectural points of this finding: (1) Metadata received in an
encapsulating container, such as the metadata within the header fields
of a message that describe the data enclosed within that message, is
authoritative in defining the nature of the data received. (2)
Inconsistency between representation data and metadata is an error that
should be discovered and corrected rather than silently ignored. (3) An
agent MUST NOT ignore or override authoritative metadata without the
consent of the party employing the agent. (4) Specifications MUST NOT
work against the Web architecture by requiring or suggesting that a
recipient override authoritative metadata without user consent.
See also: the document overview
XML.org is an OASIS Information Channel
sponsored by BEA Systems, Inc., IBM Corporation, Innodata Isogen, SAP AG and Sun
Microsystems, Inc.
Use http://www.oasis-open.org/mlmanage
to unsubscribe or change an email address. See http://xml.org/xml/news_market.shtml
for the list archives. |