XML and Web Services In The News - 19 May 2006
Provided by OASIS |
Edited by Robin Cover
This issue of XML.org Daily Newslink is sponsored by SAP
HEADLINES:
Search as the User Interface for the Rest of Us
Guy Creese, DMReview.com
Google's announcement of Google OneBox on April 19, 2006, is one more
tremor signaling a tectonic plate shift that will have an impact on
the market landscape for years to come. In a sentence, OneBox lets
employees use the same familiar Google interface that they use to search
the Web to access information within business applications. For example,
they can pull up a purchase order stored in an Oracle application via
Google, rather than using the typical Oracle application interface.
This ability is going to have a far-reaching impact on business
intelligence (BI) applications and interfaces, and is, therefore, worth
talking about in detail. The Google OneBox search appliance is a
physical box that enterprises install behind their firewall. The
appliance indexes information by crawling corporate repositories, and
then lets users search for it via the Google search box. Users log on
to the system like they do any other business application and, therefore,
can see only that information that they're allowed to see. OneBox does
this by supporting native LDAP authentication as well as the Google
Search Authorization Service Provider Interface. The system can support
up to 3 million documents on a single server, and up to 25 queries per
second. Users narrow their search via keywords. For example, if they
interested in purchase order number 060875, they type in the string
"po 060875." This will retrieve that PO's information (e.g., PO total,
supplier name, buyer name, payment terms, carrier, freight terms) from
an enterprise resource planning (ERP) system and display it in the
Google interface. Google OneBox can retrieve information from systems
such as Oracle Financials, Cisco Call Manager, Cognos, SAS,
salesforce.com, Employease and NetSuite. OneBox uses a REST-based
application programming interface (API) to make a call to the
application; the application needs to reply via XML.
Search Considered Integral
Ryan Barrows, Jim Traverso, Morgan Stanley; ACM Queue
A combination of tagging, categorization, and navigation can help
end-users leverage the power of enterprise search. "Most corporations
must leverage their data for competitive advantage. The volume of data
available to a knowledge worker has grown dramatically over the past
few years, and, while a good amount lives in large databases, an
important subset exists only as unstructured or semi-structured data.
Without the right systems, this leads to a continuously deteriorating
signal-to-noise ratio, creating an obstacle for busy users trying to
locate information quickly. Three flavors of enterprise search
solutions help improve knowledge discovery: Raw engines, Intranet
appliances, and desktop search. All three search solutions are likely
to show up in an enterprise with massive information management
challenges. At Morgan Stanley we have had a group working on intranet
search and raw search engines for more than five years and have been
experimenting with desktop search since 2004. A fourth piece of this
puzzle has yet to be popularized: combining tagging, categorization,
and navigation to improve the overall experience for the end user.
This piece is needed, as machine-relevance algorithms alone are not
good enough to produce high-quality intranet results. In this article
we discuss what such a system looks like, with a particular emphasis
on solving enterprise-scale problems."
Web Inventor Sees His Brainchild Ready for Big Leap
Lucas van Grinsven, eWEEK
The World Wide Web is on the cusp of making its next big leap to
become an open environment for collaboration and its inventor said he
has not been so optimistic in years. Still, Tim Berners-Lee, the Briton
who invented and then gave away the World Wide Web, warns that Internet
crime and anti-competitive behavior need to be fought tooth and nail.
Currently the director of the World Wide Web Consortium (W3C) which is
a U.S.-headquartered forum of companies and organizations to improve
the Web, Berners-Lee is only now realising his early vision of a two-way
Web where people can easily work together on the same page and where
the content on a page can be recognized by computers. Google Maps, whose
geographic maps turn up on other sites combined with services, and photo
sharing site Flickr, where members comment on each other's postings and
developers can use the pictures to create new applications, are early
examples of how Web sites can combine data from different sources. A new
query language, SPARQL (pronounced "Sparkle"), is designed to make Web
pages easier for machines to read, allowing all sorts of different data
to be put to work on the Web. Berners-Lee is also concerned about how
some Internet providers in the United States have started to filter data,
giving priority to premium data for which the operator receives an
additional fee. They can do this, because they own the cables, the
service, the portals and other key applications. "The public will demand
an open Internet... I tried then to make the Web technology, in turn, a
universal, neutral, platform... It is of the utmost importance that, if
I connect to the Internet, and you connect to the Internet, that we can
then run any Internet application we want, without discrimination as to
who we are or what we are doing."
See also: on Net Neutrality
First Public Draft of Open XML is Published by Ecma
Andy Updegrove, Consortium Standards Bulletin
"The first draft of Open XML has been posted for public viewing at the
Ecma Website, five months after Ecma accepted Microsoft's submission of
what was then less-appealingly referred to as the XML Reference Schema.
The most detailed source of information I've found so far is this page
at Brian Jones' blog, which focuses heavily on XML in Office and the
development work on Open XML file formats. Brian is a Microsoft Office
Program Manager who has frequently provided public comments on the
progress and purpose of Open XML. According to Jones, the specification
is now 4,000 pages long, roughly twice its original size, and has been
the subject of weekly two hour conference calls and three day F2F
meetings about every two months. A key decision in the creation of any
standard is the level of detail to standardize upon. If the level is too
low, then interoperability will suffer, because much of what is needed
to make the product useful is left up to the vendor, and those additional
features will be proprietary. But if the level is too high, then only
clones can be built, which is good for interoperability, but death to
innovation. It can also be death to competition, since if (as in this
case) the standard is based on an existing product, then no would-be
competitor would ever expect to be able to catch up with the incumbent,
much less compete on price. The [1.3] the specification may be fine and
even perhaps very good for making it possible for end users and external
developers to do more with Office documents, but it may be useless for
creating true competition in the marketplace..."
See also: Brian Jones' blog
SOA Product Review: ActiveBPEL 2.0 from Active Endpoints
Paul Maurer, Enterprise OpenSource
Business process execution Language support or BPEL is at the top of
every enterprise SOA punch list. It's an XML-based language designed
to support long-running complex business transactions in the form of
orchestrated Web Service interactions. Like most XML formats, you
wouldn't want to construct and debug a process of any complexity by
hand and an "engine" is required to recognize and execute BPEL. This
is where the tool vendors come in and Active Endpoints, Inc. has a
design tool and engine product combination that we'll cover in this
review. ActiveBPEL Designer is a world-class visual environment for
working with BPEL-based processes. ActiveBPEL Designer is built on
the seemingly ubiquitous Eclipse extensible development platform and
has an interface with a clear and logical layout. The "Navigator"
tab in the upper left region displays a hierarchical view of projects,
folders, and files in the workspace. To the right of the Navigator is
the "Web References" tab. This tab contains a registry of namespaces,
messages, type definitions, and sample data, used in BPEL processes.
It's populated automatically as WSDL files and XML schemas are added
to the workspace. The "Web References" tab has many features for
slicing and dicing the view, but my favorite is its ability to drag
Web references and drop then in the process editor canvas. Active
Endpoints has created an excellent BPEL design tool and execution
engine that is freely downloadable, well documented and has good
community support. There's virtually no cost of entry and enterprise
reliability features can be purchased for mission-critical applications.
See also: BPEL references
RDFa Primer 1.0: Embedding RDF in XHTML
Ben Adida and Mark Birbeck, Updated W3C Working Draft
W3C has announced the publication of a new working draft for "Embedding
RDF in XHTML," produced by the RDF in XHTML Task Force (HTML) of the
W3C Semantic Web Best Practices and Deployment Working Group (SWBPD)
and the W3C HTML Working Group. "Current web pages, written in HTML,
are chock-full of structured data. When publishers can express the
document's metadata, and when tools can read it, a new world of user
functionality becomes available, letting users copy and paste structured
data between applications and web sites. An event on a web page can be
directly imported into a user's desktop calendar. A license on a
document can be automatically detected so that the user is informed of
his rights automatically. A photo's creator, camera setting information,
resolution, and topic can be published to enable structured search and
sharing. RDFa is a syntax for expressing such metadata in XHTML. The
rendered, hypertext data of XHTML is reused by the RDFa markup, so that
publishers don't repeat themselves. The underlying abstract metadata
representation is RDF, which lets publishers build their own metadata
vocabulary, extend others, and evolve their vocabulary with maximal
interoperability over time. The metadata is closely tied to the data it
describes, so that rendered data can be copied and pasted along with
its relevant structure."
See also: W3C Semantic Web
UnREST over WS-* and Other "Enterprisey" Things
Anne Thomas Manes, Blog
The single, most important feature that inspires my enthusiasm about
WS-* is that it has universal support from all the major vendors. The
technology has become pretty much pervasive (although the industry is
still stuggling with interoperability issues), and there's a huge
ecosystem of vendors and products and tools that support it. WS-* also
has some really interesting innovations (separation of header and body,
the composability of the various SOAP extensions, policy-based
management and control via intermediaries, etc), which I think make
it particularly well-suited for enterprise-class service-oriented
application systems. There. I've qualified it. WS-* is enterprisey.
But is that really such a bad thing? If you need comprehensive
enterprise-class semantics (security, reliability, session management,
transactions, etc), then it really helps to use an enterprisey
middleware system. But I can't ignore the debate between REST and WS-*.
I'm a huge proponent of the KISS principle. So I don't recommend using
WS-* for all service interactions. If an application doesn't require
enterprisey infrastructure semantics, then it's much more appropriate
to use a simpler middleware system, such as "plain old XML" (POX)
over HTTP. In fact, for applications that require Internet scalability
(e.g., mass consumer-oriented services), POX is a much better solution
than WS-*.
XML.org is an OASIS Information Channel sponsored by Innodata Isogen and SAP.
Use http://www.oasis-open.org/mlmanage to unsubscribe or change an email address. See http://xml.org/xml/news_market.shtml for the list archives. |