XML and Web Services In The News - 14 July 2006
Provided by OASIS |
Edited by Robin Cover
This issue of XML Daily Newslink is sponsored by SAP
HEADLINES:
Streaming Techniques for XML Processing - Part 3
Tobias Trapp, SAP Blog
In the first part of this weblog I introduced STX and mentioned
validation techniques beyond W3C XML schema as an application. STX is
a transformation language in XML syntax that works event-based. Like
in XSLT we can define templates that represent rules. These rules are
evaluated while processing the input XML document in a linear way.
There is a working draft for the STX-specification and Joost, an STX
processor running under Java, that implements most features of the
specification. Now I want to put it all together for an application in
data exchange. We do data exchange to link electronic business process
by making data of one system available for another system. Usually we
don't want to accept any data -- we only accept valid XML documents.
Using schema languages like W3C we have several advantages but
validation against an W3C XML Schema has also disadvantages: (1) W3C
XML Schema usually can't perform many checks, think of numerical checks
for example (2) An XML message can contain thousands of serialized
business objects. Sometimes we don't want to reject a huge XML message
because a single business object is error prone. (3) The output of an
error protocol a validation is hard to interpret. We would like to
have error codes that are readable or can be interpreted by computer
programs. Validation languages like Schematron sometimes perform
better because we can code rules and assertions. Unfortunately most
Schematron implementations rely on XSLT so that you can't check huge
XML documents. In this weblog I will present a self made prototype
of an validation language STV (Streaming Validation for XML) that is
based on STX, so I expect a good performance. On the other hand
compared to Schematron there is a lack of expressiveness. But
combined with W3C XML Schema it is an powerful tool. An STV
transformation defines a set of rules that consist of assertions. An
assertion can be coded with variables that have to be assigned first.
Within a rule we can initialize buffers that can be appended and
processed. We can use STV to code checks that will be performed an
a certain XML document. Using an XSLT 2.0 transformation we generate
an STX program that performs those checks.
See also: Part 1
The GML Simple Feature Profile and You
Sam Bacharach, Directions Magazine
Here's a problem that a growing number of geospatial software
developers face: adding support for the Open Geospatial Consortium's
(OGC) OpenGIS Geography Markup Language Encoding Specification (GML).
Simply stated, GML is a standard to encode geometry and attributes
using XML. Once the marketing department and user input confirm that
supporting this standard is worth doing, programmers have to make it
happen. Sure, programmers can do that; they can do anything. What's
the big deal? The big deal is that the current GML specification runs
600 pages, details 1,000 tags (named objects), defines many of the
geometries for describing features on the earth, and also supports
the ability to encode coverages (including imagery), topology, time,
metadata and dynamic features. GML was designed to be very broad and
cover many needs. Recall, too, that to fully implement the
specification, the programmers have to create software that will not
only write out data in this form, but also can read it in. It's
perhaps akin to requesting support for the 64 colors in the big crayon
box. After some discussion, the group has decided to include just
"simple features." In essence, only the vocabulary of "simple features"
is supported in the profile. Officially, the profile includes "points,
lines, and polygons (and collections of these), with linear
interpolation between vertices of lines, and planar (flat) surfaces
within polygons." The GML Simple Feature Profile and other GML
profiles that will appear in the coming months and years offer ways
to create the right tool for the job, thus making everyone's
geospatial life not only more interoperable, but also easier.
See also: GML references
Introducing DB2 9: Application Development Enhancements
Rav Ahuja, IBM developerWorks
DB2 9 (formerly codenamed "Viper") provides numerous enhancements that
simplify database application development, reduce development time,
and improve developer productivity. In addition to providing a platform
for robust enterprise applications, DB2 9 is also optimized for rapidly
building a new breed of "Web 2.0" applications based on Web services,
XML feeds, data syndication, and more. New enhancements for developers
in IBM DB2 9 for Linux, UNIX, and Windows include a new Developer
Workbench, deeper integration with .NET environments, rich support for
XML and SOA environments, new drivers and adapters for PHP and Ruby on
Rails, and new application samples. DB2 9 features pureXML technology
that provides a unique set of capabilities for managing and serving XML
data in a highly efficient manner. pureXML technology consists of a true
XML data type (that stores XML in its hierarchical format rather than
as a large object or stuffed into relational columns), XML indexing,
XML text search support, SQL/XML and XQuery support, schema evolution
flexibility, and numerous other capabilities. DB2 add-ins for Visual
Studio contain full support for pureXML including the functionality to
perform several actions, including: update, import, and export XML data;
validate an XML database against a registered XML schema; register and
unregister XML schemas; generate sample data based on an XML schema. The
DB2 driver for PHP is also included as part of the Zend Core for IBM --
a seamless, out-of-the-box, easy to install, and supported PHP
development and production environment tailored for DB2, IBM Cloudscape,
or Apache Derby data servers. This article, the final article in a
series introducing the features of DB2 9, provides an overview of these
enhancements.
See also: XML and Databases
What's on O'Reilly's Open Source Executive Radar?
Matt Asay, InfoWorld
From the Open Source Executive Briefing for presentation at the
O'Reilly Open Source Convention (OSCON) in Portland, Oregon: (1) Open
Source as Assymmetric Competition -- For years the software industry
has largely competed on the basis of symmetry: Oracle versus IBM in
databases; BEA versus IBM in application servers; etc. Feature wars,
price wars, but not true competition wars. That is, competing by
playing a different game, with different rules. Open source enables an
alternative battleground upon which to compete, with community, code,
and culture the new competitive tools. (2) Operations as Advantage - In
a world where software is delivered as a service, the quality of a
company's operational infrastructure is a key source of competitive
advantage. This a world where scale matters. (3) Open Data - Tim
O'Reilly has long believed that "data is the Intel Inside" of Web 2.0
applications, the source of competitive advantage and lock in. As a
consequence, he also believes that it won't be long before "open data"
becomes as hot-button an issue as open source software has been. (4)
Open Source and Web 2.0 - Everyone knows that Google, Yahoo!, and many
other "Web 2.0" companies are built on top of open source, but how
exactly do they use it?
See also: the OSCON web site
W3C Semantic Web Activity to Include GRDDL, Deployment Working Groups
Staff, W3C Announcement
W3C has announced the renewal of the Semantic Web Activity with the
chartering of three new groups. The new Working Groups have been formed
to work on Semantic Web deployment, extracting RDF from XML (e.g., to
process microformats), education, and outreach. The W3C Advisory
Committee also approved the continuing work in RDF data access, rules
interchange, and health care and life sciences. The mission of the
GRDDL Working Group is to complement the concrete RDF/XML syntax with
a mechanism to relate other XML syntaxes (especially XHTML dialects
or "microformats") to the RDF abstract syntax via transformations
identified by URIs. The goal of the Semantic Web initiative is as broad
as that of the Web: to create a universal medium for the exchange of
data. It is envisaged to smoothly interconnect personal information
management, enterprise application integration, and the global sharing
of commercial, scientific and cultural data. Semantic Web technologies
allow data to be shared and reused across applications, enterprises,
and communities. The principal technologies of the Semantic Web fit
into a set of layered specifications. The current components are the
Resource Description Framework (RDF) Core Model, the RDF Schema language
and the Web Ontology language (OWL). Building on these core components
is a standardized query language, SPARQL, for RDF enabling the 'joining'
of decentralized collections of RDF data. These languages all build on
the foundation of URIs, XML, and XML namespaces.
See also: the W3C Semantic Web Activity
Thinking XML: Manage XML Data Sets for Security
Uche Ogbuji, IBM developerWorks
This article discuss principles for managing XML deployment to avoid
vulnerabilities. The principles are quite simple, and yet not often
enough discussed among XML professionals. XML involves an interesting
perspective on data management, one which many developers find new and
strange at first. XML offers flexible support for loosely structured
and hierarchical data, but it also comes with inevitable performance
problems. Unfortunately, developers often don't consider problems that
can arise from XML's transparency. Many XML applications build on raw
XML dumps from databases and legacy applications. Software vendors
have encouraged this approach by making monolithic XML dumps the most
prominent XML features in their repertoire. The promised ease with
which you can transform one XML format to another using XSLT leads to
a cavalier philosophy: "Throw it all out as XML, and pick through for
what you need." The problem is that this leaves the door wide open to
security issues, such as XPath injection attacks. Good design for
security is not all that different from good design for software
quality. The more you clump and tangle things together, the harder it
is to spot and protect against problems. The increased transparency of
XML data requires an increased transparency of application processing
workflow in order to mitigate problems from security to state control.
Applications that work with large dumps of XML data, and use complex
processing to scratch needed information from these data sets, are
vulnerable to a sophisticated attacker who takes advantage of your
blind spots. If you design applications that package and exchange
small, controlled chunks of XML data in manageable processing stages,
you reduce these blind spots and make the application easier to
maintain. Understanding the implications of transparent data flow is
key to the security of XML-based applications.
See also: the W3C XML Processing Model Working Group
Family Tree of Schema Languages for Markup Languages (2006)
Rick Jelliffe, O'Reilly Blog
This diagram from Rick Jelliffe presents an evolutionary view of
"schema" languages from 1986 through 2006: "I've updated my 1999
diagram "Family Tree of Schema Languages for Markup Languages" to
include the innovation coming from OASIS, ISO, W3C and other places
since [W3C] XSD came out. I put ASL in, but left out things like ISO
Topic Map Constraint Language, OASIS CAM and all the little toy
languages that have fed into ISO DSDL. It's also rearranged to
clarify where all the parts of DSDL fit. There is also activity at
the next level up: RDF and business rules, that don't fit here but
are good. The diagram was quite popular when it came out, I think
largely so that people could figure out which abbreviations and
acronyms to ignore."
See also: XML Schema Languages
WS-I Basic Security Profile Enhanced Logging Specification Requirements
Ram Poornalingam (ed), WS-I Working Group Draft
This specification defines the enhanced logging facilities used by the
WS-I Test Tools to support the Basic Security Profile. Verifying Basic
Security Profile conformance requires SOAP stack instrumentation. This
Enhanced Logging specification addresses why instrumentation is
necessary and how it can be achieved. The document assumes that the
reader understands the usage of the WS-I Interoperability testing tools
version 2.0. The WS-I Testing Tools are designed to help developers
determine whether their Web services are conformant with WS-I profile
guidelines. Complete BSP verification of encrypted SOAP message emitted
by the application is not possible. The reason being, Basic Profile
verification, a requirement of the BSP, of encrypted messages is not
possible. To achieve BP verification, the unencrypted form of the
message is necessary. The profile conformance coverage that can be
achieved, without adhering to this specification, is only at the surface
level.
See also: the WS-I web site
XML.org is an OASIS Information Channel sponsored by BEA Systems, Inc., IBM Corporation, Innodata Isogen, SAP AG and Sun Microsystems, Inc.
Use http://www.oasis-open.org/mlmanage to unsubscribe or change an email address. See http://xml.org/xml/news_market.shtml for the list archives. |