XML and Web Services In The News - 09 June 2006

Provided by OASIS | Edited by Robin Cover

This issue of XML.org Daily Newslink is sponsored by SAP

HEADLINES:

	Technical Context and Cultural Consequences of XML
	Emerging Patterns in the Use of XML for Information Modeling in Vertical Industries
	Generation of Efficient Parsers Through Direct Compilation of XML Schema Grammars
	XHTML Basic 1.1 Adds Features for Small Devices
	How do You Tell Humans and Computers Apart?
	Understand the Relationships of Web Standards
	Google Office: It's About File Formats, Not MS Office
	Openize Denmark, Parliament Orders
	Prepare for the Coming of RFID

Technical Context and Cultural Consequences of XML
Sharon Adler, Roberta Cochrane, et al., IBM Systems Journal
This lead article in IBM Systems Journal (Volume 45, Number 2, 2006) "Celebrating 10 Years of XML" represents a landmark publication, as well as a fitting overview of XML for the special issue. Abstract: "The Extensible Markup Language (XML) is an open standard for creating domain- and industry-specific markup vocabularies. XML has become the predominant mechanism for electronic data interchange between information systems and can be described as a universally applicable, durable 'Code of Integration.' As we celebrate its tenth anniversary, it is appropriate to reflect on the role XML has played and the technical ecosystem in which it functions. In this paper, we discuss both the environment from which XML arose and its technical underpinnings, and we relate these topics to companion papers in this issue of the "IBM Systems Journal". We discuss the broad consequences of XML and argue that XML will take its place among the technical standards having the greatest impact on the world in which we live. We conclude with some reflections on the significant technical, economic, and societal consequences that XML is likely to have in the future."
See also: the special issue Preface

Emerging Patterns in the Use of XML for Information Modeling in Vertical Industries
S. Hinkelman, D. Buddenbaum, and L.-J. Zhang, IBM Systems Journal
The use of XML (Extensible Markup Language) for information modeling within vertical industries has taken many diverse forms. Some, but not all, of these forms have been influenced by the emerging service- oriented architecture (SOA) XML infrastructures. Despite the diversity of approaches taken by industry-level consortiums working with XML, there is a great deal of commonality, as exemplified by four basic patterns for XML business content design which have recently emerged within vertical industry consortiums. These patterns are (1) Business Content Envelope, (2) Web-Services-Based Infrastructure, (3) Wrapped Content, and (4) Top-Down Modeling. This set of patterns, though limited, provides a framework that can aid Web Services adoption efforts by industry standards organizations. In this paper, we begin with a review of the history of the development of a selection of XML standards. Next, we focus on the emergence of the aforementioned industry-level patterns in XML business content design and describe these patterns in detail. We then describe the associated effects and implication of mappings (i.e., 'bindings') of these patterns to a Web Services infrastructure.

Generation of Efficient Parsers Through Direct Compilation of XML Schema Grammars
E. Perkins, M. Matsa, et al., IBM Systems Journal
With the widespread adoption of SOAP and Web services, XML-based processing, and parsing of XML documents in particular, is becoming a performance-critical aspect of business computing. In such scenarios, XML is often constrained by an XML Schema grammar, which can be used during parsing to improve performance. Although traditional grammar- based parser generation techniques could be applied to the XML Schema grammar, the expressiveness of XML Schema does not lend itself well to the generic intermediate representations associated with these approaches. In this paper we present a method for generating efficient parsers by using the schema component model itself as the representation of the grammar. We show that the model supports the full expressive power of the XML Schema, and we present results demonstrating significant performance improvements over existing parsers.

XHTML Basic 1.1 Adds Features for Small Devices
Mark Baker, Masayasu Ishikawa, Shinichi Matsui (eds), W3C Working Draft
Members of W3C's HTML Working Group have released the First Public Working Draft for XHTML Basic 1.1. The draft adds four new features for small devices which are the language's primary users. Version 1.1 is intended to be the convergence of the XHTML Basic 1.0 W3C Recommendation for mobile devices, released in coordination with the WAP Forum in 2000, and the Open Mobile Alliance (OMA) XHTML Mobile profile. In this revision, four new features have been incorporated into the language in order to better serve the small-device community that is this language's major user: (1) Intrinsic Events; (2) The target attribute; (3) The style element; (4) The inputmode attribute. The XHTML Basic document type includes the minimal set of modules required to be an XHTML host language document type, and in addition it includes images, forms, basic tables, and object support. It is designed for Web clients that do not support the full set of XHTML features; for example, Web clients such as mobile phones, PDAs, pagers, and settop boxes. The document type is rich enough for content authoring.
See also: the W3C news item

How do You Tell Humans and Computers Apart?
David L. Margulius, InfoWorld
While IT security pros have been working hard on systems to make sure users are who they say they are, Web 2.0 developers have been studying a related problem: how to make sure users are actually human beings, rather than machines. The result is a variety of implementations of CAPTCHA, which stands for "Completely Automated Public Turing Test to Tell Computers and Humans Apart." You can see CAPTCHA at work in those little boxes on large Web sites like Yahoo or Ticketmaster, where you must input some distorted letters in a box before proceeding to buy your concert tickets or open an e-mail account. It's a crude line of defense against bulk spammers and their ilk. CAPTCHA, like many authentication schemes, suffers from the childproof-cap problem: It doesn't fully keep out unwanted intruders, while frustrating the heck out of many legitimate users. As a W3C Working Group Note on CAPTCHA reported, 'this system can be defeated by those who benefit most from doing so ... spammers can pay a programmer to aggregate these images and feed them one by one to a human operator, who could easily verify hundreds of them each hour.' In the meantime, CAPTCHA schemes put off whole groups of humans, primarily the visually impaired, but also people with dyslexia and short-term memory problems. For financial services firms, there may be some interesting learning here in the run-up to compliance with the FFIEC two-factor authentication guidelines later this year; for example, many sites are now offering audio CAPTCHA for the visually impaired.
See also: Inaccessibility of CAPTCHA

Understand the Relationships of Web Standards
Peter V. Mikhalenko, ZDNet Asia
There are many debates on the Internet about relationships between Resource Description Framework (RDF), Topic Maps and some ontology expressing languages. Some fuel has been added to the fire with the introduction of other ontology languages such as OWL and SKOS. The World Wide Web Consortium (W3C) has made an attempt to establish standard guidelines for RDF/Topic Maps interoperability by consolidating the existing proposals of integrating RDF and Topic Maps data. The primary goal of W3C was to achieve interoperability between RDF and Topic Maps at the data level. This means that it should be possible to translate data from one form to the other without unacceptable loss of information or corruption of the semantics. It should also be possible to query the results of a translation in terms of the target model and it should be possible to share vocabularies across the two paradigms. In this article I'll try to analyze the development background of both standards, and give you an overview of five different relationship proposals. These five proposals have been chosen as being sufficiently complete and well-documented to be suitable for detailed examination. Among the several possible criteria for evaluating these proposals, two -- completeness and naturalness -- have been selected as the most relevant and appropriate for evaluating the qualities and limitations of each proposal. Analysis of the proposals identified two main approaches towards translation, which we dubbed "object mapping" (providing a translation of every structural component of the source paradigm) and "semantic mapping" (providing a structure corresponding to every conceptual structure of the source model). The analysis of the options and solutions provided in the literature, therefore, clearly shows the advantages of semantic mapping, but at the same time lists the issues that need to be addressed and solved in any future translation approach. However, now that both RDF and Topic Maps have formal data models, and with the help of RDF Schema and OWL, it seems likely that most, if not all, of the issues we have listed here can be resolved without resorting to the restricted interoperability offered by object mapping.

Google Office: It's About File Formats, Not MS Office
Ken Fisher, ars technica
Debates in the aftermath of the Google Spreadsheet announcement have climbed the mountains and traversed the valleys of Google's supposed master plan. They've covered the Google vs. Microsoft gorge, the trickling AdSense stream. What they haven't discussed is the file format war, and I suspect that this is far more important than it may at first seem. These days it's not hard to pitch anything Google does as part of some brilliant strategy to dethrone Microsoft. Despite the fact that the two company's businesses touch at only a few points, the meme is that the two giants are fighting over the same pot of honey. While we marvel at the size of the web, we sometimes forgot about the mountains of word processor, spreadsheet, and database information housed "offline." And much of that is proprietary to some extent, stored in file formats that are not accessible or only semi-accessible to your average PC user with a web browser. This is where the Open Document format (ODF) steps in. As you may know, ODF is an open format for word processor, spreadsheet, database, and presentation files, based on XML. Google's Writely can import Microsoft Word's DOC files, supports viewing and editing HTML documents, and ODF conversion is already supported as well. Google Spreadsheet will support CSV and Excel's file format, XLS, at launch. ODF support is only a matter of time. In my view, Writely and Google Spreadsheet are ultimately about promoting open file exchange and, eventually, the ODF file format.

Openize Denmark, Parliament Orders
gotze.eu, John Goetze's Blog
On Friday (June 2, 2006), the Danish Parliament (Folketinget) had its last session before the Summer break, and on a very long agenda, the very last issue (#57) was the second and last reading of Morten Helveg's Proposal for Parliamentary Resolution on Open Standards (B103). Earlier this week it was still pending, and [we stated] that it was opposed by the Government. That was accurate information as of a week ago. But politics is the art of changing things, and over the last week, crafty politicians have been at work, and changed things. Morten Helveg pushed for settlement, and then Danish People's Party's Morten Messerschmidt and Joergen Dohrman put their fingerprint on the resolution with an ammendment, so a majority vote would be reached. And to cut a long story short, on Friday afternoon, the Parliament voted [for approval of the resolution.] In conclusion, the vote in Parliament ended in an unanimous decision, but not in fence-mending. Quite the contrary, actually. But at the end of the day, and that's what counts, Denmark is now a nation who has a parliamentary mandate for open standards. Thank you to the three Mortens: Morten Helveg, Morten Messerschmidt and Morten Oestergaard, and to Joergen Dohrman and Anne Grete Holmsgaard for carrying this through, and thanks also to Michael Aastrup Jensen and Helge Sander, and all other MPs for voting for this historic resolution.

Prepare for the Coming of RFID
John Blau, InfoWorld
It's not a question of if but when RFID (radio frequency identification) technology will dominate the supply chains of manufacturers, retailers, and just about any company or organization that needs to trace products, parts and other items, according to senior executives at SAP. Prices have been dropping, down to below $0.10 per tag in large numbers from around $0.30, thanks in large part to the introduction the new Gen-2 standard, said Eric Donski, RFID solution director at SAP. Even if the day when every yogurt container is tagged with a smart chip is still a few years away, a tag on numerous pharmaceutical drugs could be just around the corner, according to Donski. "Of all the industries looking at RFID, pharmaceutical has the best business case," he said. "The tracking and tracing of drugs from the manufacturer right up to the pharmacy is important for recalls and building consumer confidence." Pharmaceutical companies are among the 450 customer RFID projects that SAP has running in 15 industries and 16 countries. To date, one of the biggest users of the wireless identification technology is the retail sector, and one company in particular: Wal-Mart Stores. By the end of 2006, Wal-Mart plans to add more than 300 suppliers to its list of companies shipping products with RFID tags, bringing its total to nearly 600. In addition, the retailer aims to have RFID in 1,000 of its more than 5,600 stores, beginning in the southern half of the U.S.
See also: PML for RFID

XML.org is an OASIS Information Channel sponsored by Innodata Isogen and SAP.

Use http://www.oasis-open.org/mlmanage to unsubscribe or change an email address. See http://xml.org/xml/news_market.shtml for the list archives.