XML and Web Services In The News - 14 December 2006

Provided by OASIS | Edited by Robin Cover

This issue of XML Daily Newslink is sponsored by SAP AG



HEADLINES:

 IBM Project Aims to Help Blind Use ODF Applications
 Teradata Unveils New Advanced Analytics for Data Warehousing
 ebXML Registry Profile for Web Ontology Language (OWL) Version 1.5
 Health Insurers Create Personal Health Records
 Semantic Wikis and Disaster Relief Operations
 W3C Workshop Report: Keeping Privacy Promises
 Simple Content Management System: XProc Pipeline Controller in XSLT
 Social Context for Data Analysis


IBM Project Aims to Help Blind Use ODF Applications
Elizabeth Montalbano, InfoWorld
When Massachusetts' government decided to use Open Document Format (ODF) as the default document file format throughout its agencies, a key concern was that ODF would not allow the visually impaired to use assistive computer technologies. On Wednesday, IBM said it has helped solve that problem by developing technology that will allow applications based on ODF to better communicate with products used by the blind to access visual information on computer screens. Through Project Missouri, IBM developed application programming interfaces, (APIs) collectively called iAccessible2. These APIs will make it easy for visuals in applications based on ODF and other Web technologies to be interpreted by screen readers that reproduce that information verbally, IBM said. IBM spokesman Ari Fishkind said that in the past it has been hard for screen-reading technology to keep up with the advent of cutting-edge development and file formats such as ODF, AJAX (asynchronous Javascript and XML) and DHTML (dynamic hypertext markup language). The latter two technologies allow increasingly complex visuals to be rendered in Web browsers, and those are difficult to translate for screen readers. iAccessible2 not only will help ODF communicate better with screen readers that assist blind computer users, but it will also allow charts, pictures and other visuals based on AJAX and DHTML to be discerned by the visually impaired through those readers. "It's like a universal decoder ring," he said of iAccessible2. The technology is based on interfaces IBM originally developed with Sun Microsystems to make programs on Java and Linux platforms accessible to the blind.

Teradata Unveils New Advanced Analytics for Data Warehousing
Chris Preimesberger, eWEEK
Data warehouse giant Teradata has introduced a new suite of data-mining software for enterprises that it says significantly automates data preparation and accelerates performance of its partners' data-mining tools. Teradata Warehouse Miner 5.0 lets users advance beyond simply describing what their customers did last quarter to predicting future buying behavior. Using the data exploration features, businesses can identify and resolve a wide range of data-quality issues that include: identifying duplicate records and missing data, validating data accuracy, verifying the format of data and identifying outlying data that may skew an analysis. ADS generation: The Teradata ADS (Analytic Data Set) Generator is a flexible, open data-mining product that streamlines the most time-consuming and critical steps of data mining: the preparation of data for analysis. Most businesses spend up to 70 percent of their data-mining time and resources just getting data ready for analysis, which is a waste of valuable resources. Extended Predictive Model Markup Language: Teradata has extended support of the XML-based PMML (Predictive Model Markup Language) to additional data-mining vendors. PMML is an open standard that allows businesses to share analytic models among applications that enable them to easily port desktop analytic models to leverage the power of a large parallel database with minimal effort. The use of PMML promotes performance and enables scalability.
See also: on PMML

ebXML Registry Profile for Web Ontology Language (OWL) Version 1.5
Asuman Dogac (ed), OASIS Committee Draft
OASIS announced the publication of a public review draft for the "ebXML Registry Profile for Web Ontology Language (OWL) Version 1.5" specification, ending 11-February-2007. Produced by members of the ebXML Registry Semantic Content Management Subcommittee, this document defines the ebXML Registry profile for publishing, management, discovery, and reuse of OWL Lite Ontologies. The SC was chartered to define use cases and requirements for managing semantic content within the ebXML Registry 4.0, seeking to establish a formal liaison with relevant groups within the Semantic Web Activity (SWA) at W3C. The requirements: must include the ability to utilize ontolgies for classifying RegistryObjects and to enable intelligent discovery using ontology based queries. The SCMSC was tasked to identify specific Semantic Web technologies (e.g. RDF, OWL) that are necessary to support the requirements identified for semantic content management. From the specification introduction: The ebXML Registry holds the metadata for the RegistryObjects and the documents pointed at by the RegistryObjects reside in an ebXML repository. The basic semantic mechanisms of ebXML Registry are classification hierarchies (ClassificationScheme) consisting of ClassificationNodes and the Association Types among RegistryObjects. Furthermore, RegistryObjects can be assigned properties through a slot mechanism and RegistryObjects can be classified using instances of Classification, ClassificationScheme and ClassificationNodes. Given these constructs, considerable amount of semantics can be defined in the registry. However, currently semantics is becoming a much broader issue than it used to be since several application domains are making use of ontologies to add knowledge to their data and applications. This document normatively defines the ebXML Registry profile for Web Ontology Language (OWL) Lite. More specifically, this document normatively specifies how OWL Lite constructs should be represented by ebXML RIM constructs without causing any changes in the core ebXML Registry specifications. Furthermore, this document normatively specifies the code to process some of the OWL semantics through parameterized stored procedures that should be made available from the ebXML Registry. Although this Profile is reIated to ebXML Registry specifications and not to any particular implementation, in order to be able to give concrete examples, the freebXML Registry implementation is used.
See also: the SC

Health Insurers Create Personal Health Records
Grant Gross, InfoWorld
Two large health insurance trade groups based in the U.S. have released a model for personal health records, a portable, Web-based tool that includes a customer's insurance claims, immunization records, medication records and other health information. America's Health Insurance Plans (AHIP) and the Blue Cross and Blue Shield Association, whose members provide health insurance to about two thirds of U.S. residents, unveiled the personal health record model Wednesday. The two groups saw the importance of working together on the project, said Susan Pisano, vice president of communications for AHIP. "This is really an effort that cries out for collaboration," she said. U.S. President George Bush has pushed for electronic health records to be available to all U.S. residents by 2014. Backers of such records say they will improve the efficiency of the U.S. health-care system and cut down on errors such as drug interaction problems. PHRs are similar to other electronic health records, although they include less specific treatment information. Electronic health records typically are used by health-care providers to store and manage detailed clinical information. Patients will be able to enter information into their PHRs, in addition to information from pharmacies, laboratories and medical providers, the groups said. The model released Wednesday includes definitions of data elements that should be included in PHRs, such as risk factors, family history, health-care facilities and medications taken. The model also includes standards for the PHRs to be portable between insurers and providers, and rules about when insurers can share the information.
See also: the announcement

Semantic Wikis and Disaster Relief Operations
Soenke Ziesche, XML.com
Access to timely information is critical for relief operations in emergency situations. Over the last years social-networking web systems, such as wikis, have become more and more sophisticated and can also be applied fruitfully in humanitarian information management. However, a major drawback of the Web currently is that its content is not machine- readable, a shortcoming that is addressed by the Semantic Web approach. The web sites hosting information on humanitarian emergencies and disasters rarely use the social-networking concept of fast and massive user participation. This becomes apparent when listing the information products: situation reports, press releases, contact lists, databases of assessments, who-does-what-where, etc. Certainly, those products are produced by or based on the input of the concerned community. But with wikis, information can be provided much quicker and more directly, which is critical in humanitarian disasters — particularly in the early stages. In this article, I'll first propose using wikis to share information faster and more easily during emergencies, and secondly, I'll introduce a way to enhance them semantically. It is particularly promising to create a Semantic Web extension for wikis, i.e., to provide them with an underlying model of the knowledge described in their entries. While conventional wikis offer only a full-text search of their content as well as a categorization of articles, a semantic wiki would provide a query opportunity based on an RDF query language such as SPARQL. However, for relief workers under extreme time pressure, a convenient interface is necessary, such as that provided by [2] in their implementation. Queries regarding typed links can be reduced to three tuples (subject article, typed link, object article), where one or two fields are left empty while attribute-value pairs are two tuples. The results are given by two- or three-column tables, respectively.

W3C Workshop Report: Keeping Privacy Promises
Staff, W3C Report
A W3C Privacy Workshop Report recommending next steps for keeping privacy promises when exchanging sensitive information on the Web is now available. In October 2006, privacy and access control experts from America, Australia, Asia and Europe met to study Web privacy issues and solutions. On the Web, information collection and transfer are routine, often conducted by multiple parties in a manner transparent to the user. As more parties are granted access to information, it becomes more challenging to track chains of privacy promises and to enforce them. Tools can help, but tools require descriptions of access privileges, and such descriptions can be hard to formulate when so many parties are involved. Though we may be familiar with scenarios such as a doctor exchanging patient information with a laboratory, these issues are not limited to large-scale enterprises. More individuals are sharing personal information (photos, blog entries, etc.) on the Web. They too recognize the need for more effective approaches for managing personal information, for describing who can access their information, and for learning who is to be held accountable when a given service does not respect their privacy preferences. One key issue for near-time follow-up was the area of policy interoperability and mapping: While there seemed to be no interest among participants in creating a new, all-encompassing access control and obligation language, there was significant interest in exploring the interfaces between different, possibly domain-specific policy languages. Ontologies and common modeling principles could help combine these languages and also help enable automatic translation between different languages. Important contributions in this area could include a standardized language to describe evidence; mechanisms for the discovery of ontologies. More than a third of the participants in the workshop indicated interest in launching a W3C Interest Group to further explore this space. Other relevant questions in this context concerned unifying frameworks for access control, data handling, and usage control languages; this area of work could help levereage languages developed in the DRM space for privacy protection, and could help to clarify the applicability of access-control languages such as XACML in the privacy space. There was also discussion of developing and expressing pre-defined sets of user preferences, in order to improve the usability of policy-based technologies.
See also: the announcement

Simple Content Management System: XProc Pipeline Controller in XSLT
Steve Ball, Software Announcement
"Here at Explain we're very interested in XML pipelines. This is because they lie at the heart of content management systems that we create, like the Simple Content Management System (SCMS); a major component of the Packaged Press service. We started using the XML Pipeline Definition Language for specifying pipelines, but now we are building systems around XProc. Our first foray into the XProc world is the imaginatively named xproc application. This is an implementation of an XProc pipeline controller in XSLT. It comes in two flavours: (1) Command line: use the sh-libxml-fop.xsl XSL stylesheet to produce a Bourne shell script. Evaluating the shell script will build the products of the pipeline. This implementation uses libxslt's xsltproc and FOP. (2) GUI application: xproc.exe is an application for MS Windows that evaluates an XProc pipeline. It works by executing the XSL stylesheet tcl.xsl that produces a Tcl script, which is then evaluated to build the products of the pipeline. The main difference between these implementations is that the shell script uses temporary files for the intermediate build products, whereas the Tcl/Tk application keeps the intermediate documents in memory (as DOM trees). Architecture The core stylesheet module is xsl/xproc.xsl. This module interprets a pipeline document and determines the products to build and their dependencies. The module makes calls to various named template to the real work, but within the module these are stubs. A higher level stylesheet imports the core module and overrides the stubs to provide an implementation. Two examples are provided in the distribution: xsl/sh-libxml-fop.xsl and xsl/tcl.xsl. Both produce a text output; these are scripts which are evaluated to run the pipeline."

Social Context for Data Analysis
Jon Udell, InfoWorld
I'm a huge fan of the CAPStat (formerly DCStat) program. At InfoWorld's recent SOA Executive Forum this fall, I taped a video interview with Dan Thomas. His innovative efforts led to the Web release of a set of data feeds from the office of Washington, D.C.'s CTO, detailing information about such areas as real estate, reported crime, licensing, and service requests. Earlier I published a podcast and a column on this topic. But despite my cheerleading, the hoped-for citizen-led mashups haven't yet materialized in a big way. ... In principle, the data is there for the taking, and there's an open invitation for anyone to scoop it up and do useful analysis. In practice, only half the battle is won — thanks to the immediate availability of data represented as RSS, Atom, and the district's own, richer flavor of XML. It's great to lay your hands on the data, but as Bob Glushko rightly insists on reminding me, XML only seems to be a self-describing format. What do tags or field names really mean? Which elements or fields are or are not comparable? We can only answer these questions by pointing to instances of data (records, documents), discussing them, and coming to agreements.


XML.org is an OASIS Information Channel sponsored by BEA Systems, Inc., IBM Corporation, Innodata Isogen, SAP AG and Sun Microsystems, Inc.

Use http://www.oasis-open.org/mlmanage to unsubscribe or change an email address. See http://xml.org/xml/news_market.shtml for the list archives.


Bottom Gear Image