XML and Web Services In The News - 14 December 2006
Provided by OASIS |
Edited by Robin Cover
This issue of XML Daily Newslink is sponsored by SAP AG
HEADLINES:
IBM Project Aims to Help Blind Use ODF Applications
Elizabeth Montalbano, InfoWorld
When Massachusetts' government decided to use Open Document Format
(ODF) as the default document file format throughout its agencies, a
key concern was that ODF would not allow the visually impaired to use
assistive computer technologies. On Wednesday, IBM said it has helped
solve that problem by developing technology that will allow applications
based on ODF to better communicate with products used by the blind to
access visual information on computer screens. Through Project Missouri,
IBM developed application programming interfaces, (APIs) collectively
called iAccessible2. These APIs will make it easy for visuals in
applications based on ODF and other Web technologies to be interpreted
by screen readers that reproduce that information verbally, IBM said.
IBM spokesman Ari Fishkind said that in the past it has been hard for
screen-reading technology to keep up with the advent of cutting-edge
development and file formats such as ODF, AJAX (asynchronous Javascript
and XML) and DHTML (dynamic hypertext markup language). The latter two
technologies allow increasingly complex visuals to be rendered in Web
browsers, and those are difficult to translate for screen readers.
iAccessible2 not only will help ODF communicate better with screen
readers that assist blind computer users, but it will also allow charts,
pictures and other visuals based on AJAX and DHTML to be discerned by
the visually impaired through those readers. "It's like a universal
decoder ring," he said of iAccessible2. The technology is based on
interfaces IBM originally developed with Sun Microsystems to make
programs on Java and Linux platforms accessible to the blind.
Teradata Unveils New Advanced Analytics for Data Warehousing
Chris Preimesberger, eWEEK
Data warehouse giant Teradata has introduced a new suite of data-mining
software for enterprises that it says significantly automates data
preparation and accelerates performance of its partners' data-mining
tools. Teradata Warehouse Miner 5.0 lets users advance beyond simply
describing what their customers did last quarter to predicting future
buying behavior. Using the data exploration features, businesses can
identify and resolve a wide range of data-quality issues that include:
identifying duplicate records and missing data, validating data accuracy,
verifying the format of data and identifying outlying data that may skew
an analysis. ADS generation: The Teradata ADS (Analytic Data Set)
Generator is a flexible, open data-mining product that streamlines the
most time-consuming and critical steps of data mining: the preparation
of data for analysis. Most businesses spend up to 70 percent of their
data-mining time and resources just getting data ready for analysis,
which is a waste of valuable resources. Extended Predictive Model Markup
Language: Teradata has extended support of the XML-based PMML
(Predictive Model Markup Language) to additional data-mining vendors.
PMML is an open standard that allows businesses to share analytic models
among applications that enable them to easily port desktop analytic
models to leverage the power of a large parallel database with minimal
effort. The use of PMML promotes performance and enables scalability.
See also: on PMML
ebXML Registry Profile for Web Ontology Language (OWL) Version 1.5
Asuman Dogac (ed), OASIS Committee Draft
OASIS announced the publication of a public review draft for the "ebXML
Registry Profile for Web Ontology Language (OWL) Version 1.5"
specification, ending 11-February-2007. Produced by members of the
ebXML Registry Semantic Content Management Subcommittee, this document
defines the ebXML Registry profile for publishing, management, discovery,
and reuse of OWL Lite Ontologies. The SC was chartered to define use
cases and requirements for managing semantic content within the ebXML
Registry 4.0, seeking to establish a formal liaison with relevant groups
within the Semantic Web Activity (SWA) at W3C. The requirements: must
include the ability to utilize ontolgies for classifying RegistryObjects
and to enable intelligent discovery using ontology based queries. The
SCMSC was tasked to identify specific Semantic Web technologies (e.g.
RDF, OWL) that are necessary to support the requirements identified for
semantic content management. From the specification introduction: The
ebXML Registry holds the metadata for the RegistryObjects and the
documents pointed at by the RegistryObjects reside in an ebXML
repository. The basic semantic mechanisms of ebXML Registry are
classification hierarchies (ClassificationScheme) consisting of
ClassificationNodes and the Association Types among RegistryObjects.
Furthermore, RegistryObjects can be assigned properties through a slot
mechanism and RegistryObjects can be classified using instances of
Classification, ClassificationScheme and ClassificationNodes. Given
these constructs, considerable amount of semantics can be defined in
the registry. However, currently semantics is becoming a much broader
issue than it used to be since several application domains are making
use of ontologies to add knowledge to their data and applications.
This document normatively defines the ebXML Registry profile for Web
Ontology Language (OWL) Lite. More specifically, this document
normatively specifies how OWL Lite constructs should be represented by
ebXML RIM constructs without causing any changes in the core ebXML
Registry specifications. Furthermore, this document normatively
specifies the code to process some of the OWL semantics through
parameterized stored procedures that should be made available from the
ebXML Registry. Although this Profile is reIated to ebXML Registry
specifications and not to any particular implementation, in order to
be able to give concrete examples, the freebXML Registry implementation
is used.
See also: the SC
Health Insurers Create Personal Health Records
Grant Gross, InfoWorld
Two large health insurance trade groups based in the U.S. have released
a model for personal health records, a portable, Web-based tool that
includes a customer's insurance claims, immunization records, medication
records and other health information. America's Health Insurance Plans
(AHIP) and the Blue Cross and Blue Shield Association, whose members
provide health insurance to about two thirds of U.S. residents, unveiled
the personal health record model Wednesday. The two groups saw the
importance of working together on the project, said Susan Pisano, vice
president of communications for AHIP. "This is really an effort that
cries out for collaboration," she said. U.S. President George Bush has
pushed for electronic health records to be available to all U.S.
residents by 2014. Backers of such records say they will improve the
efficiency of the U.S. health-care system and cut down on errors such
as drug interaction problems. PHRs are similar to other electronic health
records, although they include less specific treatment information.
Electronic health records typically are used by health-care providers to
store and manage detailed clinical information. Patients will be able
to enter information into their PHRs, in addition to information from
pharmacies, laboratories and medical providers, the groups said. The
model released Wednesday includes definitions of data elements that
should be included in PHRs, such as risk factors, family history,
health-care facilities and medications taken. The model also includes
standards for the PHRs to be portable between insurers and providers,
and rules about when insurers can share the information.
See also: the announcement
Semantic Wikis and Disaster Relief Operations
Soenke Ziesche, XML.com
Access to timely information is critical for relief operations in
emergency situations. Over the last years social-networking web systems,
such as wikis, have become more and more sophisticated and can also be
applied fruitfully in humanitarian information management. However, a
major drawback of the Web currently is that its content is not machine-
readable, a shortcoming that is addressed by the Semantic Web approach.
The web sites hosting information on humanitarian emergencies and
disasters rarely use the social-networking concept of fast and massive
user participation. This becomes apparent when listing the information
products: situation reports, press releases, contact lists, databases
of assessments, who-does-what-where, etc. Certainly, those products are
produced by or based on the input of the concerned community. But with
wikis, information can be provided much quicker and more directly, which
is critical in humanitarian disasters — particularly in the early
stages. In this article, I'll first propose using wikis to share
information faster and more easily during emergencies, and secondly,
I'll introduce a way to enhance them semantically. It is particularly
promising to create a Semantic Web extension for wikis, i.e., to provide
them with an underlying model of the knowledge described in their
entries. While conventional wikis offer only a full-text search of
their content as well as a categorization of articles, a semantic wiki
would provide a query opportunity based on an RDF query language such
as SPARQL. However, for relief workers under extreme time pressure, a
convenient interface is necessary, such as that provided by [2] in
their implementation. Queries regarding typed links can be reduced to
three tuples (subject article, typed link, object article), where one
or two fields are left empty while attribute-value pairs are two tuples.
The results are given by two- or three-column tables, respectively.
W3C Workshop Report: Keeping Privacy Promises
Staff, W3C Report
A W3C Privacy Workshop Report recommending next steps for keeping
privacy promises when exchanging sensitive information on the Web is
now available. In October 2006, privacy and access control experts
from America, Australia, Asia and Europe met to study Web privacy
issues and solutions. On the Web, information collection and transfer
are routine, often conducted by multiple parties in a manner transparent
to the user. As more parties are granted access to information, it
becomes more challenging to track chains of privacy promises and to
enforce them. Tools can help, but tools require descriptions of
access privileges, and such descriptions can be hard to formulate
when so many parties are involved. Though we may be familiar with
scenarios such as a doctor exchanging patient information with a
laboratory, these issues are not limited to large-scale enterprises.
More individuals are sharing personal information (photos, blog entries,
etc.) on the Web. They too recognize the need for more effective
approaches for managing personal information, for describing who
can access their information, and for learning who is to be held
accountable when a given service does not respect their privacy
preferences. One key issue for near-time follow-up was the area of
policy interoperability and mapping: While there seemed to be no
interest among participants in creating a new, all-encompassing access
control and obligation language, there was significant interest in
exploring the interfaces between different, possibly domain-specific
policy languages. Ontologies and common modeling principles could help
combine these languages and also help enable automatic translation
between different languages. Important contributions in this area could
include a standardized language to describe evidence; mechanisms for
the discovery of ontologies. More than a third of the participants in
the workshop indicated interest in launching a W3C Interest Group to
further explore this space. Other relevant questions in this context
concerned unifying frameworks for access control, data handling, and
usage control languages; this area of work could help levereage
languages developed in the DRM space for privacy protection, and could
help to clarify the applicability of access-control languages such as
XACML in the privacy space. There was also discussion of developing
and expressing pre-defined sets of user preferences, in order to
improve the usability of policy-based technologies.
See also: the announcement
Simple Content Management System: XProc Pipeline Controller in XSLT
Steve Ball, Software Announcement
"Here at Explain we're very interested in XML pipelines. This is because
they lie at the heart of content management systems that we create,
like the Simple Content Management System (SCMS); a major component of
the Packaged Press service. We started using the XML Pipeline Definition
Language for specifying pipelines, but now we are building systems
around XProc. Our first foray into the XProc world is the imaginatively
named xproc application. This is an implementation of an XProc pipeline
controller in XSLT. It comes in two flavours: (1) Command line: use the
sh-libxml-fop.xsl XSL stylesheet to produce a Bourne shell script.
Evaluating the shell script will build the products of the pipeline.
This implementation uses libxslt's xsltproc and FOP. (2) GUI application:
xproc.exe is an application for MS Windows that evaluates an XProc
pipeline. It works by executing the XSL stylesheet tcl.xsl that produces
a Tcl script, which is then evaluated to build the products of the
pipeline. The main difference between these implementations is that the
shell script uses temporary files for the intermediate build products,
whereas the Tcl/Tk application keeps the intermediate documents in memory
(as DOM trees). Architecture The core stylesheet module is xsl/xproc.xsl.
This module interprets a pipeline document and determines the products to
build and their dependencies. The module makes calls to various named
template to the real work, but within the module these are stubs. A
higher level stylesheet imports the core module and overrides the stubs
to provide an implementation. Two examples are provided in the
distribution: xsl/sh-libxml-fop.xsl and xsl/tcl.xsl. Both produce a text
output; these are scripts which are evaluated to run the pipeline."
Social Context for Data Analysis
Jon Udell, InfoWorld
I'm a huge fan of the CAPStat (formerly DCStat) program. At InfoWorld's
recent SOA Executive Forum this fall, I taped a video interview with Dan
Thomas. His innovative efforts led to the Web release of a set of data
feeds from the office of Washington, D.C.'s CTO, detailing information
about such areas as real estate, reported crime, licensing, and service
requests. Earlier I published a podcast and a column on this topic. But
despite my cheerleading, the hoped-for citizen-led mashups haven't yet
materialized in a big way. ... In principle, the data is there for the
taking, and there's an open invitation for anyone to scoop it up and do
useful analysis. In practice, only half the battle is won — thanks to
the immediate availability of data represented as RSS, Atom, and the
district's own, richer flavor of XML. It's great to lay your hands on
the data, but as Bob Glushko rightly insists on reminding me, XML only
seems to be a self-describing format. What do tags or field names really
mean? Which elements or fields are or are not comparable? We can only
answer these questions by pointing to instances of data (records,
documents), discussing them, and coming to agreements.
XML.org is an OASIS Information Channel
sponsored by BEA Systems, Inc., IBM Corporation, Innodata Isogen, SAP AG and Sun
Microsystems, Inc.
Use http://www.oasis-open.org/mlmanage
to unsubscribe or change an email address. See http://xml.org/xml/news_market.shtml
for the list archives. |