There’s an mSpace demo online.
Archive for February 2005
While researching and evaluating user interfaces and management tools for semantic structures — ontologies, taxonomies, thesauri, etc — I’ve come across or been directed to two good surveys of tools.
The first, courtesy of HP Labs and the SIMILE project is Review of existing tools for working with schemas, metadata, and thesauri. Thanks to Will Evans for pointing this out.
The second is a comprehensive review of nearly 100 ontology editors, or applications offering ontology editing capabilities, put together by Michael Denny at XML.com. You can read the full article Ontology Building: A Survey of Editing Tools, or go directly to the Summary Table of Survey Results.
The original date for this is 2002 — it was updated July of 2004.
mSpace is a new framework — including user interface — for interacting with semantically structured information that appeared on Slashdot this morning.
According to the supporting literature, mSpace handles both ontologically structured data, and RDF based information that is not modelled with ontologies.
What is potentially most valuable about the mSpace framework is a useful, usable interface for both navigating / exploring RDF-based information spaces, and editing them.
From the mSpace sourceforge site:
“mSpace is an interaction model designed to allow a user to navigate in a meaningful manner the multi-dimensional space that an ontology can provide. mSpace offers potentially useful slices through this space by selection of ontological categories.
mSpace is fully generalised and as such, with a little definition, can be used to explore any knowledge base (without the requirement of ontologies!).
Please see mspace.ecs.soton.ac.uk for more information.“
From the abstract of the Technical report, titled mSpace: exploring the Semantic Web“
“Information on the web is traditionally accessed through keyword searching. This method is powerful in the hands of a user that is experienced in the domain they wish to acquire knowledge within. Domain exploration is a more difficult task in the current environment for a user who does not precisely understand the information they are seeking. Semantic Web technologies can be used to represent a complex information space, allowing the exploration of data through more powerful methods than text search. Ontologies and RDF data can be used to represent rich domains, but can have a high barrier to entry in terms of application or data creation cost.
The mSpace interaction model describes a method of easily representing meaningful slices through these multidimensional spaces. This paper describes the design and creation of a system that implements the mSpace interaction model in a fashion that allows it to be applied across almost any set of RDF data with minimal reconfiguration. The system has no requirement for ontological support, but can make use of it if available. This allows the visualisation of existing non-semantic data with minimal cost, without sacrificing the ability to utilise the power that semantically-enabled data can provide.”
After missing the first four IA summits, I’m very much looking forward to this year’s IA Summit in lovely Montréal.
This is a follow-on to the better part of a year spent working on the strategy, design, and development of a suite of executive dashboards and portals for major pharmaceutical clients.
In the look-over-your-shoulder-as-you-run-forward mode typical of most consulting roles, it’s quite a bit different from the semantic web / semantic architecture work I’m engaged with now. But that’s the joy of always being on to new things
Comments Off | Information Architecture
Comments Off | Projects
How Much Information? 2003 is an update to a project first undertaken by researchers at the School of Information Management and Systems at UC Berkeley in 2000. Their intent was to study information storage and flows across print, film, magnetic, and optical media.
It’s not surprising that the United States produces more information than any other single country, but it was eye-opening to read that about 40% of the new stored information in the world every year cones from the U.S.
Also surprising is the total amount of instant message traffic in 2002, estimated at 274 terabytes, and the fact that email is now the second largest information flow, behind the telephone.
Some excerpts from the executive summary:
“Print, film, magnetic, and optical storage media produced about 5 exabytes of new information in 2002. Ninety-two percent of the new information was stored on magnetic media, mostly in hard disks.“
“How big is five exabytes? If digitized with full formatting, the seventeen million books in the Library of Congress contain about 136 terabytes of information; five exabytes of information is equivalent in size to the information contained in 37,000 new libraries the size of the Library of Congress book collections.“
“Hard disks store most new information. Ninety-two percent of new information is stored on magnetic media, primarily hard disks. Film represents 7% of the total, paper 0.01%, and optical media 0.002%.“
“The United States produces about 40% of the world’s new stored information, including 33% of the world’s new printed information, 30% of the world’s new film titles, 40% of the world’s information stored on optical media, and about 50% of the information stored on magnetic media.“
“How much new information per person? According to the Population Reference Bureau, the world population is 6.3 billion, thus almost 800 MB of recorded information is produced per person each year. It would take about 30 feet of books to store the equivalent of 800 MB of information on paper.“
“Most radio and TV broadcast content is not new information. About 70 million hours (3,500 terabytes) of the 320 million hours of radio broadcasting is original programming. TV worldwide produces about 31 million hours of original programming (70,000 terabytes) out of 123 million total hours of broadcasting.”
rdfdata.org offers a great collection of RDF data sets and services that generate RDF.
Comments Off | Semantic Web
In the latest issue of ACMQueue, Tim Bray is interviewed about his career path and early involvement with the SGML and XML standards. While recounting, Bray makes four points about the slow pace of adoption for RDF, and reiterates his conviction that the current quality of RDF-based tools is an obstacle to their adoption and the success of the Semantic Web.
Here are Bray’s points, with some commentary based on recent experiences with RDF and OWL based ontology management tools.
1. Motivating people to provide metadata is difficult. Bray says, “If there’s one thing we’ve learned, it’s that there’s no such thing as cheap meta-data.“
This is plainly a problem in spaces much beyond RDF. I hold the concept and the label meta-data itself partly responsible, since the term meta-data explicitly separates the descriptive/referential information from the idea of the data itself. I wager that user adoption of meta-data tools and processes will increase as soon as we stop dissociating a complete package into two distinct things, with different implied levels of effort and value. I’m not sure what a unified label for the base level unit construct made of meta-data and source data would be (an asset maybe?), but the implied devaluation of meta-data as an optional or supplemental element means that the time and effort demands of accurate and comprehensive tagging seem onerous to many users and businesses. Thus the proliferation of automated taxonomy and categorization generation tools…
2. Inference based processing is ineffective. Bray says, “Inferring meta-data doesn’t work… Inferring meta-data by natural language processing has always been expensive and flaky with a poor return on investment.“
I think this isn’t specific enough to agree with without qualification. However, I have seen analysis of a number of inferrencing systems, and they tend to be slow, especially when processing and updating large RDF graphs. I’m not a systems architect or an engineer, but it does seem that none of the various solutions now available directly solves the problem of allowing rapid, real-time inferrencing. This is an issue with structures that change frequently, or during high-intensity periods of the ontology life-cycle, such as initial build and editorial review.
3. Bray says, “To this day, I remain fairly unconvinced of the core Semantic Web proposition. I own the domain name RDF.net. I’ve offered the world the RDF.net challenge, which is that for anybody who can build an actual RDF-based application that I want to use more than once or twice a week, I’ll give them RDF.net. I announced that in May 2003, and nothing has come close.“
Again, I think this needs some clarification, but it brings out a serious potential barrier to the success of RDF and the Semantic Web by showcasing the poor quality of existing tools as a direct negative influencer on user satisfaction. I’ve heard this from users working with both commercial and home-built semantic structure management tools, and at all levels of usage from core to occasional.
To this I would add the idea that RDF was meant for interpretation by machines not people, and as a consequence the basic user experience paradigms for displaying and manipulating large RDF graphs and other semantic constructs remain unresolved. Mozilla and Netscape did wonders to make the WWW apparent in a visceral and tangible fashion; I suspect RDF may need the same to really take off and enter the realm of the less-than-abstruse.
4. RDF was not intended to be a Knowledge Representation language. Bray says, “My original version of RDF was as a general-purpose meta-data interchange facility. I hadn’t seen that it was going to be the basis for a general-purpose KR version of the world.“
This sounds a bit like a warning, or at least a strong admonition against reaching too far. OWL and variants are new (relatively), so it’s too early to tell if Bray is right about the scope and ambition of the Semantic Web effort being too great. But it does point out that the context of the standard bears heavily on its eventual functional achievement when put into effect. If RDF was never meant to bear its current load, then it’s not a surprise that an effective suite of RDF tools remains unavailable.
Comments Off | Semantic Web
Two thumbs up to Anders Ramsay for organizing IA meetups down in NYC. I had the chance to come to one of these regular get-togethers in January, and meet Anders, Lou Rosenfeld, Liz Danzico, Peter Van Dijk, and quite a few others while in town to see clients. After some refreshing beverages at Vig Bar, we moved on to the Mercer Kitchen for a swanky, tasty dinner. Word of mouth has it that the duck at is a religious experience. And it’s always nice to put faces to a great many blog posts.
Anders posted some photos here:
I don’t see any of the umbrellas decorating the interior of the main dining area in the photos — but you had to look up to see them hanging from the ceiling in the first place.
Visual Puzzler Challenge: someone in these photos is a System Architect maquerading as an IA — can you spot the imposter?