mSpace Online Demo

February 20th, 2005 — 2:48pm

There’s an mSpace demo online.

Two Surveys of Ontology / Taxonomy / Thesaurus Editors

February 18th, 2005 — 2:46pm

While research­ing and eval­u­at­ing user inter­faces and man­age­ment tools for seman­tic struc­tures — ontolo­gies, tax­onomies, the­sauri, etc — I’ve come across or been directed to two good sur­veys of tools.
The first, cour­tesy of HP Labs and the SIMILE project is Review of exist­ing tools for work­ing with schemas, meta­data, and the­sauri. Thanks to Will Evans for point­ing this out.
The sec­ond is a com­pre­hen­sive review of nearly 100 ontol­ogy edi­tors, or appli­ca­tions offer­ing ontol­ogy edit­ing capa­bil­i­ties, put together by Michael Denny at You can read the full arti­cle Ontol­ogy Build­ing: A Sur­vey of Edit­ing Tools, or go directly to the Sum­mary Table of Sur­vey Results.
The orig­i­nal date for this is 2002 — it was updated July of 2004.

mSpace: A New (Usable?) Semantic Web Interface

February 18th, 2005 — 10:56am

mSpace is a new frame­work — includ­ing user inter­face — for inter­act­ing with seman­ti­cally struc­tured infor­ma­tion that appeared on Slash­dot this morn­ing.
Accord­ing to the sup­port­ing lit­er­a­ture, mSpace han­dles both onto­log­i­cally struc­tured data, and RDF based infor­ma­tion that is not mod­elled with ontolo­gies.
What is poten­tially most valu­able about the mSpace frame­work is a use­ful, usable inter­face for both nav­i­gat­ing / explor­ing RDF-based infor­ma­tion spaces, and edit­ing them.
From the mSpace source­forge site:
“mSpace is an inter­ac­tion model designed to allow a user to nav­i­gate in a mean­ing­ful man­ner the multi-dimensional space that an ontol­ogy can pro­vide. mSpace offers poten­tially use­ful slices through this space by selec­tion of onto­log­i­cal cat­e­gories.
mSpace is fully gen­er­alised and as such, with a lit­tle def­i­n­i­tion, can be used to explore any knowl­edge base (with­out the require­ment of ontolo­gies!).
Please see for more infor­ma­tion.“
From the abstract of the Tech­ni­cal report, titled mSpace: explor­ing the Seman­tic Web
“Infor­ma­tion on the web is tra­di­tion­ally accessed through key­word search­ing. This method is pow­er­ful in the hands of a user that is expe­ri­enced in the domain they wish to acquire knowl­edge within. Domain explo­ration is a more dif­fi­cult task in the cur­rent envi­ron­ment for a user who does not pre­cisely under­stand the infor­ma­tion they are seek­ing. Seman­tic Web tech­nolo­gies can be used to rep­re­sent a com­plex infor­ma­tion space, allow­ing the explo­ration of data through more pow­er­ful meth­ods than text search. Ontolo­gies and RDF data can be used to rep­re­sent rich domains, but can have a high bar­rier to entry in terms of appli­ca­tion or data cre­ation cost.
The mSpace inter­ac­tion model describes a method of eas­ily rep­re­sent­ing mean­ing­ful slices through these mul­ti­di­men­sional spaces. This paper describes the design and cre­ation of a sys­tem that imple­ments the mSpace inter­ac­tion model in a fash­ion that allows it to be applied across almost any set of RDF data with min­i­mal recon­fig­u­ra­tion. The sys­tem has no require­ment for onto­log­i­cal sup­port, but can make use of it if avail­able. This allows the visu­al­i­sa­tion of exist­ing non-semantic data with min­i­mal cost, with­out sac­ri­fic­ing the abil­ity to utilise the power that semantically-enabled data can provide.”

See You At the Information Architecture Summit

February 11th, 2005 — 4:43pm

After miss­ing the first four IA sum­mits, I’m very much look­ing for­ward to this year’s IA Sum­mit in lovely Mon­tréal.
For this year’s gath­er­ing, I’m pre­sent­ing a poster on how a sim­ple set of IA build­ing blocks can sup­port pow­er­ful infor­ma­tion archi­tec­tures, in the con­text of inter­con­nected exeuc­tive por­tals. Aside from ben­e­fits in terms of user expe­ri­ence con­sis­tency, learn­abil­ity, and increased rates of user sat­is­fac­tion and adop­tion, the true busi­ness value of a sys­tem of sim­ple infor­ma­tion objects that con­veys a tremen­dous vari­ety of con­tent is in meet­ing diverse needs for deci­sion mak­ing inputs across a wide vari­ety of audi­ences and func­tional require­ments.
This is a follow-on to the bet­ter part of a year spent work­ing on the strat­egy, design, and devel­op­ment of a suite of exec­u­tive dash­boards and por­tals for major phar­ma­ceu­ti­cal clients.
In the look-over-your-shoulder-as-you-run-forward mode typ­i­cal of most con­sult­ing roles, it’s quite a bit dif­fer­ent from the seman­tic web / seman­tic archi­tec­ture work I’m engaged with now. But that’s the joy of always being on to new things :)

Joining Blogstreet

February 8th, 2005 — 11:54am

I’m explor­ing some blog tools, like blogstreet…

How Much Information Does the World Produce In a Year?

February 8th, 2005 — 10:45am

How Much Infor­ma­tion? 2003 is an update to a project first under­taken by researchers at the School of Infor­ma­tion Man­age­ment and Sys­tems at UC Berke­ley in 2000. Their intent was to study infor­ma­tion stor­age and flows across print, film, mag­netic, and opti­cal media.
It’s not sur­pris­ing that the United States pro­duces more infor­ma­tion than any other sin­gle coun­try, but it was eye-opening to read that about 40% of the new stored infor­ma­tion in the world every year cones from the U.S.
Also sur­pris­ing is the total amount of instant mes­sage traf­fic in 2002, esti­mated at 274 ter­abytes, and the fact that email is now the sec­ond largest infor­ma­tion flow, behind the tele­phone.
Some excerpts from the exec­u­tive sum­mary:
“Print, film, mag­netic, and opti­cal stor­age media pro­duced about 5 exabytes of new infor­ma­tion in 2002. Ninety-two per­cent of the new infor­ma­tion was stored on mag­netic media, mostly in hard disks.“
“How big is five exabytes? If dig­i­tized with full for­mat­ting, the sev­en­teen mil­lion books in the Library of Con­gress con­tain about 136 ter­abytes of infor­ma­tion; five exabytes of infor­ma­tion is equiv­a­lent in size to the infor­ma­tion con­tained in 37,000 new libraries the size of the Library of Con­gress book col­lec­tions.“
“Hard disks store most new infor­ma­tion. Ninety-two per­cent of new infor­ma­tion is stored on mag­netic media, pri­mar­ily hard disks. Film rep­re­sents 7% of the total, paper 0.01%, and opti­cal media 0.002%.“
“The United States pro­duces about 40% of the world’s new stored infor­ma­tion, includ­ing 33% of the world’s new printed infor­ma­tion, 30% of the world’s new film titles, 40% of the world’s infor­ma­tion stored on opti­cal media, and about 50% of the infor­ma­tion stored on mag­netic media.“
“How much new infor­ma­tion per per­son? Accord­ing to the Pop­u­la­tion Ref­er­ence Bureau, the world pop­u­la­tion is 6.3 bil­lion, thus almost 800 MB of recorded infor­ma­tion is pro­duced per per­son each year. It would take about 30 feet of books to store the equiv­a­lent of 800 MB of infor­ma­tion on paper.“
“Most radio and TV broad­cast con­tent is not new infor­ma­tion. About 70 mil­lion hours (3,500 ter­abytes) of the 320 mil­lion hours of radio broad­cast­ing is orig­i­nal pro­gram­ming. TV world­wide pro­duces about 31 mil­lion hours of orig­i­nal pro­gram­ming (70,000 ter­abytes) out of 123 mil­lion total hours of broadcasting.”

Public RDF Data Sets at

February 8th, 2005 — 9:56am offers a great col­lec­tion of RDF data sets and ser­vices that gen­er­ate RDF.

Tim Bray and the RDF Challenge: Poor Tools Are A Barrier For The Semantic Web

February 7th, 2005 — 4:56pm

In the lat­est issue of ACMQueue, Tim Bray is inter­viewed about his career path and early involve­ment with the SGML and XML stan­dards. While recount­ing, Bray makes four points about the slow pace of adop­tion for RDF, and reit­er­ates his con­vic­tion that the cur­rent qual­ity of RDF-based tools is an obsta­cle to their adop­tion and the suc­cess of the Seman­tic Web.
Here are Bray’s points, with some com­men­tary based on recent expe­ri­ences with RDF and OWL based ontol­ogy man­age­ment tools.
1. Moti­vat­ing peo­ple to pro­vide meta­data is dif­fi­cult. Bray says, “If there’s one thing we’ve learned, it’s that there’s no such thing as cheap meta-data.“
This is plainly a prob­lem in spaces much beyond RDF. I hold the con­cept and the label meta-data itself partly respon­si­ble, since the term meta-data explic­itly sep­a­rates the descriptive/referential infor­ma­tion from the idea of the data itself. I wager that user adop­tion of meta-data tools and processes will increase as soon as we stop dis­so­ci­at­ing a com­plete pack­age into two dis­tinct things, with dif­fer­ent implied lev­els of effort and value. I’m not sure what a uni­fied label for the base level unit con­struct made of meta-data and source data would be (an asset maybe?), but the implied deval­u­a­tion of meta-data as an optional or sup­ple­men­tal ele­ment means that the time and effort demands of accu­rate and com­pre­hen­sive tag­ging seem oner­ous to many users and busi­nesses. Thus the pro­lif­er­a­tion of auto­mated tax­on­omy and cat­e­go­riza­tion gen­er­a­tion tools…
2. Infer­ence based pro­cess­ing is inef­fec­tive. Bray says, “Infer­ring meta-data doesn’t work… Infer­ring meta-data by nat­ural lan­guage pro­cess­ing has always been expen­sive and flaky with a poor return on invest­ment.“
I think this isn’t spe­cific enough to agree with with­out qual­i­fi­ca­tion. How­ever, I have seen analy­sis of a num­ber of infer­renc­ing sys­tems, and they tend to be slow, espe­cially when pro­cess­ing and updat­ing large RDF graphs. I’m not a sys­tems archi­tect or an engi­neer, but it does seem that none of the var­i­ous solu­tions now avail­able directly solves the prob­lem of allow­ing rapid, real-time infer­renc­ing. This is an issue with struc­tures that change fre­quently, or dur­ing high-intensity peri­ods of the ontol­ogy life-cycle, such as ini­tial build and edi­to­r­ial review.
3. Bray says, “To this day, I remain fairly uncon­vinced of the core Seman­tic Web propo­si­tion. I own the domain name I’ve offered the world the chal­lenge, which is that for any­body who can build an actual RDF-based appli­ca­tion that I want to use more than once or twice a week, I’ll give them I announced that in May 2003, and noth­ing has come close.“
Again, I think this needs some clar­i­fi­ca­tion, but it brings out a seri­ous poten­tial bar­rier to the suc­cess of RDF and the Seman­tic Web by show­cas­ing the poor qual­ity of exist­ing tools as a direct neg­a­tive influ­encer on user sat­is­fac­tion. I’ve heard this from users work­ing with both com­mer­cial and home-built seman­tic struc­ture man­age­ment tools, and at all lev­els of usage from core to occa­sional.
To this I would add the idea that RDF was meant for inter­pre­ta­tion by machines not peo­ple, and as a con­se­quence the basic user expe­ri­ence par­a­digms for dis­play­ing and manip­u­lat­ing large RDF graphs and other seman­tic con­structs remain unre­solved. Mozilla and Netscape did won­ders to make the WWW appar­ent in a vis­ceral and tan­gi­ble fash­ion; I sus­pect RDF may need the same to really take off and enter the realm of the less-than-abstruse.
4. RDF was not intended to be a Knowl­edge Rep­re­sen­ta­tion lan­guage. Bray says, “My orig­i­nal ver­sion of RDF was as a general-purpose meta-data inter­change facil­ity. I hadn’t seen that it was going to be the basis for a general-purpose KR ver­sion of the world.“
This sounds a bit like a warn­ing, or at least a strong admo­ni­tion against reach­ing too far. OWL and vari­ants are new (rel­a­tively), so it’s too early to tell if Bray is right about the scope and ambi­tion of the Seman­tic Web effort being too great. But it does point out that the con­text of the stan­dard bears heav­ily on its even­tual func­tional achieve­ment when put into effect. If RDF was never meant to bear its cur­rent load, then it’s not a sur­prise that an effec­tive suite of RDF tools remains unavailable.

NYC Information Architecture Meetup

February 7th, 2005 — 9:49am

Two thumbs up to Anders Ram­say for orga­niz­ing IA mee­tups down in NYC. I had the chance to come to one of these reg­u­lar get-togethers in Jan­u­ary, and meet Anders, Lou Rosen­feld, Liz Danz­ico, Peter Van Dijk, and quite a few oth­ers while in town to see clients. After some refresh­ing bev­er­ages at Vig Bar, we moved on to the Mer­cer Kitchen for a swanky, tasty din­ner. Word of mouth has it that the duck at is a reli­gious expe­ri­ence. And it’s always nice to put faces to a great many blog posts.
Anders posted some pho­tos here:
I don’t see any of the umbrel­las dec­o­rat­ing the inte­rior of the main din­ing area in the pho­tos — but you had to look up to see them hang­ing from the ceil­ing in the first place.
Visual Puz­zler Chal­lenge: some­one in these pho­tos is a Sys­tem Archi­tect maque­rad­ing as an IA — can you spot the imposter?

