Tag: visualization


Defining Discovery: Core Concepts

October 18th, 2013 — 12:33pm

Dis­cov­ery tools have had a ref­er­ence­able work­ing def­i­n­i­tion since at least 2001, when Ben Shnei­der­man pub­lished ‘Invent­ing Dis­cov­ery Tools: Com­bin­ing Infor­ma­tion Visu­al­iza­tion with Data Min­ing’.  Dr. Shnei­der­man sug­gested the com­bi­na­tion of the two dis­tinct fields of data min­ing and infor­ma­tion visu­al­iza­tion could man­i­fest as new cat­e­gory of tools for dis­cov­ery, an under­stand­ing that remains essen­tially unal­tered over ten years later.  An indus­try ana­lyst report titled Visual Dis­cov­ery Tools: Mar­ket Seg­men­ta­tion and Prod­uct Posi­tion­ing from March of this year, for exam­ple, reads, “Visual dis­cov­ery tools are designed for visual data explo­ration, analy­sis and light­weight data mining.”

Tools should fol­low from the activ­i­ties peo­ple under­take (a foun­da­tional tenet of activ­ity cen­tered design), how­ever, and Dr. Shnei­der­man does not in fact describe or define dis­cov­ery activ­ity or capa­bil­ity. As I read it, dis­cov­ery is assumed to be the implied sum of the sep­a­rate fields of visu­al­iza­tion and data min­ing as they were then under­stood.  As a work­ing def­i­n­i­tion that cat­alyzes a field of prod­uct pro­to­typ­ing, it’s ade­quate in the short term.  In the long term, it makes the bound­aries of dis­cov­ery both derived and tem­po­rary, and leaves a sub­stan­tial gap in the land­scape of core con­cepts around dis­cov­ery, mak­ing con­sen­sus on the nature of most aspects of dis­cov­ery dif­fi­cult or impos­si­ble to reach.  I think this def­i­n­i­tional gap is a major rea­son that dis­cov­ery is still an ambigu­ous prod­uct landscape.

To help close that gap, I’m sug­gest­ing a few def­i­n­i­tions of four core aspects of dis­cov­ery.  These come out of our sus­tained research into dis­cov­ery needs and prac­tices, and have the goal of clar­i­fy­ing the rela­tion­ship between dis­cvo­ery and other ana­lyt­i­cal cat­e­gories.  They are sug­gested, but should be inter­nally coher­ent and consistent.

Dis­cov­ery activ­ity is: “Pur­pose­ful sense mak­ing activ­ity that intends to arrive at new insights and under­stand­ing through explo­ration and analy­sis (and for these we have spe­cific defin­tions as well) of all types and sources of data.”

Dis­cov­ery capa­bil­ity is: “The abil­ity of peo­ple and orga­ni­za­tions to pur­pose­fully real­ize valu­able insights that address the full spec­trum of busi­ness ques­tions and prob­lems by engag­ing effec­tively with all types and sources of data.”

Dis­cov­ery tools: “Enhance indi­vid­ual and orga­ni­za­tional abil­ity to real­ize novel insights by aug­ment­ing and accel­er­at­ing human sense mak­ing to allow engage­ment with all types of data at all use­ful scales.”

Dis­cov­ery envi­ron­ments: “Enable orga­ni­za­tions to under­take effec­tive dis­cov­ery efforts for all busi­ness pur­poses and per­spec­tives, in an empir­i­cal and coöper­a­tive fashion.”

Note: applic­a­bil­ity to a world of Big data is assumed — thus the refs to all scales / types / sources — rather than stated explic­itly.  I like that Big Data doesn’t have to be writ­ten into this core set of def­i­n­i­tions, b/c I think it’s a tran­si­tional label — the new ver­sion of Web 2.0 — and goes away over time.

Ref­er­ences and Resources:

Comment » | Language of Discovery

The Architecture of Discovery: Slides from Discover Conference 2011

April 16th, 2011 — 1:11pm

Endeca invites cus­tomers, part­ners and lead­ing mem­bers of the broader search and dis­cov­ery tech­nol­ogy and solu­tions com­mu­ni­ties to meet annu­ally, and show­case the most inter­est­ing and excit­ing work in the field of dis­cov­ery.  As lead for the UX team that designs Endeca’s dis­cov­ery prod­ucts, I shared some of our recent work on pat­terns in the struc­ture of dis­cov­ery appli­ca­tions, as well as best prac­tices in infor­ma­tion design and visu­al­iza­tion that we use to drive prod­uct def­i­n­i­tion and design for Endeca’s Lat­i­tude Dis­cov­ery Framework.

This mate­r­ial is use­ful for pro­gram and project man­agers and busi­ness ana­lysts defin­ing require­ments for dis­cov­ery solu­tions and appli­ca­tions, UX and sys­tem archi­tects craft­ing high-level struc­tures and address­ing long-term growth, inter­ac­tion design­ers and tech­ni­cal devel­op­ers defin­ing and build­ing infor­ma­tion work­spaces at a fine grain, and

There are three major sec­tions: the first presents some of our tools for iden­ti­fy­ing and under­stand­ing people’s needs and goals for dis­cov­ery in terms of activ­ity (the Lan­guage of Dis­cov­ery as we call it), the sec­ond brings together screen-level, appli­ca­tion level, and user sce­nario / use-case level pat­terns we’ve observed in the appli­ca­tions cre­ated to meet those needs, and the final sec­tion shares con­densed best prac­tices and fun­da­men­tal prin­ci­ples for infor­ma­tion design and visu­al­iza­tion based on aca­d­e­mic research dis­ci­plines such as cog­ni­tive sci­ence and infor­ma­tion retrieval.

It’s no coin­ci­dence that these sec­tions reflect the appli­ca­tion of the core UX dis­ci­plines of user research, infor­ma­tion archi­tec­ture, and inter­ac­tion design to the ques­tion of “who will need to encounter infor­ma­tion for some end, and in what kind of expe­ri­ence will they encounter it”.  This flow and order­ing is delib­er­ate; it demon­strates on two lev­els the results of our own efforts apply­ing the UX per­spec­tive to the ques­tions inher­ent in cre­at­ing dis­cov­ery tools, and shares some of the tools, insights, tem­plates, and resources we use to shape the plat­form used to cre­ate dis­cov­ery expe­ri­ences across diverse industries.

Ses­sion outline

  1. Under­stand­ing User Needs
  2. Design Pat­terns for Dis­cov­ery Applications
  3. Design Prin­ci­ples and Gudielines for Infor­ma­tion Inter­ac­tion and Visualization

Ses­sion description

How can you har­ness the power and flex­i­bil­ity of Lat­i­tude to cre­ate use­ful, usable, and com­pelling dis­cov­ery appli­ca­tions for enter­prise dis­cov­ery work­ers? This ses­sion goes beyond the tech­nol­ogy to explore how you can apply fun­da­men­tal prin­ci­ples of infor­ma­tion design and visu­al­iza­tion, ana­lyt­ics best prac­tices and user inter­face design pat­terns to com­pose effec­tive and com­pelling dis­cov­ery appli­ca­tions that opti­mize user dis­cov­ery, suc­cess, engage­ment, & adoption.”

The pat­terns are prod­uct spe­cific in that they show how to com­pose screens and appli­ca­tions using the pre­de­fined com­po­nents in the Dis­cov­ery Frame­work library.  How­ever, many of the product-specific com­po­nents are built to address com­mon or recur­ring needs for inter­ac­tion with infor­ma­tion via well-known mech­a­nisms such as search, fil­ter­ing, nav­i­ga­tion, visu­al­iza­tion, and pre­sen­ta­tion of data.  In other words, even if you’re not using the lit­eral Dis­cov­ery Frame­work com­po­nent library to com­pose your spe­cific infor­ma­tion analy­sis work­space, you’ll find these pat­terns rel­e­vant at work­space and appli­ca­tion lev­els of scale.

The deeper story of these pat­terns is in demon­strat­ing the evo­lu­tion of dis­cov­ery and analy­sis appli­ca­tions over time.  Typ­i­cally, dis­cov­ery appli­ca­tions begin by offer­ing users a general-purpose work­space that sat­is­fies a wide range of inter­ac­tion tasks in an approx­i­mate fash­ion.  Over time, via suc­ces­sive expan­sions in the the scope and vari­ety of data they present, and the dis­cov­ery and analy­sis capa­bil­i­ties they pro­vide, dis­cov­ery appli­ca­tions grow to include sev­eral dif­fer­ent types of work­spaces that indi­vid­u­ally address dis­tinct sets of needs for visu­al­iza­tion and sense mak­ing by using very dif­fer­ent com­bi­na­tions of com­po­nents.  As a com­pos­ite, these func­tional and infor­ma­tion­ally diverse work­spaces span the full range of inter­ac­tion needs for dif­fer­ing types of users.

I hope you find this toolkit and col­lec­tion of pat­terns and infor­ma­tion design prin­ci­ples use­ful.  What are some of the resources you’re using to take on these challenges?

User Expe­ri­ence Archi­tec­ture For Dis­cov­ery Appli­ca­tions from Joe Laman­tia

Comment » | Dashboards & Portals, Enterprise, Information Architecture, User Experience (UX)

8 Waves of Change Shaping Digital Experiences

December 11th, 2008 — 5:21am

I’ve been focused on under­stand­ing future direc­tions in the land­scape of dig­i­tal expe­ri­ences recently (which nicely par­al­lels some of the work I’ve been doing on design and futures in gen­eral), so I’m shar­ing a sum­mary of the analy­sis that’s come out of this research.
This pre­sen­ta­tion shares an overview of all the major waves of change affect­ing dig­i­tal expe­ri­ences, some of the espe­cially forward-looking insights around shifts in our iden­ti­ties, and the impli­ca­tions for those cre­at­ing dig­i­tal expe­ri­ences.
The 8 waves dis­cussed here (are there more? let me know!)

  • Dig­i­tal = Social
  • Co-Creation
  • Dig­i­tal Natives
  • Itʼs All a Game
  • Take Away
  • Every­ware
  • Con­ver­gence
  • See­ing Is Believing

Comment » | Ideas, The Media Environment, User Experience (UX)

Is Daylife the Collective Conscious?

July 20th, 2007 — 3:55pm

Jung posited the idea of the col­lec­tive uncon­scious (later refined, but a good point of depar­ture). Do Daylife and sim­i­lar stream aggre­ga­tors / visu­al­iz­ers (I’m reach­ing for a han­dle to describe these enti­ties) like Uni­verse, point at what a col­lec­tive con­scious could be?
Uni­verse
daylife_universe.jpg
Some pre­cur­sors might be Yahoo’s Taglines and TagMaps, Google Zeit­geist / Trends, and the var­i­ous cloud style visu­al­iza­tions like clouda­li­cious, etc.
Plainly, the num­ber and vari­ety of tools and des­ti­na­tions for visu­al­iz­ing what’s on the mind of groups is grow­ing rapidly.
If the par­al­lelism holds, mean­ing Daylife and kin are them­selves points of depar­ture, where is this going? I’m not think­ing of col­lec­tive intel­li­gence — just the visu­al­iza­tion aspect, and how that may evolve.

Comment » | Ideas

Watching Ideas Bloom: Text Clouds of the Republican Debate At Democrats.org

May 4th, 2007 — 8:07pm

A meme is emerg­ing for the use text clouds as visu­al­iza­tion for — and a source of insight into — polit­i­cal speeches and speak­ers.
Text clouds of the Repub­li­can Pres­i­den­tial can­di­dates’ debate appear front and cen­ter on the DNC blog democrats.org, in Tag Clouds Can Tell Us a Lot.… (sourced from media analy­sis firm Upstream Analy­sis via Pollster.com).
GiulianiTag400.png
BrownbackTag400.png
As you can see in the quote from the writeup below, we’re quickly devel­op­ing sophis­ti­cated read­ings of the (com­par­a­tively) sim­ple visu­al­iza­tion meth­ods used to gen­er­ate text clouds.
But some­times a cloud also reflects con­cerns that vot­ers share about a can­di­date. This is because the can­di­date gets asked about the issue–a lot–and then has to talk about it.
Check out the large “Pro-Life” tag in flip-flopping Romney’s cloud, or the large “Think” tag in Giuliani’s cloud–the can­di­date noto­ri­ous for leap­ing first and think­ing later.

Polit­i­cal inter­pre­ta­tions aside, this is a nuanced read­ing of the result­ing clouds: it rec­og­nizes the dynamic feed­back link between inten­tions and responses that becomes vis­i­ble in the ren­dered clouds. For a visu­al­iza­tion geek, these clouds show the dif­fer­ing agen­das of can­di­dates and audi­ence as they played out, a nice exam­ple of social mech­a­nisms in action.
Note to the tool builders of the world
How about putting together a visu­al­iza­tion toolset that shows evolv­ing text clouds as the debate pro­gresses? I’m imag­in­ing a time­line plus tran­script plus cloud view of the accu­mu­lat­ing text cloud for each can­di­date, with options for mov­ing for­ward or back in the stream of words.
What could be bet­ter than watch­ing words and ideas bloom over time, the same way we see flow­ers in a gar­den blos­som, open, and close in time lapse pho­tog­ra­phy. I’d like to think we can grow some­thing poetic and beau­ti­ful, as well as use­ful, from the (sadly debased) soil of politi­cized sound bites sur­round­ing us.

Or, with a nod to the bru­tal com­pe­ti­tion built into most nat­ural sys­tems, you may choose to watch the strug­gle of waterlil­lies for sun­light, in this clip from The Amaz­ing Life of Plants.

Comment » | Tag Clouds

Text Clouds and Advertising: Microsoft's Community Buzz Project

April 28th, 2007 — 2:47pm

Thanks to Dat­a­min­ing, for post­ing a writeup and screen­shot of a pro­to­type of Com­mu­nity Buzz, which fea­tures a text cloud. Com­mu­nity Buzz is a Microsoft Research project, and this is a per­fect use of a text cloud to visu­al­ize con­cepts and fur­ther com­pre­hen­sion in a body of text.
More inter­est­ing than the text cloud is the space in the screen­shot that looks like a place­holder for adver­tis­ing dri­ven by the con­tents of the text cloud. The anno­ta­tion reads “Con­tex­tual ads based on the Buzz cloud key­words”, imply­ing an adver­tis­ing based rev­enue mech­a­nism dri­ven by cre­ation and analy­sis of a text cloud.
Com­mu­nity Buzz Screen­shot

The descrip­tion of Com­mu­nity Buzz posted on the Tech­Fest 2007 page, includes the fol­low­ing, mak­ing the con­nec­tion to an adver­tis­ing model explicit:
Com­mu­nity Buzz com­bines text min­ing, social account­ing (Netscan/MSR-Halo), and new visu­al­iza­tion tech­niques to study and present the con­tent of com­mu­ni­ca­tion threads in online dis­cus­sion groups. The merg­ing of these research tech­nolo­gies results in a sys­tem that gives great value to com­mu­nity par­tic­i­pants, enables highly directed adver­tis­ing, and sup­plies rich met­rics to prod­uct man­agers.
Assum­ing it’s pos­si­ble to pro­vide highly directed adver­tis­ing and rich met­rics based on text clouds, I can see the ben­e­fits of for adver­tis­ers and prod­uct man­agers, and researchers of many kinds. Yet I’m not con­vinced of the ben­e­fits for com­mu­nity par­tic­i­pants. Where will the text clouds come from, and how will their con­tent reflect the needs of the com­mu­nity? How will social dynam­ics shape or affect these text clouds, to make it pos­si­ble for them to lever­age net­work effects, dif­fer­en­tial par­tic­i­pa­tion, and the scale ben­e­fits of con­nected social sys­tems?
Text clouds — at least at this stage of devel­op­ment — sup­port rapid but shal­low com­pre­hen­sion: maybe this is per­fect for adver­tis­ing pur­poses…
Like a pile of dry bones that used to make up a skele­ton, text clouds lack the spe­cific struc­ture and con­text of their source, and so can­not replace com­pre­hen­sion. Text clouds decon­struct the word ele­ments that make up a body of text the same way spec­trum analy­sis iden­ti­fies the dif­fer­ent wave­lengths of light from a dis­tant star. It’s a bit like using sta­tis­ti­cal analy­sis to read King Lear, instead of using a vari­ety of tools to learn more about what Lear might have to say.
A bet­ter use of text clouds, or any other type of decon­struc­tive method (a vari­ant of semi­otics) is as a tool for enhanc­ing com­pre­hen­sion. Text clouds seem to bypass dis­tinc­tions between high con­text and low con­text that present bar­ri­ers to under­stand­ing deep con­text, by focus­ing on the raw con­tent of the source, on the level of it’s con­stituent ele­ments.
The goal of exam­in­ing the fun­da­men­tal or essen­tial makeup of some­thing we’re explor­ing — as a way of bet­ter under­stand­ing that thing over­all — is an epis­te­mo­log­i­cal method pur­sued by Plato and a host of other West­ern philoso­phers and nat­ural sci­en­tists. We should be cau­tious with new tools, how­ever, as the urge to illu­mi­nate and dis­sect the fun­da­men­tal makeup of that which is com­plex and nuanced can go too far, cross­ing from the insight­ful to the ster­ile domain of soul­less reduc­tivism. Wit­ness the responses of cor­rupt offi­cials to Javier Bardem’s char­ac­ter Agustín, in John Malkovich’s direc­to­r­ial debut The Dancer Upstairs.
Agustín is a police hero who saves his coun­try from a crim­i­nal and oppres­sive gov­ern­ment, social dis­in­te­gra­tion, and guerilla takeover. He then sur­ren­ders all prospects of win­ning the pres­i­dency and lead­ing his strug­gling nation to pros­per­ity for the unre­quited love of a woman who aided the same guerilla leader he helped cap­ture. Agustín strikes a secret bar­gain to secure her free­dom with the cor­rupt pow­ers that be, on con­di­tion that he with­draw from pub­lic life. His choice is incom­pre­hen­si­ble to the soul­less offi­cials in power. To these peo­ple, who buy, sell, and exe­cute hun­dreds with­out a thought, Agustín’s lover “…is just a girl — 70% water.“
For ref­er­ence, the overview of Com­mu­nity Buzz:

  • Com­mu­nity Buzz com­bines analy­sis of the con­tent of online dis­cus­sions and social struc­ture of the com­mu­ni­ties to iden­tify hot top­ics and visu­al­ize how they evolve over time.
  • Through search and Buzz cloud users can access rel­e­vant dis­cus­sion threads and adverts linked to the search results and Buzz keywords.
  • Visu­al­iza­tion of key­word trends enables the users to mon­i­tor the pop­u­lar­ity of selected top­ics. Mesasages can be fil­tered based on the ‘social sta­tus’ of the author in the community.

And the com­plete descrip­tion of the demo men­tioned by Dat­a­min­ing:

Com­mu­nity Buzz is a new win­dow into online com­mu­ni­ties! Inter­est­ing and use­ful con­ver­sa­tions, authors, and groups are dis­cov­ered eas­ily using this tool, jointly devel­oped by Microsoft Research Redmond’s Com­mu­nity Tech­nolo­gies group and Microsoft Research Cambridge’s Inte­grated Sys­tems team, with spon­sor­ship from Live Labs. Com­mu­nity Buzz com­bines text min­ing, social account­ing (Netscan/MSR-Halo), and new visu­al­iza­tion tech­niques to study and present the con­tent of com­mu­ni­ca­tion threads in online dis­cus­sion groups. The merg­ing of these research tech­nolo­gies results in a sys­tem that gives great value to com­mu­nity par­tic­i­pants, enables highly directed adver­tis­ing, and sup­plies rich met­rics to prod­uct man­agers.

Comment » | Tag Clouds

Text Clouds: A New Form of Tag Cloud?

March 15th, 2007 — 12:04am

Dur­ing 2006, tag clouds moved beyond their well-known role as nav­i­ga­tion mech­a­nisms and indi­ca­tors of activ­ity within social media expe­ri­ences, emerg­ing as a stan­dard visu­al­iza­tion tech­nique for texts and tex­tual data in gen­eral.
This use of tag clouds does not com­monly involve tags, social net­works, emer­gent archi­tec­tures, folk­sonomies, or meta­data.
“Text cloud” might be a more accu­rate label for these visu­al­iza­tions than tag cloud. In addi­tion to rec­og­niz­ing fun­da­men­tal dif­fer­ences — text clouds dif­fer from tag clouds in com­po­si­tion (no tags at all) and pur­pose (pre­dom­i­nantly com­pre­hen­sion, rather than access or nav­i­ga­tion) — dis­tin­guish­ing the two types of clouds will make it much eas­ier to assess their abil­i­ties to sup­port user expe­ri­ence needs and busi­ness goals.
The emer­gence of this new form of text cloud looks like a good exam­ple of spe­ci­a­tion in action (though it’s too early to tell whether the end result will be clado­ge­n­e­sis or ana­ge­n­e­sis).
Major and minor pub­li­ca­tions feature(d) text clouds as visu­al­iza­tions in 2006, both per­ma­nently and temporarily:

The Economist’s Text cloud

In 2006, sev­eral free and pub­lic tools for gen­er­at­ing text clouds locally on the desk­top or via a ser­vice avail­able through the Web were released. The increase in the num­ber and vari­ety of spe­cific text cloud tools reflects embrace and enthu­si­asm for text clouds in com­mu­ni­ties of inter­est for infor­ma­tion visu­al­iza­tion, lan­guage pro­cess­ing, and seman­tics.
Some of the bet­ter known exam­ples of text cloud tools include:

The Many Eyes Cloud

The text clouds cre­ated with these tools range across a wide spec­trum of speeches and writing:

Text clouds are meant to facil­i­tate rapid under­stand­ing and com­pre­hen­sion of a body of words, links, phrases, etc. Any block of infor­ma­tion com­posed of text is open to analy­sis as a text cloud, as these screen cap­tures of text clouds for restau­rant menus, ingre­di­ents, wikipedia, mag­a­zine cov­ers, and even poems demon­strate.
Tim O’Reilly uses text clouds for a num­ber of pur­poses:

We used them a bunch to ana­lyze the top­ics, com­pa­nies and peo­ple at the last FOO Camp, and they were the most use­ful of the visu­al­iza­tions we did. They helped us see where we were under– and over-represented in terms of com­pa­nies and par­tic­u­lar tech­nolo­gies we were want­ing to explore. …So they have many uses beyond just show­ing what we nor­mally think of as tags.

Non-linear Access
The emer­gence of text clouds shows con­tin­u­ing explo­ration and refine­ment of cloud style dis­plays as a new form of user inter­face, adapted to spe­cific con­texts. Con­tin­ued refine­ment of text clouds in this direc­tion may indi­cate an expand­ing role for com­monly avail­able and sophis­ti­cated text visu­al­iza­tion tools to sup­port spe­cial­ized goals for infor­ma­tion dis­play and under­stand­ing.
Remem­ber that Google is busy right now scan­ning thou­sands of books per day from sev­eral of the world’s major aca­d­e­mic libraries, as part of it’s self-appointed labor of orga­niz­ing the world’s infor­ma­tion. That’s a lot of new text. How will peo­ple work with effec­tively with such an over­whelm­ing amount of text, of so many dif­fer­ent kinds, from so many dif­fer­ent sources?
Con­sider the fol­low­ing, from Ulysses’ With­out Guilt by Stacy Schiff (in the New York Times):
Recently Cath­leen Black, pres­i­dent of Hearst Mag­a­zines, urged a group of pub­lish­ing exec­u­tives to think of their audi­ence as con­sumers rather than read­ers. She’s onto some­thing: arguably the very def­i­n­i­tion of read­ing has changed. So Google asserts in defend­ing its right to scan copy­righted mate­ri­als. The process of dig­i­tiz­ing books trans­forms them, the com­pany con­tends, into some­thing else; our engage­ment with a text is dif­fer­ent when we call it up online. We are no longer read­ing. We’re search­ing — a func­tion that con­ve­niently did not exist when the con­cept of copy­right was estab­lished.
On a larger scale, the grow­ing use of text clouds hints at a (poten­tial) deeper cul­tural shift in the way we go about read­ing and com­pre­hen­sion: a shift from lin­ear modes based on read­ing words and sen­tences, to non­lin­ear modes based on view­ing sum­maries of con­tent in aggre­gate as a way of dis­cov­er­ing con­cepts and pat­terns. (Finally, a legit­i­mate use for Twit­ter…) Exper­i­ment­ing with text clouds for non-linear read­ing and com­pre­hen­sion (now that’s a sexy term…) is a nat­ural evo­lu­tion of the role cloud style dis­plays play as an alter­na­tive / com­pli­ment / sup­ple­ment to the list based nav­i­ga­tion now dom­i­nant in user expe­ri­ences.
A Text Cloud of Twit­ter Posts (A Twit­ter­Cloud?)

cre­ated at TagCrowd.com


I’m not pre­dict­ing the end of read­ing as we know it, nor the end of nav­i­ga­tion as we know it: both will be with us for a long, long time. But I do believe that text clouds might con­sti­tute an emerg­ing method for aug­ment­ing com­pre­hen­sion and dis­play of text, with broad poten­tial uses.
Enter­pris­ing Clouds
What about some­one lack­ing time to fully read a Shake­speare play, or a fad­dish busi­ness book, but who needs to under­stand some­thing about that book’s mean­ing and sub­stance? A text cloud cre­ation tool could extract the most com­monly men­tioned terms, and oth­er­wise pro­file the words that make up the text. It would be risky to rely on a shal­low text cloud (and Tim O’Reilly men­tions this specif­i­cally) for deep com­pre­hen­sion, but it would be enough to under­stand the con­cepts that appear, and allow polite con­ver­sa­tion at a net­work­ing event, or lunch with that cer­tain man­ager who rec­om­mended the book.
If I were entre­pre­neur­ial, I’d source a set of free elec­tronic ver­sions of clas­sic texts, process them with one of the free text cloud tools, apply some XSLT and other trans­for­ma­tions to gen­er­ate con­sis­tent read­able for­mat­ting, and sell the results as a line of ebooks called “Cloud Notes”. Of course, someone’s beaten me to it already
What’s in store for the future?
In this fash­ion, text clouds may become a gen­er­ally applied tool for man­ag­ing grow­ing infor­ma­tion over­load by using auto­mated syn­the­sis and sum­ma­riza­tion. In the infor­ma­tion sat­u­rated future (or the infor­ma­tion sat­u­rated present), text clouds are the com­mon exec­u­tive sum­mary on steroids and acid simul­ta­ne­ously; assem­bled with mus­cu­lar syn­tac­ti­cal and seman­tic pro­cess­ing, and fed to reading-fatigued post-literates as swirling blobs of giant words in wild col­ors, it con­sists of sig­ni­fiers for rei­fied con­cepts that tweak the eye-brain-language con­duit directly.

12 comments » | Tag Clouds

10 Best Practices For Displaying Tag Clouds

February 25th, 2007 — 1:31am

This is a short list of best prac­tices for ren­der­ing and dis­play­ing tag clouds that I orig­i­nally cir­cu­lated on the IXDG mail­ing list, and am now post­ing in response to sev­eral requests. These best prac­tices are not in order of pri­or­ity — they’re sim­ple enumerated.

  1. Use a sin­gle color for the tags in the ren­dered cloud: this will allow vis­i­tors to iden­tify finer dis­tinc­tions in the size dif­fer­ences. Employ more than one color with dis­cre­tion. If using more than one color, offer the capa­bil­ity to switch between sin­gle color and mul­ti­ple color views of the cloud.
  2. Use a sin­gle sans serif font fam­ily: this will improve the over­all read­abil­ity of the ren­dered cloud.
  3. If accu­rate com­par­i­son of rel­a­tive weight (see­ing the size dif­fer­ences amongst tags) is more impor­tant than over­all read­abil­ity, use a mono­space font.
  4. If com­pre­hen­sion of tags and under­stand­ing the mean­ing is more impor­tant, use a vari­ably spaced font that is easy to read.
  5. Use con­sis­tent and pro­por­tional spac­ing to sep­a­rate the tags in the ren­dered tag cloud. Pro­por­tional means that the spac­ing between tags varies based on their size; typ­i­cally more space is used for larger sizes. Con­sis­tent means that for each tag of a cer­tain size, the spac­ing remains the same. In html, spac­ing is often deter­mined by set­ting style para­me­ters like padding or mar­gins for the indi­vid­ual tags.
  6. Avoid sep­a­ra­tor char­ac­ters between tags: they can be con­fused for small tags.
  7. Care­fully con­sider ren­der­ing in flash, or another vector-based method, if your users will expe­ri­ence the cloud largely through older browsers / agents: the font ren­der­ing in older browsers is not always good or con­sis­tent, but it is impor­tant that the cloud offer text that is read­ily digestible by search and index­ing engines, both locally and publicly
  8. If ren­der­ing the cloud in html, set the font size of ren­dered tags using whole per­cent­ages, rather than pixel sizes or dec­i­mals: this gives the dis­play agent more free­dom to adjust its final rendering.
  9. Do not insert line breaks: this allows the ren­der­ing agent to adjust the place­ment of line breaks to suit the ren­der­ing context.
  10. Offer the abil­ity to change the order between at least two options — alpha­bet­i­cal, and one vari­able dimen­sion (over­all weight, fre­quency, recency, etc.)

For fun, I’ve run these 10 best prac­tices through Tagcrowd. The major con­cepts show up well — font, color, and size are promi­nent — but obvi­ously the specifics of the things dis­cussed remain opaque.
Best Prac­tices For Dis­play as a Text Cloud
best_practices_textcloud.jpg

12 comments » | Tag Clouds

Cartograms, Tag Clouds and Visualization

May 22nd, 2006 — 10:56pm

I was enjoy­ing some of the engag­ing car­tograms avail­able from Worldmap­per, when I real­ized tag clouds might have some strong par­al­lels with car­tograms. After a quick sub­sti­tu­tion exer­cise, I’ve come to believe tag clouds could be to lists of meta­data what car­tograms are to maps; attempted solu­tions to sim­i­lar visu­al­iza­tion prob­lems dri­ven by com­mon and his­tor­i­cally con­sis­tent infor­ma­tion needs.
Here’s the train of thought behind the anal­ogy. Car­tograms are the dis­torted but cap­ti­vat­ing maps that change the famil­iar shapes of places on a map to visu­ally show data about geo­graphic loca­tions. Car­tograms change the way loca­tions appear to make a point or com­mu­ni­cate rel­a­tive dif­fer­ences in the under­ly­ing data; for exam­ple, by mak­ing coun­tries with higher GDP (gross domes­tic prod­uct) big­ger, and those with lower GDP smaller. In the exam­ple below, Japan’s size is much larger than it’s geo­graphic area, because it’s GDP is so high (it’s the dark green blob on the far right, much larger than China or India), while Africa is nearly invis­i­ble.
Gross Domes­tic Prod­uct

Tag clouds pur­sue the same goal: to enhance our under­stand­ing by com­mu­ni­cat­ing con­tex­tual mean­ing through changes in the way a set of things are visu­al­ized, rely­ing addi­tional dimen­sions of infor­ma­tion to make con­text explicit. Where car­tograms change geo­graphic units, tag clouds change the dis­play of a list of labels (the end point of a chain of link­ages con­nect­ing con­cepts to focuses) to com­mu­ni­cate the seman­tic impor­tance or con­text of the under­ly­ing con­cepts shown in the list.
Visu­ally, the rela­tion­ship of clouds to lists is sim­i­lar to that of maps and car­tograms; com­pare these two ren­der­ings of the most pop­u­lar search terms recorded by nytimes.com, one a sim­ple list and the other a tag cloud.
List Ren­der­ing of Search Terms

Cloud Ren­der­ing of Search Terms

This expla­na­tion of car­tograms from Car­togram Cen­tral a site sup­ported by the U.S. Geo­log­i­cal Sur­vey and tional Cen­ter for Geo­graphic Infor­ma­tion and Analy­sis makes the par­al­lels clearer, in greater detail.
“A car­togram is a type of graphic that depicts attrib­utes of geo­graphic objects as the object’s area. Because a car­togram does not depict geo­graphic space, but rather changes the size of objects depend­ing on a cer­tain attribute, a car­togram is not a true map. Car­tograms vary on their degree in which geo­graphic space is changed; some appear very sim­i­lar to a map, how­ever some look noth­ing like a map at all.“
Now for the cut and paste. Sub­sti­tute ‘tag cloud’ for car­togram, ‘seman­tic’ for geo­graphic, and ‘list’ in for map, and the same expla­na­tion reads:
“A tag cloud is a type of graphic that depicts attrib­utes of seman­tic objects as the object’s area. Because a tag cloud does not depict seman­tic space, but rather changes the size of objects depend­ing on a cer­tain attribute, a tag cloud is not a true list. Tag Clouds vary on their degree in which seman­tic space is changed; some appear very sim­i­lar to a list, how­ever some look noth­ing like a list at all.“
This is a good match for the cur­rent under­stand­ing of tag clouds.
Div­ing in deeper, Car­togram Cen­tral offers an excerpt from Car­tog­ra­phy: The­matic Map Design, that goes into more detail about the spe­cific char­ac­ter­is­tics of car­tograms.
Erwin Raisz called car­tograms ‘dia­gram­matic maps.’ Today they might be called car­tograms, value-by-area maps, anamor­phated images or sim­ply spa­tial trans­for­ma­tions. What­ever their name, car­tograms are unique rep­re­sen­ta­tions of geo­graph­i­cal space. Exam­ined more closely, the value-by-area map­ping tech­nique encodes the mapped data in a sim­ple and effi­cient man­ner with no data gen­er­al­iza­tion or loss of detail. Two forms, con­tigu­ous and non-contiguous, have become pop­u­lar. Map­ping require­ments include the preser­va­tion of shape, ori­en­ta­tion con­ti­gu­ity, and data that have suit­able vari­a­tion. Suc­cess­ful com­mu­ni­ca­tion depends on how well the map reader rec­og­nizes the shapes of the inter­nal enu­mer­a­tion units, the accu­racy of esti­mat­ing these areas, and effec­tive leg­end design. Com­plex forms include the two-variable map. Car­togram con­struc­tion may be by man­ual or com­puter means. In either method, a care­ful exam­i­na­tion of the logic behind the use of the car­togram must first be under­taken.“
Doing the same sub­sti­tu­tion exer­cise on this excerpt with the addi­tion of ‘rel­e­vance’ for value, ‘size’ for area, and ‘term’ for shape, yields sim­i­lar results:
“Erwin Raisz called tag clouds ‘dia­gram­matic lists.’ Today they might be called tag clouds, relevance-by-size lists, anamor­phated images or sim­ply spa­tial trans­for­ma­tions. What­ever their name, tag clouds are unique rep­re­sen­ta­tions of seman­tic space. Exam­ined more closely, the relevance-by-size list­ing tech­nique encodes the listed data in a sim­ple and effi­cient man­ner with no data gen­er­al­iza­tion or loss of detail. Two forms, con­tigu­ous and non-contiguous, have become pop­u­lar. List­ing require­ments include the preser­va­tion of term, ori­en­ta­tion, con­ti­gu­ity, and data that have suit­able vari­a­tion. Suc­cess­ful com­mu­ni­ca­tion depends on how well the list reader rec­og­nizes the terms (of the inter­nal enu­mer­a­tion units), the accu­racy of esti­mat­ing these sizes, and effec­tive leg­end design. Com­plex forms include the two-variable list. Tag cloud con­struc­tion may be by man­ual or com­puter means. In either method, a care­ful exam­i­na­tion of the logic behind the use of the tag cloud must first be under­taken.“
The cor­re­spon­dence here is strong as well.
Sta­ble Need
The fact that car­tograms and tag clouds show close par­al­lels means that while the tag cloud may be a new user inter­face ele­ment emerg­ing for the Web (and major desk­top appli­ca­tions like Out­look, in the case of Tagloc­ity), tag clouds as a type of visu­al­iza­tion have strong prece­dents in other much more mature user expe­ri­ence con­texts, such as the dis­play of mul­ti­ple dimen­sions of infor­ma­tion within geo­graphic or geospa­tial frames of ref­er­ence. Instances of strong cor­re­spon­dence of prob­lem solv­ing approach in both mature and emerg­ing con­texts could indi­cate sim­ple appli­ca­tion of par­al­lel fram­ing (from the mature con­text to the emerg­ing con­text) as an untested con­di­tional, until the true extent of diver­gence sep­a­rat­ing the two con­texts is under­stood. This is very com­mon new media.
Instead, in the case of tag clouds, I think it points at sta­ble needs dri­ving struc­turally sim­i­lar solu­tions to the basic prob­lem of how to visu­ally com­mu­ni­cate impor­tant rela­tion­ships and addi­tional dimen­sions of mean­ing under the lim­i­ta­tions of inher­ent flat­ness. The par­al­lels between car­tograms and tag clouds place the appear­ance of the tag cloud within the larger his­tory of con­tin­u­ing explo­ration of new ways of visu­al­iz­ing infor­ma­tion. In this view, tag clouds are a recent man­i­fes­ta­tion of the sta­ble need to cre­ate strong and effec­tive visual ways of con­vey­ing more than mem­ber­ship in a one-dimensional set (the list), or loca­tion and extent within a two-dimensional coör­di­nate sys­tem (the map).

1 comment » | Ideas, Tag Clouds

NYTimes.com Redesign Includes Tag Clouds

April 11th, 2006 — 9:58pm

Though you may not have noticed it at first (I didn’t — they’re located a few steps off the front page), the recently launched design of NYTimes.com includes tag clouds. After a quick review, I think their ver­sion is a good exam­ple of a cloud that offers some increased capa­bil­i­ties and con­tex­tual infor­ma­tion that together fall in line with the likely direc­tions of tag cloud evo­lu­tion we’ve con­sid­ered before.
Specif­i­cally, the New York Times tag cloud:

  1. allows users to change the cloud’s con­text — and thus its con­tent — with a set of con­trols (vis­i­ble as tabs, run­ning across the top)
  2. lets cloud con­sumers change the dis­play behav­ior of the cloud by switch­ing modes from list to cloud in-line, not out­side the user’s area of activity
  3. sup­ports the chain of under­stand­ing for cloud con­sumers by pro­vid­ing clear indi­ca­tion of the time period cov­ered (the note about update frequency)
  4. offers [lim­ited] capa­bil­i­ties to work with / share tag cloud con­tent out­side the cloud via email — though the mes­sage con­tains only a link to the cloud page, and not a full rendering

NYtimes.com Tag Cloud

The NYTimes.com tag cloud shows the most pop­u­lar search terms used by read­ers within three time frames: the last 24 hours, the last 7 days, and the last 30 days. Choos­ing search terms as the makeup for a cloud is a bit curi­ous — but it may be as close to socially gen­er­ated meta­data as seemed rea­son­able for a first explo­ration (one that doesn’t require a sub­stan­tial change in the busi­ness or pub­lish­ing model).
Given the way that clouds lend them­selves to show­ing mul­ti­ple dimen­sions of mean­ing, such as change over time, I think the Times tag cloud would be more valu­able if it offered the option to see all three time frames at once. I put together a quick cut and paste of a con­cept screen that shows this sort of lay­out:
Screen Con­cept: 3 Clouds for Dif­fer­ent Time Frames

In an exam­ple of the rapid mor­ph­ing of memes and def­i­n­i­tions to fit shift­ing usage con­texts (as in Thomas Vanderwal’s obser­va­tions on the shift­ing usage of folk­son­omy) the NYTimes.com kept the label tag cloud, while this is more prop­erly a weighted list: the tags shown are in fact search terms, and not labels applied to a focus of some kind by tag­gers.
It’s plain from the lim­ited pres­ence and vis­i­bil­ity of clouds within the over­all site that the staff at NYTimes.com are still explor­ing the value of tag clouds for their spe­cific needs (which I think is a mature approach), oth­er­wise I imag­ine the new design con­cept and nav­i­ga­tion model would uti­lize and empha­sized tag clouds to a greater degree. So far, the Times uses tag clouds only in the new “Most Pop­u­lar” sec­tion, and they are offered as an alter­na­tive to the default list style pre­sen­ta­tion of pop­u­lar search terms. This posi­tion within the site struc­ture places them a few steps in, and off the stan­dard front page-to-an-article user flow that must be one of the core sce­nar­ios sup­ported by the site’s infor­ma­tion archi­tec­ture.
NYTimes.com User Flow to Tag Cloud

Still, I do think it’s a clear sign of increas­ing aware­ness of the poten­tial strength of tag clouds as a way of visu­al­iz­ing seman­tic infor­ma­tion. The Times is an estab­lished entity (occa­sion­ally serv­ing as the def­i­n­i­tion of ‘the estab­lish­ment’), and so is less likely to endan­ger estab­lished rela­tion­ships with cus­tomers by chang­ing its core prod­uct across any of the many chan­nels used for deliv­ery.
Ques­tions of risk aside, tag clouds (here I mean any visu­al­iza­tion of seman­tic meta­data) couLd be a very effec­tive way to scan the head­lines for a sense of what’s hap­pen­ing at the moment, and the shift­ing impor­tance of top­ics in rela­tion to on another. With a tag cloud high­light­ing “immi­gra­tion”, “duke”, and “judas”, vis­i­tors can imme­di­ately begin to under­stand what is news­wor­thy — at least in the minds of NYTimes.com read­ers.
At first glance, low­er­ing the amount of time spent read­ing the news could seem like a strong busi­ness dis­in­cen­tive for using tag clouds to stream­line nav­i­ga­tion and user flow. With more con­sid­er­a­tion, I think it points to a new poten­tial appli­ca­tion of tag clouds to enhance com­pre­hen­sion and find­abil­ity by giv­ing busy cus­tomers pow­er­ful tools to increase the speed and qual­ity of their judg­ments about what to devote their atten­tion to in order to acheive under­stand­ing greater depth. In the case of pub­li­ca­tions like the NYTimes.com, tag clouds may be well suited for con­vey­ing snap­shots or sum­maries of com­plex and deep domains that change quickly (what’s the news?), and offer­ing rapid nav­i­ga­tion to spe­cific areas or top­ics.
A new user expe­ri­ence that offers a vari­ety of tag clouds in more places might allow dif­fer­ent kinds of move­ment or flow through the larger envi­ron­ment, enabling new behav­iors and sup­port­ing dif­fer­ing goals than the cur­rent infor­ma­tion archi­tec­ture and user expe­ri­ence.
Pos­si­ble Screen Flow Incor­po­rat­ing Clouds

Step­ping back from the specifics of the design, a broader ques­tion is “Why tag clouds now?” They’re cer­tainly timely, but that’s not a busi­ness model. This is just spec­u­la­tion, but I recall job post­ings for an Infor­ma­tion Archi­tect posi­tion within the NYTimes.com group on that appeared on sev­eral recruit­ing web­sites a few months ago — maybe the new team mem­bers wanted or were directed to include tag clouds in this design? If any of those involved are allowed to share insights, I’d very much like to hear the thoughts of the IAs / design­ers / prod­uct man­agers or other team mem­bers respon­si­ble for includ­ing tag clouds in the new design and struc­ture.
And in light of Mathew Patterson’s com­ments here about cus­tomer accep­tance of mul­ti­ple clouds in other set­tings and con­texts (price­line europe), I’m curi­ous about any usabil­ity test­ing or other user research that might have been done around the new design, and any the find­ings related to tag clouds.

Comment » | Ideas

Back to top