Category: Tag Clouds


Joining the Tag Team At Tagsonomy.com

July 22nd, 2007 — 3:11pm

I’ll be writ­ing about tag­ging, tag clouds, folk­sonomies, and related top­ics over at Tagsonomy.com going for­ward. As Chris­t­ian Crum­lish observed, it’s been quite at Tagsonomy.com for a while, but that doesn’t mean that tag­ging is any­where close to being fully fig­ured out.
To help kick­start the con­ver­sa­tion, I’ve put up two posts since offi­cially join­ing the Tag Team; The Tag­ging Hype Cycle, and Is Tag­ging a Dis­rup­tive Inno­va­tion?.
Com­ments are already flow­ing in — be sure to join the discussion.

Comment » | Tag Clouds

Watching Ideas Bloom: Text Clouds of the Republican Debate At Democrats.org

May 4th, 2007 — 8:07pm

A meme is emerg­ing for the use text clouds as visu­al­iza­tion for — and a source of insight into — polit­i­cal speeches and speak­ers.
Text clouds of the Repub­li­can Pres­i­den­tial can­di­dates’ debate appear front and cen­ter on the DNC blog democrats.org, in Tag Clouds Can Tell Us a Lot.… (sourced from media analy­sis firm Upstream Analy­sis via Pollster.com).
GiulianiTag400.png
BrownbackTag400.png
As you can see in the quote from the writeup below, we’re quickly devel­op­ing sophis­ti­cated read­ings of the (com­par­a­tively) sim­ple visu­al­iza­tion meth­ods used to gen­er­ate text clouds.
But some­times a cloud also reflects con­cerns that vot­ers share about a can­di­date. This is because the can­di­date gets asked about the issue–a lot–and then has to talk about it.
Check out the large “Pro-Life” tag in flip-flopping Romney’s cloud, or the large “Think” tag in Giuliani’s cloud–the can­di­date noto­ri­ous for leap­ing first and think­ing later.

Polit­i­cal inter­pre­ta­tions aside, this is a nuanced read­ing of the result­ing clouds: it rec­og­nizes the dynamic feed­back link between inten­tions and responses that becomes vis­i­ble in the ren­dered clouds. For a visu­al­iza­tion geek, these clouds show the dif­fer­ing agen­das of can­di­dates and audi­ence as they played out, a nice exam­ple of social mech­a­nisms in action.
Note to the tool builders of the world
How about putting together a visu­al­iza­tion toolset that shows evolv­ing text clouds as the debate pro­gresses? I’m imag­in­ing a time­line plus tran­script plus cloud view of the accu­mu­lat­ing text cloud for each can­di­date, with options for mov­ing for­ward or back in the stream of words.
What could be bet­ter than watch­ing words and ideas bloom over time, the same way we see flow­ers in a gar­den blos­som, open, and close in time lapse pho­tog­ra­phy. I’d like to think we can grow some­thing poetic and beau­ti­ful, as well as use­ful, from the (sadly debased) soil of politi­cized sound bites sur­round­ing us.

Or, with a nod to the bru­tal com­pe­ti­tion built into most nat­ural sys­tems, you may choose to watch the strug­gle of waterlil­lies for sun­light, in this clip from The Amaz­ing Life of Plants.

Comment » | Tag Clouds

Text Clouds and Advertising: Microsoft's Community Buzz Project

April 28th, 2007 — 2:47pm

Thanks to Dat­a­min­ing, for post­ing a writeup and screen­shot of a pro­to­type of Com­mu­nity Buzz, which fea­tures a text cloud. Com­mu­nity Buzz is a Microsoft Research project, and this is a per­fect use of a text cloud to visu­al­ize con­cepts and fur­ther com­pre­hen­sion in a body of text.
More inter­est­ing than the text cloud is the space in the screen­shot that looks like a place­holder for adver­tis­ing dri­ven by the con­tents of the text cloud. The anno­ta­tion reads “Con­tex­tual ads based on the Buzz cloud key­words”, imply­ing an adver­tis­ing based rev­enue mech­a­nism dri­ven by cre­ation and analy­sis of a text cloud.
Com­mu­nity Buzz Screen­shot

The descrip­tion of Com­mu­nity Buzz posted on the Tech­Fest 2007 page, includes the fol­low­ing, mak­ing the con­nec­tion to an adver­tis­ing model explicit:
Com­mu­nity Buzz com­bines text min­ing, social account­ing (Netscan/MSR-Halo), and new visu­al­iza­tion tech­niques to study and present the con­tent of com­mu­ni­ca­tion threads in online dis­cus­sion groups. The merg­ing of these research tech­nolo­gies results in a sys­tem that gives great value to com­mu­nity par­tic­i­pants, enables highly directed adver­tis­ing, and sup­plies rich met­rics to prod­uct man­agers.
Assum­ing it’s pos­si­ble to pro­vide highly directed adver­tis­ing and rich met­rics based on text clouds, I can see the ben­e­fits of for adver­tis­ers and prod­uct man­agers, and researchers of many kinds. Yet I’m not con­vinced of the ben­e­fits for com­mu­nity par­tic­i­pants. Where will the text clouds come from, and how will their con­tent reflect the needs of the com­mu­nity? How will social dynam­ics shape or affect these text clouds, to make it pos­si­ble for them to lever­age net­work effects, dif­fer­en­tial par­tic­i­pa­tion, and the scale ben­e­fits of con­nected social sys­tems?
Text clouds — at least at this stage of devel­op­ment — sup­port rapid but shal­low com­pre­hen­sion: maybe this is per­fect for adver­tis­ing pur­poses…
Like a pile of dry bones that used to make up a skele­ton, text clouds lack the spe­cific struc­ture and con­text of their source, and so can­not replace com­pre­hen­sion. Text clouds decon­struct the word ele­ments that make up a body of text the same way spec­trum analy­sis iden­ti­fies the dif­fer­ent wave­lengths of light from a dis­tant star. It’s a bit like using sta­tis­ti­cal analy­sis to read King Lear, instead of using a vari­ety of tools to learn more about what Lear might have to say.
A bet­ter use of text clouds, or any other type of decon­struc­tive method (a vari­ant of semi­otics) is as a tool for enhanc­ing com­pre­hen­sion. Text clouds seem to bypass dis­tinc­tions between high con­text and low con­text that present bar­ri­ers to under­stand­ing deep con­text, by focus­ing on the raw con­tent of the source, on the level of it’s con­stituent ele­ments.
The goal of exam­in­ing the fun­da­men­tal or essen­tial makeup of some­thing we’re explor­ing — as a way of bet­ter under­stand­ing that thing over­all — is an epis­te­mo­log­i­cal method pur­sued by Plato and a host of other West­ern philoso­phers and nat­ural sci­en­tists. We should be cau­tious with new tools, how­ever, as the urge to illu­mi­nate and dis­sect the fun­da­men­tal makeup of that which is com­plex and nuanced can go too far, cross­ing from the insight­ful to the ster­ile domain of soul­less reduc­tivism. Wit­ness the responses of cor­rupt offi­cials to Javier Bardem’s char­ac­ter Agustín, in John Malkovich’s direc­to­r­ial debut The Dancer Upstairs.
Agustín is a police hero who saves his coun­try from a crim­i­nal and oppres­sive gov­ern­ment, social dis­in­te­gra­tion, and guerilla takeover. He then sur­ren­ders all prospects of win­ning the pres­i­dency and lead­ing his strug­gling nation to pros­per­ity for the unre­quited love of a woman who aided the same guerilla leader he helped cap­ture. Agustín strikes a secret bar­gain to secure her free­dom with the cor­rupt pow­ers that be, on con­di­tion that he with­draw from pub­lic life. His choice is incom­pre­hen­si­ble to the soul­less offi­cials in power. To these peo­ple, who buy, sell, and exe­cute hun­dreds with­out a thought, Agustín’s lover “…is just a girl — 70% water.“
For ref­er­ence, the overview of Com­mu­nity Buzz:

  • Com­mu­nity Buzz com­bines analy­sis of the con­tent of online dis­cus­sions and social struc­ture of the com­mu­ni­ties to iden­tify hot top­ics and visu­al­ize how they evolve over time.
  • Through search and Buzz cloud users can access rel­e­vant dis­cus­sion threads and adverts linked to the search results and Buzz keywords.
  • Visu­al­iza­tion of key­word trends enables the users to mon­i­tor the pop­u­lar­ity of selected top­ics. Mesasages can be fil­tered based on the ‘social sta­tus’ of the author in the community.

And the com­plete descrip­tion of the demo men­tioned by Dat­a­min­ing:

Com­mu­nity Buzz is a new win­dow into online com­mu­ni­ties! Inter­est­ing and use­ful con­ver­sa­tions, authors, and groups are dis­cov­ered eas­ily using this tool, jointly devel­oped by Microsoft Research Redmond’s Com­mu­nity Tech­nolo­gies group and Microsoft Research Cambridge’s Inte­grated Sys­tems team, with spon­sor­ship from Live Labs. Com­mu­nity Buzz com­bines text min­ing, social account­ing (Netscan/MSR-Halo), and new visu­al­iza­tion tech­niques to study and present the con­tent of com­mu­ni­ca­tion threads in online dis­cus­sion groups. The merg­ing of these research tech­nolo­gies results in a sys­tem that gives great value to com­mu­nity par­tic­i­pants, enables highly directed adver­tis­ing, and sup­plies rich met­rics to prod­uct man­agers.

Comment » | Tag Clouds

Text Clouds of the Democratic Debate

April 28th, 2007 — 1:36pm

Mark Blu­men­thal, of Pollster.com, recently posted a set of text clouds show­ing the words used by each can­di­date in the Demo­c­ra­tic pres­i­den­tial debate Thurs­day night. The clouds were gen­er­ated from tran­scripts of the debate, using Daniel Steinbock’s Tag Crowd tool.
Can­di­dates’ Text Clouds

In the screen­shot of Mark’s post­ing, it’s easy to see this is a great exam­ple of a col­lec­tion of text clouds used for com­par­a­tive visu­al­iza­tion and inter­pre­ta­tion. The goal is to enhance under­stand­ing of the mean­ing and con­tent of the candidate’s over­all con­ver­sa­tions dur­ing the debate, an idea I explored briefly last year.
Just a month ago, in a post that iden­ti­fied text clouds as a new and dis­tinct tag cloud vari­ant, I sug­gested:

text clouds may become a gen­er­ally applied tool for man­ag­ing grow­ing infor­ma­tion over­load by using auto­mated syn­the­sis and sum­ma­riza­tion. In the infor­ma­tion sat­u­rated future (or the infor­ma­tion sat­u­rated present), text clouds are the com­mon exec­u­tive sum­mary on steroids

Sup­port­ing the com­par­i­son and inter­pre­ta­tion of polit­i­cal speeches is an inven­tive, timely, and resource­ful appli­ca­tion that could make text clouds a reg­u­lar part of the new per­sonal and pro­fes­sional toolkit for effec­tively han­dling the tor­rents of infor­ma­tion over­whelm­ing peo­ple in impor­tant sit­u­a­tions like vet­ting polit­i­cal can­di­dates.
I espe­cially like the way this use of text clouds helps neatly side­step the dis­heart­en­ing ubiq­uity of the sound­bite, by aggre­gat­ing, dis­till­ing, and sum­ma­riz­ing all the things the can­di­dates said. I sus­pect few — if any — of the cam­paigns real­ize the poten­tial for text clouds, but they def­i­nitely know the detri­men­tal power of sound­bites:

“It’s a mess,” said an exasperated-sounding Mr. Prince, Mr. Edwards’s deputy cam­paign man­ager. “Debates are impor­tant, but in these big mul­ti­can­di­date races they end up not being an exchange of ideas, but just an exchange of sound bites. They have become a dis­trac­tion.“

From Debates Los­ing a Bit of Lus­ter in a Big Field

The value of a col­lec­tion of sound­bites over an insight­ful dia­log is — apolo­gies for the pun — debat­able. But even if a sim­ple exchange of sound­bites is what the new short­ened for­mats of many debates yields us, text clouds may help derive some value and insight from the results. The com­bined decon­struc­tive and recon­struc­tive approach that text clouds employ should make it pos­si­ble to bal­ance the weight of sin­gle remarks of can­di­dates by plac­ing them in a larger and more use­ful con­text.
His­tory Repeats Itself
In the longer term view of the his­tory of our responses to the prob­lems of infor­ma­tion over­load, the appear­ance of text clouds may mark the emer­gence of a new gen­eral puprose tool for visu­al­iz­ing ever greater quan­ti­ties of infor­ma­tion to sup­port some qual­i­ta­tively ben­e­fi­cial end (like pick­ing a good can­di­date for Pres­i­dent, which we sorely need).
The under­ly­ing pat­tern — a con­sis­tent oscil­la­tion between man­ag­ing effec­tively and inef­fec­tively cop­ing, depend­ing on the bal­ance between infor­ma­tion quan­tity and tool qual­ity — remains the same. Yet there is also value in know­ing the cycles that shape our expe­ri­ence of han­dling the infor­ma­tion cru­cial to mak­ing deci­sions, espe­cially deci­sions as impor­tant as who leads the coun­try.
The NY Times tran­script of the debate is avail­able here.

Comment » | Tag Clouds

Text Clouds: A New Form of Tag Cloud?

March 15th, 2007 — 12:04am

Dur­ing 2006, tag clouds moved beyond their well-known role as nav­i­ga­tion mech­a­nisms and indi­ca­tors of activ­ity within social media expe­ri­ences, emerg­ing as a stan­dard visu­al­iza­tion tech­nique for texts and tex­tual data in gen­eral.
This use of tag clouds does not com­monly involve tags, social net­works, emer­gent archi­tec­tures, folk­sonomies, or meta­data.
“Text cloud” might be a more accu­rate label for these visu­al­iza­tions than tag cloud. In addi­tion to rec­og­niz­ing fun­da­men­tal dif­fer­ences — text clouds dif­fer from tag clouds in com­po­si­tion (no tags at all) and pur­pose (pre­dom­i­nantly com­pre­hen­sion, rather than access or nav­i­ga­tion) — dis­tin­guish­ing the two types of clouds will make it much eas­ier to assess their abil­i­ties to sup­port user expe­ri­ence needs and busi­ness goals.
The emer­gence of this new form of text cloud looks like a good exam­ple of spe­ci­a­tion in action (though it’s too early to tell whether the end result will be clado­ge­n­e­sis or ana­ge­n­e­sis).
Major and minor pub­li­ca­tions feature(d) text clouds as visu­al­iza­tions in 2006, both per­ma­nently and temporarily:

The Economist’s Text cloud

In 2006, sev­eral free and pub­lic tools for gen­er­at­ing text clouds locally on the desk­top or via a ser­vice avail­able through the Web were released. The increase in the num­ber and vari­ety of spe­cific text cloud tools reflects embrace and enthu­si­asm for text clouds in com­mu­ni­ties of inter­est for infor­ma­tion visu­al­iza­tion, lan­guage pro­cess­ing, and seman­tics.
Some of the bet­ter known exam­ples of text cloud tools include:

The Many Eyes Cloud

The text clouds cre­ated with these tools range across a wide spec­trum of speeches and writing:

Text clouds are meant to facil­i­tate rapid under­stand­ing and com­pre­hen­sion of a body of words, links, phrases, etc. Any block of infor­ma­tion com­posed of text is open to analy­sis as a text cloud, as these screen cap­tures of text clouds for restau­rant menus, ingre­di­ents, wikipedia, mag­a­zine cov­ers, and even poems demon­strate.
Tim O’Reilly uses text clouds for a num­ber of pur­poses:

We used them a bunch to ana­lyze the top­ics, com­pa­nies and peo­ple at the last FOO Camp, and they were the most use­ful of the visu­al­iza­tions we did. They helped us see where we were under– and over-represented in terms of com­pa­nies and par­tic­u­lar tech­nolo­gies we were want­ing to explore. …So they have many uses beyond just show­ing what we nor­mally think of as tags.

Non-linear Access
The emer­gence of text clouds shows con­tin­u­ing explo­ration and refine­ment of cloud style dis­plays as a new form of user inter­face, adapted to spe­cific con­texts. Con­tin­ued refine­ment of text clouds in this direc­tion may indi­cate an expand­ing role for com­monly avail­able and sophis­ti­cated text visu­al­iza­tion tools to sup­port spe­cial­ized goals for infor­ma­tion dis­play and under­stand­ing.
Remem­ber that Google is busy right now scan­ning thou­sands of books per day from sev­eral of the world’s major aca­d­e­mic libraries, as part of it’s self-appointed labor of orga­niz­ing the world’s infor­ma­tion. That’s a lot of new text. How will peo­ple work with effec­tively with such an over­whelm­ing amount of text, of so many dif­fer­ent kinds, from so many dif­fer­ent sources?
Con­sider the fol­low­ing, from Ulysses’ With­out Guilt by Stacy Schiff (in the New York Times):
Recently Cath­leen Black, pres­i­dent of Hearst Mag­a­zines, urged a group of pub­lish­ing exec­u­tives to think of their audi­ence as con­sumers rather than read­ers. She’s onto some­thing: arguably the very def­i­n­i­tion of read­ing has changed. So Google asserts in defend­ing its right to scan copy­righted mate­ri­als. The process of dig­i­tiz­ing books trans­forms them, the com­pany con­tends, into some­thing else; our engage­ment with a text is dif­fer­ent when we call it up online. We are no longer read­ing. We’re search­ing — a func­tion that con­ve­niently did not exist when the con­cept of copy­right was estab­lished.
On a larger scale, the grow­ing use of text clouds hints at a (poten­tial) deeper cul­tural shift in the way we go about read­ing and com­pre­hen­sion: a shift from lin­ear modes based on read­ing words and sen­tences, to non­lin­ear modes based on view­ing sum­maries of con­tent in aggre­gate as a way of dis­cov­er­ing con­cepts and pat­terns. (Finally, a legit­i­mate use for Twit­ter…) Exper­i­ment­ing with text clouds for non-linear read­ing and com­pre­hen­sion (now that’s a sexy term…) is a nat­ural evo­lu­tion of the role cloud style dis­plays play as an alter­na­tive / com­pli­ment / sup­ple­ment to the list based nav­i­ga­tion now dom­i­nant in user expe­ri­ences.
A Text Cloud of Twit­ter Posts (A Twit­ter­Cloud?)

cre­ated at TagCrowd.com


I’m not pre­dict­ing the end of read­ing as we know it, nor the end of nav­i­ga­tion as we know it: both will be with us for a long, long time. But I do believe that text clouds might con­sti­tute an emerg­ing method for aug­ment­ing com­pre­hen­sion and dis­play of text, with broad poten­tial uses.
Enter­pris­ing Clouds
What about some­one lack­ing time to fully read a Shake­speare play, or a fad­dish busi­ness book, but who needs to under­stand some­thing about that book’s mean­ing and sub­stance? A text cloud cre­ation tool could extract the most com­monly men­tioned terms, and oth­er­wise pro­file the words that make up the text. It would be risky to rely on a shal­low text cloud (and Tim O’Reilly men­tions this specif­i­cally) for deep com­pre­hen­sion, but it would be enough to under­stand the con­cepts that appear, and allow polite con­ver­sa­tion at a net­work­ing event, or lunch with that cer­tain man­ager who rec­om­mended the book.
If I were entre­pre­neur­ial, I’d source a set of free elec­tronic ver­sions of clas­sic texts, process them with one of the free text cloud tools, apply some XSLT and other trans­for­ma­tions to gen­er­ate con­sis­tent read­able for­mat­ting, and sell the results as a line of ebooks called “Cloud Notes”. Of course, someone’s beaten me to it already
What’s in store for the future?
In this fash­ion, text clouds may become a gen­er­ally applied tool for man­ag­ing grow­ing infor­ma­tion over­load by using auto­mated syn­the­sis and sum­ma­riza­tion. In the infor­ma­tion sat­u­rated future (or the infor­ma­tion sat­u­rated present), text clouds are the com­mon exec­u­tive sum­mary on steroids and acid simul­ta­ne­ously; assem­bled with mus­cu­lar syn­tac­ti­cal and seman­tic pro­cess­ing, and fed to reading-fatigued post-literates as swirling blobs of giant words in wild col­ors, it con­sists of sig­ni­fiers for rei­fied con­cepts that tweak the eye-brain-language con­duit directly.

12 comments » | Tag Clouds

2.3% Of Chinese Internet Users Tag, Baidu Reports

March 4th, 2007 — 7:54pm

A post­ing from China Web2.0 Review shared results of a report on Chi­nese tag­ging rates released by Baidu, China’s lead­ing search engine.
I was not able to locate a trans­la­tion of the orig­i­nal report from Baidu, so I’ll quote the sum­mary from China Web2.0 Review:

Accord­ing to the report, only 2.3% of inter­net users have ever used tag, they mainly use tags in social book­mark­ing and blogs. I don’t know the meth­ods of data col­lec­tion, but the report said about 15 mil­lion Chi­nese web­pages were book­marked by users, on aver­age each user has saved 40 online book­marks. Among them, over 90% users add less than two tags for a book­mark.
And based on the tags of user saved book­marks, the most used tags are “soft­ware down­load”, “BBS”, “enter­tain­ment”, “game” and “learn­ing”.

We don’t know which ser­vices are included for analy­sis in the report, so I have no idea to which extent I can trust it. But based on my obser­va­tion, I agree with the basic find­ing of the report, even though more and more ser­vices have embod­ied tag­ging fea­ture, only a very small part of early-adopters in China indeed use it.
Two things come to mind right away:

  1. The matu­rity, struc­ture, and usage pat­terns of the Inter­net in China are not directly com­pa­ra­ble to the matu­rity matu­rity, struc­ture, and usage pat­terns of the Inter­net else­where (largely due to sub­stan­tial restric­tions and cen­sor­ship by the Chi­nese government)
  2. Offi­cial Chinse posi­tions are not fully reli­able, and so the num­bers, con­text, and usage described could be very dif­fer­ent from real practices

Still, even with the absence of solid qual­i­fy­ing, cor­rob­o­rat­ing, or con­tex­tual infor­ma­tion, this rate of adop­tion for tag­ging seems con­sis­tent with the rest of the very rapid pace of mod­ern­iza­tion in China.
And as the First Prin­ci­ple of Tag Clouds — “Where there’s tags, there’s a tag cloud” — says, this means there are quite a few tag clouds on the way in China.

2 comments » | Tag Clouds

10 Best Practices For Displaying Tag Clouds

February 25th, 2007 — 1:31am

This is a short list of best prac­tices for ren­der­ing and dis­play­ing tag clouds that I orig­i­nally cir­cu­lated on the IXDG mail­ing list, and am now post­ing in response to sev­eral requests. These best prac­tices are not in order of pri­or­ity — they’re sim­ple enumerated.

  1. Use a sin­gle color for the tags in the ren­dered cloud: this will allow vis­i­tors to iden­tify finer dis­tinc­tions in the size dif­fer­ences. Employ more than one color with dis­cre­tion. If using more than one color, offer the capa­bil­ity to switch between sin­gle color and mul­ti­ple color views of the cloud.
  2. Use a sin­gle sans serif font fam­ily: this will improve the over­all read­abil­ity of the ren­dered cloud.
  3. If accu­rate com­par­i­son of rel­a­tive weight (see­ing the size dif­fer­ences amongst tags) is more impor­tant than over­all read­abil­ity, use a mono­space font.
  4. If com­pre­hen­sion of tags and under­stand­ing the mean­ing is more impor­tant, use a vari­ably spaced font that is easy to read.
  5. Use con­sis­tent and pro­por­tional spac­ing to sep­a­rate the tags in the ren­dered tag cloud. Pro­por­tional means that the spac­ing between tags varies based on their size; typ­i­cally more space is used for larger sizes. Con­sis­tent means that for each tag of a cer­tain size, the spac­ing remains the same. In html, spac­ing is often deter­mined by set­ting style para­me­ters like padding or mar­gins for the indi­vid­ual tags.
  6. Avoid sep­a­ra­tor char­ac­ters between tags: they can be con­fused for small tags.
  7. Care­fully con­sider ren­der­ing in flash, or another vector-based method, if your users will expe­ri­ence the cloud largely through older browsers / agents: the font ren­der­ing in older browsers is not always good or con­sis­tent, but it is impor­tant that the cloud offer text that is read­ily digestible by search and index­ing engines, both locally and publicly
  8. If ren­der­ing the cloud in html, set the font size of ren­dered tags using whole per­cent­ages, rather than pixel sizes or dec­i­mals: this gives the dis­play agent more free­dom to adjust its final rendering.
  9. Do not insert line breaks: this allows the ren­der­ing agent to adjust the place­ment of line breaks to suit the ren­der­ing context.
  10. Offer the abil­ity to change the order between at least two options — alpha­bet­i­cal, and one vari­able dimen­sion (over­all weight, fre­quency, recency, etc.)

For fun, I’ve run these 10 best prac­tices through Tagcrowd. The major con­cepts show up well — font, color, and size are promi­nent — but obvi­ously the specifics of the things dis­cussed remain opaque.
Best Prac­tices For Dis­play as a Text Cloud
best_practices_textcloud.jpg

12 comments » | Tag Clouds

PEW Report Shows 28% Of Internet Users Have Tagged

February 1st, 2007 — 2:30pm

The Pew Inter­net & Amer­i­can Life Project just released a report on tag­ging that finds
28% of inter­net users have tagged or cat­e­go­rized con­tent online such as pho­tos, news sto­ries or
blog posts. On a typ­i­cal day online, 7% of inter­net users say they tag or cat­e­go­rize online con­tent.

The authors note “This is the first time the Project has asked about tag­ging, so it is not clear exactly how fast the trend is grow­ing.“
Wow — I’d say it’s grow­ing extremely quickly. Though I am on record as a believer in the bright future of tag clouds, I admit I’m sur­prised by these results. The fact that 7% of inter­net users tag daily is what’s most sig­nif­i­cant: it’s an indi­ca­tion of very rapid adop­tion for the prac­tice of tag­ging in many dif­fer­ent con­texts and many dif­fer­ent kinds of audi­ences, given it’s brief his­tory.
I’d guess this adop­tion rate com­pares to the rates of adop­tion for other new network-dependent or emer­gent archi­tec­tures like P2P music shar­ing or on-line music buy­ing.
You’re cor­rect if you’re think­ing there is a dif­fer­ence between tag­ging and tag clouds. And if you’ve read the report and the accom­pa­ny­ing inter­view with Dr. Wein­berger, you’ve likely real­ized that nei­ther Dr. Weinberger’s inter­view nor the report specif­i­cally addresses tag cloud usage. But remem­ber the First Prin­ci­ple of Tag Clouds: “Where there’s tags, there’s a tag cloud.” By def­i­n­i­tion, any item with an asso­ci­ated col­lec­tion of tags has a tag cloud, regard­less of whether that tag cloud is directly vis­i­ble and action­able in the user expe­ri­ence. So that 7% of inter­net users who tag daily are by default cre­at­ing and work­ing with tag clouds daily.
It might be time for tag clouds to look into get­ting some sun­glasses.

Comment » | Tag Clouds

Cartograms, Tag Clouds and Visualization

May 22nd, 2006 — 10:56pm

I was enjoy­ing some of the engag­ing car­tograms avail­able from Worldmap­per, when I real­ized tag clouds might have some strong par­al­lels with car­tograms. After a quick sub­sti­tu­tion exer­cise, I’ve come to believe tag clouds could be to lists of meta­data what car­tograms are to maps; attempted solu­tions to sim­i­lar visu­al­iza­tion prob­lems dri­ven by com­mon and his­tor­i­cally con­sis­tent infor­ma­tion needs.
Here’s the train of thought behind the anal­ogy. Car­tograms are the dis­torted but cap­ti­vat­ing maps that change the famil­iar shapes of places on a map to visu­ally show data about geo­graphic loca­tions. Car­tograms change the way loca­tions appear to make a point or com­mu­ni­cate rel­a­tive dif­fer­ences in the under­ly­ing data; for exam­ple, by mak­ing coun­tries with higher GDP (gross domes­tic prod­uct) big­ger, and those with lower GDP smaller. In the exam­ple below, Japan’s size is much larger than it’s geo­graphic area, because it’s GDP is so high (it’s the dark green blob on the far right, much larger than China or India), while Africa is nearly invis­i­ble.
Gross Domes­tic Prod­uct

Tag clouds pur­sue the same goal: to enhance our under­stand­ing by com­mu­ni­cat­ing con­tex­tual mean­ing through changes in the way a set of things are visu­al­ized, rely­ing addi­tional dimen­sions of infor­ma­tion to make con­text explicit. Where car­tograms change geo­graphic units, tag clouds change the dis­play of a list of labels (the end point of a chain of link­ages con­nect­ing con­cepts to focuses) to com­mu­ni­cate the seman­tic impor­tance or con­text of the under­ly­ing con­cepts shown in the list.
Visu­ally, the rela­tion­ship of clouds to lists is sim­i­lar to that of maps and car­tograms; com­pare these two ren­der­ings of the most pop­u­lar search terms recorded by nytimes.com, one a sim­ple list and the other a tag cloud.
List Ren­der­ing of Search Terms

Cloud Ren­der­ing of Search Terms

This expla­na­tion of car­tograms from Car­togram Cen­tral a site sup­ported by the U.S. Geo­log­i­cal Sur­vey and tional Cen­ter for Geo­graphic Infor­ma­tion and Analy­sis makes the par­al­lels clearer, in greater detail.
“A car­togram is a type of graphic that depicts attrib­utes of geo­graphic objects as the object’s area. Because a car­togram does not depict geo­graphic space, but rather changes the size of objects depend­ing on a cer­tain attribute, a car­togram is not a true map. Car­tograms vary on their degree in which geo­graphic space is changed; some appear very sim­i­lar to a map, how­ever some look noth­ing like a map at all.“
Now for the cut and paste. Sub­sti­tute ‘tag cloud’ for car­togram, ‘seman­tic’ for geo­graphic, and ‘list’ in for map, and the same expla­na­tion reads:
“A tag cloud is a type of graphic that depicts attrib­utes of seman­tic objects as the object’s area. Because a tag cloud does not depict seman­tic space, but rather changes the size of objects depend­ing on a cer­tain attribute, a tag cloud is not a true list. Tag Clouds vary on their degree in which seman­tic space is changed; some appear very sim­i­lar to a list, how­ever some look noth­ing like a list at all.“
This is a good match for the cur­rent under­stand­ing of tag clouds.
Div­ing in deeper, Car­togram Cen­tral offers an excerpt from Car­tog­ra­phy: The­matic Map Design, that goes into more detail about the spe­cific char­ac­ter­is­tics of car­tograms.
Erwin Raisz called car­tograms ‘dia­gram­matic maps.’ Today they might be called car­tograms, value-by-area maps, anamor­phated images or sim­ply spa­tial trans­for­ma­tions. What­ever their name, car­tograms are unique rep­re­sen­ta­tions of geo­graph­i­cal space. Exam­ined more closely, the value-by-area map­ping tech­nique encodes the mapped data in a sim­ple and effi­cient man­ner with no data gen­er­al­iza­tion or loss of detail. Two forms, con­tigu­ous and non-contiguous, have become pop­u­lar. Map­ping require­ments include the preser­va­tion of shape, ori­en­ta­tion con­ti­gu­ity, and data that have suit­able vari­a­tion. Suc­cess­ful com­mu­ni­ca­tion depends on how well the map reader rec­og­nizes the shapes of the inter­nal enu­mer­a­tion units, the accu­racy of esti­mat­ing these areas, and effec­tive leg­end design. Com­plex forms include the two-variable map. Car­togram con­struc­tion may be by man­ual or com­puter means. In either method, a care­ful exam­i­na­tion of the logic behind the use of the car­togram must first be under­taken.“
Doing the same sub­sti­tu­tion exer­cise on this excerpt with the addi­tion of ‘rel­e­vance’ for value, ‘size’ for area, and ‘term’ for shape, yields sim­i­lar results:
“Erwin Raisz called tag clouds ‘dia­gram­matic lists.’ Today they might be called tag clouds, relevance-by-size lists, anamor­phated images or sim­ply spa­tial trans­for­ma­tions. What­ever their name, tag clouds are unique rep­re­sen­ta­tions of seman­tic space. Exam­ined more closely, the relevance-by-size list­ing tech­nique encodes the listed data in a sim­ple and effi­cient man­ner with no data gen­er­al­iza­tion or loss of detail. Two forms, con­tigu­ous and non-contiguous, have become pop­u­lar. List­ing require­ments include the preser­va­tion of term, ori­en­ta­tion, con­ti­gu­ity, and data that have suit­able vari­a­tion. Suc­cess­ful com­mu­ni­ca­tion depends on how well the list reader rec­og­nizes the terms (of the inter­nal enu­mer­a­tion units), the accu­racy of esti­mat­ing these sizes, and effec­tive leg­end design. Com­plex forms include the two-variable list. Tag cloud con­struc­tion may be by man­ual or com­puter means. In either method, a care­ful exam­i­na­tion of the logic behind the use of the tag cloud must first be under­taken.“
The cor­re­spon­dence here is strong as well.
Sta­ble Need
The fact that car­tograms and tag clouds show close par­al­lels means that while the tag cloud may be a new user inter­face ele­ment emerg­ing for the Web (and major desk­top appli­ca­tions like Out­look, in the case of Tagloc­ity), tag clouds as a type of visu­al­iza­tion have strong prece­dents in other much more mature user expe­ri­ence con­texts, such as the dis­play of mul­ti­ple dimen­sions of infor­ma­tion within geo­graphic or geospa­tial frames of ref­er­ence. Instances of strong cor­re­spon­dence of prob­lem solv­ing approach in both mature and emerg­ing con­texts could indi­cate sim­ple appli­ca­tion of par­al­lel fram­ing (from the mature con­text to the emerg­ing con­text) as an untested con­di­tional, until the true extent of diver­gence sep­a­rat­ing the two con­texts is under­stood. This is very com­mon new media.
Instead, in the case of tag clouds, I think it points at sta­ble needs dri­ving struc­turally sim­i­lar solu­tions to the basic prob­lem of how to visu­ally com­mu­ni­cate impor­tant rela­tion­ships and addi­tional dimen­sions of mean­ing under the lim­i­ta­tions of inher­ent flat­ness. The par­al­lels between car­tograms and tag clouds place the appear­ance of the tag cloud within the larger his­tory of con­tin­u­ing explo­ration of new ways of visu­al­iz­ing infor­ma­tion. In this view, tag clouds are a recent man­i­fes­ta­tion of the sta­ble need to cre­ate strong and effec­tive visual ways of con­vey­ing more than mem­ber­ship in a one-dimensional set (the list), or loca­tion and extent within a two-dimensional coör­di­nate sys­tem (the map).

1 comment » | Ideas, Tag Clouds

Tag Clouds: "A New User Interface?"

May 3rd, 2006 — 10:58pm

In Piv­ot­ing on tags to cre­ate bet­ter nav­i­ga­tion UI Matt McAl­lis­ter offers the idea that we’re see­ing “a new user inter­face evolv­ing out of tag data,” and uses Wikio as an exam­ple. For con­text, he places tag clouds within a con­tin­uüm of the evo­lu­tion of web nav­i­ga­tion, from list views to the new tag-based nav­i­ga­tion emerg­ing now.
It’s an insight­ful post, and it allows me to build on strong ground­work to talk more about why and how tag clouds dif­fer from ear­lier forms of nav­i­ga­tion, and will become [part of] a new user inter­face.
Matt iden­ti­fies five ‘leaps’ in web nav­i­ga­tion inter­faces that I’ll summarize:

  1. List view; a list of links
  2. Left-hand col­umn; a stan­dard loca­tion for lists of links used to navigate
  3. Search boxes and results pages; mak­ing very large lists manageable
  4. Tab nav­i­ga­tion; a list of other nav­i­ga­tion lists
  5. Tag nav­i­ga­tion; tag clouds

A Les­son in ‘Lis­tory’
As Matt men­tions, all four pre­de­ces­sors to tag based nav­i­ga­tion are really vari­a­tions on the under­ly­ing form of the list. There’s use­ful his­tory in the evo­lu­tion of lists as web nav­i­ga­tion tools. Early lists used for nav­i­ga­tion were sta­tic, cho­sen by a site owner, ordered, and flat: recall the lists of favorite sites that appeared at the bot­tom of so many early per­sonal home pages.
These basic nav­i­ga­tion lists evolved a vari­ety of order­ing schemes, (alpha­bet­i­cal, numeric), began to incor­po­rate hier­ar­chy (shown as sub-menus in nav­i­ga­tion sys­tems, or as indent­ing in the left-column Matt men­tions), and allowed users to change their order­ing, for exam­ple by sort­ing on a vari­ety of fields or columns in search results.
From sta­tic lists whose con­tents do not change rapidly and reflect a sin­gle point of view, the lists employed for web nav­i­ga­tion and search results then became dynamic, per­son­al­ized, and reflec­tive of mul­ti­ple points of view. Ama­zon and other e-commerce des­ti­na­tions offered recently viewed items (yours or oth­ers), things most requested, sets bounded by date (pub­lished last year), sets dri­ven by vary­ing para­me­ters (related arti­cles), and lists deter­mined by the nav­i­ga­tion choices of oth­ers who fol­lowed sim­i­lar paths.)
But they remained fun­da­men­tally lists. They item­ized or enu­mer­ated choices of one kind or another, indi­cated implicit or explicit prece­dence through order­ing or the absence of order­ing, and were designed for lin­ear inter­ac­tion pat­terns: start at the begin­ning (or the end, if you pre­ferred an alter­na­tive per­spec­tive — I still habit­u­ally read mag­a­zines from back to front…) and work your way through.
Tag clouds are dif­fer­ent from lists, often by con­tents and pre­sen­ta­tion, and more impor­tantly by basic assump­tion about the kind of inter­ac­tion they encour­age. On tag-based nav­i­ga­tion Matt says, “This is a new layer that pre­empts the search box in a way. The visual rep­re­sen­ta­tion of it is a tag cloud, but the inter­ac­tion is more like a pivot.” Matt’s men­tion of the inter­ac­tion hits on an impor­tant aspect that’s key to under­stand­ing the dif­fer­ences between clouds and lists: clouds are not lin­ear, and are not designed for lin­ear con­sump­tion in the fash­ion of lists.
I’m not say­ing that no one will read clouds left to right (with Roman alpha­bets), or right to left if they’re in Hebrew, or in any other way. I’m say­ing that tag clouds are not meant for ‘read­ing’ in the same way that lists are. As they’re com­monly visu­al­ized today, clouds sup­port mul­ti­ple entry points using visual dif­fer­en­tia­tors such as color and size.
Start­ing in the mid­dle of a list and wan­der­ing around just increases the amount of visual and cog­ni­tive work involved, since you need to remem­ber where you started to com­plete your sur­vey. Start­ing in the “mid­dle” of a tag cloud — if there is such a loca­tion — with a brightly col­ored and big juicy visual morsel is *exactly* what you’re sup­posed to do. It could save you quite a lot of time and effort, if the cloud is well designed and prop­erly ren­dered.
Kunal Anand cre­ated a visu­al­iza­tion of the inter­sec­tions of his del.icio.us tags that shows the dif­fer­ences between a cloud and a list nicely. This is at heart a pic­ture, and accord­ingly you can start look­ing at it any­where / any­way you pre­fer.
Visu­al­iz­ing My Del.icio.us Tags

We all know what a list looks like…
iTunes Play Lists

What’s In a Name?
Describ­ing a tag cloud as a weighted list (I did until I’d thought about it fur­ther) misses this impor­tant qual­i­ta­tive dif­fer­ence, and reflects our early stages of under­stand­ing of tag clouds. The term “weighted list” is a list-centered view of tag clouds that comes from the pre­ced­ing frame of ref­er­ence. It’s akin to describ­ing a com­puter as an “arith­metic engine”, or the print­ing press as “mov­able type”.
[Aside: The label for tag clouds will prob­a­bly change, as we develop con­cepts and lan­guage to frame new the user expe­ri­ences and infor­ma­tion envi­ron­ments that include clouds. For exam­ple, the lan­guage Matt uses — the word ‘pivot’ when he talks about the expe­ri­ence of nav­i­gat­ing via the tag cloud in Wikio, not the word ‘fol­low’ which is a default for describ­ing nav­i­ga­tion — in the post­ing and his screen­cast reflects a pos­si­ble shift in fram­ing.]
A Cam­era Obscura For the Seman­tic Land­scape
I’ve come to think of a tag cloud as some­thing like the image pro­duced by a cam­era obscura.
Cam­era Obscura
images.jpg
Where the cam­era obscura ren­ders a real-world land­scape, a tag cloud shows a seman­tic land­scape like those cre­ated by Amber Frid-Jimenez at MIT.
Seman­tic Land­scape

Seman­tic Land­scape

Like a cam­era obscura image, a tag cloud is a fil­tered visu­al­iza­tion of a mul­ti­di­men­sional world. Unlike a cam­era obscura image, a tag cloud allows move­ment within the land­scape. And unlike a list, tag clouds can and do show rela­tion­ships more com­plex than one-dimensional lin­ear­ity (expe­ri­enced as prece­dence). This abil­ity to show more than one dimen­sion allows clouds to reflect the struc­ture of the envi­ron­ment they visu­al­ize, as well as the con­tents of that envi­ron­ment. This frees tag clouds from the lim­i­ta­tion of sim­ply item­iz­ing or enu­mer­at­ing the con­tents of a set, which is the fun­da­men­tal achieve­ment of a list.
Ear­lier, I shared some obser­va­tions on the struc­tural evo­lu­tion — from sta­tic and flat to hier­ar­chi­cal and dynamic — of the lists used as web nav­i­ga­tion mech­a­nisms. As I’ve ven­tured else­where, we may see a sim­i­lar evo­lu­tion in tag clouds.
It is already clear that we’re wit­ness­ing evo­lu­tion in the pre­sen­ta­tion of tag clouds in step with their greater visu­al­izatin capa­bil­i­ties. Clouds now rely on an expand­ing vari­ety of visual cues to show an increas­ingly detailed view of the under­ly­ing seman­tic land­scape: prox­im­ity, depth, bright­ness, inten­sity, color of item, color of field around item. I expect clouds will develop other cues to help depict the many con­nec­tions (per­ma­nent or tem­po­rary) link­ing the labels in a tag cloud. It’s pos­si­ble that tag clouds will offer a user expe­ri­ence sim­i­lar to some of the ontol­ogy man­age­ment tools avail­able now.
Is this “a new user inter­face”? That depends on how you define new. In Shap­ing Things, author and futur­ist Bruce Ster­ling sug­gests, “the future com­posts the past” — mean­ing that new ele­ments are sub­sumed into the accu­mu­la­tion of lay­ers past and present. In the con­text of nav­i­ga­tion sys­tems and tag clouds, that implies that we’ll see mix­tures of lists from the four pre­vi­ous stages of nav­i­ga­tion inter­face, and clouds from the lat­est leap; a fusion of old and new.
Exam­ples of this com­post­ing abound, from 30daytags.com to Wikio that Matt McAl­lis­ter exam­ined.
30DayTags.com Tag Clouds

Wikio Tag Cloud

As lists encour­aged lin­ear inter­ac­tions as a result of their struc­ture, it’s pos­si­ble that new user inter­faces rely­ing on tag clouds will encour­age dif­fer­ent kinds of seek­ing or find­ing behav­iors within infor­ma­tion expe­ri­ences. In “The endan­gered joy of serendip­ity” William McK­een bemoans the decrease of serendip­ity as a result of pre­cisely directed and tar­geted media, search­ing, and inter­ac­tions. Tag clouds — by offer­ing many con­nec­tions and mul­ti­ple entry paths simul­ta­ne­ously — may help reju­ve­nate serendip­ity in dan­ger in a world of closely focused lists.

2 comments » | Ideas, Tag Clouds

Back to top