Text Clouds: A New Form of Tag Cloud?

Dur­ing 2006, tag clouds moved beyond their well-known role as nav­i­ga­tion mech­a­nisms and indi­ca­tors of activ­ity within social media expe­ri­ences, emerg­ing as a stan­dard visu­al­iza­tion tech­nique for texts and tex­tual data in gen­eral.
This use of tag clouds does not com­monly involve tags, social net­works, emer­gent archi­tec­tures, folk­sonomies, or meta­data.
“Text cloud” might be a more accu­rate label for these visu­al­iza­tions than tag cloud. In addi­tion to rec­og­niz­ing fun­da­men­tal dif­fer­ences — text clouds dif­fer from tag clouds in com­po­si­tion (no tags at all) and pur­pose (pre­dom­i­nantly com­pre­hen­sion, rather than access or nav­i­ga­tion) — dis­tin­guish­ing the two types of clouds will make it much eas­ier to assess their abil­i­ties to sup­port user expe­ri­ence needs and busi­ness goals.
The emer­gence of this new form of text cloud looks like a good exam­ple of spe­ci­a­tion in action (though it’s too early to tell whether the end result will be clado­ge­n­e­sis or ana­ge­n­e­sis).
Major and minor pub­li­ca­tions feature(d) text clouds as visu­al­iza­tions in 2006, both per­ma­nently and temporarily:

The Economist’s Text cloud

In 2006, sev­eral free and pub­lic tools for gen­er­at­ing text clouds locally on the desk­top or via a ser­vice avail­able through the Web were released. The increase in the num­ber and vari­ety of spe­cific text cloud tools reflects embrace and enthu­si­asm for text clouds in com­mu­ni­ties of inter­est for infor­ma­tion visu­al­iza­tion, lan­guage pro­cess­ing, and seman­tics.
Some of the bet­ter known exam­ples of text cloud tools include:

The Many Eyes Cloud

The text clouds cre­ated with these tools range across a wide spec­trum of speeches and writing:

Text clouds are meant to facil­i­tate rapid under­stand­ing and com­pre­hen­sion of a body of words, links, phrases, etc. Any block of infor­ma­tion com­posed of text is open to analy­sis as a text cloud, as these screen cap­tures of text clouds for restau­rant menus, ingre­di­ents, wikipedia, mag­a­zine cov­ers, and even poems demon­strate.
Tim O’Reilly uses text clouds for a num­ber of pur­poses:

We used them a bunch to ana­lyze the top­ics, com­pa­nies and peo­ple at the last FOO Camp, and they were the most use­ful of the visu­al­iza­tions we did. They helped us see where we were under– and over-represented in terms of com­pa­nies and par­tic­u­lar tech­nolo­gies we were want­ing to explore. …So they have many uses beyond just show­ing what we nor­mally think of as tags.

Non-linear Access
The emer­gence of text clouds shows con­tin­u­ing explo­ration and refine­ment of cloud style dis­plays as a new form of user inter­face, adapted to spe­cific con­texts. Con­tin­ued refine­ment of text clouds in this direc­tion may indi­cate an expand­ing role for com­monly avail­able and sophis­ti­cated text visu­al­iza­tion tools to sup­port spe­cial­ized goals for infor­ma­tion dis­play and under­stand­ing.
Remem­ber that Google is busy right now scan­ning thou­sands of books per day from sev­eral of the world’s major aca­d­e­mic libraries, as part of it’s self-appointed labor of orga­niz­ing the world’s infor­ma­tion. That’s a lot of new text. How will peo­ple work with effec­tively with such an over­whelm­ing amount of text, of so many dif­fer­ent kinds, from so many dif­fer­ent sources?
Con­sider the fol­low­ing, from Ulysses’ With­out Guilt by Stacy Schiff (in the New York Times):
Recently Cath­leen Black, pres­i­dent of Hearst Mag­a­zines, urged a group of pub­lish­ing exec­u­tives to think of their audi­ence as con­sumers rather than read­ers. She’s onto some­thing: arguably the very def­i­n­i­tion of read­ing has changed. So Google asserts in defend­ing its right to scan copy­righted mate­ri­als. The process of dig­i­tiz­ing books trans­forms them, the com­pany con­tends, into some­thing else; our engage­ment with a text is dif­fer­ent when we call it up online. We are no longer read­ing. We’re search­ing — a func­tion that con­ve­niently did not exist when the con­cept of copy­right was estab­lished.
On a larger scale, the grow­ing use of text clouds hints at a (poten­tial) deeper cul­tural shift in the way we go about read­ing and com­pre­hen­sion: a shift from lin­ear modes based on read­ing words and sen­tences, to non­lin­ear modes based on view­ing sum­maries of con­tent in aggre­gate as a way of dis­cov­er­ing con­cepts and pat­terns. (Finally, a legit­i­mate use for Twit­ter…) Exper­i­ment­ing with text clouds for non-linear read­ing and com­pre­hen­sion (now that’s a sexy term…) is a nat­ural evo­lu­tion of the role cloud style dis­plays play as an alter­na­tive / com­pli­ment / sup­ple­ment to the list based nav­i­ga­tion now dom­i­nant in user expe­ri­ences.
A Text Cloud of Twit­ter Posts (A Twit­ter­Cloud?)

cre­ated at

I’m not pre­dict­ing the end of read­ing as we know it, nor the end of nav­i­ga­tion as we know it: both will be with us for a long, long time. But I do believe that text clouds might con­sti­tute an emerg­ing method for aug­ment­ing com­pre­hen­sion and dis­play of text, with broad poten­tial uses.
Enter­pris­ing Clouds
What about some­one lack­ing time to fully read a Shake­speare play, or a fad­dish busi­ness book, but who needs to under­stand some­thing about that book’s mean­ing and sub­stance? A text cloud cre­ation tool could extract the most com­monly men­tioned terms, and oth­er­wise pro­file the words that make up the text. It would be risky to rely on a shal­low text cloud (and Tim O’Reilly men­tions this specif­i­cally) for deep com­pre­hen­sion, but it would be enough to under­stand the con­cepts that appear, and allow polite con­ver­sa­tion at a net­work­ing event, or lunch with that cer­tain man­ager who rec­om­mended the book.
If I were entre­pre­neur­ial, I’d source a set of free elec­tronic ver­sions of clas­sic texts, process them with one of the free text cloud tools, apply some XSLT and other trans­for­ma­tions to gen­er­ate con­sis­tent read­able for­mat­ting, and sell the results as a line of ebooks called “Cloud Notes”. Of course, someone’s beaten me to it already
What’s in store for the future?
In this fash­ion, text clouds may become a gen­er­ally applied tool for man­ag­ing grow­ing infor­ma­tion over­load by using auto­mated syn­the­sis and sum­ma­riza­tion. In the infor­ma­tion sat­u­rated future (or the infor­ma­tion sat­u­rated present), text clouds are the com­mon exec­u­tive sum­mary on steroids and acid simul­ta­ne­ously; assem­bled with mus­cu­lar syn­tac­ti­cal and seman­tic pro­cess­ing, and fed to reading-fatigued post-literates as swirling blobs of giant words in wild col­ors, it con­sists of sig­ni­fiers for rei­fied con­cepts that tweak the eye-brain-language con­duit directly.

Designers, Meet Systems (Recommended Reading)

2007 looks to be the year that the user expe­ri­ence, infor­ma­tion archi­tec­ture, and design com­mu­ni­ties embrace sys­tems think­ing and con­cepts.
It’s a meet­ing that’s been in the mak­ing for a while -
At the 2006 IA Sum­mit, Karl Fast and D. Grant Camp­bell pre­sented From Pace Lay­er­ing to Resilience The­ory: the Com­plex Impli­ca­tions of Tag­ging for Infor­ma­tion Archi­tec­ture.
Gene Smith has been writ­ing about sys­tems for a while. At the 2007 sum­mit Gene and Matthew Milan will dis­cuss some prac­ti­cal tech­niques in their pre­sen­ta­tion Rich map­ping and soft sys­tems: new tools for cre­at­ing con­cep­tual mod­els.
Peter Mer­hholz has been post­ing and talk­ing about the impli­ca­tions of some of these ideas often.
– and seems to have reached crit­i­cal mass recently:

Here’s a set of read­ing rec­om­men­da­tions related to sys­tems and sys­tem think­ing. These books, feeds, and arti­cles either talk about sys­tems and the ideas and con­cepts behind this way of think­ing, or con­tain work that is heav­ily informed by sys­tems think­ing. Either way, they’re good resources for learn­ing more.
Resilience Sci­ence recently fea­tured three excel­lent essays on the work of C.S. Holling

And for a lighter read, try any­thing by author Bruce Ster­ling that fea­tures his recur­ring char­ac­ter Leggy Star­litz — a self-described sys­tems ana­lyst ( likely the first exam­ple of one in a work of fic­tion that’s even mod­er­ately well known…). His sto­ries Hol­ly­wood Krem­lin, Are You for 86?, and The Lit­tlest Jackal (two in short story col­lec­tion Glob­al­head), are good places to start. The novel Zeit­gest focuses on Star­litz.
Sus­tain­abil­ity, Sta­bil­ity, and Resilience
We’ve needed to bridge the gulf between views of design rooted in sta­tic notions of form and func­tion, and the fluid real­ity of life for a long time. I hope this new friend­ship lasts a while.

2.3% Of Chinese Internet Users Tag, Baidu Reports

A post­ing from China Web2.0 Review shared results of a report on Chi­nese tag­ging rates released by Baidu, China’s lead­ing search engine.
I was not able to locate a trans­la­tion of the orig­i­nal report from Baidu, so I’ll quote the sum­mary from China Web2.0 Review:

Accord­ing to the report, only 2.3% of inter­net users have ever used tag, they mainly use tags in social book­mark­ing and blogs. I don’t know the meth­ods of data col­lec­tion, but the report said about 15 mil­lion Chi­nese web­pages were book­marked by users, on aver­age each user has saved 40 online book­marks. Among them, over 90% users add less than two tags for a book­mark.
And based on the tags of user saved book­marks, the most used tags are “soft­ware down­load”, “BBS”, “enter­tain­ment”, “game” and “learn­ing”.

We don’t know which ser­vices are included for analy­sis in the report, so I have no idea to which extent I can trust it. But based on my obser­va­tion, I agree with the basic find­ing of the report, even though more and more ser­vices have embod­ied tag­ging fea­ture, only a very small part of early-adopters in China indeed use it.
Two things come to mind right away:

  1. The matu­rity, struc­ture, and usage pat­terns of the Inter­net in China are not directly com­pa­ra­ble to the matu­rity matu­rity, struc­ture, and usage pat­terns of the Inter­net else­where (largely due to sub­stan­tial restric­tions and cen­sor­ship by the Chi­nese government)
  2. Offi­cial Chinse posi­tions are not fully reli­able, and so the num­bers, con­text, and usage described could be very dif­fer­ent from real practices

Still, even with the absence of solid qual­i­fy­ing, cor­rob­o­rat­ing, or con­tex­tual infor­ma­tion, this rate of adop­tion for tag­ging seems con­sis­tent with the rest of the very rapid pace of mod­ern­iza­tion in China.
And as the First Prin­ci­ple of Tag Clouds — “Where there’s tags, there’s a tag cloud” — says, this means there are quite a few tag clouds on the way in China.

