Endeca Guided Navigation vs. Facets In Search Experiences

February 26th, 2007 — 5:49pm

A recent ques­tion on the mail­ing list for the Tax­on­omy Com­mu­nity of Prac­tice asked about search ven­dors whose prod­ucts han­dle faceted nav­i­ga­tion, and men­tioned Endeca. Because ven­dor mar­ket­ing dis­torts the mean­ing of accepted terms too often, it’s worth point­ing out that Endeca’s tools dif­fer from faceted nav­i­ga­tion and orga­ni­za­tion sys­tems in a num­ber of key ways. These dif­fer­ences should affect strat­egy and pur­chase deci­sions on the best approach to pro­vid­ing high qual­ity search expe­ri­ences for users.
The Endeca model is based on Guided Nav­i­ga­tion, a prod­uct con­cept that blends ele­ments of user expe­ri­ence, admin­is­tra­tion, func­tion­al­ity, and pos­si­ble infor­ma­tion struc­tures. In prac­tice, guided nav­i­ga­tion feels sim­i­lar to facets, in that sets of results are nar­rowed or fil­tered by suc­ces­sive choices from avail­able attrib­utes (Endeca calls them dimen­sions).
But at heart, Endeca’s approach is dif­fer­ent in key ways.

  • Facets are orthog­o­nal, whereas Endeca’s dimen­sions can overlap.
  • Facets are ubiq­ui­tous, so always apply, whereas Endeca’s dimen­sions can be con­di­tional, some­times apply­ing and some­times not.
  • Facets reflect a fun­da­men­tal char­ac­ter­is­tic or aspect of the pool of items. Endeca’s Dimen­sions may reflect some aspect of the pool of items (pri­mary prop­er­ties), they may be inferred (sec­ondary prop­er­ties), they may be out­side cri­te­ria, etc.
  • The val­ues pos­si­ble for a indi­vid­ual facet are flat and equiv­a­lent. Endeca’s dimen­sions can con­tain var­i­ous kinds of struc­tures (unless I’m mis­taken), and may not be equivalent.

In terms of appli­ca­tion to var­i­ous kinds of busi­ness needs and user expe­ri­ences, facets can offer great power and util­ity for quickly iden­ti­fy­ing and manip­u­lat­ing large num­bers of sim­i­lar or sym­met­ri­cal items, typ­i­cally in nar­rower domains. Endeca’s guided nav­i­ga­tion is well suited to broader domains (though there is still a sin­gle root at the base of the tree), with fuzzier struc­tures than facets.
Oper­a­tively, facets often don’t serve well as a uni­fy­ing solu­tion to the need for pro­vid­ing struc­ture and access to het­ero­ge­neous col­lec­tions, and can encounter scal­ing dif­fi­cul­ties when used for homoge­nous col­lec­tions. Faceted expe­ri­ences can offer gen­uine bidi­rec­tional nav­i­ga­tion for users, mean­ing they work equally well for nav­i­ga­tion paths that expand item sets from a sin­gle item to larger col­lec­tions of sim­i­lar items, because of the sym­me­try built in to faceted sys­tems.
Guided nav­i­ga­tion is bet­ter able to han­dle het­ero­ge­neous col­lec­tions, but is not as pre­cise for iden­ti­fi­ca­tion, does not reflect struc­ture, and requires atten­tion to cor­rectly define (in ways not con­fus­ing / con­flict­ing) and man­age over time. Endeca’s dimen­sions do not offer bidi­rec­tional nav­i­ga­tion by default (because of their struc­tural dif­fer­ences — it is pos­si­ble to cre­ate user expe­ri­ences that sup­port bidi­rec­tional nav­i­ga­tion using Endeca).
In sum, these dif­fer­ences should help explain the pop­u­lar­ity of Endeca in ecom­merce con­texts, where every archi­tec­tural incen­tive (even those that may not align with user goals) to increas­ing the total value of cus­tomer pur­chases is sig­nif­i­cant, and the rel­e­vance of facets to search­ing and infor­ma­tion retrieval expe­ri­ences that sup­port a broader set of user goals within nar­rower infor­ma­tion domains.

10 Best Practices For Displaying Tag Clouds

February 25th, 2007 — 1:31am

This is a short list of best prac­tices for ren­der­ing and dis­play­ing tag clouds that I orig­i­nally cir­cu­lated on the IXDG mail­ing list, and am now post­ing in response to sev­eral requests. These best prac­tices are not in order of pri­or­ity — they’re sim­ple enumerated.

  1. Use a sin­gle color for the tags in the ren­dered cloud: this will allow vis­i­tors to iden­tify finer dis­tinc­tions in the size dif­fer­ences. Employ more than one color with dis­cre­tion. If using more than one color, offer the capa­bil­ity to switch between sin­gle color and mul­ti­ple color views of the cloud.
  2. Use a sin­gle sans serif font fam­ily: this will improve the over­all read­abil­ity of the ren­dered cloud.
  3. If accu­rate com­par­i­son of rel­a­tive weight (see­ing the size dif­fer­ences amongst tags) is more impor­tant than over­all read­abil­ity, use a mono­space font.
  4. If com­pre­hen­sion of tags and under­stand­ing the mean­ing is more impor­tant, use a vari­ably spaced font that is easy to read.
  5. Use con­sis­tent and pro­por­tional spac­ing to sep­a­rate the tags in the ren­dered tag cloud. Pro­por­tional means that the spac­ing between tags varies based on their size; typ­i­cally more space is used for larger sizes. Con­sis­tent means that for each tag of a cer­tain size, the spac­ing remains the same. In html, spac­ing is often deter­mined by set­ting style para­me­ters like padding or mar­gins for the indi­vid­ual tags.
  6. Avoid sep­a­ra­tor char­ac­ters between tags: they can be con­fused for small tags.
  7. Care­fully con­sider ren­der­ing in flash, or another vector-based method, if your users will expe­ri­ence the cloud largely through older browsers / agents: the font ren­der­ing in older browsers is not always good or con­sis­tent, but it is impor­tant that the cloud offer text that is read­ily digestible by search and index­ing engines, both locally and publicly
  8. If ren­der­ing the cloud in html, set the font size of ren­dered tags using whole per­cent­ages, rather than pixel sizes or dec­i­mals: this gives the dis­play agent more free­dom to adjust its final rendering.
  9. Do not insert line breaks: this allows the ren­der­ing agent to adjust the place­ment of line breaks to suit the ren­der­ing context.
  10. Offer the abil­ity to change the order between at least two options — alpha­bet­i­cal, and one vari­able dimen­sion (over­all weight, fre­quency, recency, etc.)

For fun, I’ve run these 10 best prac­tices through Tagcrowd. The major con­cepts show up well — font, color, and size are promi­nent — but obvi­ously the specifics of the things dis­cussed remain opaque.
Best Prac­tices For Dis­play as a Text Cloud

Smart Scoping For Content Management: Use The Content Scope Cycle

February 19th, 2007 — 4:01pm

Con­tent man­age­ment efforts are justly infa­mous for exceed­ing bud­gets and time­lines, despite mak­ing con­sid­er­able accom­plish­ments. Exag­ger­ated expec­ta­tions for tool capa­bil­i­ties (ven­dors promise a world of automagic sim­plic­ity, but don’t believe the hype) and the poten­tial value of cost and effi­ciency improve­ments from man­ag­ing con­tent cre­ation and dis­tri­b­u­tion play a sub­stan­tial part in this. But unre­al­is­tic esti­mates of the scope of the con­tent to be man­aged make a more impor­tant con­tri­bu­tion to most cost and time over­runs.
Scope in this sense is a com­bi­na­tion of the quan­tity and the qual­ity of con­tent; smaller amounts of very com­plex con­tent sub­stan­tially increase the over­all scope of needs a CM solu­tion must man­age effec­tively. By anal­ogy, imag­ine build­ing an assem­bly line for toy cars, then decid­ing it has to han­dle the assem­bly of just a few full size auto­mo­biles at the same time.
Early and inac­cu­rate esti­mates of con­tent scope have a cas­cad­ing effect, decreas­ing the accu­racy of bud­gets, time­lines, and resource fore­casts for all the activ­i­ties that fol­low.
In a typ­i­cal con­tent man­age­ment engage­ment, the activ­i­ties affected include:

  • tak­ing a con­tent inventory
  • defin­ing con­tent models
  • choos­ing a new con­tent man­age­ment system
  • design­ing con­tent struc­tures, work­flows, and metadata
  • migrat­ing con­tent from one sys­tem to another
  • refresh­ing and updat­ing content
  • estab­lish­ing sound gov­er­nance mechanisms

The Root of the Prob­lem
Two mis­con­cep­tions — and two com­mon but unhealthy prac­tices, dis­cussed below — drive most con­tent scope esti­mates. First: the scope of con­tent is know­able in advance. Sec­ond, and more mis­lead­ing, scope remains fixed once defined. Nei­ther of these assump­tions is valid: iden­ti­fy­ing the scope of con­tent with accu­racy is unlikely with­out a com­pre­hen­sive audit, and con­tent scope (ini­tial, revised, actual) changes con­sid­er­ably over the course of the CM effort.
Together, these assump­tions make it very dif­fi­cult for pro­gram direc­tors, project man­agers, and busi­ness spon­sors to set accu­rate and detailed bud­get and time­line expec­ta­tions. The uncer­tain or shift­ing scope of most CM efforts con­flicts directly with busi­ness imper­a­tives to care­fully man­age of IT cap­i­tal invest­ment and spend­ing, a neces­sity in most fund­ing processes, and espe­cially at the enter­prise level. Instead of esti­mat­ing spe­cific num­bers long in advance of real­ity (as with the Iraq war bud­get), a bet­ter approach is to embrace flu­id­ity, and plan to refine scope esti­mates at punc­tu­ated inter­vals, accord­ing to the nat­ural cycle of con­tent scope change.
Under­stand­ing the Con­tent Scope Cycle
Con­tent scope changes accord­ing to a pre­dictable cycle that is largely inde­pen­dent of the specifics of a project, sys­tem, orga­ni­za­tional set­ting, and scale. This cycle seems con­sis­tent at the level of local CM efforts for a sin­gle busi­ness unit or iso­lated process, and at the level of enter­prise scale con­tent man­age­ment efforts. Under­stand­ing the cycle makes it pos­si­ble to pre­pare for shifts in a qual­i­ta­tive sense, account­ing for the kind of vari­a­tion to expect while plan­ning and set­ting expec­ta­tions with stake­hold­ers, solu­tion users, spon­sors, and con­sumers of the man­aged con­tent.
The Con­tent Scope Cycle
The high peak and ele­vated moun­tain val­ley shape in this illus­tra­tion tell the story of scope changes through the course of most con­tent man­age­ment efforts. From the ini­tial inac­cu­rate esti­mate, scope climbs con­sis­tently and steeply dur­ing the dis­cov­ery phase, peak­ing in poten­tial after all dis­cov­ery activ­i­ties con­clude. Scope then declines quickly, but not to the orig­i­nal level, as assess­ments cull unneeded con­tent. Scope lev­els out dur­ing sys­tem / solu­tion / infra­struc­ture cre­ation, and climbs mod­estly dur­ing revi­sion and replace­ment activ­i­ties. At this point, the actual scope is known. Mea­sured increases dri­ven by the incor­po­ra­tion of sup­ple­men­tal mate­r­ial then increase scope in stages.
Local and Enter­prise Cycles
Apply­ing the context-independent view of the cycle to a local level reveals a close match with the activ­i­ties and mile­stones for a con­tent man­age­ment effort for a small body of con­tent, a sin­gle busi­ness unit of a larger orga­ni­za­tion, or a self-contained busi­ness process.
Local Con­tent Man­age­ment Scope Cycle
At the enter­prise level, the cycle is the same. This illus­tra­tion shows activ­i­ties and mile­stones for a con­tent man­age­ment effort for a large and diverse body of con­tent, mul­ti­ple busi­ness units of a larger orga­ni­za­tion, or mul­ti­ple and inter­con­nected busi­ness process.
Enter­prise Con­tent Man­age­ment Scope Cycle
Scope Cycle Changes
This graph shows the amount of scope change at each mile­stone, ver­sus its pre­de­ces­sor. Look­ing at the changes for any pat­terns of clus­ter­ing and fre­quency, it’s easy to see the cycle breaks down into three major phases: an ini­tial period of dynamic insta­bil­ity, a sta­tic and sta­ble phase, and a con­clud­ing (and ongo­ing, if the effort is suc­cess­ful) phase of dynamic sta­bil­ity.
Scope Cycle Phases
Where does the extra scope come from? In other words, what’s the source of the unex­pected quan­tity and com­plex­ity of con­tent behind the spikes and drops in expected scope in the first two phases? And why dri­ves the shifts from one phase to another?
Bad CM Habits
Two com­mon approaches account for a major­ity of the dra­matic shifts in con­tent scope. Most sig­nif­i­cantly, those peo­ple with imme­di­ate knowl­edge of the con­tent quan­tity and com­plex­ity rarely have direct voice in set­ting the scope and time­line expec­ta­tions. Too often, stake hold­ers with exper­tise in other areas (IT, enter­prise archi­tec­ture, appli­ca­tion devel­op­ment) frame the prob­lem and the solu­tion far in advance. The con­tent cre­ators, pub­lish­ers, dis­trib­u­tors, and con­sumers are not involved early enough.
Sec­ondly, those who frame the prob­lem make assump­tions about quan­tity and com­plex­ity that trend low. (This is in com­pan­ion to the exag­ger­a­tion of tool capa­bil­i­ties.) Each new busi­ness unit, con­tent owner, and sys­tem administrator’s items included in the effort will increase the scope of the con­tent in quan­tity, com­plex­ity, or both. Ongo­ing iden­ti­fi­ca­tion of new or unknown types of con­tent, work flows, busi­ness rules, usage con­texts, stor­age modes, appli­ca­tions, for­mats, syn­di­ca­tion instances, sys­tems, and repos­i­to­ries will con­tinue to increase the scope until all rel­e­vant par­ties (cre­ators, con­sumers, admin­is­tra­tors, etc.) are engaged, and their needs and con­tent col­lec­tions fully under­stood.
The result is clear: a series of sub­stan­tial scope errors of both under and over-estimatio, in com­par­i­son to the actual scope, con­cen­trated in the first phase of the scope cycle.
Scope Errors
Smart Scop­ing
The scope cycle seems to be a fun­da­men­tal pat­tern; likely an emer­gent aspect of the envi­ron­ments and sys­tems under­ly­ing it, but that’s another dis­cus­sion entirely. Fail­ing to allow for the nat­ural changes in scope over the course of a con­tent man­age­ment effort ties your suc­cess to inac­cu­rate esti­mates, and this false expec­ta­tions.
Smart scop­ing means allow­ing for and antic­i­pat­ing the inher­ent mar­gins of error when set­ting expec­ta­tions and mak­ing esti­mates. The most straight­for­ward way to put this into prac­tice and account for the likely mar­gins of error is to adjust the tim­ing of a scope esti­mate to the nec­es­sary level of accu­racy.
Rel­a­tive Scope Esti­mate Accu­racy
Scop­ing and Bud­get­ing
Esti­ma­tion prac­tices that respond to the con­tent scope cycle can still sat­isfy busi­ness needs. At the enter­prise CM level, IT spend­ing plans and invest­ment frame­works (often part of enter­prise archi­tec­ture plan­ning processes) should allow for nat­ural cycles by defin­ing classes or kinds of esti­mates based on com­par­a­tive degree of accu­racy, and the estimator’s lee­way for meet­ing or exceed­ing implied com­mit­ments. Enter­prise frame­works will iden­tify when more or less accu­rate esti­mates are needed to move through fund­ing and approval gate­ways, based on each organization’s invest­ment prac­tices.
And at the local CM level, project plan­ning and resource fore­cast­ing meth­ods should allow for incre­men­tal allo­ca­tion of resources to meet task and activ­ity needs. Tak­ing a con­tent inven­tory is a sub­stan­tial labor on its own, for exam­ple. The same is true of migrat­ing a body of con­tent from one or more sources to a new CM solu­tion that incor­po­rates changed con­tent struc­tures such as work flows and infor­ma­tion archi­tec­tures. The archi­tec­tural, tech­ni­cal, and orga­ni­za­tional capa­bil­i­ties and staff needed for inven­to­ry­ing and migrat­ing con­tent can often be met by rely­ing on con­tent own­ers and stake hold­ers, or hir­ing con­trac­tors for short and medium-term assis­tance.
Par­al­lels To CM Spend­ing Pat­terns
The con­tent scope cycle strongly par­al­lels the spend­ing pat­terns dur­ing CMS imple­men­ta­tion James Robert­son iden­ti­fied in June of 2005. I think the scope cycle cor­re­lates with the spend­ing pat­tern James found, and it may even be a dri­ving fac­tor.
Scop­ing and Matu­rity
Unre­al­is­tic scope esti­ma­tion that does not take the con­tent scope cycle into account is typ­i­cal of orga­ni­za­tions under­tak­ing a first con­tent man­age­ment effort. It is also com­mon in orga­ni­za­tions with con­tent man­age­ment expe­ri­ence, but low lev­els of con­tent man­age­ment matu­rity.
Two (infor­mal) sur­veys of CMS prac­ti­tion­ers span­ning the past three years show the preva­lence of scop­ing prob­lems. In 2004, Vic­tor Lom­bardi reported: “Of all tasks in a con­tent man­age­ment project, the cre­ation, edit­ing, and migra­tion of con­tent are prob­a­bly the most fre­quently under­es­ti­mated on the project plan.” [in Man­ag­ing the Com­plex­ity of Con­tent Man­age­ment].
And two weeks ago, Rita War­ren of CMSWire shared the results of a recent sur­vey on chal­lenges in con­tent man­age­ment (Things That Go Bump In Your CMS).

The top 5 chal­lenges (most often ranked #1) were:

  1. Clar­i­fy­ing busi­ness goals
  2. Gain­ing and main­tain­ing exec­u­tive support
  3. Redesigning/optimizing busi­ness processes
  4. Gain­ing con­sen­sus among stakeholders
  5. Prop­erly scop­ing the project

…“Prop­erly scop­ing the project” was actu­ally the most pop­u­lar answer, show­ing up in the top 5 most often.
Accu­rate scop­ing is much eas­ier for orga­ni­za­tions with high lev­els of con­tent man­age­ment matu­rity. As the error mar­gins inher­ent in early and inac­cu­rate scope esti­mates demon­strate, there is con­sid­er­able ben­e­fit in cre­at­ing mech­a­nisms and tools for effec­tively under­stand­ing the quan­tity and qual­ity of con­tent requir­ing man­age­ment, as well as the larger busi­ness con­text, solu­tion gov­er­nance, and orga­ni­za­tional cul­ture concerns.

PEW Report Shows 28% Of Internet Users Have Tagged

February 1st, 2007 — 2:30pm

The Pew Inter­net & Amer­i­can Life Project just released a report on tag­ging that finds
28% of inter­net users have tagged or cat­e­go­rized con­tent online such as pho­tos, news sto­ries or
blog posts. On a typ­i­cal day online, 7% of inter­net users say they tag or cat­e­go­rize online con­tent.

The authors note “This is the first time the Project has asked about tag­ging, so it is not clear exactly how fast the trend is grow­ing.“
Wow — I’d say it’s grow­ing extremely quickly. Though I am on record as a believer in the bright future of tag clouds, I admit I’m sur­prised by these results. The fact that 7% of inter­net users tag daily is what’s most sig­nif­i­cant: it’s an indi­ca­tion of very rapid adop­tion for the prac­tice of tag­ging in many dif­fer­ent con­texts and many dif­fer­ent kinds of audi­ences, given it’s brief his­tory.
I’d guess this adop­tion rate com­pares to the rates of adop­tion for other new network-dependent or emer­gent archi­tec­tures like P2P music shar­ing or on-line music buy­ing.
You’re cor­rect if you’re think­ing there is a dif­fer­ence between tag­ging and tag clouds. And if you’ve read the report and the accom­pa­ny­ing inter­view with Dr. Wein­berger, you’ve likely real­ized that nei­ther Dr. Weinberger’s inter­view nor the report specif­i­cally addresses tag cloud usage. But remem­ber the First Prin­ci­ple of Tag Clouds: “Where there’s tags, there’s a tag cloud.” By def­i­n­i­tion, any item with an asso­ci­ated col­lec­tion of tags has a tag cloud, regard­less of whether that tag cloud is directly vis­i­ble and action­able in the user expe­ri­ence. So that 7% of inter­net users who tag daily are by default cre­at­ing and work­ing with tag clouds daily.
It might be time for tag clouds to look into get­ting some sun­glasses.

