Amazon.com Widgets

Archives: February 2007

« January 2007 : Home : March 2007 »

Endeca Guided Navigation vs. Facets In Search Experiences

February 26, 2007 05:49 PM | Posted in: User Experience (UX)

A recent question on the mailing list for the Taxonomy Community of Practice asked about search vendors whose products handle faceted navigation, and mentioned Endeca. Because vendor marketing distorts the meaning of accepted terms too often, it's worth pointing out that Endeca's tools differ from faceted navigation and organization systems in a number of key ways. These differences should affect strategy and purchase decisions on the best approach to providing high quality search experiences for users.

The Endeca model is based on Guided Navigation, a product concept that blends elements of user experience, administration, functionality, and possible information structures. In practice, guided navigation feels similar to facets, in that sets of results are narrowed or filtered by successive choices from available attributes (Endeca calls them dimensions).

But at heart, Endeca's approach is different in key ways.

In terms of application to various kinds of business needs and user experiences, facets can offer great power and utility for quickly identifying and manipulating large numbers of similar or symmetrical items, typically in narrower domains. Endeca's guided navigation is well suited to broader domains (though there is still a single root at the base of the tree), with fuzzier structures than facets.

Operatively, facets often don't serve well as a unifying solution to the need for providing structure and access to heterogeneous collections, and can encounter scaling difficulties when used for homogenous collections. Faceted experiences can offer genuine bidirectional navigation for users, meaning they work equally well for navigation paths that expand item sets from a single item to larger collections of similar items, because of the symmetry built in to faceted systems.

Guided navigation is better able to handle heterogeneous collections, but is not as precise for identification, does not reflect structure, and requires attention to correctly define (in ways not confusing / conflicting) and manage over time. Endeca's dimensions do not offer bidirectional navigation by default (because of their structural differences - it is possible to create user experiences that support bidirectional navigation using Endeca).

In sum, these differences should help explain the popularity of Endeca in ecommerce contexts, where every architectural incentive (even those that may not align with user goals) to increasing the total value of customer purchases is significant, and the relevance of facets to searching and information retrieval experiences that support a broader set of user goals within narrower information domains.

local tags: endeca, enteprise, facets, ia, information_retrieval, navigation, search, ux

10 Best Practices For Displaying Tag Clouds

February 25, 2007 01:31 AM | Posted in: Tag Clouds

This is a short list of best practices for rendering and displaying tag clouds that I originally circulated on the IXDG mailing list, and am now posting in response to several requests. These best practices are not in order of priority - they're simple enumerated.

  1. Use a single color for the tags in the rendered cloud: this will allow visitors to identify finer distinctions in the size differences. Employ more than one color with discretion. If using more than one color, offer the capability to switch between single color and multiple color views of the cloud.
  2. Use a single sans serif font family: this will improve the overall readability of the rendered cloud.
  3. If accurate comparison of relative weight (seeing the size differences amongst tags) is more important than overall readability, use a monospace font.
  4. If comprehension of tags and understanding the meaning is more important, use a variably spaced font that is easy to read.
  5. Use consistent and proportional spacing to separate the tags in the rendered tag cloud. Proportional means that the spacing between tags varies based on their size; typically more space is used for larger sizes. Consistent means that for each tag of a certain size, the spacing remains the same. In html, spacing is often determined by setting style parameters like padding or margins for the individual tags.
  6. Avoid separator characters between tags: they can be confused for small tags.
  7. Carefully consider rendering in flash, or another vector-based method, if your users will experience the cloud largely through older browsers / agents: the font rendering in older browsers is not always good or consistent, but it is important that the cloud offer text that is readily digestible by search and indexing engines, both locally and publicly
  8. If rendering the cloud in html, set the font size of rendered tags using whole percentages, rather than pixel sizes or decimals: this gives the display agent more freedom to adjust its final rendering.
  9. Do not insert line breaks: this allows the rendering agent to adjust the placement of line breaks to suit the rendering context.
  10. Offer the ability to change the order between at least two options - alphabetical, and one variable dimension (overall weight, frequency, recency, etc.)

For fun, I've run these 10 best practices through Tagcrowd. The major concepts show up well - font, color, and size are prominent - but obviously the specifics of the things discussed remain opaque.

Best Practices For Display as a Text Cloud
best_practices_textcloud.jpg

local tags: tagclouds, tagging, visualization

Smart Scoping For Content Management: Use The Content Scope Cycle

February 19, 2007 04:01 PM | Posted in: Ideas

Content management efforts are justly infamous for exceeding budgets and timelines, despite making considerable accomplishments. Exaggerated expectations for tool capabilities (vendors promise a world of automagic simplicity, but don't believe the hype) and the potential value of cost and efficiency improvements from managing content creation and distribution play a substantial part in this. But unrealistic estimates of the scope of the content to be managed make a more important contribution to most cost and time overruns.

Scope in this sense is a combination of the quantity and the quality of content; smaller amounts of very complex content substantially increase the overall scope of needs a CM solution must manage effectively. By analogy, imagine building an assembly line for toy cars, then deciding it has to handle the assembly of just a few full size automobiles at the same time.
Early and inaccurate estimates of content scope have a cascading effect, decreasing the accuracy of budgets, timelines, and resource forecasts for all the activities that follow.

In a typical content management engagement, the activities affected include:

The Root of the Problem

Two misconceptions - and two common but unhealthy practices, discussed below - drive most content scope estimates. First: the scope of content is knowable in advance. Second, and more misleading, scope remains fixed once defined. Neither of these assumptions is valid: identifying the scope of content with accuracy is unlikely without a comprehensive audit, and content scope (initial, revised, actual) changes considerably over the course of the CM effort.

Together, these assumptions make it very difficult for program directors, project managers, and business sponsors to set accurate and detailed budget and timeline expectations. The uncertain or shifting scope of most CM efforts conflicts directly with business imperatives to carefully manage of IT capital investment and spending, a necessity in most funding processes, and especially at the enterprise level. Instead of estimating specific numbers long in advance of reality (as with the Iraq war budget), a better approach is to embrace fluidity, and plan to refine scope estimates at punctuated intervals, according to the natural cycle of content scope change.

Understanding the Content Scope Cycle

Content scope changes according to a predictable cycle that is largely independent of the specifics of a project, system, organizational setting, and scale. This cycle seems consistent at the level of local CM efforts for a single business unit or isolated process, and at the level of enterprise scale content management efforts. Understanding the cycle makes it possible to prepare for shifts in a qualitative sense, accounting for the kind of variation to expect while planning and setting expectations with stakeholders, solution users, sponsors, and consumers of the managed content.

The Content Scope Cycle
cm_scope_cycle.png

The high peak and elevated mountain valley shape in this illustration tell the story of scope changes through the course of most content management efforts. From the initial inaccurate estimate, scope climbs consistently and steeply during the discovery phase, peaking in potential after all discovery activities conclude. Scope then declines quickly, but not to the original level, as assessments cull unneeded content. Scope levels out during system / solution / infrastructure creation, and climbs modestly during revision and replacement activities. At this point, the actual scope is known. Measured increases driven by the incorporation of supplemental material then increase scope in stages.

Local and Enterprise Cycles

Applying the context-independent view of the cycle to a local level reveals a close match with the activities and milestones for a content management effort for a small body of content, a single business unit of a larger organization, or a self-contained business process.

Local Content Management Scope Cycle
cm_scope_local.png

At the enterprise level, the cycle is the same. This illustration shows activities and milestones for a content management effort for a large and diverse body of content, multiple business units of a larger organization, or multiple and interconnected business process.

Enterprise Content Management Scope Cycle
cm_enterprise_cycle.png

Scope Cycle Changes
cm_scope_changes.png

This graph shows the amount of scope change at each milestone, versus its predecessor. Looking at the changes for any patterns of clustering and frequency, it's easy to see the cycle breaks down into three major phases: an initial period of dynamic instability, a static and stable phase, and a concluding (and ongoing, if the effort is successful) phase of dynamic stability.

Scope Cycle Phases
cm_scope_phases.png

Where does the extra scope come from? In other words, what's the source of the unexpected quantity and complexity of content behind the spikes and drops in expected scope in the first two phases? And why drives the shifts from one phase to another?

Bad CM Habits

Two common approaches account for a majority of the dramatic shifts in content scope. Most significantly, those people with immediate knowledge of the content quantity and complexity rarely have direct voice in setting the scope and timeline expectations. Too often, stake holders with expertise in other areas (IT, enterprise architecture, application development) frame the problem and the solution far in advance. The content creators, publishers, distributors, and consumers are not involved early enough.

Secondly, those who frame the problem make assumptions about quantity and complexity that trend low. (This is in companion to the exaggeration of tool capabilities.) Each new business unit, content owner, and system administrator's items included in the effort will increase the scope of the content in quantity, complexity, or both. Ongoing identification of new or unknown types of content, work flows, business rules, usage contexts, storage modes, applications, formats, syndication instances, systems, and repositories will continue to increase the scope until all relevant parties (creators, consumers, administrators, etc.) are engaged, and their needs and content collections fully understood.

The result is clear: a series of substantial scope errors of both under and over-estimatio, in comparison to the actual scope, concentrated in the first phase of the scope cycle.

Scope Errors
cm_scope_error.png

Smart Scoping

The scope cycle seems to be a fundamental pattern; likely an emergent aspect of the environments and systems underlying it, but that's another discussion entirely. Failing to allow for the natural changes in scope over the course of a content management effort ties your success to inaccurate estimates, and this false expectations.

Smart scoping means allowing for and anticipating the inherent margins of error when setting expectations and making estimates. The most straightforward way to put this into practice and account for the likely margins of error is to adjust the timing of a scope estimate to the necessary level of accuracy.

Relative Scope Estimate Accuracy
cm_estimate_accuracy.png


Scoping and Budgeting

Estimation practices that respond to the content scope cycle can still satisfy business needs. At the enterprise CM level, IT spending plans and investment frameworks (often part of enterprise architecture planning processes) should allow for natural cycles by defining classes or kinds of estimates based on comparative degree of accuracy, and the estimator's leeway for meeting or exceeding implied commitments. Enterprise frameworks will identify when more or less accurate estimates are needed to move through funding and approval gateways, based on each organization's investment practices.

And at the local CM level, project planning and resource forecasting methods should allow for incremental allocation of resources to meet task and activity needs. Taking a content inventory is a substantial labor on its own, for example. The same is true of migrating a body of content from one or more sources to a new CM solution that incorporates changed content structures such as work flows and information architectures. The architectural, technical, and organizational capabilities and staff needed for inventorying and migrating content can often be met by relying on content owners and stake holders, or hiring contractors for short and medium-term assistance.

Parallels To CM Spending Patterns

The content scope cycle strongly parallels the spending patterns during CMS implementation James Robertson identified in June of 2005. I think the scope cycle correlates with the spending pattern James found, and it may even be a driving factor.

Scoping and Maturity

Unrealistic scope estimation that does not take the content scope cycle into account is typical of organizations undertaking a first content management effort. It is also common in organizations with content management experience, but low levels of content management maturity.

Two (informal) surveys of CMS practitioners spanning the past three years show the prevalence of scoping problems. In 2004, Victor Lombardi reported: "Of all tasks in a content management project, the creation, editing, and migration of content are probably the most frequently underestimated on the project plan." [in Managing the Complexity of Content Management].

And two weeks ago, Rita Warren of CMSWire shared the results of a recent survey on challenges in content management (Things That Go Bump In Your CMS).


The top 5 challenges (most often ranked #1) were:

  1. Clarifying business goals

  2. Gaining and maintaining executive support

  3. Redesigning/optimizing business processes

  4. Gaining consensus among stakeholders

  5. Properly scoping the project

..."Properly scoping the project" was actually the most popular answer, showing up in the top 5 most often.

Accurate scoping is much easier for organizations with high levels of content management maturity. As the error margins inherent in early and inaccurate scope estimates demonstrate, there is considerable benefit in creating mechanisms and tools for effectively understanding the quantity and quality of content requiring management, as well as the larger business context, solution governance, and organizational culture concerns.

local tags: content_management, cycles, ecm, enterprise, project_management

PEW Report Shows 28% Of Internet Users Have Tagged

February 1, 2007 02:30 PM | Posted in: Tag Clouds

The Pew Internet & American Life Project just released a report on tagging that finds
28% of internet users have tagged or categorized content online such as photos, news stories or
blog posts. On a typical day online, 7% of internet users say they tag or categorize online content.

The authors note "This is the first time the Project has asked about tagging, so it is not clear exactly how fast the trend is growing."

Wow - I'd say it's growing extremely quickly. Though I am on record as a believer in the bright future of tag clouds, I admit I'm surprised by these results. The fact that 7% of internet users tag daily is what's most significant: it's an indication of very rapid adoption for the practice of tagging in many different contexts and many different kinds of audiences, given it's brief history.

I'd guess this adoption rate compares to the rates of adoption for other new network-dependent or emergent architectures like P2P music sharing or on-line music buying.

You're correct if you're thinking there is a difference between tagging and tag clouds. And if you've read the report and the accompanying interview with Dr. Weinberger, you've likely realized that neither Dr. Weinberger's interview nor the report specifically addresses tag cloud usage. But remember the First Principle of Tag Clouds: "Where there's tags, there's a tag cloud." By definition, any item with an associated collection of tags has a tag cloud, regardless of whether that tag cloud is directly visible and actionable in the user experience. So that 7% of internet users who tag daily are by default creating and working with tag clouds daily.

It might be time for tag clouds to look into getting some sunglasses.

local tags: pew, social_systems, tagclouds, tagging

©2008 by Joe Lamantia :: joe [at] joelamantia.com