Archive for June 2014

Empirical Discovery: Concept and Workflow Model

June 20th, 2014 — 12:53pm

Con­cept mod­els are a pow­er­ful tool for artic­u­lat­ing the essen­tial ele­ments and rela­tion­ships that define new or com­plex things we need to under­stand.  We’ve pre­vi­ously defined empir­i­cal dis­cov­ery as a new method, look­ing at antecedents, and also com­par­ing and con­trast­ing the dis­tinc­tive char­ac­ter­is­tics of Empir­i­cal Dis­cov­ery with other knowl­edge cre­ation and insight seek­ing meth­ods.  I’m now shar­ing our con­cept model of Empir­i­cal Dis­cov­ery, which iden­ti­fies the most impor­tant actors, activ­i­ties, and out­comes of empir­i­cal dis­cov­ery efforts, to com­ple­ment the writ­ten def­i­n­i­tion by illus­trat­ing   how the method works in practice.

Empir­i­cal dis­cov­ery con­cept model from Joe Laman­tia

In this model, we illus­trate the activ­i­ties of the three kinds of peo­ple most cen­tral to dis­cov­ery efforts: Insight Con­sumers, Data Sci­en­tists, and Data Engi­neers.  We have robust def­i­n­i­tions of all the major actors involved in dis­cov­ery (used to drive prod­uct devel­op­ment), and may share some of these var­i­ous per­sonas, pro­files, and snap­shots sub­se­quently.  For read­ing this model, under­stand Insight Con­sumers as the peo­ple who rely on insights from dis­cov­ery efforts to effect and man­age the oper­a­tions of the busi­ness.  Data Sci­en­tists are the sense­mak­ers who achieve insights, and cre­ate data prod­ucts, and ana­lyt­i­cal mod­els through dis­cov­ery efforts.  Data Engi­neers enable dis­cov­ery efforts by build­ing the enter­prise data analy­sis infra­struc­ture nec­es­sary for dis­cov­ery, and often imple­ment the out­comes of empir­i­cal dis­cov­ery by build­ing new tools based on the insights and mod­els Data Sci­en­tists create.

A key assump­tion of this model is that dis­cov­ery is by def­i­n­i­tion an iter­a­tive and serendip­i­tous method, rely­ing on fre­quent back-steps and unpre­dictable rep­e­ti­tion of activ­i­ties as a nec­es­sary aspect of how dis­cov­ery efforts unfold.  This model also assumes the data, meth­ods, and tools shift dur­ing dis­cov­ery efforts, in keep­ing with the evo­lu­tion of moti­vat­ing ques­tions, and the achieve­ment of interim out­comes.  Sim­i­larly, dis­cov­ery efforts do not always involve all of these elements.

To keep the essen­tial struc­ture and rela­tion­ships between ele­ments clear and in the fore­ground, we have not shown all of the pos­si­ble iter­a­tive loops or repeated steps.  Some closely related con­cepts are grouped together, to allow read­ing the model on two lev­els of detail.

For a sim­pli­fied view, fol­low the links between named actors and groups of con­cepts shown with col­ored back­grounds and labels.  In this read­ing, an Insight Con­sumer artic­u­lates ques­tions to a Data Sci­en­tist, who com­bines domain knowl­edge with the Empir­i­cal Dis­cov­ery Method (yel­low) to direct the appli­ca­tion of Ana­lyt­i­cal Tools (blue) and Mod­els (salmon) to Data Sets (green) drawn from Data Sources (magenta).  The Data Sci­en­tist shares Insights result­ing from dis­cov­ery efforts with the Insight Con­sumer, while Data Engi­neers may imple­ment the mod­els or data prod­ucts cre­ated by the Data Sci­en­tist by turn­ing them into tools and infra­struc­ture for the rest of the busi­ness.  For a more detailed view of the spe­cific con­cepts and activ­i­ties com­mon to Empir­i­cal dis­cov­ery efforts, fol­low the links between the indi­vid­ual con­cepts within these named groups.  (Note: there are two kinds of con­nec­tions; solid arrows indi­cat­ing def­i­nite rela­tion­ships, and for the Data Sets and Mod­els groups, dashed arrows indi­cat­ing pos­si­ble paths of evo­lu­tion.  More on this to follow)

Another way to inter­pret the two lev­els of detail in this model is as descrip­tions of for­mal vs. infor­mal imple­men­ta­tions of the empir­i­cal dis­cov­ery method.  Peo­ple and orga­ni­za­tions who take a more for­mal approach to empir­i­cal dis­cov­ery may require explic­itly defined arti­facts and activ­i­ties that address each major con­cept, such as pre­dic­tions and exper­i­men­tal results.  In less for­mal approaches, Data Sci­en­tists may implic­itly address each of the major con­cepts and activ­i­ties, such as fram­ing hypothe­ses, or track­ing the states of data sets they are work­ing with, with­out any for­mal arti­fact or deci­sion gate­way.  This sit­u­a­tional flex­i­bil­ity is follow-on of the applied nature of the empir­i­cal dis­cov­ery method, which does not require sci­en­tific stan­dards of proof and repro­ducibil­ity to gen­er­ate val­ued outcomes.

The story begins in the upper right cor­ner, when an Insight Con­sumer artic­u­lates a belief or ques­tion to a Data Sci­en­tist, who then trans­lates this moti­vat­ing state­ment into a planned dis­cov­ery effort that addresses the busi­ness goal. The Data Sci­en­tist applies the Empir­i­cal Dis­cov­ery Method (con­cepts in yel­low); pos­si­bly gen­er­at­ing a hypoth­e­sis and accom­pa­ny­ing pre­dic­tions which will be tested by exper­i­ments, choos­ing data from the range of avail­able data sources (grouped in magenta), and select­ing ini­tial ana­lyt­i­cal meth­ods con­sis­tent with the domain, the data sets (green), and the ana­lyt­i­cal or ref­er­ence mod­els (salmon) they will work with.  Given the par­tic­u­lars of the data and the ana­lyt­i­cal meth­ods, the Data Sci­en­tist employs spe­cific ana­lyt­i­cal tools (blue) such as algo­rithms and sta­tis­ti­cal or other mea­sures, based on fac­tors such as expected accu­racy, and speed or ease of use.  As the effort pro­gresses through iter­a­tions, or insights emerge, exper­i­ments may be added or revised, based on the con­clu­sions the Data Sci­en­tist draws from the results and their impact on start­ing pre­dic­tions or hypotheses.

For exam­ple, an Insight Con­sumer who works in a prod­uct man­age­ment capac­ity for an on-line social net­work with a busi­ness goal of increas­ing users’ level of engage­ment with the ser­vice wishes to iden­tify oppor­tu­ni­ties to rec­om­mend users estab­lish new con­nec­tions with other sim­i­lar and pos­si­bly known users based on unrec­og­nized affini­ties in their posted pro­files.  The data sci­en­tist trans­lates this busi­ness goal into a series of exper­i­ments inves­ti­gat­ing pre­dic­tions about which aspects of user pro­files more effec­tively pre­dict the like­li­hood of cre­at­ing new con­nec­tions in response to system-generated rec­om­men­da­tions for sim­i­lar­ity.  The Data Sci­en­tist frames exper­i­ments that rely on data from the accu­mu­lated logs of user activ­i­ties within the net­work that have been anonymized to com­ply with pri­vacy poli­cies, select­ing spe­cific work­ing sets of data to ana­lyze based on aware­ness of the shoe and nature of the attrib­utes that appear directly in users’ pro­files both across the entire net­work, and among pools of sim­i­lar but uncon­nected users. The Data Sci­en­tist plans to begin with ana­lyt­i­cal meth­ods use­ful for pre­dic­tive mod­el­ing of the effec­tive­ness of rec­om­mender sys­tems in net­work con­texts, such as mea­sure­ments of the affin­ity of users’ inter­ests based on seman­tic analy­sis of social objects shared by users within this net­work and also pub­licly in other online media, and also struc­tural or topo­log­i­cal mea­sures of rel­a­tive posi­tion and dis­tance from the field of net­work sci­ence.  The Data Sci­en­tist chooses a set of stan­dard social net­work analy­sis algo­rithms and mea­sures, com­bined with cus­tom mod­els for inter­pret­ing user activ­ity and inter­est unique to this net­work.  The Data Sci­en­tist has pre­de­fined scripts and open source libraries avail­able for ready appli­ca­tion to data (MLlib, Gephi, Weka, Pan­das, etc.) in the form of Ana­lyt­i­cal tools, which she will com­bine in sequences accord­ing to the desired ana­lyt­i­cal flow for each experiment.

The nature of ana­lyt­i­cal engage­ment with data sets varies dur­ing the course of dis­cov­ery efforts, with dif­fer­ent types of data sets play­ing dif­fer­ent roles at spe­cific stages of the dis­cov­ery work­flow.  Our con­cept map sim­pli­fies the life­cy­cle of data for pur­poses of descrip­tion, iden­ti­fy­ing five dis­tinct and rec­og­niz­able ways data are used by the Data Sci­en­tist, with five cor­re­spond­ing types of data sets.  In some cases, for­mal cri­te­ria on data qual­ity, com­plete­ness, accu­racy, and con­tent gov­ern which stage of the data life­cy­cle any  given data set is at.  In most dis­cov­ery efforts, how­ever, Data Sci­en­tists them­selves make a series of judge­ments about when and how the data in hand is suit­able for use.  The dashed arrows link­ing the five types of data sets cap­ture the approx­i­mate and con­di­tional nature of these dif­fer­ent stages of evo­lu­tion.  In prac­tice, dis­cov­ery efforts begin with explo­ration of data that may or may not be rel­e­vant for focused analy­sis, but which requires some direct engage­ment to and atten­tion to rule in or out of con­sid­er­a­tion. Focused ana­lyt­i­cal inves­ti­ga­tion of the rel­e­vant data fol­lows, made pos­si­ble by the iter­a­tive addi­tion, refine­ment and trans­for­ma­tion (wran­gling — more on this in later posts) of the exploratory data in hand.  At this stage, the Data Sci­en­tist applies ana­lyt­i­cal tools iden­ti­fied by their cho­sen ana­lyt­i­cal method.  The model build­ing stage seeks to cre­ate explicit, for­mal, and reusable mod­els that artic­u­late the pat­terns and struc­tures found dur­ing inves­ti­ga­tion.  When val­i­da­tion of newly cre­ated ana­lyt­i­cal mod­els is nec­es­sary, the Data Sci­en­tist uses appro­pri­ate data — typ­i­cally data that was not part of explicit model cre­ation.  Finally, train­ing data is some­times nec­es­sary to put mod­els into pro­duc­tion — either using them for fur­ther steps in ana­lyt­i­cal work­flows (which can be very com­plex), or in busi­ness oper­a­tions out­side the ana­lyt­i­cal context.

Because so much dis­cov­ery activ­ity requires trans­for­ma­tion of the data before or dur­ing analy­sis, there is great inter­est in the Data Sci­ence and busi­ness ana­lyt­ics indus­tries in how Data Sci­en­tists and sense­mak­ers work with data at these var­i­ous stages.  Much of this atten­tion focuses on the need for bet­ter tools for trans­form­ing data in order to make analy­sis pos­si­ble.  This model does not explic­itly rep­re­sent wran­gling as an activ­ity, because it is not directly a part of the empir­i­cal dis­cov­ery method; trans­for­ma­tion is done only as and when needed to make analy­sis pos­si­ble.  How­ever, under­stand­ing the nature of wran­gling and trans­for­ma­tion activ­i­ties is a very impor­tant topic for grasp­ing dis­cov­ery, so I’ll address in later post­ings. (We have a good model for this too…)

Empir­i­cal dis­cov­ery efforts aim to cre­ate one or more of the three types of out­comes shown in orange: insights, mod­els, and data prod­ucts.  Insights, as we’ve defined them pre­vi­ously, are dis­cov­er­ies that change people’s per­spec­tive or under­stand­ing, not sim­ply the results of ana­lyt­i­cal activ­ity, such as the end val­ues of ana­lyt­i­cal cal­cu­la­tions, the gen­er­a­tion of reports, or the retrieval and aggre­ga­tion of stored information.

One of the most valu­able out­comes of dis­cov­ery efforts is the cre­ation of exter­nal­ized mod­els that describe behav­ior, struc­ture or rela­tion­ships in clear and quan­ti­fied terms.  The mod­els that result from empir­i­cal dis­cov­ery efforts can take many forms — google ‘pre­dic­tive model’ for a sense of the tremen­dous vari­a­tion in what peo­ple active in busi­ness ana­lyt­ics con­sider to be a use­ful model — but their defin­ing char­ac­ter­is­tic is that a model always describes aspects of a sub­ject of dis­cov­ery and analy­sis that are not directly present in the data itself.  For exam­ple, if given the node and edge data iden­ti­fy­ing all of the con­nec­tions between peo­ple in the social net­work above, one pos­si­ble model result­ing from analy­sis of the net­work struc­ture is a descrip­tive read­out of the topol­ogy of the net­work as scale-free, with some set of sub­graphs, a range of node cen­tral­ity val­ues’, a matrix of pos­si­ble short­est paths between nodes or sub­graphs, etc.  It is pos­si­ble to make sense of, inter­pret, or cir­cu­late a model inde­pen­dently of the data it describes and is derived from.

Data Sci­en­tists also engage with mod­els in dis­tinct and rec­og­niz­able ways dur­ing dis­cov­ery efforts.  Ref­er­ence mod­els, deter­mined by the domain of inves­ti­ga­tion, often guide exploratory analy­sis of dis­cov­ery sub­jects by pro­vid­ing Data Sci­en­tists with gen­eral  expla­na­tions and quan­tifi­ca­tions for processes and rela­tion­ships com­mon to the domain.  And the mod­els gen­er­ated as insight and under­stand­ing accu­mu­late dur­ing dis­cov­ery evolve in stages from ini­tial artic­u­la­tion through val­i­da­tion to readi­ness for pro­duc­tion imple­men­ta­tion; which means being put into effect directly on the oper­a­tions of the business.

Data prod­ucts are best under­stood as ‘pack­ages’ of data which have util­ity for other ana­lyt­i­cal or busi­ness pur­poses, such as a list of users in the social net­work who will form new con­nec­tions in response to system-generated sug­ges­tions of other sim­i­lar users.  Data prod­ucts are not lit­er­ally fin­ished prod­ucts that the busi­ness offers for exter­nal sale or con­sump­tion.  And as back­ground, we assume oper­a­tional­iza­tion or ‘imple­men­ta­tion’ of the out­comes of empir­i­cal dis­cov­ery efforts to change the func­tion­ing of the busi­ness is the goal of dif­fer­ent busi­ness processes, such as prod­uct devel­op­ment.  While empir­i­cal dis­cov­ery focuses on achiev­ing under­stand­ing, rather than mak­ing things, this is not the only thing Data Sci­en­tists do for the busi­ness.  The clas­sic def­i­n­i­tion of Data Sci­ence as aimed at cre­at­ing new prod­ucts based on data which impact the busi­ness, is a broad man­date, and many of the posi­tion descrip­tions for data sci­ence jobs require par­tic­i­pa­tion in prod­uct devel­op­ment efforts.

Two or more kinds of out­comes are often bun­dled together as the results of a gen­uinely suc­cess­ful dis­cov­ery effort; for exam­ple, an insight that two appar­ently uncon­nected busi­ness processes are in fact related through mutual feed­back loops, and a model explic­itly describ­ing and quan­ti­fy­ing the nature of the rela­tion­ships as dis­cov­ered through analysis.

There’s more to the story, but as one trip through the essen­tial ele­ments of empir­i­cal dis­cov­ery, this is a log­i­cal point to pause and ask what might be miss­ing from this model? And how can it be improved?


Comment » | Language of Discovery

The Sensemaking Spectrum for Business Analytics: Translating from Data to Business Through Analysis

June 10th, 2014 — 8:33am

One of the most com­pelling out­comes of our strate­gic research efforts over the past sev­eral years is a grow­ing vocab­u­lary that artic­u­lates our cumu­la­tive under­stand­ing of the deep struc­ture of the domains of dis­cov­ery and busi­ness analytics.

Modes are one exam­ple of the deep struc­ture we’ve found.  After look­ing at dis­cov­ery activ­i­ties across a very wide range of indus­tries, ques­tion types, busi­ness needs, and prob­lem solv­ing approaches, we’ve iden­ti­fied dis­tinct and recur­ring kinds of sense­mak­ing activ­ity, inde­pen­dent of con­text.  We label these activ­i­ties Modes: Explore, com­pare, and com­pre­hend are three of the nine rec­og­niz­able modes.  Modes describe *how* peo­ple go about real­iz­ing insights.  (Read more about the pro­gram­matic research and for­mal aca­d­e­mic ground­ing and dis­cus­sion of the modes here: By anal­ogy to lan­guages, modes are the ‘verbs’ of dis­cov­ery activ­ity.  When applied to the prac­ti­cal ques­tions of prod­uct strat­egy and devel­op­ment, the modes of dis­cov­ery allow one to iden­tify what kinds of ana­lyt­i­cal activ­ity a prod­uct, plat­form, or solu­tion needs to sup­port across a spread of usage sce­nar­ios, and then make con­crete and well-informed deci­sions about every aspect of the solu­tion, from high-level capa­bil­i­ties, to which spe­cific types of infor­ma­tion visu­al­iza­tions bet­ter enable these sce­nar­ios for the types of data users will analyze.

The modes are a pow­er­ful gen­er­a­tive tool for prod­uct mak­ing, but if you’ve spent time with young chil­dren, or had a really bad hang­over (or both at the same time…), you under­stand the dif­fi­cult of com­mu­ni­cat­ing using only verbs.

So I’m happy to share that we’ve found trac­tion on another facet of the deep struc­ture of dis­cov­ery and busi­ness ana­lyt­ics.  Con­tin­u­ing the lan­guage anal­ogy, we’ve iden­ti­fied some of the ‘nouns’ in the lan­guage of dis­cov­ery: specif­i­cally, the con­sis­tently recur­ring aspects of a busi­ness that peo­ple are look­ing for insight into.  We call these dis­cov­ery Sub­jects, since they iden­tify *what* peo­ple focus on dur­ing dis­cov­ery efforts, rather than *how* they go about dis­cov­ery as with the Modes.

Defin­ing the col­lec­tion of Sub­jects peo­ple repeat­edly focus on allows us to under­stand and artic­u­late sense mak­ing needs and activ­ity in more spe­cific, con­sis­tent, and com­plete fash­ion.  In com­bi­na­tion with the Modes, we can use Sub­jects to con­cretely iden­tify and define sce­nar­ios that describe people’s ana­lyt­i­cal needs and goals.  For exam­ple, a sce­nario such as ‘Explore [a Mode] the attri­tion rates [a Mea­sure, one type of Sub­ject] of our largest cus­tomers [Enti­ties, another type of Sub­ject] clearly cap­tures the nature of the activ­ity — explo­ration of trends vs. deep analy­sis of under­ly­ing fac­tors — and the cen­tral focus — attri­tion rates for cus­tomers above a cer­tain set of size cri­te­ria — from which fol­low many of the specifics needed to address this sce­nario in terms of data, ana­lyt­i­cal tools, and methods.

We can also use Sub­jects to trans­late effec­tively between the dif­fer­ent per­spec­tives that shape dis­cov­ery efforts, reduc­ing ambi­gu­ity and increas­ing impact on both sides the per­spec­tive divide.  For exam­ple, from the lan­guage of busi­ness, which often moti­vates ana­lyt­i­cal work by ask­ing ques­tions in busi­ness terms, to the per­spec­tive of analy­sis.  The ques­tion posed to a Data Sci­en­tist or ana­lyst may be some­thing like “Why are sales of our new kinds of potato chips to our largest cus­tomers fluc­tu­at­ing unex­pect­edly this year?” or “Where can inno­vate, by expand­ing our prod­uct port­fo­lio to meet unmet needs?”.  Ana­lysts trans­late ques­tions and beliefs like these into one or more empir­i­cal dis­cov­ery efforts that more for­mally and gran­u­larly indi­cate the plan, meth­ods, tools, and desired out­comes of analy­sis.  From the per­spec­tive of analy­sis this sec­ond ques­tion might become, “Which cus­tomer needs of type ‘A’, iden­ti­fied and mea­sured in terms of ‘B’, that are not directly or indi­rectly addressed by any of our cur­rent prod­ucts, offer ‘X’ poten­tial for ‘Y’ pos­i­tive return on the invest­ment ‘Z’ required to launch a new offer­ing, in time frame ‘W’?  And how do these com­pare to each other?”.  Trans­la­tion also hap­pens from the per­spec­tive of analy­sis to the per­spec­tive of data; in terms of avail­abil­ity, qual­ity, com­plete­ness, for­mat, vol­ume, etc.

By impli­ca­tion, we are propos­ing that most work­ing orga­ni­za­tions — small and large, for profit and non-profit, domes­tic and inter­na­tional, and in the major­ity of indus­tries — can be described for ana­lyt­i­cal pur­poses using this col­lec­tion of Sub­jects.  This is a bold claim, but sim­pli­fied artic­u­la­tion of com­plex­ity is one of the pri­mary goals of sense­mak­ing frame­works such as this one.  (And, yes, this is in fact a frame­work for mak­ing sense of sense­mak­ing as a cat­e­gory of activ­ity — but we’re not con­sid­er­ing the recur­sive aspects of this exer­cise at the moment.)

Com­pellingly, we can place the col­lec­tion of sub­jects on a sin­gle con­tin­uüm — we call it the Sense­mak­ing Spec­trum — that sim­ply and coher­ently illus­trates some of the most impor­tant rela­tion­ships between the dif­fer­ent types of Sub­jects, and also illu­mi­nates sev­eral of the fun­da­men­tal dynam­ics shap­ing busi­ness ana­lyt­ics as a domain.  As a corol­lary, the Sense­mak­ing Spec­trum also sug­gests inno­va­tion oppor­tu­ni­ties for prod­ucts and ser­vices related to busi­ness analytics.

The first illus­tra­tion below shows Sub­jects arrayed along the Sense­mak­ing Spec­trum; the sec­ond illus­tra­tion presents exam­ples of each kind of Sub­ject.  Sub­jects appear in col­ors rang­ing from blue to reddish-orange, reflect­ing their place along the Spec­trum, which indi­cates whether a Sub­ject addresses more the view­point of sys­tems and data (Data cen­tric and blue), or peo­ple (User cen­tric and orange).  This axis is shown explic­itly above the Spec­trum.  Anno­ta­tions sug­gest how Sub­jects align with the three sig­nif­i­cant per­spec­tives of Data, Analy­sis, and Busi­ness that shape busi­ness ana­lyt­ics activ­ity.  This ren­der­ing makes explicit the trans­la­tion and bridg­ing func­tion of Ana­lysts as a role, and analy­sis as an activity.


Sub­jects are best under­stood as fuzzy cat­e­gories [], rather than tightly defined buck­ets.  For each Sub­ject, we sug­gest some of the most com­mon exam­ples: Enti­ties may be phys­i­cal things such as named prod­ucts, or loca­tions (a build­ing, or a city); they could be Con­cepts, such as sat­is­fac­tion; or they could be Rela­tion­ships between enti­ties, such as the vari­ety of pos­si­ble con­nec­tions that define link­age in social net­works.  Like­wise, Events may indi­cate a time and place in the dic­tio­nary sense; or they may be Trans­ac­tions involv­ing named enti­ties; or take the form of Sig­nals, such as ‘some Mea­sure had some value at some time’ — what many enter­prises under­stand as alerts.

The cen­tral story of the Spec­trum is that though con­sumers of ana­lyt­i­cal insights (rep­re­sented here by the Busi­ness per­spec­tive) need to work in terms of Sub­jects that are directly mean­ing­ful to their per­spec­tive — such as Themes, Plans, and Goals — the work­ing real­i­ties of data (con­di­tion, struc­ture, avail­abil­ity, com­plete­ness, cost) and the chang­ing nature of most dis­cov­ery efforts make direct engage­ment with source data in this fash­ion impos­si­ble.  Accord­ingly, busi­ness ana­lyt­ics as a domain is struc­tured around the fun­da­men­tal assump­tion that sense mak­ing depends on ana­lyt­i­cal trans­for­ma­tion of data.  Ana­lyt­i­cal activ­ity incre­men­tally syn­the­sizes more com­plex and larger scope Sub­jects from data in its start­ing con­di­tion, accu­mu­lat­ing insight (and value) by mov­ing through a pro­gres­sion of stages in which increas­ingly mean­ing­ful Sub­jects are iter­a­tively syn­the­sized from the data, and recom­bined with other Sub­jects.  The end goal of  ‘lad­der­ing’ suc­ces­sive trans­for­ma­tions is to enable sense mak­ing from the busi­ness per­spec­tive, rather than the ana­lyt­i­cal perspective.

Syn­the­sis through lad­der­ing is typ­i­cally accom­plished by spe­cial­ized Ana­lysts using ded­i­cated tools and meth­ods. Begin­ning with some moti­vat­ing ques­tion such as seek­ing oppor­tu­ni­ties to increase the effi­ciency (a Theme) of ful­fill­ment processes to reach some level of prof­itabil­ity by the end of the year (Plan), Ana­lysts will iter­a­tively wran­gle and trans­form source data Records, Val­ues and Attrib­utes into rec­og­niz­able Enti­ties, such as Prod­ucts, that can be com­bined with Mea­sures or other data into the Events (ship­ment of orders) that indi­cate the work­ings of the business.

More com­plex Sub­jects (to the right of the Spec­trum) are com­posed of or make ref­er­ence to less com­plex Sub­jects: a busi­ness Process such as Ful­fill­ment will include Activ­i­ties such as con­firm­ing, pack­ing, and then ship­ping orders.  These Activ­i­ties occur within or are con­ducted by orga­ni­za­tional units such as teams of staff or part­ner firms (Net­works), com­posed of Enti­ties which are struc­tured via Rela­tion­ships, such as sup­plier and buyer.  The ful­fill­ment process will involve other types of Enti­ties, such as the prod­ucts or ser­vices the busi­ness pro­vides.  The suc­cess of the ful­fill­ment process over­all may be judged accord­ing to a sophis­ti­cated oper­at­ing effi­ciency Model, which includes tiered Mea­sures of busi­ness activ­ity and health for the trans­ac­tions and activ­i­ties included.  All of this may be inter­preted through an under­stand­ing of the oper­a­tional domain of the busi­nesses sup­ply chain (a Domain).

We’ll dis­cuss the Spec­trum in more depth in suc­ceed­ing posts.

Comment » | Big Data, Language of Discovery

Back to top