Artificially Intelligent Specification and Analysis of Context-Dependent Attribute Preferences

2 The competitive set

The data sets used to test the models reported in this paper cover at least 65 and up to nearly 200 brands of alcoholic beverage. The individual brands are types of whisky (Scotch, Irish, Canadian, Bourbon, etc.), white spirits (gin and various sorts of vodka), fortified wines, brandies and liqueurs. Choosing any one brand in such a data set, it is by no means obvious which of the other brands are in its set of competitors. A Scotch whisky, for example, could compete with brandies, other types of whisky or even liqueurs and fortified wines in different contexts.

Using algorithms driven by a knowledge-based system as reported by [3], we inferred from data on prices and sales volumes for all brands in the data set the chief price-competitors of the brand we chose as the focus of each simulation run with the CDAP model. These algorithms have been developed and integrated ad hoc and the demonstration of their formal properties is reserved for further research. Nonetheless, they do guide and inform the specification of the competitive sets by the marketing professionals and, so, we use them here in full recognition of their possible formal weaknesses. Our justification is that the knowledge-base describes the actual (though ad hoc) procedures used on actual data sets to inform the development of marketing strategies in earnest.

The determination of the competitive set of a focus brand proceeds in three stages:

the determination of a plausible superset of the competitive set by linear regression of market share of the focus brand on some transform of the price of each of the other brands for which data is held as well as regressing the shares of the other brands on the same price variable of the focus brand;*1

the elimination of some brands from that superset by a set of multiple regressions developed from the AIDS algorithm but without the symmetry restriction mentioned above;

some further elimination of brands from the competitive set together with analysis of the changes in competitive structures over the data period based on a non-linear (local-regression-based) generalization of the second stage.

Because the number of brands is often large in relation to the number of observations (up to 2500 brands with no more than 200 observations) there are insufficient degrees of freedom to begin with the second or third steps involving multiple regressions over all other brands.

Effectively, the first cut at the competitive set was to include all brands for which all of the relevant regression coefficients were of the correct sign with t-statistics greater than 5 in magnitude. The relevant coefficients were the OLS coefficients on price with focus-brand value share as the dependent variable and, where these were of appropriate sign and magnitude, the coefficient on the focus brand price with the competing brand's value share as the dependent variable. The point here is to ensure that the competition goes both ways even if the cross-price elasticities are not symmetrical.

The second cut used similar criteria but in an OLS regression of the value share of the focus brand against a log transform of all of the price variables in the first-cut competitive set. Brands were winnowed out of this set one at a time either because in the regressions their coefficients had the wrong sign (indicating that they were complements rather than competitors) or, if all coefficients were of the appropriate sign, because their coefficients were the least significant.

At this stage, it is usual for some surprising brands to be left in the competitive set. If the set has been cut down too far, some brands that the domain experts would expect to be in the competitive set are left out. Since we want to be sure that no actual competitors are left out of further consideration, we typically make the second-cut competitive set rather larger that the size of the set we intend to end up with. In general, the marketing professionals are interested in the half-dozen or so most important competitors. If we leave 15 to 20 brands in the second-cut competitive set, then they have some confidence that a set of that size will include all of the most important six to eight competitors.

The reason that inappropriate brands are left in the second-cut competitive set is that some ephemeral strategy has brought them temporarily into competition with the focus brand. Usually, this will be because of some special offer which increases their sales while the offer remains in force but does not lead to a long-term increase in market share. The problem here is that the assumption of a linear relation between market share and competitors' prices may yield a spurious result in which a few large and systematic fluctuations in volumes and prices are averaged out over all observations and make the constant coefficients and t-statistics larger and apparently more significant than would be the case without those few fluctuations.

In order to identify the brands which should be in the competitive set but might not be captured by linear regressions and their interpretations as well as those that should not be captured, we employ non-linear, local regression. This procedure produces a regression coefficient for each observation and each regressor.*2

The interpretation of the time patterns of local regression coefficients is determined by rules and is entirely declarative. The rules "look" at the patterns of levels and first and second differences in the coefficients on each regressor over the data period. The coefficients of interest are those on the price variables where the dependent variable is the focus brand's value share. These will be significant and positive for competing brands and insignificant or negative for non-competing brands. A positive coefficient indicates that the focus brand will lose (resp. gain) share if the price of another brand falls (resp. rises). Because the price variable used is a log transform of the actual price, the coefficient is the elasticity of focus-brand value share with respect to the price of the other brand. Thus, a high coefficient value indicates a high elasticity and, therefore, more competitiveness.

The aim of the rulebase in this regard is to identify brands which are consistently strong competitors of the focus brand and to include brands which become competitors over the data period and to eliminate from consideration those which have ceased to be competitors during the data period. The time pattern of the local regression price coefficients identifies which brands fall into these various categories.

Artificially Intelligent Specification and Analysis of Context-Dependent Attribute Preferences - 03 NOV 97

[Next] [Previous] [Top] [Contents]

Generated with CERN WebMaker