Optimal Segments and Optimal Messaging Raise and Fall

(Bayes Theorem, Autoregressive Modeling, and Optimal Messaging)

(How to Keep Up with the Ever Changing Dynamic World of Messaging Needs of Consumers)

Key words: prior distribution, updated distribution (posterior distribution), autoregressive modeling with covariates, optimal message to consumers

Nethra Sambamoorthi, PhD, Sr. Consultant, CRMportals Inc.,

Bayes Theorem provides a way to find out the group from which an element comes (the probability that an element belongs to a group), given the attributes of the element.  More explicitly, Bayes Theorem provides method of calculating the probability that the (unit) element comes from a specific group given the attributes of the element, having known the probability distribution of the attributes of a unit with in each group and the probability distribution of the groups ( reverse information – is it accurate to call this as inverse probability)

If you look at this question, this needs to be cast in a particular framework where we accomplish the promise of calculating such probabilities.  We need probabilities of steady state occurrence of the groups until the last observation (which will get modified as new observations gets accumulated; this probability distribution is also called prior and the new calculated distribution is called posterior).  We need the probabilities of distribution of the attributes of elements, within each group, also called the marginal distribution of attributes with in each group.  Then, if we observe attributes of an individual then using posterior probability calculation method, we can find the that probability that the person belongs to a specific group and hence the characteristics that group can be overlaid to that particular group, accordingly the messaging could be devised and served.

An Example:

Say we have three consumer segments, namely people who buys mostly mutual funds, people who buys mostly bonds, people who buy mostly stocks, and the chance every one belongs to one of these groups in the population is 5 in 10, 3 in 10, and 2 in 10, respectively.

 

Mostly MF

Mostly Bonds

Mostly Stocks

 Probabilities

0.5

0.3

0.2

--This is prior distribution

Also we want to understand that among these each group, characteristics of the demographics as follows  (this example will get little more complex, after this basic discussion) 

 

Average in come <50000

Education > 12 Yrs

Mostly Mutual Funds

0.3

0.6

Mostly Bonds

0.3

0.3

Mostly Stocks

0.6

0.3

This is a straight forward application of Bayes theorem.

Key Points:

  • The groups have their desired attributes
  • The probability distribution of groups is prior distribution
  • The starting distribution can be uniform distribution
  • The prior distribution gets modified dynamically as the new data come in.
  • The posterior distribution follows an autoregressive process
  • The autoregressive model may additionally involve third party data and internal data which are both time varying and time non-varying factors
  • The autoregressive model’s response variable could be continuous or discrete with corresponding modeling steps

References:

  1. Bayesian Networks With Out Tears
  1. Lim, T.-S., Loh, W.-Y. & Shih, Y.-S. (1999). "A Comparison of Prediction Accuracy, Complexity, and Training Time of Thirty-three Old and New Classification Algorithms". Machine Learning
  1. Modeling online browsing data and Path Analysis Using Clickstream data: Montgomery, Alan L.; Shibo Li; Srinivasan, Kannan; Liechty, John C.. Marketing Science, Fall2004, Vol. 23 Issue 4, p579-595

.