Optimal Segments
and Optimal Messaging Raise and Fall
(Bayes Theorem,
Autoregressive Modeling, and Optimal
Messaging)
(How to Keep Up with the Ever Changing
Dynamic World of Messaging Needs of Consumers)
Key words: prior distribution, updated distribution (posterior
distribution), autoregressive modeling with covariates,
optimal message to consumers
Nethra Sambamoorthi, PhD, Sr. Consultant,
CRMportals Inc.,
Bayes Theorem provides a way to find out
the group from which an element comes (the probability that an
element belongs to a group), given the attributes of
the element. More explicitly, Bayes Theorem provides method
of calculating the probability that the (unit) element comes
from a specific group given the attributes of the element,
having known the probability distribution of the attributes of
a unit with in each group and the probability distribution of
the groups ( reverse information – is it accurate to call this
as inverse probability)
If you look at this question, this needs
to be cast in a particular framework where we accomplish the
promise of calculating such probabilities. We need
probabilities of steady state occurrence of the groups until
the last observation (which will get modified as new
observations gets accumulated; this probability distribution
is also called prior and the new calculated distribution is
called posterior). We need the probabilities of distribution
of the attributes of elements, within each group, also called
the marginal distribution of attributes with in each group.
Then, if we observe attributes of an individual then using
posterior probability calculation method, we can find the that
probability that the person belongs to a specific group and
hence the characteristics that group can be overlaid to that
particular group, accordingly the messaging could be devised
and served.
An Example:
Say we have three consumer segments,
namely people who buys mostly mutual funds, people who buys
mostly bonds, people who buy mostly stocks, and the chance
every one belongs to one of these groups in the population is
5 in 10, 3 in 10, and 2 in 10, respectively.

Mostly MF 
Mostly Bonds 
Mostly Stocks 
Probabilities 
0.5 
0.3 
0.2 
This is prior distribution
Also we want to understand that among
these each group, characteristics of the demographics as
follows (this example will get little more complex, after
this basic discussion)

Average in come <50000 
Education > 12 Yrs 
Mostly Mutual Funds 
0.3 
0.6 
Mostly Bonds 
0.3 
0.3 
Mostly Stocks 
0.6 
0.3 
This is a straight forward application of
Bayes theorem.
Key Points:
 The groups have their desired
attributes
 The probability distribution of groups
is prior distribution
 The starting distribution can be
uniform distribution
 The prior distribution gets modified
dynamically as the new data come in.
 The posterior distribution follows an
autoregressive process
 The autoregressive model may
additionally involve third party data and internal data
which are both time varying and time nonvarying factors
 The autoregressive model’s response
variable could be continuous or discrete with corresponding
modeling steps
References:

Bayesian Networks With Out Tears
 Lim,
T.S., Loh, W.Y. & Shih, Y.S. (1999). "A Comparison of
Prediction Accuracy, Complexity, and Training Time of
Thirtythree Old and New Classification Algorithms".
Machine Learning
 Modeling online browsing data and Path
Analysis Using Clickstream data:
Montgomery, Alan L.; Shibo Li; Srinivasan, Kannan; Liechty,
John C.. Marketing Science, Fall2004, Vol. 23 Issue 4,
p579595
.
