The
previous post has addressed how to measure the accuracy of predictive
models (i.e., Decision Tree, Naïve Bayes, Neural Network, and Clustering). This
post focuses on measuring the accuracy of nonpredictive
models.
1. Measuring the Accuracy of Association Rules
You
use three measures—Support, Probability,
and Importance to estimate the quality of the rules that the Association
Rules algorithm finds.
a. Support for Itemsets
Support
measures the number of cases in which the itemset is
included. It simply tells you how many times items were found together in the
basket. However, there is typically a direction in purchasing habits. For example,
in the United States, customers who buy a frozen pizza typically buy a soda as well.
However, customers who buy a soda do not always buy a frozen pizza. Thus, to
get the direction, you have to measure the probability of the rule, not the
probability of the itemset.
b. Probability for Rules
Rule
is directional. You express a rule by using a conditional sentence such as, “If
a customer purchases a frozen pizza, the customer purchases a soda as well.”
You can see the probabilities of the rules by using the Rules tab in the Mining
Model Viewer. If the probability for the rule “If a customer purchases a frozen
pizza, the customer purchases a soda as well.” is 1.00, it means that if a
customer has purchased a frozen pizza, he has always purchased a frozen pizza.
c. Importance for Rules
Importance
is the score of a rule. Positive Importance tells you that the probability that
product B will be in the basket increases when product A is in the basket.
Negative Importance means that the probability for product B goes down if
product A is in the basket. Zero Importance means that there is no association
between products A and B.
It is different from
probability, and is designed to measure the usefulness of a rule. In some
cases, although the probability that a rule will occur may be high, the
usefulness of the rule may be unimportant in itself. For example, if every itemset contains a specific state of an attribute, a rule
that predicts state is trivial, even though the probability is very high.
Importance is also referred
to as lift. For an association rule, Lift Importance is calculated by the
log likelihood of the right-hand side of the rule, given the left-hand side of
the rule. For example, in the rule If {A} Then {B}, Analysis
Services calculates the ratio of cases with A and B over cases with B but
without A, and then normalizes that ratio by using a logarithmic scale.
2. Measuring the Accuracy of Clustering and Sequence Clustering
a. Rule 1 – Use the business sense rather than Mathematics
Even
if a Clustering algorithm or Sequence Clustering algorithm model gives you a
good mathematical score for quality, it might not be useful in production. It
could, for example, have clusters that contain input variable values that are
difficult to use in the real world. Therefore, you need to analyze the clusters
generated by different models by using the Clustering viewers and then decide
which model to implement from a business perspective.
b. Rule 2 – Look at the MSOLAP_NODE_SCORE for the model
If
you really need to evaluate your clusters mathematically, Microsoft Clustering or Sequence
Clustering models
have a specific score which tells you how well the training data fits the
clusters detected during training. You can obtain that score with a query as
follows or from the Microsoft Generic Content Tree Viewer.
/*
MSOLAP_NODE_SCORE
0.950970832045941
*/
Select MSOLAP_NODE_SCORE FROM [TK448 Ch09 Cube Clustering].CONTENT WHERE
NODE_UNIQUE_NAME='000'
The
node with the unique name '000' is the top content node of a clustering model.
The score is a number between 0 and 1, the higher the better, and it measure
the quality of the detected clusters. A score of 1 tells that each of the
training cases fits perfectly at least one cluster detected by the algorithm
3. Measuring the Accuracy of Time Series
How
can we measure the quality of forecasted values with the Time Series algorithm
when we do not yet have the actual data? By using a specific number of periods
from the past, we can try to forecast present values. If the model performs
well for forecasting present values, there is a better probability that it will
perform well for forecasting future values. We control the creation of
historical models by using two algorithm parameters:
HISTORICAL_MODEL_COUNT
- controls the number of historical models to build.
HISTORICAL_MODEL_GAP-
specifies the time increments in which historical models to be built.
Let’s
see some examples. Figures below show the historical predictions on the sales
quantity of the M-200 model in the Pacific region.
(HISTORICAL_MODEL_COUNT
=1, HISTORICAL_MODEL_GAP =1): One model, one month increment.
(HISTORICAL_MODEL_COUNT =2, HISTORICAL_MODEL_GAP
=1): 2 models, one month increment. It means that when predicting the present value
on 200806, it uses the 2nd model (the one between the blue dots). When
predicting the value on 200805, it uses the 1st model.
(HISTORICAL_MODEL_COUNT
=2, HISTORICAL_MODEL_GAP =2): 2 models, 2 months increment. A-B-C is the 1st
model, C-D-E is the 2nd model.
(HISTORICAL_MODEL_COUNT =2, HISTORICAL_MODEL_GAP
=3): 2 models, 3 months increment. From A to B is one model and from B to C is
another model.
In general, the more historical models we build,
the more we can observe predictions in the past and gauge the accuracy of
future predictions.