SQL Server Administration, Development and B.I. Development related
The accuracy of the results in Data Mining Viewer
In doing the Exercise 6 in the 70-448 self-paced training kit (p.391), I found that my results in the Data Mining Viewer in step 2 and step 3 (with total children instead of age) are different from the book sample. Later, I realize my case sample has a different composition from the book (p.383): 6536 vs. 6403, comparing with 6509 vs. 6430 in the book. Since I did not change the vTargetMail data, I suspect these differences come from the 70% random sampling in creating the mining structure model. Now the issue is: If the results are affected by the random sampling for the training process, how reliable of the model results?