Can someone please assist?
I have no problem using the provided Algorithms (NaiveBayes, Decision Tree, etc) from SQL Server 2005 Data Mining. For example: If I want to predict whether the customers want to buy bike from the following data, then I use Age, Salary, Gender as input/attribute/feature selection and BuyBike column as "Predict" column.
Table
Age Salary Gender BuyBike
However, say that I have 10,000 types of bikes to predict. How to do that?
Age Salary Gender BuyBike1 BuyBike2 BuyBike3 ...... BuyBike10000
Are there any online resources discussing this issue? I am desperately try to solve this problem. Please assist!
Mary
You can create a nested table. Based on data above, the model would look like:
(
[CustKey] KEY,
[Age] DOUBLE CONTINUOUS,
[Gender] TEXT DISCRETE,
[BikeModels] TABLE PREDICT
(
[Model] TEXT KEY
)
)
Then query the model (NB, DT etc) with a prediction statement like below:
SELECT Predict( BikeModels[, 5]) FROM Model
If you use the optional ",5" it will return the top 5 most likely predictions
More details:
http://www.sqlserverdatamining.com/DMCommunity/TipsNTricks/1090.aspx (details on the nested table concept)
http://www.sqlserverdatamining.com/DMCommunity/TipsNTricks/1061.aspx (impact of nested table on the model attributes)
http://msdn2.microsoft.com/en-us/library/ms132190.aspx (documentation for the DMX Predict function)
|||
Hello,
The provided solution (see the previous message) gave me the following error:
Query(2, 25) Parse: The syntax for '[,5]' is incorrect.
Please assist!
Mary
|||Perhaps this is the right solution:
SELECT Predict( [BikeModels], 5) FROM Model
instead of
SELECT Predict( BikeModels[, 5]) FROM Model
Mary
|||By [, 5] I meant that ", 5" is optional.
You can use either
SELECT Predict(BikeModels) FROM Model -- for all predictions
or
SELECT Predict(BikeModels, 5) FROM Model --for top 5 predictions
Sorry, I should have made it clear
No comments:
Post a Comment