Data transformation services: Data Mining Problem: Is that possible to predict Many Many Columns?

Thursday, March 8, 2012

Data Mining Problem: Is that possible to predict Many Many Columns?

Hello,

Can someone please assist?
I have no problem using the provided Algorithms (NaiveBayes, Decision Tree, etc) from SQL Server 2005 Data Mining. For example: If I want to predict whether the customers want to buy bike from the following data, then I use Age, Salary, Gender as input/attribute/feature selection and BuyBike column as "Predict" column.

Table
Age Salary Gender BuyBike

However, say that I have 10,000 types of bikes to predict. How to do that?
Age Salary Gender BuyBike1 BuyBike2 BuyBike3 ...... BuyBike10000

Are there any online resources discussing this issue? I am desperately try to solve this problem. Please assist!

Mary

You can create a nested table. Based on data above, the model would look like:

(

[CustKey] KEY,

[Age] DOUBLE CONTINUOUS,

[Gender] TEXT DISCRETE,

[BikeModels] TABLE PREDICT

(

[Model] TEXT KEY

)

Then query the model (NB, DT etc) with a prediction statement like below:

SELECT Predict( BikeModels[, 5]) FROM Model

If you use the optional ",5" it will return the top 5 most likely predictions

More details:

http://www.sqlserverdatamining.com/DMCommunity/TipsNTricks/1090.aspx (details on the nested table concept)

http://www.sqlserverdatamining.com/DMCommunity/TipsNTricks/1061.aspx (impact of nested table on the model attributes)

http://msdn2.microsoft.com/en-us/library/ms132190.aspx (documentation for the DMX Predict function)

|||

Hello,

The provided solution (see the previous message) gave me the following error:

Query(2, 25) Parse: The syntax for '[,5]' is incorrect.

Please assist!

Mary

|||

Perhaps this is the right solution:

SELECT Predict( [BikeModels], 5) FROM Model

instead of

SELECT Predict( BikeModels[, 5]) FROM Model

Mary

|||

By [, 5] I meant that ", 5" is optional.

You can use either

SELECT Predict(BikeModels) FROM Model -- for all predictions

SELECT Predict(BikeModels, 5) FROM Model --for top 5 predictions

Sorry, I should have made it clear

Thursday, March 8, 2012

Data Mining Problem: Is that possible to predict Many Many Columns?

No comments:

Post a Comment

Data transformation services

Blog Archive

About Me