Wednesday, March 7, 2012

Data Mining

Hi,

I've come along way learning about databases and SQL. I can write basic queries now. Even some with subQueries. What I need to learn, is how to approach data mining. Can someone suggest the best path to follow, to learn how to accomplish data mining from a very large database?
I don't just need to produce reports of acquired data. I need to go in and grab data and look for patterns against known result sets. I hope that makes sense.

Thanks,
MilfredoWell, do you have a basic grounding in statistics?
TSQL lends itself fairly well to exception reporting, but unfortunately it does not have a rich function set for correlational studies. I've written my own code for simple linear regression, but I haven't attempted true multi-linear analysis.
SQL Server 2005 has the CLR toolkit that allows you to build custom aggregate functions, so it may prove more useful for data analysis and I wouldn't be surprised to see 3rd party developers marketing more advanced statistical functions. In the meantime, you will need to spin off subsets of data for further analysis is a stronger statistical tool, such as Excel or SAS.|||Thanks Blindman,

I'm looking into some third party software. There seems to be several things on the market that will alllow me to drill down into my data and do some analysis. For single user's the price isn't bad.

Milfredo|||If you are just looking at pivoting and drill-downs, a simple Excel pivot table linked to a SQL Server view might get you started.|||Thanks. I actually over simplified my project. I have a fairly extensive database, that I need to work with. I need to select certain data set and see what factors that I can add that will make it more predictive. I'm test driving a software called Purple Mineset at present.

Milfredo

No comments:

Post a Comment