Top 10 challenging problems in data mining
- Developing a unifying theory of data mining
- Scaling up for high dimensional data and high speed data streams
- Mining sequence data and time series data
- Mining complex knowledge from complex data
- Data mining in a network setting
- Distributed data mining and mining multi-agent data
- Data mining for biological and environmental problems
- Data Mining process-related problems
- Security, privacy and data integrity
- Dealing with non-static, unbalanced and cost-sensitive data
I sometimes receive emails from master student or practitioners interested in data mining. The usual question is “What can I do as research in data mining?”. Of course, the answer depends on what you like and the opportunities of the moment. However, this paper can maybe give some hints on possible directions for research.
As usual, the “data mining automation process” issue is mentioned. It is worth noting that researchers argue that they need to find a way to automate data mining, while practitioners say that they can do it (for example KXEN). Finally, I think that one of the most important issue is pointed out by the following sentence in the paper:
“[...] they’re [data mining systems] unable to relate the results of mining to the real-world decisions they affect [...]“
In my opinion, it is more subjective to rank top problems than top algorithms. Most people will certainly agree on the selected data mining algorithms. The question is more subjective regarding data mining problems since some of them may only be relevant to certain fields of research.
No comments:
Post a Comment