What is a good bachelor’s thesis topic in data mining?
A good bachelor’s thesis requirements are about the same as a good PhD thesis.
(1) You are given some data set to analyze (or sometimes you have to collect data). The data is often in service of some real world problem.
(2) You formulate some possible algorithmic approaches to solving the problem. There should be a set of both simple and more advanced approaches in your plan. If you can solve the problem more simply, you should. If an advanced approach has utility, you should show its benefit over the simpler approaches.
(3) You iterate and optimize your algorithms as needed based on analysis and re-analysis of the data, but showing good data mining methodology (i.e. avoiding overfitting, keeping the false positive rate down, appropriately visualizing the data, etc).
In terms of raw topics, data mining is often not done in a vacuum but with respect to some data you are interested in analyzing from some secondary interest. Maybe you have an interest in personalized medicine, biology, neuroscience, or music. I mostly work on Robotics Robotics/Engineering domains so my PhD thesis is on all sorts of sensor data.
In general, as a data miner, you will gain an appreciation of techniques that can be broadly applicable to a wide variety of future problems, not just the current one you’re solving. It should be worth noting that an exceptionally good thesis could advance data mining as a whole, instead of such application of data mining to a single domain. This can be difficult to scope for a bachelor’s thesis and to show utility, but it has been done (i.e. see Machine Learning side of this field).
下面这个是说数据挖掘方面博士论文,也可以参考下。
How to choose a good thesis topic in Data Mining?
How to choose a good thesis topic in Data Mining?
I have seen many people asking for help in data mining forums and on other websites about how to choose a good thesis topicin data mining. Therefore, in this this post, I will address this question.
The first thing to consider is whether you want to design/improve data mining techniques, apply data mining techniques or do both. Personally, I think that designing or improving data mining techniques is more challenging than using already existing techniques. Moreover, you can make a more fundamental contribution if you work on improving data mining techniques instead of applying them. However, you need to be aware that improving data mining techniques may require better algorithmic and/or mathematics skills.
The second thing to consider is what kind of techniques you want to apply or design/improve? Data mining is a broad field consisting of many techniques such as neural networks, association rule mining algorithms, clustering and outlier detection. You should try to get some overview of the different techniques to see what you are more interested in. To get a rough overview of the field, you could read some introduction books on data mining such as the book by Tan, Steinbach & Kumar (Introduction to data mining) or read websites and articles related to data mining. If your goal is just to applydata mining techniques to achieve some other purpose (e.g. analysing cancer data) but you don’t know which one yet, you could skip this question.
The third thing to consider is which problems you want to solve or what you want to improve. This requires more thoughts. A good way is to look at recent good data mining conferences (KDD, ICDM, PKDD, PAKDD, ADMA, DAWAK, etc.) and journals (TKDE, TKDD, KAIS, etc.), or to attend conferences, if possible, and talk with other researchers. This helps to see what are the current popular topics and what kind of problems researchers are currently trying to solve. It does not mean that you need to work on the most popular topic. Working on a popular topic (e.g. social network mining) has several advantages. It is easier to get grants or in some case to get your papers accepted in special issues, workshops, etc. However, there are also some “older” topics that are also interesting even if they are not the current flavor of the day. Actually, the most important is that you find a topic that you like and will enjoy working on it for perhaps a few years of your life. Finding a good problem to work on can require to read several articles to understand what are the limitations of current techniques and decide what can be improved. So don’t worry. It is normal that it takes time to find a more specific topic.
Fourth, one should not forget that helping to choose a thesis topic is also the job of the professor that supervise the Master or Ph.D Students. Therefore, if you are looking for a thesis topic, it is good to talk with your supervisor and ask for suggestions. He should help you. If you don’t have a supervisor yet, then try to get a rough idea of what you like, and try to meet/discuss with professors that could become your supervisors. Some of them will perhaps have some research projects and ideas that they could give you if you work with them. Choosing a supervisor is a very important and strategic decision that every graduate student has to make. For more information about choosing a supervisor, you can read this post : How to choose a research advisor for M.Sc. / Ph.D ?
Lastly, I would like to discuss the common question “please give me a Ph.D. topic in data mining“, that I read on websites and that I sometimes receive in my e-mails. There are two problems with this question. The first problem is that it is too general. As mentioned, data mining is a very broad field. For example, I could suggest you some very specific topics such as detecting outliers in imbalanced stock market data or to optimize the memory efficiency of subgraph mining algorithms for community detection in social networks. But will you like it? It is best to choose something by yourself that you like. The second problem with the above question is that choosing a topic is the work that a researcher should do or learn to do. In fact, in research, it is equally important to be able to find a good research problem as it is to find a good solution. Therefore, I highly recommend to try to find a research topic by yourself, as it is important to develop this skill to become a successful researcher. If you are a student, when searching for a topic, you can ask your research advisor to guide you.
Also, just for fun, here is a Ph.D thesis title generator.
If you like this blog, you can subscribe to the RSS Feed or my Twitter account () to get notified about future blog posts.
回复 ( 1 )
正好我一朋友想做这方面的毕业论文,我就帮着搜了一下,看到quora上面有相关问题,但我对该领域了解不多(我毕设是搜索方面),就不多说,转过来了,希望对你有帮助(比较忙,就不翻译了)。
What is a good bachelor’s thesis topic in data mining?
A good bachelor’s thesis requirements are about the same as a good PhD thesis.
(1) You are given some data set to analyze (or sometimes you have to collect data). The data is often in service of some real world problem.
(2) You formulate some possible algorithmic approaches to solving the problem. There should be a set of both simple and more advanced approaches in your plan. If you can solve the problem more simply, you should. If an advanced approach has utility, you should show its benefit over the simpler approaches.
(3) You iterate and optimize your algorithms as needed based on analysis and re-analysis of the data, but showing good data mining methodology (i.e. avoiding overfitting, keeping the false positive rate down, appropriately visualizing the data, etc).
In terms of raw topics, data mining is often not done in a vacuum but with respect to some data you are interested in analyzing from some secondary interest. Maybe you have an interest in personalized medicine, biology, neuroscience, or music. I mostly work on Robotics Robotics/Engineering domains so my PhD thesis is on all sorts of sensor data.
In general, as a data miner, you will gain an appreciation of techniques that can be broadly applicable to a wide variety of future problems, not just the current one you’re solving. It should be worth noting that an exceptionally good thesis could advance data mining as a whole, instead of such application of data mining to a single domain. This can be difficult to scope for a bachelor’s thesis and to show utility, but it has been done (i.e. see Machine Learning side of this field).
下面这个是说数据挖掘方面博士论文,也可以参考下。
How to choose a good thesis topic in Data Mining?
How to choose a good thesis topic in Data Mining?
I have seen many people asking for help in data mining forums and on other websites about how to choose a good thesis topicin data mining. Therefore, in this this post, I will address this question.
The first thing to consider is whether you want to design/improve data mining techniques, apply data mining techniques or do both. Personally, I think that designing or improving data mining techniques is more challenging than using already existing techniques. Moreover, you can make a more fundamental contribution if you work on improving data mining techniques instead of applying them. However, you need to be aware that improving data mining techniques may require better algorithmic and/or mathematics skills.
The second thing to consider is what kind of techniques you want to apply or design/improve? Data mining is a broad field consisting of many techniques such as neural networks, association rule mining algorithms, clustering and outlier detection. You should try to get some overview of the different techniques to see what you are more interested in. To get a rough overview of the field, you could read some introduction books on data mining such as the book by Tan, Steinbach & Kumar (Introduction to data mining) or read websites and articles related to data mining. If your goal is just to applydata mining techniques to achieve some other purpose (e.g. analysing cancer data) but you don’t know which one yet, you could skip this question.
The third thing to consider is which problems you want to solve or what you want to improve. This requires more thoughts. A good way is to look at recent good data mining conferences (KDD, ICDM, PKDD, PAKDD, ADMA, DAWAK, etc.) and journals (TKDE, TKDD, KAIS, etc.), or to attend conferences, if possible, and talk with other researchers. This helps to see what are the current popular topics and what kind of problems researchers are currently trying to solve. It does not mean that you need to work on the most popular topic. Working on a popular topic (e.g. social network mining) has several advantages. It is easier to get grants or in some case to get your papers accepted in special issues, workshops, etc. However, there are also some “older” topics that are also interesting even if they are not the current flavor of the day. Actually, the most important is that you find a topic that you like and will enjoy working on it for perhaps a few years of your life. Finding a good problem to work on can require to read several articles to understand what are the limitations of current techniques and decide what can be improved. So don’t worry. It is normal that it takes time to find a more specific topic.
Fourth, one should not forget that helping to choose a thesis topic is also the job of the professor that supervise the Master or Ph.D Students. Therefore, if you are looking for a thesis topic, it is good to talk with your supervisor and ask for suggestions. He should help you. If you don’t have a supervisor yet, then try to get a rough idea of what you like, and try to meet/discuss with professors that could become your supervisors. Some of them will perhaps have some research projects and ideas that they could give you if you work with them. Choosing a supervisor is a very important and strategic decision that every graduate student has to make. For more information about choosing a supervisor, you can read this post : How to choose a research advisor for M.Sc. / Ph.D ?
Lastly, I would like to discuss the common question “please give me a Ph.D. topic in data mining“, that I read on websites and that I sometimes receive in my e-mails. There are two problems with this question. The first problem is that it is too general. As mentioned, data mining is a very broad field. For example, I could suggest you some very specific topics such as detecting outliers in imbalanced stock market data or to optimize the memory efficiency of subgraph mining algorithms for community detection in social networks. But will you like it? It is best to choose something by yourself that you like. The second problem with the above question is that choosing a topic is the work that a researcher should do or learn to do. In fact, in research, it is equally important to be able to find a good research problem as it is to find a good solution. Therefore, I highly recommend to try to find a research topic by yourself, as it is important to develop this skill to become a successful researcher. If you are a student, when searching for a topic, you can ask your research advisor to guide you.
Also, just for fun, here is a Ph.D thesis title generator.
If you like this blog, you can subscribe to the RSS Feed or my Twitter account () to get notified about future blog posts.