INSS 662 Project : You are required to use Weka or other open source data mining software, not Watson.
1. Find an open dataset on the Internet, see below
2. Conduct appropriate data mining activities and report the processes and outcomes. You need to use at least three competitive algorithms from the same or different classes of Data mining or Machine Learning techniques. See the lecture slides for Chapter 5 for algorithms and techniques under course materials.
3. Present your results.
Possible data sets to choose from:
Data from Hackathon such as the Lord of the Machines – Data Science Hackathon, DataHack Premier League, Mckinsey Analytics Online Hackathon, etc. (Be brave to take challenging problem …)
Possible sources are:
a. https://archive.ics.uci.edu/ml/index.php
b. https://www.kaggle.com/datasets
c. https://www.kdnuggets.com/datasets/index.html (Datasets for Data Mining and Data Science)
d. open government dataset @ https://catalog.data.gov/dataset?groups=local
e. Dataset from URL: https://www.ibm.com/communities/analytics/watson-analytics-blog/guide-to-sample-datasets/
OR
f. Other datasets after getting approval from the instructors.