The purpose of this project is to have you demonstrate how the data mining methods learnt can be
applied on real world problems. It will further allow you to develop a deeper understanding of different
algorithms by implementing them yourself and provide hands on experience using data mining methods
for real data.
You’re asked to perform the following steps:
2. Select the data: Once you’ve selected the topic and your reply posting is 1st ranking, then you need
to visit UCI Machine Learning repository to select the data set for your topic
([login to view URL]). Note that there may be more than one dataset
available for your problem; in that case, you’re free to choose whichever dataset you prefer.
3. Visualize the data: Using several visualization techniques, visualize and review your data (include
these visualizations in your report).
4. Preprocess your data using the techniques described in class. You will probably have to try different
techniques and assess their performance and make a final selection of preprocessing steps.
5. Select at the very least 2-3 different data mining algorithms and apply to your data. Please try as
many different algorithms as possible (whenever problem and data is supporting the use of such
algorithms). Note that grading of the project will take into account the number and variety of the
algorithms implemented for your problem.
6. Analyze your results: Please state the performance of each of the algorithms implemented for your
dataset using common performance measures such as accuracy, recall, F-measure, sensitivity,
loss/error rate, ROC curves etc.
7. Write a report about your implementation and analysis. Please include results from every step
including, preprocessing (i.e. what steps are taken, what impact has been obtained, etc.),
visualization (plots, graphs, and comments based on these visualizations), data mining algorithms
(how the settings and parameters are determined, what difficulties are experienced if any,
performance metrics used, the strategies used to prevent overfitting, i.e. training-validation-testing
splits), and your overall algorithm recommendation for the dataset selected.
Reports should not exceed 30 pages (including graphs, code snippets and screenshots), typed single
space with 12 pt. font of Times New Roman or Arial, and must have the following sections: Overview
of the problem (describe the problem and its importance), Dataset Overview (describe the data),
Data Preprocessing, Algorithm Selection, Analysis Results and Comparison, Conclusion.
8. In your submission, please submit both the code (can be included as R file or as a separate
text/word file) as well as the report.
Return your code and the repor
IIT Roorkee undergraduate student. 3+ years of experience in the Data Science domain. Proficient in Python, R etc. Have already solved multiple datasets in the above-mentioned reporsitory. Relevant Skills and Experien Plus
10 freelance font une offre moyenne de $177 pour ce travail
I have a good hand on working with Advanced Excel, R and Python. I have quite a good knowledge of deep learning Algorithm , have also developed dashboards and Shiny Web Application in R. Relevant Skills and Experience Plus
Hello, how are you? I have read the details provided, but please contact me so that we can discuss more on the project. I don't outsource like most people do ensuring quality work on time Relevant Skills and Experien Plus
Hi, I have worked on multiple data mining projects and am well versed with statistical and machine learning techniques. I have understood your requirements and can deliver according to your expectations. We can discuss Plus
I have a strong background of theoretical statistics and probability theory.I graduated from Indian Institute of Technology, Kharagpur in 2016. I am currently enrolled in integrated PhD in mathematics. I have experien Plus
Hello Sir, I have read your project requirement and i can deliver your project with 100% quality . please give me chance to work on your job. Relevant Skills and Experience I have more than 7 years of experience in D Plus
Hello ! I am data enthusiast and worked on many statistics and data analytics projects. Its my specialization and you can check my reviews as well, all my customers are passed with high numbers. Please contact with me Plus
View my last projects based on Data Mining, Machine Learning, Artificial Intelligence, python, java and I can complete your project perfectly. www.freelancer.com/u/vorasiddh4it#/reviews Relevant Skills and Experience Plus