EEE314 Introduction to Data Mining

Course Objectives:

  • Gain a solid understanding of the fundamental concepts and techniques of data mining.
  • Develop practical skills in data preprocessing, analysis, and visualization.
  • Learn to apply classification, clustering, association analysis, anomaly detection, and dimensionality reduction techniques to real-world datasets.
  • Complete a data mining project from start to finish, demonstrating the ability to extract valuable insights from data.

TextBook & Lecture Notes: “Introduction to Data Mining” by Tan, Steinbach, and Kumar (First Edition)

Week 1: Introduction to Data Mining

  • Overview of Data Mining: Scope, definitions, and fundamental concepts.
  • Importance and Applications of Data Mining.

Week 2: Data

  • Understanding Types of Data.
  • Data Quality and Preprocessing Techniques.
  • Lab Session: Data preprocessing with Python.

Week 3: Exploring Data

  • Techniques for Data Visualization.
  • Summary Statistics for Understanding Data.

Week 4: Classification: Basic Concepts, Decision Trees, and Model Evaluation

  • Introduction to Classification and Decision Trees.
  • Model Evaluation Metrics.
  • Lab Session: Implementing decision trees and evaluating model performance.

Week 5-6: Classification: Alternative Techniques

  • Advanced Classification Algorithms: k-NN, SVM, Neural Networks.
  • Comparison and Selection of Classification Techniques.

Week 7: Association Analysis: Basic Concepts and Algorithms

  • Market Basket Analysis and the Apriori Algorithm.
  • Project: Perform association analysis on a retail dataset, due in two weeks.

Week 8: Midterm Exam

Week 9: Association Analysis: Advanced Concepts

  • Advanced Association Analysis Algorithms.
  • Enhancements in Association Analysis.
  • Lab Session: Implementing the FP-growth algorithm.

Week 10: Cluster Analysis: Basic Concepts and Algorithms

  • Clustering Techniques and their Applications.
  • Assignment: Clustering a dataset and analyzing the results.

Week 11: Anomaly Detection and Dimensionality Reduction

  • Anomaly Detection: Concepts, Applications, and Techniques.
  • Dimensionality Reduction: PCA, SVD, and t-SNE.
  • Lab Session: Anomaly detection and implementing PCA.

Weeks 12-14: Project Presentations

  • Students will present their final projects, which should incorporate the concepts and techniques learned throughout the course.
  • Each presentation will include a discussion of the problem statement, methodology, data analysis, results, and conclusions.