Syllabus




III Year II Semester



L
T
P
C

Subject Code: 16CS6T16



4
0
0
3

DATA WAREHOUSING AND MINING
SYLLABUS

Learning Objectives:

1.  Students will be enabled to understand and implement classical models and algorithms in  data warehousing and data mining.

2.  They will learn how to analyze the data, identify the problems, and choose the relevant models and algorithms to apply.

3.  They will further be able to assess the strengths and weaknesses of various methods and algorithms and to analyze their behavior.

Course Outcomes:

 The student will be able to

COURSE OUTCOME

COURSE OUTCOMES
BLOOMS
TAXONOMY LEVEL
CO-1
Implement data warehouse for heterogeneous data.
Applying
CO-2
Analyze real time datasets with basic summary statistics.
Analyzing
CO-3
Apply      different      preprocessing      methods,      Similarity,
Dissimilarity measures for any given raw data.
Applying
CO-4
Construct a decision tree and resolve the problem of model
overfitting
Applying
CO-5
Compare Apriori and FP-growth association rule mining
Analyzing


algorithms for frequent itemset generation

CO-6
Apply suitable clustering algorithm for the given data set
Applying

The Mapping of CO and PO on 3 point scale{high-3,Medium-2,Low-1}is:

PO-
1
PO-
2
PO-
3
PO-
4
PO-
5
PO-
6
PO-
7
PO-
8
PO-
9
PO-
10
PO-
11
PO-
12
PSO-
1
PSO-
2
PSO-
3
CO-1
3
2
2
1
1
-
-
-
-
-
-
-
2
1
-
CO-2
3
3
3
1
1
-
-
-
-
-
-
-
2
1
-
CO-3
3
3
3
1
1
-
-
-
-
-
-
-
2
1
-
CO-4
3
3
3
1
1
-
-
-
-
-
-
-
2
1
-
CO-5
3
3
3
1
1
-
-
-
-
-
-
-
2
1
-
CO-6
3
3
3
1
1
-
-
-
-
-
-
-
2
1
-

UNIT –I

Data Warehouse and OLAP Technology: An Overview: What Is a Data Warehouse? A Multidimensional Data Model, Data Warehouse Architecture, Data Warehouse Implementation, From Data Warehousing to Data Mining. (Han &Kamber)

 UNIT –II
Data Mining: Introduction, what is Data Mining? Motivating challenges, The origins of Data Mining, Data Mining Tasks, Types of Data, Data Quality, Exploring Data, The Iris Dataset, summary statistics (Tan & Vipin)

UNIT –III

Data Preprocessing: Aggregation, Sampling, Dimensionality Reduction, Feature Subset Selection, Feature creation, Discretization and Binarization, Variable Transformation, Measures of Similarity and Dissimilarity. (Tan & Vipin)

UNIT –IV

Classification: Basic Concepts, General Approach to solving a classification problem, Decision Tree Induction: Working of Decision Tree, building a decision tree, methods for expressing an attribute test conditions, measures for selecting the best split, Algorithm for decision tree induction.
Model Overfitting: Due to presence of noise, due to lack of representation samples, evaluating the performance of classifier: holdout method, random sub sampling, cross-validation, bootstrap. (Tan & Vipin). Alterative Techniques: Bayes’ Theorem, Naïve Bayesian Classification

UNIT –V

Association Analysis: Basic Concepts and Algorithms: Problem Definition, Frequent Item Set Generation using Apriori, Rule Generation, Compact Representation of Frequent Itemsets, FP- Growth Algorithm. (Tan & Vipin)

UNIT –VI

Cluster Analysis: Basic Concepts and Algorithms: Overview, What Is Cluster Analysis? Different Types of Clustering, Different Types of Clusters; K-means: The Basic K-means Algorithm, K-means Additional Issues, Bisecting K-means, Strengths and Weaknesses; Agglomerative Hierarchical Clustering: Basic Agglomerative Hierarchical Clustering Algorithm DBSCAN: Traditional Density Center-Based Approach, DBSCAN Algorithm, Strengths and Weaknesses. (Tan & Vipin)

TEXT BOOKS:

1.   “Introduction to Data Mining,” Pang-Ning Tan & Michael Steinbach, Vipin Kumar, Pearson, 2nd edition, 2013.
2.  “Data Mining concepts and Techniques,” Jiawei Han, Michel Kamber, Elsevier, 3rd edition.2011.



REFERENCE BOOKS:

1.              “Data Mining Techniques and Applications: An Introduction,”Hongbo Du, Cengage Learning, 2010.

2.              “Data Mining: Introductory and Advanced topics,” Dunham, Pearson, 3rd  edition, 2008.
3.              “Data Warehousing Data Mining & OLAP,” Alex Berson, Stephen Smith, TMH, 2008.
4.              “Data Mining Techniques,”Arun K Pujari, Universities Press, 2005.

Web Resources

1.              https://onlinecourses.nptel.ac.in/noc18_cs14/preview (PabitraMitra, IIT, Kharagpur)


No comments:

Post a Comment