Assignment Questions
Data warehousing and Data Mining
Unit – I
Level – 1 (High)
1.
List
and explain the different schemas that can be built using dimension tables and fact
tables.
2.
What
are the major distinguishing features between OLTP and OLAP systems?
Level – 2 (Medium)
3.
What
are the different OLAP operations on multi-dimensional data?
4.
Briefly
explain different schemas of data ware house?
Level – 3 (Low)
5.
What
are the different types of OLAP servers give example of each?
Unit – II
Level
– I (High)
1.
What
are the issues in measurement and data collection with respect to data quality?
2.
Explain
about summary statistics with an example.
Level – 2 (Medium)
3.
What
is a dataset? Explain different types of datasets in detail.
4.
What
is an attribute? Explain different attribute types in detail.
Level – 3 (Low)
5.
What
are the challenges that motivated the development of data mining?
6.
Describe
the data mining tasks in detail
Unit – III
Level – 1 (High)
1.
Explain
the different techniques that are used to handle noisy data.
2.
Explain
the various methods that are used in Discretization and concept Hierarchy
Generation for numerical data.
Level – 2 (Medium)
3.
What
is data integration? Discuss the issues to be considered for data integration.
4.
Explain
the various data reduction techniques give advantages of each.
Level – 3 (Low)
5.
What
is the need for Data pre-processing? Explain various techniques.
6.
What
is lossless and lossy dimensionality reduction.
Unit – IV
Level
– I (High)
1.
Explain
the general approach to solving a classification problem.
2.
Discuss
the methods that are commonly used to evaluate the performance of a classifier.
Level – 2 (Medium)
3.
Explain
the important characteristics of decision tree induction algorithms.
4.
What
is meant by i) Model Underfitting ii) Model Overfitting?. Compare them.
Level – 3 (Low)
5.
Explain
Hunt’s algorithm for building a Decision Tree.
6.
Explain
the different attribute types that are used in attribute test condition in the
Decision Tree.
Unit – V
Level – 1 (High)
1.
Explain
the FP-Tree representation. Also explain how the frequent itemset is generated
using FP-growth algorithm.
2.
Explain
why we use support and confidence in association analysis.
Level – 2 (Medium)
3.
Explain
about i) Maximal Frequent Itemsets ii) Closed Frequent Itemsets.
- Write
an algorithm for finding frequent itemsets using candidate generation.
Level – 3 (Low)
5.
Give
the formal definitions of the support and confidence metrics in Association
Analysis.
6.
State
Apriori Principle. Explain the apriori algorithm with an example.
Unit – VI
Level – 1 (High)
1.
Write
the basic Agglomerative Hierarchical Clustering algorithm
2.
Write
the basic k-means algorithm. Mention the time and space complexity for the
basic k-means algorithm
Level – 2 (Medium)
3.
Write
the bisecting k-means algorithm with an example.
- Briefly describe the
strengths and weaknesses of k-means clustering algorithm.
Level – 3 (Low)
5.
Explain
the DBSCAN algorithm in detail.
6.
Explain
the different types of clusterings.
No comments:
Post a Comment