Part III. Mining association rules from US census dataset (Ass 3 – WEKA part 1)

1.    Dataset description

This is a well-known dataset of US census 1994, which we used for classification task. Perform dataset analysis and determine what filters should be applied in order to make the dataset suitable for FP-growth. Reminder: FP-growth accepts only binary format of categorical attributes.

2.    Preprocessing

Preprocess dataset by applying the necessary filters.

3.    Default FP-growth

Perform FP-growth with default parameters. What are the results?

4.    Different parameters

Change parameters and produce more rules. Sift through the rules to find non-trivial associations between attributes. Report 5 interesting association rules.

5.    Rules which contain the class value

Change parameters to generate only rules which contain the class: Income. Report 5 rules. What is the main difference between the classification rules in Assignment 1 and these association rules?

End of part III