Shibani's Learning Analytics Notes: Competency 8.2

Competency 8.2: Build and evaluate models using alternative feature spaces.

I used the different feature spaces that I saved in the previous exercise for building models. My data set was very small and I intended to use it just for testing. I found significant improvement in metrics while comparing the models of POS features Vs Unigrams and bigrams. I could see from my data that the n-grams were most predictive of the categories.

I couldn't find significant improvements in model metrics for many basic features. I used Naive Bayes as the classification algorithm. I also tried other algorithms, but there was not a big difference in the metrics' values. Few feature spaces I tried along with the metrics for their models are below:

Feature Space	Accuracy	Kappa
POS grams	42%	0.12
12 grams_count	58%	0.36
1 grams_pairs	61%	0.41
12 grams_length	61%	0.41
12 POS grams	65%	0.47
12 grams_no stop	69%	0.52
12 grams	73%	0.59
123 grams	73%	0.59

To test with a real data set, I tried the hands on activity of text feature extraction given in Prosolo using sentiment_sentences data set. I extracted different feature spaces from the basic feature set and used logistic regression. There was significant improvement while expanding the feature set.

Shibani's Learning Analytics Notes

Wednesday, December 10, 2014

Competency 8.2

No comments:

Post a Comment