Identifying more bloggers: Towards large scale personality classification of personal weblogs
Tuesday, March 27, 2007
Scott Nowson and Jon OberlanderWe report new results on the relatively novel task of automatic classification of blog author personality. Promisingly high classification accuracies have recently been reported for four important personality traits (Extraversion, Neuroticism, Agreeableness and Conscientiousness). But the blog corpus used in that work required careful preparation, and was consequently quite small (with less than a hundred authors; and less than half a million words). Here, we provide an initial report on the classification accuracies that can be achieved when classifiers conditioned on the small corpus are applied to a larger, automatically-acquired blog corpus, using lower granularity personality data and substantially less manual preparation (with over a thousand bloggers, and approximately five million words). Predictably, results on the larger corpus are not as impressive as those on the smaller; nevertheless, they point the way forward for further work.
Full Paper
posted by ICWSM at 1:07 PM
0 Comments:
Post a Comment
<< Home