1st Tuesday Journal-Paper Club: May 2015 meeting

#1st #Tuesday #Journal-Paper #Club #Do #we #Need #Hundreds #of #Classifiers #to #Solve #Real #World #Classification #Problems?
Share

This month's paper is:

Do we Need Hundreds of Classifiers to Solve Real World Classification Problems?

Manuel Fernández-Delgado, Eva Cernadas, Senén Barro, Dinani Amorim; 15(Oct):3133−3181, 2014.

http://jmlr.org/papers/v15/delgado14a.html 

Sam Hames has kindly agreed to lead the discussion and the venue will again be the Brisbane Brewhouse (see address details & map below).

ABSTRACT: We evaluate 179 classifiers arising from 17 families (discriminant analysis, Bayesian, neural networks, support vector machines, decision trees, rule-based classifiers, boosting, bagging, stacking, random forests and other ensembles, generalized linear models, nearest-neighbors, partial least squares and principal component regression, logistic and multinomial regression, multiple adaptive regression splines and other methods), implemented in Weka, R (with and without the caret package), C and Matlab, including all the relevant classifiers available today. We use 121 data sets, which represent the whole UCI data base (excluding the large- scale problems) and other own real problems, in order to achieve significant conclusions about the classifier behavior, not dependent on the data set collection. The classifiers most likely to be the bests are the random forest (RF) versions, the best of which (implemented in R and accessed via caret) achieves 94.1% of the maximum accuracy overcoming 90% in the 84.3% of the data sets. However, the difference is not statistically significant with the second best, the SVM with Gaussian kernel implemented in C using LibSVM, which achieves 92.3% of the maximum accuracy. A few models are clearly better than the remaining ones: random forest, SVM with Gaussian and polynomial kernels, extreme learning machine with Gaussian kernel, C5.0 and avNNet (a committee of multi-layer perceptrons implemented in R with the caret package). The random forest is clearly the best family of classifiers (3 out of 5 bests classifiers are RF), followed by SVM (4 classifiers in the top-10), neural networks and boosting ensembles (5 and 3 members in the top-20, respectively).

*About the 1st Tuesday Journal-Paper Club:* the idea is to meet regularly, usually on the 1st Tuesday of the month as the name suggests (inspired by the ABC TV series "1st Tuesday Book Club"). Each month, the participants would agree on a highly cited, 'top ten' or major-prize-winning article in an SPS or ComSoc journal (but not one of our own!). We would also select a Discussion Leader. Through the month, each of the participants would read the article. At the next meeting, the Discussion Leader would lead a discussion of that article, starting with his/her own appraisal. In this way, it is hoped that we could all broaden our understanding of the field and further develop a sense of community. 1st rule of 1st Tuesday Journal-Paper Club: tell everyone about 1st Tuesday Journal-Paper Club.

 

 



  Date and Time

  Location

  Hosts

  Registration



  • Date: 05 May 2015
  • Time: 06:00 PM to 08:00 PM
  • All times are (UTC+10:00) Brisbane
  • Add_To_Calendar_icon Add Event to Calendar
  • 601 Stanley St.
  • Woolloongabba, Queensland
  • Australia 4102
  • Building: Brewhouse Brisbane

  • Contact Event Host
  • Starts 13 April 2015 06:00 AM
  • Ends 04 May 2015 12:00 PM
  • All times are (UTC+10:00) Brisbane
  • No Admission Charge