We are at the beginning of the multicore era. Computers will have increasingly

many cores (processors), but there is still no good programming framework for

these architectures, and thus no simple and uniﬁed way for machine learning to

take advantage of the potential speed up. In this paper, we develop a broadly ap-

plicable parallel programming method, one that is easily applied to many different

learning algorithms. Our work is in distinct contrast to the tradition in machine

learning of designing (often ingenious) ways to speed up a single algorithm at a

time. Speciﬁcally, we show that algorithms that ﬁt the Statistical Query model [15]

can be written in a certain “summation form,” which allows them to be easily par-

allelized on multicore computers. We adapt Google’s map-reduce [7] paradigm to

demonstrate this parallel speed up technique on a variety of learning algorithms

including locally weighted linear regression (LWLR), k-means, logistic regres-

sion (LR), naive Bayes (NB), SVM, ICA, PCA, gaussian discriminant analysis

(GDA), EM, and backpropagation (NN). Our experimental results show basically

linear speedup with an increasing number of processors.

Selected for Oral Presentation

%8 12/2006 %> http://www.willowgarage.com/sites/default/files/ChuCT_etal_2006.pdf