SMT: Sparse multivariate tree

Başlık	SMT: Sparse multivariate tree
Publication Type	Journal Article
Year of Publication	2014
Authors	Deng, H., M. Gokce Baydogan, and G. Runger
Journal	Statistical Analysis and Data Mining
Volume	7
Issue	1
Pagination	53-69
Date Published	02/2014
ISSN	1932-1872
Anahtar kelimeler	decision tree, feature extraction, fused Lasso, Lasso, time series classification
Abstract	A multivariate decision tree attempts to improve upon the single variable split in a traditional tree. With the increase in datasets with many features and a small number of labeled instances in a variety of domains (bioinformatics, text mining, etc.), a traditional tree-based approach with a greedy variable selection at a node may omit important information. Therefore, the recursive partitioning idea of a simple decision tree combined with the intrinsic feature selection of L1 regularized logistic regression (LR) at each node is a natural choice for a multivariate tree model that is simple, but broadly applicable. This natural solution leads to the sparse multivariate tree (SMT) considered here. SMT can naturally handle non-time-series data and is extended to handle time-series classification problems with the power of extracting interpretable temporal patterns (e.g., means, slopes, and deviations). Binary L1 regularized LR models are used here for binary classification problems. However, SMT may be extended to solve multiclass problems with multinomial LR models. The accuracy and computational efficiency of SMT is compared to a large number of competitors on time series and non-time-series data.
URL	http://dx.doi.org/10.1002/sam.11208
DOI	10.1002/sam.11208