Multiple Kernel Learning.
A support vector machine based method for use with multiple kernels. In Multiple Kernel Learning (MKL) in addition to the SVM
and bias term
the kernel weights
are estimated in training. The resulting kernel method can be stated as
where
is the number of training examples
are the weights assigned to each training example
are the weights assigned to each sub-kernel
are sub-kernels and
the bias.
Kernels have to be chosen a-priori. In MKL
and bias are determined by solving the following optimization program
here C is a pre-specified regularization parameter.
Within shogun this optimization problem is solved using semi-infinite programming. For 1-norm MKL using one of the two approaches described in
Soeren Sonnenburg, Gunnar Raetsch, Christin Schaefer, and Bernhard Schoelkopf. Large Scale Multiple Kernel Learning. Journal of Machine Learning Research, 7:1531-1565, July 2006.
The first approach (also called the wrapper algorithm) wraps around a single kernel SVMs, alternatingly solving for
and
. It is using a traditional SVM to generate new violated constraints and thus requires a single kernel SVM and any of the SVMs contained in shogun can be used. In the MKL step either a linear program is solved via glpk or cplex or analytically or a newton (for norms>1) step is performed.
The second much faster but also more memory demanding approach performing interleaved optimization, is integrated into the chunking-based SVMlight.
In addition sparsity of MKL can be controlled by the choice of the
-norm regularizing
as described in
Marius Kloft, Ulf Brefeld, Soeren Sonnenburg, and Alexander Zien. Efficient and accurate lp-norm multiple kernel learning. In Advances in Neural Information Processing Systems 21. MIT Press, Cambridge, MA, 2009.
An alternative way to control the sparsity is the elastic-net regularization, which can be formulated into the following optimization problem:
where
is a loss function. Here
controls the trade-off between the two regularization terms.
corresponds to
-MKL, whereas
corresponds to the uniform-weighted combination of kernels (
-MKL). This approach was studied by Shawe-Taylor (2008) "Kernel Learning for Novelty Detection" (NIPS MKL Workshop 2008) and Tomioka & Suzuki (2009) "Sparsity-accuracy trade-off in MKL" (NIPS MKL Workshop 2009).
Definition at line 93 of file MKL.h.
Public Member Functions |
| CMKL (CSVM *s=NULL) |
virtual | ~CMKL () |
void | set_constraint_generator (CSVM *s) |
void | set_svm (CSVM *s) |
CSVM * | get_svm () |
void | set_C_mkl (float64_t C) |
void | set_mkl_norm (float64_t norm) |
void | set_elasticnet_lambda (float64_t elasticnet_lambda) |
void | set_mkl_block_norm (float64_t q) |
void | set_interleaved_optimization_enabled (bool enable) |
bool | get_interleaved_optimization_enabled () |
float64_t | compute_mkl_primal_objective () |
virtual float64_t | compute_mkl_dual_objective () |
float64_t | compute_elasticnet_dual_objective () |
void | set_mkl_epsilon (float64_t eps) |
float64_t | get_mkl_epsilon () |
int32_t | get_mkl_iterations () |
virtual bool | perform_mkl_step (const float64_t *sumw, float64_t suma) |
virtual float64_t | compute_sum_alpha ()=0 |
virtual void | compute_sum_beta (float64_t *sumw) |
virtual const char * | get_name () const |
Static Public Member Functions |
static bool | perform_mkl_step_helper (CMKL *mkl, const float64_t *sumw, const float64_t suma) |
Protected Member Functions |
virtual bool | train_machine (CFeatures *data=NULL) |
virtual void | init_training ()=0 |
void | perform_mkl_step (float64_t *beta, float64_t *old_beta, int num_kernels, int32_t *label, int32_t *active2dnum, float64_t *a, float64_t *lin, float64_t *sumw, int32_t &inner_iters) |
float64_t | compute_optimal_betas_via_cplex (float64_t *beta, const float64_t *old_beta, int32_t num_kernels, const float64_t *sumw, float64_t suma, int32_t &inner_iters) |
float64_t | compute_optimal_betas_via_glpk (float64_t *beta, const float64_t *old_beta, int num_kernels, const float64_t *sumw, float64_t suma, int32_t &inner_iters) |
float64_t | compute_optimal_betas_elasticnet (float64_t *beta, const float64_t *old_beta, const int32_t num_kernels, const float64_t *sumw, const float64_t suma, const float64_t mkl_objective) |
void | elasticnet_transform (float64_t *beta, float64_t lmd, int32_t len) |
void | elasticnet_dual (float64_t *ff, float64_t *gg, float64_t *hh, const float64_t &del, const float64_t *nm, int32_t len, const float64_t &lambda) |
float64_t | compute_optimal_betas_directly (float64_t *beta, const float64_t *old_beta, const int32_t num_kernels, const float64_t *sumw, const float64_t suma, const float64_t mkl_objective) |
float64_t | compute_optimal_betas_block_norm (float64_t *beta, const float64_t *old_beta, const int32_t num_kernels, const float64_t *sumw, const float64_t suma, const float64_t mkl_objective) |
float64_t | compute_optimal_betas_newton (float64_t *beta, const float64_t *old_beta, int32_t num_kernels, const float64_t *sumw, float64_t suma, float64_t mkl_objective) |
virtual bool | converged () |
void | init_solver () |
bool | init_cplex () |
void | set_qnorm_constraints (float64_t *beta, int32_t num_kernels) |
bool | cleanup_cplex () |
bool | init_glpk () |
bool | cleanup_glpk () |
bool | check_lpx_status (LPX *lp) |
Protected Attributes |
CSVM * | svm |
float64_t | C_mkl |
float64_t | mkl_norm |
float64_t | ent_lambda |
float64_t | mkl_block_norm |
float64_t * | beta_local |
int32_t | mkl_iterations |
float64_t | mkl_epsilon |
bool | interleaved_optimization |
float64_t * | W |
float64_t | w_gap |
float64_t | rho |
CTime | training_time_clock |
CPXENVptr | env |
CPXLPptr | lp_cplex |
LPX * | lp_glpk |
bool | lp_initialized |