I think there is no solid evidence to prove which is better than another. These two algorithms build from different methods with different hyper-parameters to tune. Therefore, I think the right approach is to understand the pros and cons of the two, recall the pros and cons when solving your specific problem.
SVM
- Kernel based, turn linear sep problem to non-linear problem.
- Good for high dimension feature space
- slow for large training set
- hard to interpret
- hard to find a good kernel
- not robust for noisy training set
- sensitive for missing data
Decision tree
- Good for handling non-linear problems
- Handle categorical data well
- easy to interpret
- can handle some noisy data
- easy over-fitting
- not work so well for large feature space
- not work so well if some features have close correlation
Logistic regression
- Assumptions: linear relationship/multivariate normal/little or no multicollinearity/no or little auto-correlation / meaning the residuals are equal across the regression line
- handle linear problem well
- robust to noise, use l1,l2 regularization for model selection, avoid overfitting
- computationally efficient
- variations with regularizations:
- Lasso: penalizes the absolute size of coefficients, offers automatic feature selection
- Ridge: penalizes the squared size of coefficients, offers automatic feature shrinkage
- Elastic Net: combination of Lasso and Ridge
Naive Bayes
- computationally efficient when P is large by alleviating the curse of dimensionality
- works surprisingly well for some cases even if the condition doesn’t hold