A library consisting of useful tools and extensions for the day-to-day data science tasks.
- This open source project is released under a permissive new BSD open source license and commercially usable
Sebastian Raschka 2014-2016
- Documentation: http://rasbt.github.io/mlxtend/
- Source code repository: https://github.com/rasbt/mlxtend
- PyPI: https://pypi.python.org/pypi/mlxtend
- Changelog: http://rasbt.github.io/mlxtend/changelog
- Contributing: http://rasbt.github.io/mlxtend/contributing
- Questions? Check out the Google Groups mailing list
## Recent changes
- Sequential Feature Selection algorithms: SFS, SFFS, and SFBS
- Neural Network / Multilayer Perceptron classifier
- Ordinary least square regression using different solvers (gradient and stochastic gradient descent, and the closed form solution)
To install mlxtend
, just execute
pip install mlxtend
The mlxtend
version on PyPI may always one step behind; you can install the latest development version from this GitHub repository by executing
pip install git+git://github.com/rasbt/mlxtend.git#egg=mlxtend
Alternatively, you download the package manually from the Python Package Index https://pypi.python.org/pypi/mlxtend, unzip it, navigate into the package, and use the command:
python setup.py install
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.gridspec as gridspec
import itertools
from sklearn.linear_model import LogisticRegression
from sklearn.svm import SVC
from sklearn.ensemble import RandomForestClassifier
from mlxtend.classifier import EnsembleVoteClassifier
from mlxtend.data import iris_data
from mlxtend.evaluate import plot_decision_regions
# Initializing Classifiers
clf1 = LogisticRegression(random_state=0)
clf2 = RandomForestClassifier(random_state=0)
clf3 = SVC(random_state=0, probability=True)
eclf = EnsembleVoteClassifier(clfs=[clf1, clf2, clf3], weights=[2, 1, 1], voting='soft')
# Loading some example data
X, y = iris_data()
X = X[:,[0, 2]]
# Plotting Decision Regions
gs = gridspec.GridSpec(2, 2)
fig = plt.figure(figsize=(10, 8))
for clf, lab, grd in zip([clf1, clf2, clf3, eclf],
['Logistic Regression', 'Random Forest', 'Naive Bayes', 'Ensemble'],
itertools.product([0, 1], repeat=2)):
clf.fit(X, y)
ax = plt.subplot(gs[grd[0], grd[1]])
fig = plot_decision_regions(X=X, y=y, clf=clf, legend=2)
plt.title(lab)
plt.show()