Skip to content
Merged
6 changes: 3 additions & 3 deletions doc/multioutput.rst
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ MIMO (Multi-Input Multi-Output) data. For classification, it can be used for
multilabel data. Actually, for multiclass classification, which has one output with
multiple categories, multioutput feature selection can also be useful. The multiclass
classification can be converted to multilabel classification by one-hot encoding
target ``y``. The canonical correaltion coefficient between the features ``X`` and the
target ``y``. The canonical correlation coefficient between the features ``X`` and the
one-hot encoded target ``y`` has equivalent relationship with Fisher's criterion in
LDA (Linear Discriminant Analysis) [1]_. Applying :class:`FastCan` to the converted
multioutput data may result in better accuracy in the following classification task
Expand All @@ -23,7 +23,7 @@ Relationship on multiclass data
Assume the feature matrix is :math:`X \in \mathbb{R}^{N\times n}`, the multiclass
target vector is :math:`y \in \mathbb{R}^{N\times 1}`, and the one-hot encoded target
matrix is :math:`Y \in \mathbb{R}^{N\times m}`. Then, the Fisher's criterion for
:math:`X` and :math:`y` is denoted as :math:`J` and the canonical correaltion
:math:`X` and :math:`y` is denoted as :math:`J` and the canonical correlation
coefficient between :math:`X` and :math:`Y` is denoted as :math:`R`. The relationship
between :math:`J` and :math:`R` is given by

Expand All @@ -36,7 +36,7 @@ or
R^2 = \frac{J}{1+J}

It should be noted that the number of the Fisher's criterion and the canonical
correaltion coefficient is not only one. The number of the non-zero canonical
correlation coefficient is not only one. The number of the non-zero canonical
correlation coefficients is no more than :math:`\min (n, m)`, and each canonical correlation
coefficient is one-to-one correspondence to each Fisher's criterion.

Expand Down
2 changes: 1 addition & 1 deletion doc/ols_and_omp.rst
Original file line number Diff line number Diff line change
Expand Up @@ -39,7 +39,7 @@ it the following advantages over OLS and OMP:
and/or added some constants, the selection result given by :class:`FastCan` will be
unchanged. See :ref:`sphx_glr_auto_examples_plot_affinity.py`.
* Multioutput: as :class:`FastCan` use canonical correlation for feature ranking, it is
naturally support feature seleciton on dataset with multioutput.
naturally support feature selection on dataset with multioutput.


.. rubric:: References
Expand Down
2 changes: 1 addition & 1 deletion doc/pruning.rst
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ by sparse linear combinations of the atoms.
We use these atoms as the target :math:`Y` and select samples based on their correlation with :math:`Y`.

One challenge to use :class:`FastCan` for data pruning is that the number to select is much larger than feature selection.
Normally, this number is higher than the number of features, which will make the pruned data matrix singular.
Normally, this number is greater than the number of features, which will make the pruned data matrix singular.
In other words, :class:`FastCan` will easily think the pruned data is redundant and no additional sample
should be selected, as any additional samples can be represented by linear combinations of the selected samples.
Therefore, the number to select has to be set to small.
Expand Down
6 changes: 3 additions & 3 deletions examples/plot_fisher.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@

.. currentmodule:: fastcan

In this examples, we will demonstrate the canonical correaltion coefficient
In this examples, we will demonstrate the canonical correlation coefficient
between the features ``X`` and the one-hot encoded target ``y`` has equivalent
relationship with Fisher's criterion in LDA (Linear Discriminant Analysis).
"""
Expand All @@ -17,14 +17,14 @@
# Prepare data
# ------------
# We use ``iris`` dataset and transform this multiclass data to multilabel data by
# one-hot encoding. Here, drop="first" is necessary, otherwise, the transformed target
# one-hot encoding. Here, drop="first" is necessary; otherwise, the transformed target
# is not full column rank.

from sklearn import datasets
from sklearn.preprocessing import OneHotEncoder

X, y = datasets.load_iris(return_X_y=True)
# drop="first" is necessary, otherwise, the transformed target is not full column rank
# drop="first" is necessary; otherwise, the transformed target is not full column rank
y_enc = OneHotEncoder(
drop="first",
sparse_output=False,
Expand Down
4 changes: 2 additions & 2 deletions examples/plot_forecasting.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@
In this examples, we will demonstrate how to use :func:`make_narx` to build (nonlinear)
AutoRegressive (AR) models for time-series forecasting.
The time series used isthe monthly average atmospheric CO2 concentrations
The time series used is the monthly average atmospheric CO2 concentrations
from 1958 and 2001.
The objective is to forecast the CO2 concentration till nowadays with
initial 18 months data.
Expand Down Expand Up @@ -94,7 +94,7 @@
# Nonlinear AR model
# ------------------
# We can use :func:`make_narx` to easily build a nonlinear AR model, which does not
# has a input. Therefore, the input ``X`` is set as ``None``.
# has an input. Therefore, the input ``X`` is set as ``None``.
# :func:`make_narx` will search 10 polynomial terms, whose maximum degree is 2 and
# maximum delay is 9.

Expand Down
14 changes: 7 additions & 7 deletions examples/plot_narx.py
Original file line number Diff line number Diff line change
Expand Up @@ -47,7 +47,7 @@
X = np.c_[u0[max_delay:], u1[max_delay:]]

# %%
# Build term libriary
# Build term library
# -------------------
# To build a reduced polynomial NARX model, it is normally have two steps:
#
Expand All @@ -56,14 +56,14 @@
#
# #. Learn the coefficients of the terms.
#
# To search the structure of the model, the candidate term libriary should be
# To search the structure of the model, the candidate term library should be
# constructed by the following two steps.
#
# #. Time-shifted variables: the raw input-output data, i.e., :math:`u_0(k)`,
# :math:`u_1(k)`, and :math:`y(k)`, are converted into :math:`u_0(k-1)`,
# :math:`u_1(k-2)`, etc.
#
# #. Nonlinear terms: the time-shifted variables are onverted to nonlinear terms
# #. Nonlinear terms: the time-shifted variables are converted to nonlinear terms
# via polynomial basis functions, e.g., :math:`u_0(k-1)^2`,
# :math:`u_0(k-1)u_0(k-3)`, etc.
#
Expand Down Expand Up @@ -124,8 +124,8 @@
# %%
# Build NARX model
# ----------------
# As the reduced polynomial NARX is a linear function of the nonlinear tems,
# the coefficient of each term can be easily estimated by oridnary least squares.
# As the reduced polynomial NARX is a linear function of the nonlinear terms,
# the coefficient of each term can be easily estimated by ordinary least squares.
# In the printed NARX model, it is found that :class:`FastCan` selects the correct
# terms and the coefficients are close to the true values.

Expand All @@ -143,9 +143,9 @@

print_narx(narx_model)
# %%
# Automaticated NARX modelling workflow
# Automated NARX modelling workflow
# -------------------------------------
# We provide :meth:`narx.make_narx` to automaticate the workflow above.
# We provide :meth:`narx.make_narx` to automate the workflow above.

from fastcan.narx import make_narx

Expand Down
4 changes: 2 additions & 2 deletions examples/plot_narx_multi.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
"""
=======================
Mulit-output NARX model
Multi-output NARX model
=======================
.. currentmodule:: fastcan
Expand Down Expand Up @@ -64,7 +64,7 @@


# %%
# Identify the mulit-output NARX model
# Identify the multi-output NARX model
# ------------------------------------
# We provide :meth:`narx.make_narx` to automatically find the model
# structure. `n_terms_to_select` can be a list to indicate the number
Expand Down
6 changes: 3 additions & 3 deletions fastcan/_refine.py
Original file line number Diff line number Diff line change
Expand Up @@ -38,10 +38,10 @@ def refine(selector, drop=1, max_iter=None, verbose=1):
In the refining process, the selected features will be dropped, and
the vacancy positions will be refilled from the candidate features.

The processing of a vacany position is refilled after searching all
The processing of a vacant position is refilled after searching all
candidate features is called an `iteration`.

The processing of a vacany position is refilled by a different features
The processing of a vacant position is refilled by a different features
from the dropped one, which increase the SSC of the selected features
is called a `valid iteration`.

Expand All @@ -51,7 +51,7 @@ def refine(selector, drop=1, max_iter=None, verbose=1):
FastCan selector.

drop : int or array-like of shape (n_drops,) or "all", default=1
The number of the selected features dropped for the consequencing
The number of the selected features dropped for the consequent
reselection.

max_iter : int, default=None
Expand Down
2 changes: 1 addition & 1 deletion fastcan/narx/_utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -217,7 +217,7 @@ def make_narx(
The verbosity level of refine.

refine_drop : int or "all", default=None
The number of the selected features dropped for the consequencing
The number of the selected features dropped for the consequent
reselection. If `drop` is None, no refining will be performed.

refine_max_iter : int, default=None
Expand Down
4 changes: 2 additions & 2 deletions fastcan/narx/tests/test_narx.py
Original file line number Diff line number Diff line change
Expand Up @@ -263,7 +263,7 @@ def make_data(multi_output, nan, rng):
).fit(X, y)


def test_mulit_output_warn():
def test_multi_output_warn():
X = np.random.rand(10, 2)
y = np.random.rand(10, 2)
for i in range(2):
Expand Down Expand Up @@ -342,7 +342,7 @@ def test_fit_intercept():
assert_array_equal(narx.intercept_, [0.0, 0.0])


def test_mulit_output_error():
def test_multi_output_error():
X = np.random.rand(10, 2)
y = np.random.rand(10, 2)
time_shift_ids = np.array([[0, 1], [1, 1]])
Expand Down