Logistic Regression

../../../../_images/logistic_regression.svg

Some of the docstrings for this module have been automatically extracted from the scikit-learn library and are covered by their respective licenses.

class node_regression.LogisticRegression[source]

Logistic regression of a categorical dependent variable

Configuration:
  • penalty

    Used to specify the norm used in the penalization. The ‘newton-cg’, ‘sag’ and ‘lbfgs’ solvers support only l2 penalties.

  • dual

    Dual or primal formulation. Dual formulation is only implemented for l2 penalty with liblinear solver. Prefer dual=False when n_samples > n_features.

  • C

    Inverse of regularization strength; must be a positive float. Like in support vector machines, smaller values specify stronger regularization.

  • fit_intercept

    Specifies if a constant (a.k.a. bias or intercept) should be added to the decision function.

  • intercept_scaling

    Useful only when the solver ‘liblinear’ is used and self.fit_intercept is set to True. In this case, x becomes [x, self.intercept_scaling], i.e. a “synthetic” feature with constant value equal to intercept_scaling is appended to the instance vector. The intercept becomes intercept_scaling * synthetic_feature_weight.

    Note! the synthetic feature weight is subject to l1/l2 regularization as all other features. To lessen the effect of regularization on synthetic feature weight (and therefore on the intercept) intercept_scaling has to be increased.

  • class_weight

    Weights associated with classes in the form {class_label: weight}. If not given, all classes are supposed to have weight one.

    The “balanced” mode uses the values of y to automatically adjust weights inversely proportional to class frequencies in the input data as n_samples / (n_classes * np.bincount(y)).

    Note that these weights will be multiplied with sample_weight (passed through the fit method) if sample_weight is specified.

    New in version 0.17: class_weight=’balanced’ instead of deprecated class_weight=’auto’.

  • tol

    Tolerance for stopping criteria.

  • multi_class

    Multiclass option can be either ‘ovr’ or ‘multinomial’. If the option chosen is ‘ovr’, then a binary problem is fit for each label. Else the loss minimised is the multinomial loss fit across the entire probability distribution. Works only for the ‘newton-cg’, ‘sag’ and ‘lbfgs’ solver.

    New in version 0.18: Stochastic Average Gradient descent solver for ‘multinomial’ case.

  • max_iter

    Useful only for the newton-cg, sag and lbfgs solvers. Maximum number of iterations taken for the solvers to converge.

  • solver

    Algorithm to use in the optimization problem.

    • For small datasets, ‘liblinear’ is a good choice, whereas ‘sag’ is
      faster for large ones.
    • For multiclass problems, only ‘newton-cg’, ‘sag’ and ‘lbfgs’ handle
      multinomial loss; ‘liblinear’ is limited to one-versus-rest schemes.
    • ‘newton-cg’, ‘lbfgs’ and ‘sag’ only handle L2 penalty.

    Note that ‘sag’ fast convergence is only guaranteed on features with approximately the same scale. You can preprocess the data with a scaler from sklearn.preprocessing.

    New in version 0.17: Stochastic Average Gradient descent solver.

  • n_jobs

    Number of CPU cores used during the cross-validation loop. If given a value of -1, all cores are used.

  • random_state

    The seed of the pseudo random number generator to use when shuffling the data. Used only in solvers ‘sag’ and ‘liblinear’.

  • warm_start

    When set to True, reuse the solution of the previous call to fit as initialization, otherwise, just erase the previous solution. Useless for liblinear solver.

    New in version 0.17: warm_start to support lbfgs, newton-cg, sag solvers.

Attributes:
  • n_iter_

    Actual number of iterations for all classes. If binary or multinomial, it returns only 1 element. For liblinear solver, only the maximum number of iteration across all classes is given.

  • coef_

    Coefficient of the features in the decision function.

  • intercept_

    Intercept (a.k.a. bias) added to the decision function. If fit_intercept is set to False, the intercept is set to zero.

Inputs:
Outputs:
model : model

Model