As an R user, I wanted to also get up to speed on scikit.
Creating a linear regression model(s) is fine, but can't seem to find a reasonable way to get a standard summary of regression output.
# Linear Regression import numpy as np from sklearn import datasets from sklearn.linear_model import LinearRegression # Load the diabetes datasets dataset = datasets.load_diabetes() # Fit a linear regression model to the data model = LinearRegression() model.fit(dataset.data, dataset.target) print(model) # Make predictions expected = dataset.target predicted = model.predict(dataset.data) # Summarize the fit of the model mse = np.mean((predicted-expected)**2) print model.intercept_, model.coef_, mse, print(model.score(dataset.data, dataset.target))
- seems like the intercept and coef are built into the model, and I just type
- What about all the other standard regression output like R^2, adjusted R^2, p values, etc. If I read the examples correctly, seems like you have to write a function/equation for each of these and then print it.
- So, is there no standard summary output for lin. reg. models?
- Also, in my printed array of outputs of coefficients, there are no variable names associated with each of these? I just get the numeric array. Is there a way to print these where I get an output of the coefficients and the variable they go with?
My printed output:
LinearRegression(copy_X=True, fit_intercept=True, normalize=False) 152.133484163 [ -10.01219782 -239.81908937 519.83978679 324.39042769 -792.18416163 476.74583782 101.04457032 177.06417623 751.27932109 67.62538639] 2859.69039877 0.517749425413
Notes: Started off with Linear, Ridge and Lasso. I have gone through the examples. Below is for the basic OLS.