Regression metrics
Open-source regression quality metrics.
1. Model Quality Summary Metrics
Evidently calculate a few standard model quality metrics: Mean Error (ME), Mean Absolute Error (MAE), Mean Absolute Percentage Error (MAPE).
For each quality metric, Evidently also shows one standard deviation of its value (in brackets) to estimate the stability of the performance.
To support the model performance analysis, Evidently also generates interactive visualizations. They help analyze where the model makes mistakes and come up with improvement ideas.
2. Predicted vs Actual
Predicted versus actual values in a scatter plot.
3. Predicted vs Actual in Time
Predicted and Actual values over time or by index, if no datetime is provided.
4. Error (Predicted - Actual)
Model error values over time or by index, if no datetime is provided.
5. Absolute Percentage Error
Absolute percentage error values over time or by index, if no datetime is provided.
6. Error Distribution
Distribution of the model error values.
7. Error Normality
Quantile-quantile plot (Q-Q plot) to estimate value normality.
Next, Evidently explore in detail the two segments in the dataset: 5% of predictions with the highest negative and positive errors. We refer to them as “underestimation” and “overestimation” groups. We refer to the rest of the predictions as “majority”.
8. Mean Error per Group
A summary of the model quality metrics for each of the two segments: mean Error (ME), Mean Absolute Error (MAE), Mean Absolute Percentage Error (MAPE).
9. Predicted vs Actual per Group
Prediction plots that visualize the regions where the model underestimates and overestimates the target function.
10. Error Bias: Mean/Most Common Feature Value per Group
This table helps quickly see the differences in feature values between the 3 groups:
-
OVER (top-5% of predictions with overestimation)
-
UNDER (top-5% of the predictions with underestimation)
-
MAJORITY (the rest 90%)
For the numerical features, it shows the mean value per group. For the categorical features, it shows the most common value.
If you have two datasets, the table displays the values for both REF (reference) and CURR (current).
If you observe a large difference between the groups, it means that the model error is sensitive to the values of a given feature.
To search for cases like this, you can sort the table using the column “Range(%)”. It increases when either or both of the “extreme” groups are different from the majority.
Here is the formula used to calculate the Range %:
Range=100∗∣(Vover−Vunder)/(Vmax−Vmin)∣Range=100∗∣(Vover−Vunder)/(Vmax−Vmin)∣
Where: Vover = average feature value in the OVER group; Vunder = average feature value in the UNDER group; Vmax = maximum feature value; Vmin = minimum feature value
11. Error Bias per Feature
For each feature, Evidently shows a histogram to visualize the distribution of its values in the segments with extreme errors and in the rest of the data. You can visually explore if there is a relationship between the high error and the values of a given feature.
Here is an example where extreme errors are dependent on the “temperature” feature.
12. Predicted vs Actual per Feature
For each feature, Evidently also show the Predicted vs Actual scatterplot. It helps visually detect and explore underperforming segments which might be sensitive to the values of the given feature.