Regression metrics

1. Model Quality Summary Metrics

Evidently calculate a few standard model quality metrics: Mean Error (ME), Mean Absolute Error (MAE), Mean Absolute Percentage Error (MAPE).

For each quality metric, Evidently also shows one standard deviation of its value (in brackets) to estimate the stability of the performance.

To support the model performance analysis, Evidently also generates interactive visualizations. They help analyze where the model makes mistakes and come up with improvement ideas.

2. Predicted vs Actual

Predicted versus actual values in a scatter plot.

3. Predicted vs Actual in Time

Predicted and Actual values over time or by index, if no datetime is provided.

4. Error (Predicted - Actual)

Model error values over time or by index, if no datetime is provided.

5. Absolute Percentage Error

Absolute percentage error values over time or by index, if no datetime is provided.

6. Error Distribution

Distribution of the model error values.

7. Error Normality

Quantile-quantile plot (Q-Q plot) to estimate value normality.

Next, Evidently explore in detail the two segments in the dataset: 5% of predictions with the highest negative and positive errors. We refer to them as “underestimation” and “overestimation” groups. We refer to the rest of the predictions as “majority”.

8. Mean Error per Group

A summary of the model quality metrics for each of the two segments: mean Error (ME), Mean Absolute Error (MAE), Mean Absolute Percentage Error (MAPE).

9. Predicted vs Actual per Group

Prediction plots that visualize the regions where the model underestimates and overestimates the target function.

10. Error Bias: Mean/Most Common Feature Value per Group

This table helps quickly see the differences in feature values between the 3 groups:

OVER (top-5% of predictions with overestimation)
UNDER (top-5% of the predictions with underestimation)
MAJORITY (the rest 90%)

For the numerical features, it shows the mean value per group. For the categorical features, it shows the most common value.

If you have two datasets, the table displays the values for both REF (reference) and CURR (current).

If you observe a large difference between the groups, it means that the model error is sensitive to the values of a given feature.

To search for cases like this, you can sort the table using the column “Range(%)”. It increases when either or both of the “extreme” groups are different from the majority.

Here is the formula used to calculate the Range %:

Range=100∗∣(Vover−Vunder)/(Vmax−Vmin)∣Range=100∗∣(Vover−Vunder)/(Vmax−Vmin)∣

Where: Vover = average feature value in the OVER group; Vunder = average feature value in the UNDER group; Vmax = maximum feature value; Vmin = minimum feature value

11. Error Bias per Feature

For each feature, Evidently shows a histogram to visualize the distribution of its values in the segments with extreme errors and in the rest of the data. You can visually explore if there is a relationship between the high error and the values of a given feature.

Here is an example where extreme errors are dependent on the “temperature” feature.

12. Predicted vs Actual per Feature

For each feature, Evidently also show the Predicted vs Actual scatterplot. It helps visually detect and explore underperforming segments which might be sensitive to the values of the given feature.

Reference

Metric Customization

Explainers

1. Model Quality Summary Metrics

2. Predicted vs Actual

3. Predicted vs Actual in Time

4. Error (Predicted - Actual)

5. Absolute Percentage Error

6. Error Distribution

7. Error Normality

8. Mean Error per Group

9. Predicted vs Actual per Group

10. Error Bias: Mean/Most Common Feature Value per Group

11. Error Bias per Feature

12. Predicted vs Actual per Feature

Metrics output

Reference

Metric Customization

Explainers

​1. Model Quality Summary Metrics

​ 2. Predicted vs Actual

​ 3. Predicted vs Actual in Time

​ 4. Error (Predicted - Actual)

​ 5. Absolute Percentage Error

​ 6. Error Distribution

​ 7. Error Normality

​ 8. Mean Error per Group

​ 9. Predicted vs Actual per Group

​ 10. Error Bias: Mean/Most Common Feature Value per Group

​ 11. Error Bias per Feature

​ 12. Predicted vs Actual per Feature

​ Metrics output

1. Model Quality Summary Metrics

2. Predicted vs Actual

3. Predicted vs Actual in Time

4. Error (Predicted - Actual)

5. Absolute Percentage Error

6. Error Distribution

7. Error Normality

8. Mean Error per Group

9. Predicted vs Actual per Group

10. Error Bias: Mean/Most Common Feature Value per Group

11. Error Bias per Feature

12. Predicted vs Actual per Feature

Metrics output