Telemetry
What data is collected when you use Evidently open-source.
Telemetry refers to the collection of usage data. We collect some data to understand how many users we have and how they interact with Evidently open-source.
This helps us improve the tool and prioritize implementing the new features. Below we describe what is collected, how to opt out and why we’d appreciate if you keep the telemetry on.
What data is collected?
Telemetry is collected in Evidently starting from version 0.4.0.
We only collect telemetry when you use Evidently Monitoring UI. We DO NOT collect any telemetry when you use the tool as a library, for instance, run in a Jupyter notebook or in a Python script to generate Evidently Reports.
We only collect anonymous usage data. We DO NOT collect personal data.
We only collect data about environment and service use. Our telemetry is intentionally limited in scope. We DO NOT collect any sensitive information or data about the datasets you process. We DO NOT have access to the dataset schema, parameters, variable names, or anything related to the contents of the data or your code.
We collect the following types of data.
Environment data. Basic information about the environment in which you run Evidently:
-
timestamp
-
user_id
-
os_name
-
os_version
-
python_version
-
tool_name
-
tool_version
-
source_ip
The source_ip
is NOT your IP address. We use jitsu
, an open-source tool for event collection. We always use strict ip_policy
which obscures the exact IP. You can read more in Jitsu docs.
The user_ID
is anonymized and only allows matching that actions are performed by the same user.
Service usage data. Data about the following actions performed in the service to understand features being used:
-
Startup
-
Index
-
List_projects
-
Get_project_info
-
Project_dashboard
-
List_reports
-
List_test_suites
-
Get_snapshot_download
-
Add_project
-
Search_projects
-
Update_project_info
-
Get_snapshot_graph_data
-
Get_snapshot_data
-
List_project_dashboard_panels
-
Add_snapshot
How to enable/disable telemetry?
By default, telemetry is enabled.
After starting up the service, you will see the following message in the terminal:
To disable telemetry, use the environment variable: DO_NOT_TRACK
Set it to any value, for instance:
After doing that and starting the service, you will see the message:
To enable telemetry back, unset the environment variable:
Event log examples
Should I opt out?
Being open-source, we have no visibility into the tool usage unless someone actively reaches out to us or opens a GitHub issue.
We’d be grateful if you keep the telemetry on since it helps us answer questions like:
-
How many people are actively using the tool?
-
Which features are being used most?
-
What is the environment you run Evidently in?
It helps us prioritize the development of new features and make sure we test the performance in the most popular environments.
We understand that you might still prefer not to share any telemetry data, and we respect this wish. Follow the steps above to disable the data collection.