Synthetic data generation is an add-on available on some Evidently Cloud and Enterprise plans. Check details on the pricing page. Request a demo or contact sales@evidentlyai.com for extended trial access.

To use synthetic data feature:

  • Create a Project

  • Set up an API key for Open AI

  • Open “Datasets” and choose “Generate Dataset.”

You can use synthetic data to augment your test scenarios as you evaluate the performance of your AI system.

Currently, the following features are available for self-service:

  • Synthetic inputs. Describe the task (like “healthcare chatbot”) and generate example inputs the users can ask. Once you create an initial dataset, you can further add more examples by selecting the ones you like and using the “More like this” option.

  • Synthetic RAG ground truth dataset. Provide a source document (markdown, CSV, PDF file) and generate a set of questions with ground truth answers to validate against.

Check the video with the basic flow from our LLM evaluation course: