Skip to content

Evaluate

Evaluate your queries offline.


The evaluation page allows offline evaluation of queries and provides a quick and easy way to test Inconvo’s capabilities on your data and your users’ queries, even before you deploy Inconvo to production.

Each row in the evaluations table is a test case. Add a new test case by inputting a question in the input box beneath the table.

EvalsTable

Test cases will initially be marked as for Review
They can then be marked as Passing or Failing by clicking on the test case row and opening the trace view.

You can use the evals table to debug failing test cases and build towards an acceptable test pass rate for your application.

For every evaluation query a trace is created. A trace shows the steps Inconvo took to create an answer.

Traces show the tools Inconvo used for the run, their input_options, the tool call and the output.

Play

How many new customers did we win in the last month?

Looking at the trace we see that Inconvo used the number of users created in the last month to answer the question.

FailingTrace

In our application users != customers so the answer is incorrect and we mark the run as failing.

The fix is simple, add a table prompt to the users table to inform Inconvo of what a customer is in our application.

TablePrompt

With this simple addition Inconvo retrieves the correct answer and we mark the test case as passing.

PassingTrace