function readOnly(count){ }
Starting November 20, the site will be set to read-only. On December 4, 2023,
forum discussions will move to the Trailblazer Community.
+ Start a Discussion
Jason DurheimJason Durheim 

Can you separate Einstein training data from the test data?

This may simply just not be available yet, didn't see it in any of the doc surrounding training a model.  

You can specify what percentage of the dataset gets used for training with the remainder being used for testing.  Is there any mechanism to explicitly define a dataset to use for training and another dataset to be used for testing?  

It would seem like this would be valuable for tuning the model, having a static test data set to always test against while being able to modify the training dataset to increase the accuracy.
Michael Machado 22Michael Machado 22
Hi Jason- By default we use 10% of your data as a test set.  These parameters can be customized though to any range you prefer, but it is only done as part of training, so to do a separate test, you will have to write a script to call our API to make predictions on a different set of data.
Jason DurheimJason Durheim
Something I should have clarified first too, we were using the Einstein Language - Intent APIs and not Einstein Vision :)  Which we know is still in Beta as well.

I understand that you can vary the percentage used for the test set, what I was getting at was more about the fact that I don't know (using the default as an example), which 10% of the test data is used as the test set versus the other 90% as the training set.  As it goes through the training iterations is this 10% consistent? Or does it do some number of iterations on a particular 10% until optimal accuracy is achieved and then go through a new 10%, etc, to try and further refine the accuracy across that data set? 

To put it another way, with the current APIs it's hard to try and tune the accuracy for certain categories of questions if you aren't sure you have a consistently repeatable test set to compare changes in the training set to.  In our light PoC, even if Einstein was getting the correct answer, it might have only been 60% sure, and we were looking to see if we could improve on that number.  In other cases we saw that the ordering of the words in the question or the presence of specific words in the sentence might greatly change the probability, even to the point of changing the top returned answer.  We also tried to play around with longer sentences versus shorter ones (even single words in some cases) to try and adjust the accuracy, but was a little harder sometimes to see how those changes to the dataset affected the accuract of certain groups of questions.

A large part of my question probably comes from the fact that I've worked internally on other AI platforms in the past, so I'm used to a setup with an explicitly separate training set and test set and knowing the accuracy results.  Maybe that partially qualifies me as a "data-scientist" where I want to see and work more with the guts of the training process, when Einstein is currently more designed for ease of accessability without having to go to that level.  So totally nothing wrong with these capabilities not being available (maybe even ever), was mostly just curious if there was any thought that there may be some 'advanced options' that might allow for a litle more low-level work with tuning the accuracy of Einstein :)

Michael Machado 22Michael Machado 22
We apply consistent hashing to the trainings/test set split - So your test/train comparison will be consistent across multiple training jobs.  In other words, changes to your dataset won't impact your test set, but new test examples can be added based on the ratio of new data. 

As to your examples... Those are great ways to test and enhance your model.  Since we use a Neural Net architecture, we aren't simply counting words; with that word order, synonymns, size/length of the utterance, etc... all add to variables that can be leveraged to enlarge your training data and enhance your models. 

Glad to hear that we are checking off a lot of your requirements but do let us know how we can ensure success for your model tests.  We are constantly thinking of ways to provide more granular model metrics and visibility into potential errors in predictions. 


To follow up, for the language APIs, the default split is 0.8 or 80%. One way you can tweak it is to change the split ratio by passing an explicit value {"trainSplitRatio": 0.n} in the trainParams when you train the dataset. For more info on training, see