How To Split A Tensorflow Dataset Into Train, Test And Validation In A Python Script?
Solution 1:
tfds.Split.ALL.subsplit
or tfds.Split.TRAIN.subsplit
apparently are deprecated and no longer supported.
Some of the datasets are already split between train and test. In this case I found the following solution (using for example the fashion MNIST dataset):
splits, info = tfds.load('fashion_mnist', with_info=True, as_supervised=True,
split=['train+test[:80]','train+test[80:90]', 'train+test[90:]'],
data_dir=filePath)
(train_examples, validation_examples, test_examples) = splits
EDIT AFTER COMMENTS
The previous code had some errors. First of all, this official link says:
Full dataset ('all'): 'all' is a special split name corresponding to the union of all splits (equivalent to 'train+test+...')
but when I tried it did not work. all
would be helpful but there is an alternative.
The error in the previous code is that the %
must be used and that it must be specified for each set. I modified the code in this way:
import tensorflow_datasets as tfds
splits, info = tfds.load('fashion_mnist', with_info=True, as_supervised=True,
split=['train[:80%]+test[:80%]','train[80%:90%]+test[80%:90%]', 'train[90%:]+test[90%:]'],
data_dir='./')
#(train_examples, validation_examples, test_examples) = splitsfor el in splits:
print(el.cardinality())
which prints:
tf.Tensor(56000, shape=(), dtype=int64)
tf.Tensor(7000, shape=(), dtype=int64)
tf.Tensor(7000, shape=(), dtype=int64)
Solution 2:
In the case of rock_paper_scissor dataset on tfds it works for me:
splits = ['train+test[:80]', 'train+test[80:90]', 'train+test[90:]']
splits, info = tfds.load( 'rock_paper_scissors', split=splits, as_supervised=True, with_info=True)
(train_examples, validation_examples, test_examples) = splitsnum_examples= info.splits['train'].num_examplesnum_classes= info.features['label'].num_classes
Post a Comment for "How To Split A Tensorflow Dataset Into Train, Test And Validation In A Python Script?"