Task-based datasets, preprocessing, and evaluation for sequence models
SeqIO SeqIO is a library for processing sequential data to be fed into downstream sequence models. It uses tf.data.Dataset to create scalable data pipelines but requires minimal use of TensorFlow. In particular, with one line of code, the returned dataset can be transformed to a numpy iterator and hence it is fully compatible with other frameworks such as JAX or PyTorch. Currently, SeqIO assumes that the dataset is a sequence, i.e., each feature is one-dimensional array. Modalities such as text […]
Read more