Package ai.djl.basicdataset.nlp
Class GoEmotions
java.lang.Object
ai.djl.training.dataset.RandomAccessDataset
ai.djl.basicdataset.nlp.TextDataset
ai.djl.basicdataset.nlp.GoEmotions
- All Implemented Interfaces:
ai.djl.training.dataset.Dataset
GoEmotions is a corpus of 58k carefully curated comments extracted from Reddit, with human
annotations to 27 emotion categories or Neutral. This version of data is filtered based on
rater-agreement on top of the raw data, and contains a train/test/validation split. The emotion
categories are: admiration, amusement, anger, annoyance, approval, caring, confusion, curiosity,
desire, disappointment, disapproval, disgust, embarrassment, excitement, fear, gratitude, grief,
joy, love, nervousness, optimism, pride, realization, relief, remorse, sadness, surprise.
-
Nested Class Summary
Nested ClassesNested classes/interfaces inherited from class ai.djl.basicdataset.nlp.TextDataset
TextDataset.SampleNested classes/interfaces inherited from class ai.djl.training.dataset.RandomAccessDataset
ai.djl.training.dataset.RandomAccessDataset.BaseBuilder<T extends ai.djl.training.dataset.RandomAccessDataset.BaseBuilder<T>>Nested classes/interfaces inherited from interface ai.djl.training.dataset.Dataset
ai.djl.training.dataset.Dataset.Usage -
Field Summary
Fields inherited from class ai.djl.basicdataset.nlp.TextDataset
manager, mrl, prepared, samples, sourceTextData, targetTextData, usageFields inherited from class ai.djl.training.dataset.RandomAccessDataset
dataBatchifier, device, labelBatchifier, limit, pipeline, prefetchNumber, sampler, targetPipeline -
Method Summary
Modifier and TypeMethodDescriptionprotected longReturns the number of records available to be read in thisDataset.static GoEmotions.Builderbuilder()Creates a builder to build aGoEmotions.ai.djl.training.dataset.Recordget(ai.djl.ndarray.NDManager manager, long index) Gets theRecordfor the given index from the dataset.voidprepare(ai.djl.util.Progress progress) Prepares the dataset for use with tracked progress.Methods inherited from class ai.djl.basicdataset.nlp.TextDataset
getProcessedText, getRawText, getSamples, getTextEmbedding, getVocabulary, preprocessMethods inherited from class ai.djl.training.dataset.RandomAccessDataset
getData, getData, getData, getData, newSubDataset, newSubDataset, randomSplit, size, subDataset, subDataset, subDataset, subDataset, toArrayMethods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, waitMethods inherited from interface ai.djl.training.dataset.Dataset
matchingTranslatorOptions, prepare
-
Method Details
-
prepare
public void prepare(ai.djl.util.Progress progress) throws IOException, ai.djl.modality.nlp.embedding.EmbeddingException Prepares the dataset for use with tracked progress. In this method the TSV file will be parsed. All datasets will be preprocessed.- Parameters:
progress- the progress tracker- Throws:
IOException- for various exceptions depending on the datasetai.djl.modality.nlp.embedding.EmbeddingException
-
get
public ai.djl.training.dataset.Record get(ai.djl.ndarray.NDManager manager, long index) throws IOException Gets theRecordfor the given index from the dataset.- Specified by:
getin classai.djl.training.dataset.RandomAccessDataset- Parameters:
manager- the manager used to create the arraysindex- the index of the requested data item- Returns:
- a
Recordthat contains the data and label of the requested data item. The dataNDListcontains threeNDArrays representing the embedded title, context and question, which are named accordingly. The labelNDListcontains multipleNDArrays corresponding to each embedded answer. - Throws:
IOException
-
availableSize
protected long availableSize()Returns the number of records available to be read in thisDataset. In this implementation, the actual size of available records are the size ofquestionInfoList.- Specified by:
availableSizein classai.djl.training.dataset.RandomAccessDataset- Returns:
- the number of records available to be read in this
Dataset
-
builder
Creates a builder to build aGoEmotions.- Returns:
- a new builder
-