PLStream: A Framework for Fast Polarity Labelling of Massive Data Streams
Motivation When dataset freshness is critical, the annotating of high speed unlabelled data streams becomes critical but remains an open problem. We propose PLStream, a novel Apache Flink-based framework for fast polarity labelling of massive data streams, like Twitter tweets or online product reviews. Environment Requirements relative python packages are summerized in requirements.txt Flink v1.13 Python 3.7 Java 8 DataSource Tweets Yelp Reviews Amazon Reviews Quick Start quick try PLStream on yelp review dataset Data Prepare cd PLStream weget https://s3.amazonaws.com/fast-ai-nlp/yelp_review_polarity_csv.tgz […]
Read more