An implemention of offline RL on recommender system
@author: misajie@update: 20220123 File organization: RecEnv ClassicalRL OfflineRL In progress: Classical off-policy models construction and application on existing environments (Recsim, Virtual Taobao) Reconstruct simulator-free model, eg. feedrec Modify Recsim to fit Wechat short video dataset and run off-policy models and evaluate the result Generate reply samples from short video recommendation environment Build classical offline models Build original offline model Evaluate new model add autoML GitHub View Github
Read more