Autonomous, Bidirectional and Iterative Language Modeling for Scene Text Recognition

ABINet

Read Like Humans: Autonomous, Bidirectional and Iterative Language Modeling for Scene Text Recognition

The official code of ABINet (CVPR 2021, Oral).

ABINet uses a vision model and an explicit language model to recognize text in the wild, which are trained in end-to-end way. The language model (BCN) achieves bidirectional language representation in simulating cloze test, additionally utilizing iterative correction strategy.

Runtime Environment

We provide a pre-built docker image using the Dockerfile from docker/Dockerfile

Running in Docker

$ [email protected]:FangShancheng/ABINet.git
$ docker run --gpus all --rm -ti --ipc=host -v $(pwd)/ABINet:/app fangshancheng/fastai:torch1.1 /bin/bash

(Untested) Or using the dependencies
```
pip install -r requirements.txt
```

Datasets

Training datasets
1. MJSynth (MJ):
  
  To finish reading, please visit source site