A collection of text processing tools for Python

ChirpText

ChirpText is a collection of text processing tools for Python 3.

It is not meant to be a powerful tank like the popular NTLK but a small package which you can pip-install anywhere and write a few lines of code to process textual data.

  • Simple file data manipulation using an enhanced open() function (txt, gz, binary, etc.)
  • CSV helper functions
  • Parse Japanese text with mecab library (Does not require mecab-python3 package even on Windows, only a binary release (i.e. mecab.exe) is required)
  • Built-in “lite” text annotation formats (texttaglib TTL/CSV and TTL/JSON)
  • Helper functions and useful data for processing English, Japanese, Chinese and Vietnamese.
  • Application configuration files management which can make educated guess about config files’ whereabouts
  • Quick text-based report generation

chirptext is available on PyPI and

 

 

 

To finish reading, please visit source site