A collection of text processing tools for Python
ChirpText
ChirpText is a collection of text processing tools for Python 3.
It is not meant to be a powerful tank like the popular NTLK but a small package which you can pip-install anywhere and write a few lines of code to process textual data.
- Simple file data manipulation using an enhanced
open()
function (txt, gz, binary, etc.) - CSV helper functions
- Parse Japanese text with mecab library (Does not require
mecab-python3
package even on Windows, only a binary release (i.e.mecab.exe
) is required) - Built-in “lite” text annotation formats (
texttaglib
TTL/CSV and TTL/JSON) - Helper functions and useful data for processing English, Japanese, Chinese and Vietnamese.
- Application configuration files management which can make educated guess about config files’ whereabouts
- Quick text-based report generation
chirptext
is available on PyPI and