A python library that helps you read text from an unknown charset encoding

Charset Normalizer

Library that help you read text from unknown charset encoding. Project motivated by chardet, I’m trying to resolve the issue by taking another approach. All IANA character set names for which the Python core library provides codecs are supported.

d3da9600-dedc-11e9-83e8-081f597505df

Introduction

This library aim to assist you in finding what encoding suit the best to content. It DOES NOT try to uncover the originating encoding, in fact this program does not care about it.

By originating we means the one that was precisely used to encode a text file.

Precisely

my_byte_str = 'Bonjour, je suis à la recherche d'une aide sur les étoiles'.encode('cp1252')

We ARE NOT looking for cp1252 BUT FOR Bonjour, je suis à la recherche d'une aide sur

 

 

 

To finish reading, please visit source site