summaryrefslogtreecommitdiff
path: root/contrib/unaccent/generate_unaccent_rules.py
AgeCommit message (Collapse)Author
2019-02-01Add combining characters to unaccent.rules.Thomas Munro
Strip certain classes of combining characters, so that accents encoded this way are removed. Author: Hugh Ranalli Discussion: https://postgr.es/m/15548-cef1b3f8de190d4f%40postgresql.org
2019-01-10Update unaccent rules with release 34 of CLDR for Latin-ASCII.xmlMichael Paquier
This has required an update of the python script generating the rules, as its format has changed in release 29. This release has also added new punctuation and symbols, and a new set of rules has been generated to include them. The way to find newest versions of Latin-ASCII gets also more clearly documented. Author: Hugh Ranalli, Michael Paquier Discussion: https://postgr.es/m/15548-cef1b3f8de190d4f@postgresql.org
2019-01-04unaccent: Make generate_unaccent_rules.py Python 3 compatiblePeter Eisentraut
Python 2 is still supported. Author: Hugh Ranalli <hugh@whtc.ca> Discussion: https://www.postgresql.org/message-id/CAAhbUMNyZ+PhNr_mQ=G161K0-hvbq13Tz2is9M3WK+yX9cQOCw@mail.gmail.com
2018-09-02Add Greek characters to unaccent.rules.Thomas Munro
Author: Tasos Maschalidis Reviewed-by: Michael Paquier, Tom Lane Discussion: https://postgr.es/m/153495048900.1368.11566580687623014380%40wrigleys.postgresql.org Discussion: https://postgr.es/m/VI1PR01MB38537EBD529FE5EE3FE9A5FEB5370%40VI1PR01MB3853.eurprd01.prod.exchangelabs.com
2017-08-16Extend the default rules file for contrib/unaccent with Vietnamese letters.Tom Lane
Improve generate_unaccent_rules.py to handle composed characters whose base is another composed character rather than a plain letter. The net effect of this is to add a bunch of multi-accented Vietnamese characters to unaccent.rules. Original complaint from Kha Nguyen, diagnosis of the script's shortcoming by Thomas Munro. Dang Minh Huong and Michael Paquier Discussion: https://postgr.es/m/CALo3sF6EC8cy1F2JUz=GRf5h4LMUJTaG3qpdoiLrNbWEXL-tRg@mail.gmail.com
2016-03-16fix typo in commentTeodor Sigaev
2016-03-16Improve script generating unaccent rulesTeodor Sigaev
Script now use the standard Unicode transliterator Latin-ASCII. Author: Leonard Benedetti
2015-09-04Make unaccent handle all diacritics known to Unicode, and expand ligatures ↵Teodor Sigaev
correctly Add Python script for buiding unaccent.rules from Unicode data. Don't backpatch because unaccent changes may require tsvector/index rebuild. Thomas Munro <thomas.munro@enterprisedb.com>