summaryrefslogtreecommitdiff
path: root/py/unicode.c
AgeCommit message (Collapse)Author
2024-03-07all: Remove the "STATIC" macro and just use "static" instead.Angus Gratton
The STATIC macro was introduced a very long time ago in commit d5df6cd44a433d6253a61cb0f987835fbc06b2de. The original reason for this was to have the option to define it to nothing so that all static functions become global functions and therefore visible to certain debug tools, so one could do function size comparison and other things. This STATIC feature is rarely (if ever) used. And with the use of LTO and heavy inline optimisation, analysing the size of individual functions when they are not static is not a good representation of the size of code when fully optimised. So the macro does not have much use and it's simpler to just remove it. Then you know exactly what it's doing. For example, newcomers don't have to learn what the STATIC macro is and why it exists. Reading the code is also less "loud" with a lowercase static. One other minor point in favour of removing it, is that it stops bugs with `STATIC inline`, which should always be `static inline`. Methodology for this commit was: 1) git ls-files | egrep '\.[ch]$' | \ xargs sed -Ei "s/(^| )STATIC($| )/\1static\2/" 2) Do some manual cleanup in the diff by searching for the word STATIC in comments and changing those back. 3) "git-grep STATIC docs/", manually fixed those cases. 4) "rg -t python STATIC", manually fixed codegen lines that used STATIC. This work was funded through GitHub Sponsors. Signed-off-by: Angus Gratton <angus@redyak.com.au>
2020-02-28all: Reformat C and Python source code with tools/codeformat.py.Damien George
This is run with uncrustify 0.70.1, and black 19.10b0.
2020-01-12py/unicode: Add unichar_isalnum().Yonatan Goldschmidt
2018-11-26py/unicode: Fix check for valid utf8 being stricter about contn chars.Damien George
2018-02-14py/unicode: Clean up utf8 funcs and provide non-utf8 inline versions.Damien George
This patch provides inline versions of the utf8 helper functions for the case when unicode is disabled (MICROPY_PY_BUILTINS_STR_UNICODE set to 0). This saves code size. The unichar_charlen function is also renamed to utf8_charlen to match the other utf8 helper functions, and the signature of this function is adjusted for consistency (const char* -> const byte*, mp_uint_t -> size_t).
2017-09-06py/objstr: Add check for valid UTF-8 when making a str from bytes.tll
This patch adds a function utf8_check() to check for a valid UTF-8 encoded string, and calls it when constructing a str from raw bytes. The feature is selectable at compile time via MICROPY_PY_BUILTINS_STR_UNICODE_CHECK and is enabled if unicode is enabled. It costs about 110 bytes on Thumb-2, 150 bytes on Xtensa and 170 bytes on x86-64.
2017-07-31all: Use the name MicroPython consistently in commentsAlexander Steffen
There were several different spellings of MicroPython present in comments, when there should be only one.
2016-12-28py/unicode: Comment-out unused function unichar_isprint.Damien George
2016-02-17py/repl: Check for an identifier char after the keyword.Alex March
- As described in the #1850. - Add cmdline tests.
2015-05-20py: Minor improvement to unichar_isxdigitDave Hylands
This drops the size of unicode_isxdigit from 0x1e + 0x02 filler to 0x14 bytes (so net code reduction of 12 bytes) and will make unicode_is_xdigit perform slightly faster.
2015-05-20extmod: Add ubinascii.unhexlifyDave Hylands
This also pulls out hex_digit from py/lexer.c and makes unichar_hex_digit
2015-04-09py: Adjust some spaces in code style/format, purely for consistency.Damien George
2015-01-01py: Move to guarded includes, everywhere in py/ core.Damien George
Addresses issue #1022.
2014-12-10py: Tidy up a few function declarations.Damien George
2014-07-03Rename machine_(u)int_t to mp_(u)int_t.Damien George
See discussion in issue #50.
2014-06-28py: Make unichar_charlen() accept/return machine_uint_t.Paul Sokolovsky
2014-06-28py: Small comments, name changes, use of machine_int_t.Damien George
2014-06-27unicode: Make get_char()/next_char()/charlen() be 8-bit compatible.Paul Sokolovsky
Based on config define.
2014-06-27unicode: Add utf8_ptr_to_index().Paul Sokolovsky
Useful when we have pointer to char inside string, but need to return char index. (E.g. str.find()).
2014-06-27py: Implement basic unicode functions.Chris Angelico
2014-06-21py: Include mpconfig.h before all other includes.Paul Sokolovsky
It defines types used by all other headers. Fixes #691.
2014-06-14unicode: String API is const byte*.Paul Sokolovsky
We still have that char vs byte dichotomy, but majority of string operations now use byte.
2014-05-11py: Rename some unichar functions for consistency.Damien George
2014-05-10objstr: Implement .lower() and .upper().Paul Sokolovsky
2014-05-03Add license header to (almost) all files.Damien George
Blanket wide to all .c and .h files. Some files originating from ST are difficult to deal with (license wise) so it was left out of those. Also merged modpyb.h, modos.h, modstm.h and modtime.h in stmhal/.
2014-04-10py: Make form-feed character a space (following C isspace).Damien George
Eg, in CPython stdlib, email/header.py has a form-feed character.
2014-02-12Replace global "static" -> "STATIC", to allow "analysis builds". Part 2.Paul Sokolovsky
2014-01-22Implement octal and hex escapes in strings.Paul Sokolovsky
2013-12-30Put unicode functions in unicode.c, and tidy their names.Damien George