summaryrefslogtreecommitdiff
path: root/py/objarray.c
AgeCommit message (Collapse)Author
2024-04-22py/objarray: Fix use-after-free if extending a bytearray from itself.Angus Gratton
Two cases, one assigning to a slice. Closes https://github.com/micropython/micropython/issues/13283 Second is extending a slice from itself, similar logic. In both cases the problem occurs when m_renew causes realloc to move the buffer, leaving a dangling pointer behind. There are more complex and hard to fix cases when either argument is a memoryview into the buffer, currently resizing to a new address breaks memoryviews into that object. Reproducing this bug and confirming the fix was done by running the unix port under valgrind with GC-aware extensions. Note in default configurations with GIL this bug exists but has no impact (the free buffer won't be reused while the function is still executing, and is no longer referenced after it returns). Signed-off-by: Angus Gratton <angus@redyak.com.au>
2024-03-07all: Remove the "STATIC" macro and just use "static" instead.Angus Gratton
The STATIC macro was introduced a very long time ago in commit d5df6cd44a433d6253a61cb0f987835fbc06b2de. The original reason for this was to have the option to define it to nothing so that all static functions become global functions and therefore visible to certain debug tools, so one could do function size comparison and other things. This STATIC feature is rarely (if ever) used. And with the use of LTO and heavy inline optimisation, analysing the size of individual functions when they are not static is not a good representation of the size of code when fully optimised. So the macro does not have much use and it's simpler to just remove it. Then you know exactly what it's doing. For example, newcomers don't have to learn what the STATIC macro is and why it exists. Reading the code is also less "loud" with a lowercase static. One other minor point in favour of removing it, is that it stops bugs with `STATIC inline`, which should always be `static inline`. Methodology for this commit was: 1) git ls-files | egrep '\.[ch]$' | \ xargs sed -Ei "s/(^| )STATIC($| )/\1static\2/" 2) Do some manual cleanup in the diff by searching for the word STATIC in comments and changing those back. 3) "git-grep STATIC docs/", manually fixed those cases. 4) "rg -t python STATIC", manually fixed codegen lines that used STATIC. This work was funded through GitHub Sponsors. Signed-off-by: Angus Gratton <angus@redyak.com.au>
2023-05-19py/objarray: Disallow memoryview addition.Damien George
Following CPython. This is important for subsequent commits to work correctly. Signed-off-by: Damien George <damien@micropython.org>
2023-01-20py/objarray: Raise error on out-of-bound memoryview slice start.Andrew Leech
32-bit platforms only support a slice offset start of 24 bit max due to the limited size of the mp_obj_array_t.free member. Similarly on 64-bit platforms the limit is 56 bits. This commit adds an OverflowError if the user attempts to slice a memoryview beyond this limit. Signed-off-by: Damien George <damien@micropython.org>
2022-11-08py/objarray: Detect bytearray(str) without an encoding.Jim Mussared
This prevents a very subtle bug caused by writing e.g. `bytearray('\xfd')` which gives you `(0xc3, 0xbd)`. This work was funded through GitHub Sponsors. Signed-off-by: Jim Mussared <jim.mussared@gmail.com>
2022-09-19py/obj: Convert make_new into a mp_obj_type_t slot.Jim Mussared
Instead of being an explicit field, it's now a slot like all the other methods. This is a marginal code size improvement because most types have a make_new (100/138 on PYBV11), however it improves consistency in how types are declared, removing the special case for make_new. Signed-off-by: Jim Mussared <jim.mussared@gmail.com>
2022-09-19py/obj: Merge getiter and iternext mp_obj_type_t slots.Jim Mussared
The goal here is to remove a slot (making way to turn make_new into a slot) as well as reduce code size by the ~40 references to mp_identity_getiter and mp_stream_unbuffered_iter. This introduces two new type flags: - MP_TYPE_FLAG_ITER_IS_ITERNEXT: This means that the "iter" slot in the type is "iternext", and should use the identity getiter. - MP_TYPE_FLAG_ITER_IS_CUSTOM: This means that the "iter" slot is a pointer to a mp_getiter_iternext_custom_t instance, which then defines both getiter and iternext. And a third flag that is the OR of both, MP_TYPE_FLAG_ITER_IS_STREAM: This means that the type should use the identity getiter, and mp_stream_unbuffered_iter as iternext. Finally, MP_TYPE_FLAG_ITER_IS_GETITER is defined as a no-op flag to give the default case where "iter" is "getiter". Signed-off-by: Jim Mussared <jim.mussared@gmail.com>
2022-09-19py/obj: Add accessors for type slots and use everywhere.Jim Mussared
This is a no-op, but sets the stage for changing the mp_obj_type_t representation. Signed-off-by: Jim Mussared <jim.mussared@gmail.com>
2022-09-19all: Remove unnecessary locals_dict cast.Jim Mussared
Signed-off-by: Jim Mussared <jim.mussared@gmail.com>
2022-09-19all: Fix #if inside MP_DEFINE_CONST_OBJ_TYPE for msvc.Jim Mussared
Changes: MP_DEFINE_CONST_OBJ_TYPE( ... #if FOO ... #endif ... ); to: MP_DEFINE_CONST_OBJ_TYPE( ... FOO_TYPE_ATTR ... ); Signed-off-by: Jim Mussared <jim.mussared@gmail.com>
2022-09-19all: Make all mp_obj_type_t defs use MP_DEFINE_CONST_OBJ_TYPE.Jim Mussared
In preparation for upcoming rework of mp_obj_type_t layout. Signed-off-by: Jim Mussared <jim.mussared@gmail.com>
2022-09-19all: Simplify buffer protocol to just a "get buffer" callback.Jim Mussared
The buffer protocol type only has a single member, and this existing layout creates problems for the upcoming split/slot-index mp_obj_type_t layout optimisations. If we need to make the buffer protocol more sophisticated in the future either we can rely on the mp_obj_type_t optimisations to just add additional slots to mp_obj_type_t or re-visit the buffer protocol then. This change is a no-op in terms of generated code. Signed-off-by: Jim Mussared <jim.mussared@gmail.com>
2022-08-12py/objstr: Add hex/fromhex to bytes/memoryview/bytearray.Jim Mussared
These were added in Python 3.5. Enabled via MICROPY_PY_BUILTINS_BYTES_HEX, and enabled by default for all ports that currently have ubinascii. Rework ubinascii to use the implementation of these methods. Signed-off-by: Jim Mussared <jim.mussared@gmail.com>
2022-08-11py/objstr: Consolidate methods for str/bytes/bytearray/array.Andrew Leech
This commit adds the bytes methods to bytearray, matching CPython. The existing implementations of these methods for str/bytes are reused for bytearray with minor updates to match CPython return types. For details on the CPython behaviour see https://docs.python.org/3/library/stdtypes.html#bytes-and-bytearray-operations The work to merge locals tables for str/bytes/bytearray/array was done by @jimmo. Because of this merging of locals the change in code size for this commit is mostly negative: bare-arm: +0 +0.000% minimal x86: +29 +0.018% unix x64: -792 -0.128% standard[incl -448(data)] unix nanbox: -436 -0.078% nanbox[incl -448(data)] stm32: -40 -0.010% PYBV10 cc3200: -32 -0.017% esp8266: -28 -0.004% GENERIC esp32: -72 -0.005% GENERIC[incl -200(data)] mimxrt: -40 -0.011% TEENSY40 renesas-ra: -40 -0.006% RA6M2_EK nrf: -16 -0.009% pca10040 rp2: -64 -0.013% PICO samd: +148 +0.105% ADAFRUIT_ITSYBITSY_M4_EXPRESS
2022-05-03all: Use mp_obj_malloc everywhere it's applicable.Jim Mussared
This replaces occurences of foo_t *foo = m_new_obj(foo_t); foo->base.type = &foo_type; with foo_t *foo = mp_obj_malloc(foo_t, &foo_type); Excludes any places where base is a sub-field or when new0/memset is used. Signed-off-by: Jim Mussared <jim.mussared@gmail.com>
2021-05-18py/objarray: Fix constructing a memoryview from a memoryview.Damien George
Fixes issue #7261. Signed-off-by: Damien George <damien@micropython.org>
2021-05-18py/objarray: Use mp_obj_memoryview_init helper in mp_obj_new_memoryview.Damien George
Signed-off-by: Damien George <damien@micropython.org>
2021-05-13py/objarray: Implement more/less comparisons for array.stijn
2021-05-13py/objarray: Prohibit comparison of mismatching types.stijn
Array equality is defined as each element being equal but to keep code size down MicroPython implements a binary comparison. This can only be used correctly for elements with the same binary layout though so turn it into an NotImplementedError when comparing types for which the binary comparison yielded incorrect results: types with different sizes, and floating point numbers because nan != nan.
2021-03-26py: Rename remaining object types to be of the form mp_type_xxx.Damien George
For consistency with all other object types in the core. Signed-off-by: Damien George <damien@micropython.org>
2020-04-18py/objarray: Fix sign mismatch in comparison.stijn
Found when compiling with clang and -Wsign-compare.
2020-04-18Revert "all: Fix implicit casts of float/double, and signed comparison."stijn
This reverts commit a2110bd3fca59df8b16a2b5fe4645a4af30b06ed. There's nothing inherently wrong with it, but upcoming commits will apply similar fixes in a slightly different way.
2020-04-05all: Use MP_ERROR_TEXT for all error messages.Jim Mussared
2020-03-30all: Fix implicit casts of float/double, and signed comparison.David Lechner
These were found by buiding the unix coverage variant on macOS (so clang compiler). Mostly, these are fixing implicit cast of float/double to mp_float_t which is one of those two and one mp_int_t to size_t fix for good measure.
2020-02-28all: Reformat C and Python source code with tools/codeformat.py.Damien George
This is run with uncrustify 0.70.1, and black 19.10b0.
2020-02-21py/objarray: Turn on MP_TYPE_FLAG_EQ_CHECKS_OTHER_TYPE for memoryview.Jim Mussared
And add corresponding tests. Fixes #5674 (comparison of memoryview against bytes).
2020-02-11py: Expand type equality flags to 3 separate ones, fix bool/namedtuple.Damien George
Both bool and namedtuple will check against other types for equality; int, float and complex for bool, and tuple for namedtuple. So to make them work after the recent commit 3aab54bf434e7f025a91ea05052f1bac439fad8c they would need MP_TYPE_FLAG_NEEDS_FULL_EQ_TEST set. But that makes all bool and namedtuple equality checks less efficient because mp_obj_equal_not_equal() could no longer short-cut x==x, and would need to try __ne__. To improve this, this commit splits the MP_TYPE_FLAG_NEEDS_FULL_EQ_TEST flags into 3 separate flags to give types more fine-grained control over how their equality behaves. These new flags are then used to fix bool and namedtuple equality. Fixes issue #5615 and #5620.
2020-01-30py: Support non-boolean results for equality and inequality tests.Nicko van Someren
This commit implements a more complete replication of CPython's behaviour for equality and inequality testing of objects. This addresses the issues discussed in #5382 and a few other inconsistencies. Improvements over the old code include: - Support for returning non-boolean results from comparisons (as used by numpy and others). - Support for non-reflexive equality tests. - Preferential use of __ne__ methods and MP_BINARY_OP_NOT_EQUAL binary operators for inequality tests, when available. - Fallback to op2 == op1 or op2 != op1 when op1 does not implement the (in)equality operators. The scheme here makes use of a new flag, MP_TYPE_FLAG_NEEDS_FULL_EQ_TEST, in the flags word of mp_obj_type_t to indicate if various shortcuts can or cannot be used when performing equality and inequality tests. Currently four built-in classes have the flag set: float and complex are non-reflexive (since nan != nan) while bytearray and frozenszet instances can equal other builtin class instances (bytes and set respectively). The flag is also set for any new class defined by the user. This commit also includes a more comprehensive set of tests for the behaviour of (in)equality operators implemented in special methods.
2019-08-15py/objarray: Fix amount of free space in array when doing slice assign.Damien George
Prior to this patch the amount of free space in an array (including bytearray) was not being maintained correctly for the case of slice assignment which changed the size of the array. Under certain cases (as encoded in the new test) it was possible that the array could grow beyond its allocated memory block and corrupt the heap. Fixes issue #4127.
2019-05-21py/objarray: Add decode method to bytearray.stijn
Reuse the implementation for bytes since it works the same way regardless of the underlying type. This method gets added for CPython compatibility of bytearray, but to keep the code simple and small array.array now also has a working decode method, which is non-standard but doesn't hurt.
2019-05-14py/objarray: Add support for memoryview.itemsize attribute.stijn
This allows figuring out the number of bytes in the memoryview object as len(memview) * memview.itemsize. The feature is enabled via MICROPY_PY_BUILTINS_MEMORYVIEW_ITEMSIZE and is disabled by default.
2019-05-06py: remove "if (0)" and "if (false)" branches.Jun Wu
Prior to this commit, building the unix port with `DEBUG=1` and `-finstrument-functions` the compilation would fail with an error like "control reaches end of non-void function". This change fixes this by removing the problematic "if (0)" branches. Not all branches affect compilation, but they are all removed for consistency.
2019-02-12py: Downcase all MP_OBJ_IS_xxx macros to make a more consistent C API.Damien George
These macros could in principle be (inline) functions so it makes sense to have them lower case, to match the other C API functions. The remaining macros that are upper case are: - MP_OBJ_TO_PTR, MP_OBJ_FROM_PTR - MP_OBJ_NEW_SMALL_INT, MP_OBJ_SMALL_INT_VALUE - MP_OBJ_NEW_QSTR, MP_OBJ_QSTR_VALUE - MP_OBJ_FUN_MAKE_SIG - MP_DECLARE_CONST_xxx - MP_DEFINE_CONST_xxx These must remain macros because they are used when defining const data (at least, MP_OBJ_NEW_SMALL_INT is so it makes sense to have MP_OBJ_SMALL_INT_VALUE also a macro). For those macros that have been made lower case, compatibility macros are provided for the old names so that users do not need to change their code immediately.
2018-12-20py/objarray: Introduce "memview_offset" alias for "free" field of objectPaul Sokolovsky
Both mp_type_array and mp_type_memoryview use the same object structure, mp_obj_array_t, but for the case of memoryview, some fields, e.g. "free", have different meaning. As the "free" field is also a bitfield, assume that (anonymous) union can't be used here (for the concerns of possible compatibility issues with wide array of toolchains), and just add a field alias using a #define. As it's a define, it should be a selective identifier, so use verbose "memview_offset" to avoid any clashes.
2018-09-11py/objarray: bytearray: Allow 2nd/3rd arg to constructor.Paul Sokolovsky
If bytearray is constructed from str, a second argument of encoding is required (in CPython), and third arg of Unicode error handling is allowed, e.g.: bytearray("str", "utf-8", "strict") This is similar to bytes: bytes("str", "utf-8", "strict") This patch just allows to pass 2nd/3rd arguments to bytearray, but doesn't try to validate them to not impact code size. (This is also similar to how bytes constructor is handled, though it does a bit more validation, e.g. check that in case of str arg, encoding argument is passed.)
2018-08-14py/objarray: Allow to build again when bytearray is disabled.Damien George
2018-06-18py/objarray: Replace 0x80 with new MP_OBJ_ARRAY_TYPECODE_FLAG_RW macro.Damien George
2017-11-24py/runtime: Add MP_BINARY_OP_CONTAINS as reverse of MP_BINARY_OP_IN.Damien George
Before this patch MP_BINARY_OP_IN had two meanings: coming from bytecode it meant that the args needed to be swapped, but coming from within the runtime meant that the args were already in the correct order. This lead to some confusion in the code and comments stating how args were reversed. It also lead to 2 bugs: 1) containment for a subclass of a native type didn't work; 2) the expression "{True} in True" would illegally succeed and return True. In both of these cases it was because the args to MP_BINARY_OP_IN ended up being reversed twice. To fix these things this patch introduces MP_BINARY_OP_CONTAINS which corresponds exactly to the __contains__ special method, and this is the operator that built-in types should implement. MP_BINARY_OP_IN is now only emitted by the compiler and is converted to MP_BINARY_OP_CONTAINS by swapping the arguments.
2017-10-24all: Use NULL instead of "" when calling mp_raise exception helpers.Damien George
This is the established way of doing it and reduces code size by a little bit.
2017-10-04all: Remove inclusion of internal py header files.Damien George
Header files that are considered internal to the py core and should not normally be included directly are: py/nlr.h - internal nlr configuration and declarations py/bc0.h - contains bytecode macro definitions py/runtime0.h - contains basic runtime enums Instead, the top-level header files to include are one of: py/obj.h - includes runtime0.h and defines everything to use the mp_obj_t type py/runtime.h - includes mpstate.h and hence nlr.h, obj.h, runtime0.h, and defines everything to use the general runtime support functions Additional, specific headers (eg py/objlist.h) can be included if needed.
2017-08-29all: Convert mp_uint_t to mp_unary_op_t/mp_binary_op_t where appropriateDamien George
The unary-op/binary-op enums are already defined, and there are no arithmetic tricks used with these types, so it makes sense to use the correct enum type for arguments that take these values. It also reduces code size quite a bit for nan-boxing builds.
2017-08-13all: Raise exceptions via mp_raise_XXXJavier Candeira
- Changed: ValueError, TypeError, NotImplementedError - OSError invocations unchanged, because the corresponding utility function takes ints, not strings like the long form invocation. - OverflowError, IndexError and RuntimeError etc. not changed for now until we decide whether to add new utility functions.
2017-07-31all: Use the name MicroPython consistently in commentsAlexander Steffen
There were several different spellings of MicroPython present in comments, when there should be only one.
2017-03-28py: Use mp_raise_TypeError/mp_raise_ValueError helpers where possible.Damien George
Saves 168 bytes on bare-arm.
2017-03-25py/objarray: Use mp_obj_str_get_str instead of mp_obj_str_get_data.Damien George
2017-03-23py: Use size_t as len argument and return type of mp_get_index.Damien George
These values are used to compute memory addresses and so size_t is the more appropriate type to use.
2017-02-27py/objarray: Disallow slice-assignment to read-only memoryview.Damien George
Also comes with a test for this. Fixes issue #2904.
2017-02-16py: De-optimise some uses of mp_getiter, so they don't use the C stack.Damien George
In these cases the heap is anyway used to create a new object so no real need to use the C stack for iterating. It saves a few bytes of code size.
2017-02-16py: Add iter_buf to getiter type method.Damien George
Allows to iterate over the following without allocating on the heap: - tuple - list - string, bytes - bytearray, array - dict (not dict.keys, dict.values, dict.items) - set, frozenset Allows to call the following without heap memory: - all, any, min, max, sum TODO: still need to allocate stack memory in bytecode for iter_buf.
2017-02-16py/objarray: Convert mp_uint_t to size_t where appropriate.Damien George