summaryrefslogtreecommitdiff
path: root/py/parse.c
AgeCommit message (Collapse)Author
2019-09-26py: Rename MP_QSTR_NULL to MP_QSTRnull to avoid intern collisions.Josh Lloyd
Fixes #5140.
2019-09-26py: Add support for matmul operator @ as per PEP 465.Damien George
To make progress towards MicroPython supporting Python 3.5, adding the matmul operator is important because it's a really "low level" part of the language, being a new token and modifications to the grammar. It doesn't make sense to make it configurable because 1) it would make the grammar and lexer complicated/messy; 2) no other operators are configurable; 3) it's not a feature that can be "dynamically plugged in" via an import. And matmul can be useful as a general purpose user-defined operator, it doesn't have to be just for numpy use. Based on work done by Jim Mussared.
2019-09-26py/parse: Use calculation instead of table to convert token to operator.Damien George
2019-09-26py/lexer: Reorder operator tokens to match corresponding binary ops.Damien George
2019-02-12py: Downcase all MP_OBJ_IS_xxx macros to make a more consistent C API.Damien George
These macros could in principle be (inline) functions so it makes sense to have them lower case, to match the other C API functions. The remaining macros that are upper case are: - MP_OBJ_TO_PTR, MP_OBJ_FROM_PTR - MP_OBJ_NEW_SMALL_INT, MP_OBJ_SMALL_INT_VALUE - MP_OBJ_NEW_QSTR, MP_OBJ_QSTR_VALUE - MP_OBJ_FUN_MAKE_SIG - MP_DECLARE_CONST_xxx - MP_DEFINE_CONST_xxx These must remain macros because they are used when defining const data (at least, MP_OBJ_NEW_SMALL_INT is so it makes sense to have MP_OBJ_SMALL_INT_VALUE also a macro). For those macros that have been made lower case, compatibility macros are provided for the old names so that users do not need to change their code immediately.
2018-09-20py: Shorten error messages by using contractions and some rewording.Damien George
2017-12-29py/parse: Fix macro evaluation by avoiding empty __VA_ARGS__.Damien George
Empty __VA_ARGS__ are not allowed in the C preprocessor so adjust the rule arg offset calculation to not use them. Also, some compilers (eg MSVC) require an extra layer of macro expansion.
2017-12-29py/parse: Update debugging code to compile on 64-bit arch.Damien George
2017-12-29py/parse: Compress rule pointer table to table of offsets.Damien George
This is the sixth and final patch in a series of patches to the parser that aims to reduce code size by compressing the data corresponding to the rules of the grammar. Prior to this set of patches the rules were stored as rule_t structs with rule_id, act and arg members. And then there was a big table of pointers which allowed to lookup the address of a rule_t struct given the id of that rule. The changes that have been made are: - Breaking up of the rule_t struct into individual components, with each component in a separate array. - Removal of the rule_id part of the struct because it's not needed. - Put all the rule arg data in a big array. - Change the table of pointers to rules to a table of offsets within the array of rule arg data. The last point is what is done in this patch here and brings about the biggest decreases in code size, because an array of pointers is now an array of bytes. Code size changes for the six patches combined is: bare-arm: -644 minimal x86: -1856 unix x64: -5408 unix nanbox: -2080 stm32: -720 esp8266: -812 cc3200: -712 For the change in parser performance: it was measured on pyboard that these six patches combined gave an increase in script parse time of about 0.4%. This is due to the slightly more complicated way of looking up the data for a rule (since the 9th bit of the offset into the rule arg data table is calculated with an if statement). This is an acceptable increase in parse time considering that parsing is only done once per script (if compiled on the target).
2017-12-28py/parse: Remove rule_t struct because it's no longer needed.Damien George
2017-12-28py/parse: Pass rule_id to push_result_token, instead of passing rule_t*.Damien George
2017-12-28py/parse: Pass rule_id to push_result_rule, instead of passing rule_t*.Damien George
Reduces code size by eliminating quite a few pointer dereferences.
2017-12-28py/parse: Break rule data into separate act and arg arrays.Damien George
Instead of each rule being stored in ROM as a struct with rule_id, act and arg, the act and arg parts are now in separate arrays and the rule_id part is removed because it's not needed. This reduces code size, by roughly one byte per grammar rule, around 150 bytes.
2017-12-28py/parse: Split out rule name from rule struct into separate array.Damien George
The rule name is only used for debugging, and this patch makes things a bit cleaner by completely separating out the rule name from the rest of the rule data.
2017-12-11py: Extend nan-boxing config to have 47-bit small integers.Damien George
The nan-boxing representation has an extra 16-bits of space to store small-int values, and making use of it allows to create and manipulate full 32-bit positive integers (ie up to 0xffffffff) without using the heap.
2017-11-16py/objstr: Make mp_obj_new_str_of_type check for existing interned qstr.Damien George
The function mp_obj_new_str_of_type is a general str object constructor used in many places in the code to create either a str or bytes object. When creating a str it should first check if the string data already exists as an interned qstr, and if so then return the qstr object. This patch makes the function have such behaviour, which helps to reduce heap usage by reusing existing interned data where possible. The old behaviour of mp_obj_new_str_of_type (which didn't check for existing interned data) is made available through the function mp_obj_new_str_copy, but should only be used in very special cases. One consequence of this patch is that the following expression is now True: 'abc' is ' abc '.split()[0]
2017-10-04all: Remove inclusion of internal py header files.Damien George
Header files that are considered internal to the py core and should not normally be included directly are: py/nlr.h - internal nlr configuration and declarations py/bc0.h - contains bytecode macro definitions py/runtime0.h - contains basic runtime enums Instead, the top-level header files to include are one of: py/obj.h - includes runtime0.h and defines everything to use the mp_obj_t type py/runtime.h - includes mpstate.h and hence nlr.h, obj.h, runtime0.h, and defines everything to use the general runtime support functions Additional, specific headers (eg py/objlist.h) can be included if needed.
2017-07-31all: Use the name MicroPython consistently in commentsAlexander Steffen
There were several different spellings of MicroPython present in comments, when there should be only one.
2017-02-24py/parse: Simplify handling of errors by raising them directly.Damien George
The parser was originally written to work without raising any exceptions and instead return an error value to the caller. But it's now required that a call to the parser be wrapped in an nlr handler, so we may as well make use of that fact and simplify the parser so that it doesn't need to keep track of any memory errors that it had. The parser anyway explicitly raises an exception at the end if there was an error. This patch simplifies the parser by letting the underlying memory allocation functions raise an exception if they fail to allocate any memory. And if there is an error parsing the "<id> = const(<val>)" pattern then that also raises an exception right away instead of trying to recover gracefully and then raise.
2017-02-24py: Create str/bytes objects in the parser, not the compiler.Damien George
Previous to this patch any non-interned str/bytes objects would create a special parse node that held a copy of the str/bytes data. Then in the compiler this data would be turned into a str/bytes object. This actually lead to 2 copies of the data, one in the parse node and one in the object. The parse node's copy of the data would be freed at the end of the compile stage but nevertheless it meant that the peak memory usage of the parse/compile stage was higher than it needed to be (by an amount equal to the number of bytes in all the non-interned str/bytes objects). This patch changes the behaviour so that str/bytes objects are created directly in the parser and the object stored in a const-object parse node (which already exists for bignum, float and complex const objects). This reduces peak RAM usage of the parse/compile stage, simplifies the parser and compiler, and reduces code size by about 170 bytes on Thumb2 archs, and by about 300 bytes on Xtensa archs.
2017-02-24py/parse: Allow parser/compiler consts to be bignums.Damien George
This patch allows uPy consts to be bignums, eg: X = const(1 << 100) The infrastructure for consts to be a bignum (rather than restricted to small integers) has been in place for a while, ever since constant folding was upgraded to allow bignums. It just required a small change (in this patch) to enable it.
2017-02-16py/grammar: Group no-compile grammar rules together to shrink tables.Damien George
Grammar rules have 2 variants: ones that are attached to a specific compile function which is called to compile that grammar node, and ones that don't have a compile function and are instead just inspected to see what form they take. In the compiler there is a table of all grammar rules, with each entry having a pointer to the associated compile function. Those rules with no compile function have a null pointer. There are 120 such rules, so that's 120 words of essentially wasted code space. By grouping together the compile vs no-compile rules we can put all the no-compile rules at the end of the list of rules, and then we don't need to store the null pointers. We just have a truncated table and it's guaranteed that when indexing this table we only index the first half, the half with populated pointers. This patch implements such a grouping by having a specific macro for the compile vs no-compile grammar rules (DEF_RULE vs DEF_RULE_NC). It saves around 460 bytes of code on 32-bit archs.
2017-01-17py/parse: Refactor code to remove assert(0)'s.Damien George
This helps to improve code coverage. Note that most of the changes in this patch are just de-denting the cases of the switch statements.
2016-11-15py/parse: Add code to fold logical constants in or/and/not operations.Damien George
Adds about 200 bytes to the code size when constant folding is enabled.
2016-11-15py/parse: Make mp_parse_node_new_leaf an inline function.Damien George
It is split into 2 functions, one to make small ints and the other to make a non-small-int leaf node. This reduces code size by 32 bytes on bare-arm, 64 bytes on unix (x64-64) and 144 bytes on stmhal.
2016-11-15py/parse: Move function to check for const parse node to parse.[ch].Damien George
2016-11-02py: Fix wrong assumption that m_renew will not move if shrinkingColin Hogben
In both parse.c and qstr.c, an internal chunking allocator tidies up by calling m_renew to shrink an allocated chunk to the size used, and assumes that the chunk will not move. However, when MICROPY_ENABLE_GC is false, m_renew calls the system realloc, which does not guarantee this behaviour. Environments where realloc may return a different pointer include: (1) mbed-os with MBED_HEAP_STATS_ENABLED (which adds a wrapper around malloc & friends; this is where I was hit by the bug); (2) valgrind on linux (how I diagnosed it). The fix is to call m_renew_maybe with allow_move=false.
2016-09-23py/parse: Only replace constants that are standalone identifiers.Damien George
This fixes constant substitution so that only standalone identifiers are replaced with their constant value (if they have one). I.e. don't replace NAME in expressions like obj.NAME or NAME = expr.
2016-06-06py/parse: Treat constants that start with underscore as private.Damien George
Assignments of the form "_id = const(value)" are treated as private (following a similar CPython convention) and code is no longer emitted for the assignment to a global variable. See issue #2111.
2016-05-20py: Declare constant data as properly constant.Damien George
Otherwise some compilers (eg without optimisation) will put this read-only data in RAM instead of ROM.
2016-05-10py/parse: Add uerrno to list of modules to look for constants in.Damien George
2016-04-14py: Simplify "and" action within parser by making ident-rules explicit.Damien George
Most grammar rules can optimise to the identity if they only have a single argument, saving a lot of RAM building the parse tree. Previous to this patch, whether a given grammar rule could be optimised was defined (mostly implicitly) by a complicated set of logic rules. With this patch the definition is always specified explicitly by using "and_ident" in the rule definition in the grammar. This simplifies the logic of the parser, making it a bit smaller and faster. RAM usage in unaffected.
2016-04-13py: Fix constant folding and inline-asm to work with new async grammar.Damien George
2016-03-19py/parse: When looking up consts, check they exist before checking type.Damien George
2016-02-23py/parse: Use m_renew_maybe to ensure that memory is shrunk in-place.Damien George
The chunks of memory that the parser allocates contain parse nodes and are pointed to from many places, so these chunks cannot be relocated by the memory manager. This patch makes it so that when a chunk is shrunk to fit, it is not relocated.
2016-01-12py: unary_op enum type fix, and a cast to remove clang warningAntonin ENFRUN
2016-01-08py/parse: Include unistd.h for ssize_t definition.Damien George
In some cases ssize_t is not defined by already included headers.
2016-01-07py/parse: Improve constant folding to operate on small and big ints.Damien George
Constant folding in the parser can now operate on big ints, whatever their representation. This is now possible because the parser can create parse nodes holding arbitrary objects. For the case of small ints the folding is still efficient in RAM because the folded small int is stored inplace in the parse node. Adds 48 bytes to code size on Thumb2 architecture. Helps reduce heap usage because more constants can be computed at compile time, leading to a smaller parse tree, and most importantly means that the constants don't have to be computed at runtime (perhaps more than once). Parser will now be a little slower when folding due to calls to runtime to do the arithmetic.
2016-01-07py/parse: Optimise away parse node that's just parenthesis around expr.Damien George
Before this patch, (x+y)*z would be parsed to a tree that contained a redundant identity parse node corresponding to the parenthesis. With this patch such nodes are optimised away, which reduces memory requirements for expressions with parenthesis, and simplifies the compiler because it doesn't need to handle this identity case. A parenthesis parse node is still needed for tuples.
2015-12-18py: Add MICROPY_ENABLE_COMPILER and MICROPY_PY_BUILTINS_EVAL_EXEC opts.Damien George
MICROPY_ENABLE_COMPILER can be used to enable/disable the entire compiler, which is useful when only loading of pre-compiled bytecode is supported. It is enabled by default. MICROPY_PY_BUILTINS_EVAL_EXEC controls support of eval and exec builtin functions. By default they are only included if MICROPY_ENABLE_COMPILER is enabled. Disabling both options saves about 40k of code size on 32-bit x86.
2015-12-17py/parse: Replace mp_int_t/mp_uint_t with size_t etc, where appropriate.Damien George
2015-11-29py: Add support for 64-bit NaN-boxing object model, on 32-bit machine.Damien George
To use, put the following in mpconfigport.h: #define MICROPY_OBJ_REPR (MICROPY_OBJ_REPR_D) #define MICROPY_FLOAT_IMPL (MICROPY_FLOAT_IMPL_DOUBLE) typedef int64_t mp_int_t; typedef uint64_t mp_uint_t; #define UINT_FMT "%llu" #define INT_FMT "%lld" Currently does not work with native emitter enabled.
2015-11-29py: Wrap all obj-ptr conversions in MP_OBJ_TO_PTR/MP_OBJ_FROM_PTR.Damien George
This allows the mp_obj_t type to be configured to something other than a pointer-sized primitive type. This patch also includes additional changes to allow the code to compile when sizeof(mp_uint_t) != sizeof(void*), such as using size_t instead of mp_uint_t, and various casts.
2015-11-29py: Add MP_ROM_* macros and mp_rom_* types and use them.Damien George
2015-11-17py: Implement default and star args for lambdas.Damien George
2015-10-12py/parse: Make parser error handling cleaner, less spaghetti-like.Damien George
2015-10-12py: Move constant folding from compiler to parser.Damien George
It makes much more sense to do constant folding in the parser while the parse tree is being built. This eliminates the need to create parse nodes that will just be folded away. The code is slightly simpler and a bit smaller as well. Constant folding now has a configuration option, MICROPY_COMP_CONST_FOLDING, which is enabled by default.
2015-10-08py/parse: Factor logic when creating parse node from and-rule.Damien George
2015-10-02py: Allocate parse nodes in chunks to reduce fragmentation and RAM use.Damien George
With this patch parse nodes are allocated sequentially in chunks. This reduces fragmentation of the heap and prevents waste at the end of individually allocated parse nodes. Saves roughly 20% of RAM during parse stage.
2015-08-17unix-cpy: Remove unix-cpy. It's no longer needed.Damien George
unix-cpy was originally written to get semantic equivalent with CPython without writing functional tests. When writing the initial implementation of uPy it was a long way between lexer and functional tests, so the half-way test was to make sure that the bytecode was correct. The idea was that if the uPy bytecode matched CPython 1-1 then uPy would be proper Python if the bytecodes acted correctly. And having matching bytecode meant that it was less likely to miss some deep subtlety in the Python semantics that would require an architectural change later on. But that is all history and it no longer makes sense to retain the ability to output CPython bytecode, because: 1. It outputs CPython 3.3 compatible bytecode. CPython's bytecode changes from version to version, and seems to have changed quite a bit in 3.5. There's no point in changing the bytecode output to match CPython anymore. 2. uPy and CPy do different optimisations to the bytecode which makes it harder to match. 3. The bytecode tests are not run. They were never part of Travis and are not run locally anymore. 4. The EMIT_CPYTHON option needs a lot of extra source code which adds heaps of noise, especially in compile.c. 5. Now that there is an extensive test suite (which tests functionality) there is no need to match the bytecode. Some very subtle behaviour is tested with the test suite and passing these tests is a much better way to stay Python-language compliant, rather than trying to match CPy bytecode.