Age | Commit message (Collapse) | Author |
|
This commit expands the implementation of Viper load/store operations
that are optimised for the Arm platform.
Now both load and store emitters should generate the shortest possible
sequence in all cases. Redundant specialised operation emitters have
been folded into the general case implementation - this was the case of
integer-indexed load/store operations with a fixed offset of zero.
Signed-off-by: Alessandro Gatti <a.gatti@frob.it>
|
|
This commit performs a small refactoring on the Arm native emitter, by
renaming all but one instance of ASM_ARM_REG_R8 into REG_TEMP.
ASM_ARM_REG_R8 is the temporary register used by the emitter when
operations cannot overwrite the value of a particular register and some
extra storage is needed.
Signed-off-by: Alessandro Gatti <a.gatti@frob.it>
|
|
This commit extends the range for int-indexed load/store opcode
generators, making them emit correct code sequences for offsets that
span more than 12 bits.
This is necessary due to those generator bits being also used in the
Viper emitter, where it's more probable to reference offsets that can
not be embedded in the LDR/STR opcodes.
Signed-off-by: Alessandro Gatti <a.gatti@frob.it>
|
|
This commit lets the Viper code generator use optimised code sequences
for register-indexed load and store operations when generating Arm code.
The existing code defaulted to generic multi-operations code sequences
for Arm code on most cases. Now optimised implementations are provided
for register-indexed loads and stores of all data sizes, taking at most
two machine opcodes for each operation.
Signed-off-by: Alessandro Gatti <a.gatti@frob.it>
|
|
This commit fixes code generation for loading halfwords using an offset
greater than 255.
The old code blindly encoded the offset into a `LDRH Rd, [Rn, #imm]`
opcode, but only the lowest 8 bits would be put into the opcode itself.
This commit instead generates a two-opcodes sequence, a constant load into
R8, and then `LDRH Rd, [Rn, R8]`.
This fixes `tests/extmod/vfs_rom.py` for the qemu/SABRELITE board.
Signed-off-by: Alessandro Gatti <a.gatti@frob.it>
|
|
This commit fixes code generation for loading a local's address if its
index is greater than 63.
The old code blindly encoded the offset into an `ADD Rd, Rn, #imm` opcode,
but only the lowest 8 bits would be put into the opcode itself. This
commit instead generates a two-opcodes sequence, a constant load into R8,
and then an `ADD Rd, Rn, R8` opcode.
This fixes `tests/float/math_domain.py` for the qemu/SABRELITE board.
Signed-off-by: Alessandro Gatti <a.gatti@frob.it>
|
|
Co-authored-by: Alessandro Gatti <a.gatti@frob.it>
Signed-off-by: Damien George <damien@micropython.org>
|
|
Prior to this fix, the assembler generated `LDRH Rd, [Rn, #imm]!`, so the
second `LDRH` from the same origin would load from the wrong base.
Co-authored-by: Alessandro Gatti <a.gatti@frob.it>
Signed-off-by: Damien George <damien@micropython.org>
|
|
ASM_NOT_REG is optional, it can be synthesised by xor(reg, -1).
ASM_NEG_REG can also be synthesised with a subtraction, but most
architectures have a dedicated instruction for it.
Signed-off-by: Damien George <damien@micropython.org>
|
|
The STATIC macro was introduced a very long time ago in commit
d5df6cd44a433d6253a61cb0f987835fbc06b2de. The original reason for this was
to have the option to define it to nothing so that all static functions
become global functions and therefore visible to certain debug tools, so
one could do function size comparison and other things.
This STATIC feature is rarely (if ever) used. And with the use of LTO and
heavy inline optimisation, analysing the size of individual functions when
they are not static is not a good representation of the size of code when
fully optimised.
So the macro does not have much use and it's simpler to just remove it.
Then you know exactly what it's doing. For example, newcomers don't have
to learn what the STATIC macro is and why it exists. Reading the code is
also less "loud" with a lowercase static.
One other minor point in favour of removing it, is that it stops bugs with
`STATIC inline`, which should always be `static inline`.
Methodology for this commit was:
1) git ls-files | egrep '\.[ch]$' | \
xargs sed -Ei "s/(^| )STATIC($| )/\1static\2/"
2) Do some manual cleanup in the diff by searching for the word STATIC in
comments and changing those back.
3) "git-grep STATIC docs/", manually fixed those cases.
4) "rg -t python STATIC", manually fixed codegen lines that used STATIC.
This work was funded through GitHub Sponsors.
Signed-off-by: Angus Gratton <angus@redyak.com.au>
|
|
Signed-off-by: Damien George <damien@micropython.org>
|
|
Prior to this commit, cache flushing for ARM native code was done only in
the assembler code asm_thumb_end_pass()/asm_arm_end_pass(), at the last
pass of the assembler. But this misses flushing the cache when loading
native code from an .mpy file, ie in persistentcode.c.
The change here makes sure the cache is always flushed/cleaned/invalidated
when assigning native code on ARM architectures.
This problem was found running tests/micropython/import_mpy_native_gc.py on
the mimxrt port.
Signed-off-by: Damien George <damien@micropython.org>
|
|
The inline assembler code does not work for __ARM_ARCH == 7.
Signed-off-by: Damien George <damien@micropython.org>
|
|
Signed-off-by: Damien George <damien@micropython.org>
|
|
This is run with uncrustify 0.70.1, and black 19.10b0.
|
|
__clear_cache causes a compile error when using clang. Instead use
__builtin___clear_cache which is available under both gcc and clang.
Also replace tabs with spaces in this section of code (introduced by a
previous commit).
|
|
Comes from https://community.arm.com/developer/ip-products/processors/b/processors-ip-blog/posts/caches-and-self-modifying-code
This fixes a crash when running MicroPython using qemu-arm.
|
|
This commit adds support for saving and loading .mpy files that contain
native code (native, viper and inline-asm). A lot of the ground work was
already done for this in the form of removing pointers from generated
native code. The changes here are mainly to link in qstr values to the
native code, and change the format of .mpy files to contain native code
blocks (possibly mixed with bytecode).
A top-level summary:
- @micropython.native, @micropython.viper and @micropython.asm_thumb/
asm_xtensa are now allowed in .py files when compiling to .mpy, and they
work transparently to the user.
- Entire .py files can be compiled to native via mpy-cross -X emit=native
and for the most part the generated .mpy files should work the same as
their bytecode version.
- The .mpy file format is changed to 1) specify in the header if the file
contains native code and if so the architecture (eg x86, ARMV7M, Xtensa);
2) for each function block the kind of code is specified (bytecode,
native, viper, asm).
- When native code is loaded from a .mpy file the native code must be
modified (in place) to link qstr values in, just like bytecode (see
py/persistentcode.c:arch_link_qstr() function).
In addition, this now defines a public, native ABI for dynamically loadable
native code generated by other languages, like C.
|
|
The maximum index into mp_fun_table is currently less than 1024 and should
stay that way to keep things efficient for all architectures, so there is
no need to handle loading the pointer directly via a literal in this
function.
|
|
Useful for position independent code, and implementing state machines.
|
|
All callers of the asm entry function guarantee that num_locals>=0, so no
need to add an explicit check for it. Use an assertion instead.
Also, the signature of asm_x86_entry is changed to match the other asm
entry functions.
|
|
There were several different spellings of MicroPython present in comments,
when there should be only one.
|
|
|
|
|
|
|
|
For all but the last pass the assembler only needs to count how much space
is needed for the machine code, it doesn't actually need to emit anything.
The dummy_data just uses unnecessary RAM and without it the code is not
any more complex (and code size does not increase for Thumb and Xtensa
archs).
|
|
All assemblers should "derive" from mp_asm_base_t.
|
|
|
|
The code was apparently broken after 9988618e0e0f5c319e31b135d993e22efb593093
"py: Implement full func arg passing for native emitter.". This attempts to
propagate those changes to ARM emitter.
|
|
|
|
Previously was allocating at end of PASS_COMPUTE, and this pass was
being run twice, so memory was being allocated twice.
|
|
Addresses issue #1022.
|
|
|
|
Viper can now do: ptr8(buf)[0], which loads a byte from a buffer using
machine instructions.
|
|
|
|
|
|
This included a bit of restructuring of the assembler backends. Note
that the ARM backend is missing a few functions and won't compile.
|
|
|
|
This gets ARM native emitter working againg and addresses issue #858.
|
|
And move the MAP_ANON redefinition from py/asmx64.c to unix/alloc.c.
|
|
MP_PLAT_FREE_EXEC macros
Fixes issue #840
|
|
|
|
|