summaryrefslogtreecommitdiff
path: root/py/objstrunicode.c
AgeCommit message (Collapse)Author
2016-08-07py/objstr,objstrunicode: Fix inconistent #if indentation.Paul Sokolovsky
2016-08-07py/objstr: Make .partition()/.rpartition() methods configurable.Paul Sokolovsky
Default is disabled, enabled for unix port. Saves 600 bytes on x86.
2016-07-25py/objstrunicode: str_index_to_ptr: Implement positive indexing properly.Paul Sokolovsky
Order out-of-bounds check, completion check, and increment in the right way.
2016-07-25py/objstrunicode: str_index_to_ptr: Should handle bytes too.Paul Sokolovsky
There's single str_index_to_ptr() function, called for both bytes and unicode objects, so should handle each properly.
2016-05-22py/objstr*: Properly ifdef str.center().Dave Hylands
2016-05-22py/objstr: Implement str.center().Paul Sokolovsky
Disabled by default, enabled in unix port. Need for this method easily pops up when working with text UI/reporting, and coding workalike manually again and again counter-productive.
2016-01-03py: Use polymorphic iterator type where possible to reduce code size.Damien George
Only types whose iterator instances still fit in 4 machine words have been changed to use the polymorphic iterator. Reduces Thumb2 arch code size by 264 bytes.
2015-11-29py: Wrap all obj-ptr conversions in MP_OBJ_TO_PTR/MP_OBJ_FROM_PTR.Damien George
This allows the mp_obj_t type to be configured to something other than a pointer-sized primitive type. This patch also includes additional changes to allow the code to compile when sizeof(mp_uint_t) != sizeof(void*), such as using size_t instead of mp_uint_t, and various casts.
2015-11-29py: Add MP_ROM_* macros and mp_rom_* types and use them.Damien George
2015-10-11py: Rename MP_BOOL() to mp_obj_new_bool() for consistency in naming.Paul Sokolovsky
2015-09-03py: Use mp_not_implemented consistently for not implemented features.Damien George
2015-05-17py: Clean up declarations of str type/funcs that are also in unicode.Damien George
Background: trying to make an amalgamation of all the code gave some errors with redefined types and inconsistent use of static.
2015-04-16py: Overhaul and simplify printf/pfenv mechanism.Damien George
Previous to this patch the printing mechanism was a bit of a tangled mess. This patch attempts to consolidate printing into one interface. All (non-debug) printing now uses the mp_print* family of functions, mainly mp_printf. All these functions take an mp_print_t structure as their first argument, and this structure defines the printing backend through the "print_strn" function of said structure. Printing from the uPy core can reach the platform-defined print code via two paths: either through mp_sys_stdout_obj (defined pert port) in conjunction with mp_stream_write; or through the mp_plat_print structure which uses the MP_PLAT_PRINT_STRN macro to define how string are printed on the platform. The former is only used when MICROPY_PY_IO is defined. With this new scheme printing is generally more efficient (less layers to go through, less arguments to pass), and, given an mp_print_t* structure, one can call mp_print_str for efficiency instead of mp_printf("%s", ...). Code size is also reduced by around 200 bytes on Thumb2 archs.
2015-04-04py: In str unicode, str_subscr will never be passed a bytes object.Damien George
2015-04-04objstr: Add .splitlines() method.Paul Sokolovsky
splitlines() occurs ~179 times in CPython3 standard library, so was deemed worthy to implement. The method has subtle semantic differences from just .split("\n"). It is also defined as working for any end-of-line combination, but this is currently not implemented - it works only with LF line-endings (which should be OK for text strings on any platforms, but not OK for bytes).
2015-03-19py: Allow to compile with extra warnings (sign-compare, unused-param).Damien George
2015-01-28py: Remove duplicated mp_obj_str_make_new function from objstrunicode.c.Damien George
2015-01-23objstr: Remove code duplication and unbreak Windows build.Paul Sokolovsky
There was really weird warning (promoted to error) when building Windows port. Exact cause is still unknown, but it uncovered another issue: 8-bit and unicode str_make_new implementations should be mutually exclusive, and not built at the same time. What we had is that bytes_decode() pulled 8-bit str_make_new() even for unicode build.
2015-01-23objstr*: Use separate names for locals_dict of 8-bit and unicode str's.Paul Sokolovsky
To somewhat unbreak -DSTATIC="" compile.
2015-01-21py: Add mp_obj_new_str_from_vstr, and use it where relevant.Damien George
This patch allows to reuse vstr memory when creating str/bytes object. This improves memory usage. Also saves code ROM: 128 bytes on stmhal, 92 bytes on bare-arm, and 88 bytes on unix x64.
2015-01-20py, unix: Allow to compile with -Wunused-parameter.Damien George
See issue #699.
2015-01-01py: Move to guarded includes, everywhere in py/ core.Damien George
Addresses issue #1022.
2014-10-31objstr: Allow to convert any buffer proto object to str.Paul Sokolovsky
Original motivation is to support converting bytearrays, but easier to just support buffer protocol at all.
2014-09-25py: Simplify JSON str printing (while still conforming to JSON spec).Damien George
The JSON specs are relatively flexible and allow us to use one function to print strings, be they ascii, bytes or utf-8 encoded.
2014-09-17py: Add native json printing using existing print framework.Damien George
Also add start of ujson module with dumps implemented. Enabled in unix and stmhal ports. Test passes on both.
2014-08-30py: Change uint to mp_uint_t in runtime.h, stackctrl.h, binary.h.Damien George
Part of code cleanup, working towards resolving issue #50.
2014-08-30Change some parts of the core API to use mp_uint_t instead of uint/int.Damien George
Addressing issue #50, still some way to go yet.
2014-07-31py: Make MP_OBJ_NEW_SMALL_INT cast arg to mp_int_t itself.Damien George
Addresses issue #724.
2014-07-03Rename machine_(u)int_t to mp_(u)int_t.Damien George
See discussion in issue #50.
2014-06-28py: Make unichar_charlen() accept/return machine_uint_t.Paul Sokolovsky
2014-06-28py: Small comments, name changes, use of machine_int_t.Damien George
2014-06-27objstrunicode: Refactor str_index_to_ptr() following objstr.Paul Sokolovsky
2014-06-27objstrunicode: Signedness issues.Paul Sokolovsky
2014-06-27objstrunicode: Implement iterator.Paul Sokolovsky
2014-06-27objstrunicode: Re-add buffer protocol back for now, required for io.StringIO.Paul Sokolovsky
2014-06-27objstrunicode: Revamp len() handling for unicode, and optimize bool().Paul Sokolovsky
2014-06-27objstrunicode: Get rid of bytes checking, it's separate type.Paul Sokolovsky
2014-06-27py: Prune unneeded code from objstrunicode, reuse code in objstr.Paul Sokolovsky
2014-06-27objstrunicode: Basic implementation of unicode handling.Chris Angelico
Squashed commit of the following: commit 99dc21b67a895dc10d3c846bc158d27c839cee48 Author: Chris Angelico <rosuav@gmail.com> Date: Thu Jun 12 02:18:54 2014 +1000 Optimize as per TODO (thanks Damien!) commit 5bf0153ecad8348443058d449d74504fc458fe51 Author: Chris Angelico <rosuav@gmail.com> Date: Tue Jun 10 08:42:06 2014 +1000 Test a default (= UTF-8) encode and decode commit c962057ac340832c4fde60896f656a3fe3ad78a9 Merge: e2c9782 195de32 Author: Chris Angelico <rosuav@gmail.com> Date: Tue Jun 10 05:23:03 2014 +1000 Merge branch 'master' into unicode, resolving conflict on py/obj.h commit e2c9782a65eb57f481d441d40161de427e1940ba Author: Chris Angelico <rosuav@gmail.com> Date: Tue Jun 10 05:05:57 2014 +1000 More whitespace fixups commit 086a2a0f57afbc1f731697fd5d3a0cbbb80e5418 Author: Chris Angelico <rosuav@gmail.com> Date: Tue Jun 10 05:04:20 2014 +1000 Properly implement string slicing commit 0d339a143e2b6442366145e7f3d64aada293eaa0 Author: Chris Angelico <rosuav@gmail.com> Date: Tue Jun 10 02:24:11 2014 +1000 Support slicing in str_index_to_ptr, and fix a bounds error commit 24371c7267d360e77cf5eabc2e8ce9a73d2ee0da Author: Chris Angelico <rosuav@gmail.com> Date: Tue Jun 10 02:10:22 2014 +1000 Break out index-to-pointer calculation into a function commit 616c24ac014c3ca56008428c506034dd1bfff7a8 Author: Chris Angelico <rosuav@gmail.com> Date: Tue Jun 10 02:03:11 2014 +1000 Add tests of string slicing, which currently fail commit a24d19f676fe8cc21dad512d91b826892e162a5b Author: Chris Angelico <rosuav@gmail.com> Date: Tue Jun 10 01:56:53 2014 +1000 Change string indexing to not precalculate the charlen, and add test for neg indexing commit 0bcc7ab89eafb2ae53195e94c9bea42a4e886b64 Author: Chris Angelico <rosuav@gmail.com> Date: Sun Jun 8 22:09:17 2014 +1000 Clean up constant qstr declarations now that charlen isn't needed commit 5473e1a1dba2124b7b0c207f2964293cfbe80167 Author: Chris Angelico <rosuav@gmail.com> Date: Sun Jun 8 07:18:42 2014 +1000 Remove the charlen field from strings, calculating it when required commit 5c1658ec71aefbdc88c261ce2e57dc7670cdc6ef Author: Chris Angelico <rosuav@gmail.com> Date: Sun Jun 8 07:11:27 2014 +1000 Get rid of mp_obj_str_get_data_len() which was used in only one place commit a019ba968b4e8daf7f3674f63c5cc400e304c509 Author: Chris Angelico <rosuav@gmail.com> Date: Sun Jun 8 06:58:26 2014 +1000 Add a unichar_charlen() function to calculate length-in-characters from length-in-bytes commit 44b0d5cff846ba487c526ed95be1b3d1cd3d762a Author: Chris Angelico <rosuav@gmail.com> Date: Sun Jun 8 06:32:44 2014 +1000 Use utf8_get/next_char in building up a string's repr commit 30d1bad33f7af90f1971987c39864c8fcf3f5c21 Author: Chris Angelico <rosuav@gmail.com> Date: Sun Jun 8 06:10:45 2014 +1000 Make utf8_get_char() and utf8_next_char() actually do what their names say commit bc990dad9afb8ec112f5e7f7f79d5ab415da0e72 Author: Chris Angelico <rosuav@gmail.com> Date: Sun Jun 8 02:10:59 2014 +1000 Revert "Add PEP 393-flags to strings and stub usage." This reverts commit c239f509521d1a0f9563bf9c5de0c4fb9a6a33ba. commit f9bebb28ad52467f2f2d7a752bb033296b6c2f9b Author: Chris Angelico <rosuav@gmail.com> Date: Sat Jun 7 15:41:48 2014 +1000 Whitespace fixes commit 279de0c8eb3cb186914799ccc5ee94ea97f56de4 Author: Chris Angelico <rosuav@gmail.com> Date: Sat Jun 7 15:28:35 2014 +1000 Formatting/layout improvements - introduce macros for UTF-8 byte detection, add braces. No functional changes. commit f1911f53d56da809c97b07245f5728a419e8fb30 Author: Chris Angelico <rosuav@gmail.com> Date: Sat Jun 7 11:56:02 2014 +1000 Make chr() Unicode-aware commit f51ad737b48ac04c161197a4012821d50885c4c7 Author: Chris Angelico <rosuav@gmail.com> Date: Sat Jun 7 11:44:07 2014 +1000 Make a string's repr Unicode-aware commit 01bd68684611585d437982dccdf05b33cbedc630 Author: Chris Angelico <rosuav@gmail.com> Date: Sat Jun 7 11:33:43 2014 +1000 Expand the Unicode tests commit 7bc91904f899f8012089fc14a06495680a51e590 Author: Chris Angelico <rosuav@gmail.com> Date: Sat Jun 7 11:27:30 2014 +1000 Record byte lengths for byte strings commit bb132120717cf176dcfb26f87fa309378f76ab5f Author: Chris Angelico <rosuav@gmail.com> Date: Sat Jun 7 11:25:06 2014 +1000 Make ord() Unicode-aware commit 03f0cbe9051b62192be97b59f84f63f9216668bf Author: Chris Angelico <rosuav@gmail.com> Date: Sat Jun 7 10:24:35 2014 +1000 Retain characters as UTF-8 encoded Unicode commit e924659b85c001916a5ff7f4d1d8b3ebe2bf0c2f Author: Chris Angelico <rosuav@gmail.com> Date: Sat Jun 7 08:37:27 2014 +1000 Add support for \u and \U escapes, but not \N (with explanatory comment) commit 231031ac5f0346e4ffcf9c4abec2bd33f566232c Author: Chris Angelico <rosuav@gmail.com> Date: Sat Jun 7 05:09:35 2014 +1000 Add character length to qstr commit 6df1b946fb17d8d5df3d91b21cde627c3d4556a8 Author: Chris Angelico <rosuav@gmail.com> Date: Fri Jun 6 13:48:36 2014 +1000 Add test of UTF-8 encoded source file resulting in properly formed string commit 16429b81a8483cf25865ed11afd81a7d9c253c26 Author: Chris Angelico <rosuav@gmail.com> Date: Fri Jun 6 13:44:15 2014 +1000 Make len(s) return character length (even though creation's still buggy) commit cd2cf6663cc47831dbc97819ad5c50ad33f939d3 Author: Chris Angelico <rosuav@gmail.com> Date: Fri Jun 6 13:15:36 2014 +1000 HACK - When indexing a qstr, count its charlen. Stupidly inefficient but POC. All tests pass now, though string creation is still buggy. commit 47c234584d3358dfa6b4003d5e7264105d17b8f7 Author: Chris Angelico <rosuav@gmail.com> Date: Fri Jun 6 13:15:32 2014 +1000 objstr: Record character length separately from byte length CAUTION: Buggy, may crash stuff - qstr needs equivalent functionality too commit b0f41c72af27d3b361027146025877b3d7e8785c Author: Chris Angelico <rosuav@gmail.com> Date: Fri Jun 6 05:37:36 2014 +1000 Beginnings of UTF-8 support - construct strings from that many UTF-8-encoded chars, and subscript bytes the same way commit 89452be641674601e9bfce86dc71c17c3140a6cf Author: Chris Angelico <rosuav@gmail.com> Date: Fri Jun 6 05:28:47 2014 +1000 Update comments - now aiming for UTF-8 rather than PEP 393 strings commit c239f509521d1a0f9563bf9c5de0c4fb9a6a33ba Author: Chris Angelico <rosuav@gmail.com> Date: Wed Jun 4 05:28:12 2014 +1000 Add PEP 393-flags to strings and stub usage. The test suite all passes, but nothing has actually been changed.
2014-06-27objstrunicode: Complete copy of objstr, to be patched for unicode support.Paul Sokolovsky