summaryrefslogtreecommitdiff
path: root/docs/library/ure.rst
blob: 6f9094028dd1fa573b056a955447145f315730f6 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
:mod:`ure` -- simple regular expressions
========================================

.. module:: ure
   :synopsis: regular expressions

|see_cpython_module| :mod:`python:re`.

This module implements regular expression operations. Regular expression
syntax supported is a subset of CPython ``re`` module (and actually is
a subset of POSIX extended regular expressions).

Supported operators are:

``'.'``
   Match any character.

``'[...]'``
   Match set of characters. Individual characters and ranges are supported,
   including negated sets (e.g. ``[^a-c]``).

``'^'``
   Match the start of the string.

``'$'``
   Match the end of the string.

``'?'``
   Match zero or one of the previous entity.

``'*'``
   Match zero or more of the previous entity.

``'+'``
   Match one or more of the previous entity.

``'??'``

``'*?'``

``'+?'``

``'|'``
   Match either the LHS or the RHS of this operator.

``'(...)'``
   Grouping. Each group is capturing (a substring it captures can be accessed
   with `match.group()` method).

**NOT SUPPORTED**: Counted repetitions (``{m,n}``), more advanced assertions
(``\b``, ``\B``), named groups (``(?P<name>...)``), non-capturing groups
(``(?:...)``), etc.


Functions
---------

.. function:: compile(regex_str, [flags])

   Compile regular expression, return `regex <regex>` object.

.. function:: match(regex_str, string)

   Compile *regex_str* and match against *string*. Match always happens
   from starting position in a string.

.. function:: search(regex_str, string)

   Compile *regex_str* and search it in a *string*. Unlike `match`, this will search
   string for first position which matches regex (which still may be
   0 if regex is anchored).

.. function:: sub(regex_str, replace, string, count=0, flags=0)

   Compile *regex_str* and search for it in *string*, replacing all matches
   with *replace*, and returning the new string.

   *replace* can be a string or a function.  If it is a string then escape
   sequences of the form ``\<number>`` and ``\g<number>`` can be used to
   expand to the corresponding group (or an empty string for unmatched groups).
   If *replace* is a function then it must take a single argument (the match)
   and should return a replacement string.

   If *count* is specified and non-zero then substitution will stop after
   this many substitutions are made.  The *flags* argument is ignored.

   Note: availability of this function depends on `MicroPython port`.

.. data:: DEBUG

   Flag value, display debug information about compiled expression.
   (Availability depends on `MicroPython port`.)


.. _regex:

Regex objects
-------------

Compiled regular expression. Instances of this class are created using
`ure.compile()`.

.. method:: regex.match(string)
            regex.search(string)
            regex.sub(replace, string, count=0, flags=0)

   Similar to the module-level functions :meth:`match`, :meth:`search`
   and :meth:`sub`.
   Using methods is (much) more efficient if the same regex is applied to
   multiple strings.

.. method:: regex.split(string, max_split=-1)

   Split a *string* using regex. If *max_split* is given, it specifies
   maximum number of splits to perform. Returns list of strings (there
   may be up to *max_split+1* elements if it's specified).

Match objects
-------------

Match objects as returned by `match()` and `search()` methods, and passed
to the replacement function in `sub()`.

.. method:: match.group([index])

   Return matching (sub)string. *index* is 0 for entire match,
   1 and above for each capturing group. Only numeric groups are supported.

.. method:: match.groups()

   Return a tuple containing all the substrings of the groups of the match.

   Note: availability of this method depends on `MicroPython port`.

.. method:: match.start([index])
            match.end([index])

   Return the index in the original string of the start or end of the
   substring group that was matched.  *index* defaults to the entire
   group, otherwise it will select a group.

   Note: availability of these methods depends on `MicroPython port`.

.. method:: match.span([index])

   Returns the 2-tuple ``(match.start(index), match.end(index))``.

   Note: availability of this method depends on `MicroPython port`.