4 files changed, 203 insertions, 80 deletions
diff --git a/docs/reference/constrained.rst b/docs/reference/constrained.rst
index 7c1b6a3eb..14286aa26 100644
--- a/docs/reference/constrained.rst
+++ b/docs/reference/constrained.rst
@@ -119,10 +119,10 @@ symbols that have already been defined, e.g. ``1 << BIT``.
 
 Where there is a substantial volume of constant data and the platform supports
 execution from Flash, RAM may be saved as follows. The data should be located in
-Python modules and frozen as bytecode. The data must be defined as ``bytes``
-objects. The compiler 'knows' that ``bytes`` objects are immutable and ensures
+Python modules and frozen as bytecode. The data must be defined as `bytes`
+objects. The compiler 'knows' that `bytes` objects are immutable and ensures
 that the objects remain in flash memory rather than being copied to RAM. The
-``ustruct`` module can assist in converting between ``bytes`` types and other
+`ustruct` module can assist in converting between `bytes` types and other
 Python built-in types.
 
 When considering the implications of frozen bytecode, note that in Python
@@ -185,7 +185,7 @@ a file it will save RAM if this is done in a piecemeal fashion. Rather than
 creating a large string object, create a substring and feed it to the stream
 before dealing with the next.
 
-The best way to create dynamic strings is by means of the string ``format``
+The best way to create dynamic strings is by means of the string `format`
 method:
 
 .. code::
@@ -226,26 +226,26 @@ function ``foo()``:
     foo(b'\1\2\xff')
 
 In the first call a tuple of integers is created in RAM. The second efficiently
-creates a ``bytes`` object consuming the minimum amount of RAM. If the module
-were frozen as bytecode, the ``bytes`` object would reside in flash.
+creates a `bytes` object consuming the minimum amount of RAM. If the module
+were frozen as bytecode, the `bytes` object would reside in flash.
 
 **Strings Versus Bytes**
 
 Python3 introduced Unicode support. This introduced a distinction between a
 string and an array of bytes. MicroPython ensures that Unicode strings take no
 additional space so long as all characters in the string are ASCII (i.e. have
-a value < 126). If values in the full 8-bit range are required ``bytes`` and
-``bytearray`` objects can be used to ensure that no additional space will be
-required. Note that most string methods (e.g. ``strip()``) apply also to ``bytes``
+a value < 126). If values in the full 8-bit range are required `bytes` and
+`bytearray` objects can be used to ensure that no additional space will be
+required. Note that most string methods (e.g. :meth:`str.strip()`) apply also to `bytes`
 instances so the process of eliminating Unicode can be painless.
 
 .. code::
 
-    s = 'the quick brown fox'  # A string instance
-    b = b'the quick brown fox'  # a bytes instance
+    s = 'the quick brown fox'   # A string instance
+    b = b'the quick brown fox'  # A bytes instance
 
-Where it is necessary to convert between strings and bytes the string ``encode``
-and the bytes ``decode`` methods can be used. Note that both strings and bytes
+Where it is necessary to convert between strings and bytes the :meth:`str.encode`
+and the :meth:`bytes.decode` methods can be used. Note that both strings and bytes
 are immutable. Any operation which takes as input such an object and produces
 another implies at least one RAM allocation to produce the result. In the
 second line below a new bytes object is allocated. This would also occur if ``foo``
@@ -258,10 +258,10 @@ were a string.
 
 **Runtime compiler execution**
 
-The Python keywords ``eval`` and ``exec`` invoke the compiler at runtime, which
-requires significant amounts of RAM. Note that the ``pickle`` library employs
-``exec``. It may be more RAM efficient to use the ``json`` library for object
-serialisation.
+The Python funcitons `eval` and `exec` invoke the compiler at runtime, which
+requires significant amounts of RAM. Note that the `pickle` library from
+`micropython-lib` employs `exec`. It may be more RAM efficient to use the
+`ujson` library for object serialisation.
 
 **Storing strings in flash**
 
@@ -300,7 +300,7 @@ from a fixed size pool known as the heap. When the object goes out of scope (in
 other words becomes inaccessible to code) the redundant object is known as
 "garbage". A process known as "garbage collection" (GC) reclaims that memory,
 returning it to the free heap. This process runs automatically, however it can
-be invoked directly by issuing ``gc.collect()``.
+be invoked directly by issuing `gc.collect()`.
 
 The discourse on this is somewhat involved. For a 'quick fix' issue the
 following periodically:
@@ -332,7 +332,7 @@ Reporting
 ~~~~~~~~~
 
 A number of library functions are available to report on memory allocation and
-to control GC. These are to be found in the ``gc`` and ``micropython`` modules.
+to control GC. These are to be found in the `gc` and `micropython` modules.
 The following example may be pasted at the REPL (``ctrl e`` to enter paste mode,
 ``ctrl d`` to run it).
 
@@ -357,17 +357,17 @@ The following example may be pasted at the REPL (``ctrl e`` to enter paste mode,
 
 Methods employed above:
 
-* ``gc.collect()`` Force a garbage collection. See footnote.
-* ``micropython.mem_info()`` Print a summary of RAM utilisation.
-* ``gc.mem_free()`` Return the free heap size in bytes.
-* ``gc.mem_alloc()`` Return the number of bytes currently allocated.
+* `gc.collect()` Force a garbage collection. See footnote.
+* `micropython.mem_info()` Print a summary of RAM utilisation.
+* `gc.mem_free()` Return the free heap size in bytes.
+* `gc.mem_alloc()` Return the number of bytes currently allocated.
 * ``micropython.mem_info(1)`` Print a table of heap utilisation (detailed below).
 
 The numbers produced are dependent on the platform, but it can be seen that
 declaring the function uses a small amount of RAM in the form of bytecode
 emitted by the compiler (the RAM used by the compiler has been reclaimed).
 Running the function uses over 10KiB, but on return ``a`` is garbage because it
-is out of scope and cannot be referenced. The final ``gc.collect()`` recovers
+is out of scope and cannot be referenced. The final `gc.collect()` recovers
 that memory.
 
 The final output produced by ``micropython.mem_info(1)`` will vary in detail but
@@ -394,7 +394,7 @@ line of the heap dump represents 0x400 bytes or 1KiB of RAM.
 Control of Garbage Collection
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 
-A GC can be demanded at any time by issuing ``gc.collect()``. It is advantageous
+A GC can be demanded at any time by issuing `gc.collect()`. It is advantageous
 to do this at intervals, firstly to pre-empt fragmentation and secondly for
 performance. A GC can take several milliseconds but is quicker when there is
 little work to do (about 1ms on the Pyboard). An explicit call can minimise that
@@ -417,7 +417,7 @@ occupied.
 In general modules should instantiate data objects at runtime using constructors
 or other initialisation functions. The reason is that if this occurs on
 initialisation the compiler may be starved of RAM when subsequent modules are
-imported. If modules do instantiate data on import then ``gc.collect()`` issued
+imported. If modules do instantiate data on import then `gc.collect()` issued
 after the import will ameliorate the problem.
 
 String Operations
@@ -444,13 +444,13 @@ RAM usage and speed.
 
 Where variables are required whose size is neither a byte nor a machine word
 there are standard libraries which can assist in storing these efficiently and
-in performing conversions. See the ``array``, ``ustruct`` and ``uctypes``
+in performing conversions. See the `array`, `ustruct` and `uctypes`
 modules.
 
 Footnote: gc.collect() return value
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 
-On Unix and Windows platforms the ``gc.collect()`` method returns an integer
+On Unix and Windows platforms the `gc.collect()` method returns an integer
 which signifies the number of distinct memory regions that were reclaimed in the
 collection (more precisely, the number of heads that were turned into frees). For
 efficiency reasons bare metal ports do not return this value.
diff --git a/docs/reference/glossary.rst b/docs/reference/glossary.rst
new file mode 100644
index 000000000..98979afa9
--- /dev/null
+++ b/docs/reference/glossary.rst
@@ -0,0 +1,105 @@
+Glossary
+========
+
+.. glossary::
+
+    baremetal
+        A system without (full-fledged) OS, like an :term:`MCU`. When
+        running on a baremetal system, MicroPython effectively becomes
+        its user-facing OS with a command interpreter (REPL).
+
+    board
+        A PCB board. Oftentimes, the term is used to denote a particular
+        model of an :term:`MCU` system. Sometimes, it is used to actually
+        refer to :term:`MicroPython port` to a particular board (and then
+        may also refer to "boardless" ports like
+        :term:`Unix port <MicroPython Unix port>`).
+
+    CPython
+        CPython is the reference implementation of Python programming
+        language, and the most well-known one, which most of the people
+        run. It is however one of many implementations (among which
+        Jython, IronPython, PyPy, and many more, including MicroPython).
+        As there is no formal specification of the Python language, only
+        CPython documentation, it is not always easy to draw a line
+        between Python the language and CPython its particular
+        implementation. This however leaves more freedom for other
+        implementations. For example, MicroPython does a lot of things
+        differently than CPython, while still aspiring to be a Python
+        language implementation.
+
+    GPIO
+        General-purpose input/output. The simplest means to control
+        electrical signals. With GPIO, user can configure hardware
+        signal pin to be either input or output, and set or get
+        its digital signal value (logical "0" or "1"). MicroPython
+        abstracts GPIO access using :class:`machine.Pin` and :class:`machine.Signal`
+        classes.
+
+    GPIO port
+        A group of :term:`GPIO` pins, usually based on hardware
+        properties of these pins (e.g. controllable by the same
+        register).
+
+    MCU
+        Microcontroller. Microcontrollers usually have much less resources
+        than a full-fledged computing system, but smaller, cheaper and
+        require much less power. MicroPython is designed to be small and
+        optimized enough to run on an average modern microcontroller.
+
+    micropython-lib
+        MicroPython is (usually) distributed as a single executable/binary
+        file with just few builtin modules. There is no extensive standard
+        library comparable with :term:`CPython`. Instead, there is a related, but
+        separate project
+        `micropython-lib <https://github.com/micropython/micropython-lib>`_
+        which provides implementations for many modules from CPython's
+        standard library. However, large subset of these modules require
+        POSIX-like environment (Linux, MacOS, Windows may be partially
+        supported), and thus would work or make sense only with MicroPython
+        Unix port. Some subset of modules is however usable for baremetal ports
+        too.
+
+        Unlike monolithic :term:`CPython` stdlib, micropython-lib modules
+        are intended to be installed individually - either using manual
+        copying or using :term:`upip`.
+
+    MicroPython port
+        MicroPython supports different :term:`boards <board>`, RTOSes,
+        and OSes, and can be relatively easily adapted to new systems.
+        MicroPython with support for a particular system is called a
+        "port" to that system. Different ports may have widely different
+        functionality. This documentation is intended to be a reference
+        of the generic APIs available across different ports ("MicroPython
+        core"). Note that some ports may still omit some APIs described
+        here (e.g. due to resource constraints). Any such differences,
+        and port-specific extensions beyond MicroPython core functionality,
+        would be described in the separate port-specific documentation.
+
+    MicroPython Unix port
+        Unix port is one of the major :term:`MicroPython ports <MicroPython port>`.
+        It is intended to run on POSIX-compatible operating systems, like
+        Linux, MacOS, FreeBSD, Solaris, etc. It also serves as the basis
+        of Windows port. The importance of Unix port lies in the fact
+        that while there are many different :term:`boards <board>`, so
+        two random users unlikely have the same board, almost all modern
+        OSes have some level of POSIX compatibility, so Unix port serves
+        as a kind of "common ground" to which any user can have access.
+        So, Unix port is used for initial prototyping, different kinds
+        of testing, development of machine-independent features, etc.
+        All users of MicroPython, even those which are interested only
+        in running MicroPython on :term:`MCU` systems, are recommended
+        to be familiar with Unix (or Windows) port, as it is important
+        productivity helper and a part of normal MicroPython workflow.
+
+    port
+        Either :term:`MicroPython port` or :term:`GPIO port`. If not clear
+        from context, it's recommended to use full specification like one
+        of the above.
+
+    upip
+        (Literally, "micro pip"). A package manage for MicroPython, inspired
+        by :term:`CPython`'s pip, but much smaller and with reduced functionality.
+        upip runs both on :term:`Unix port <MicroPython Unix port>` and on
+        :term:`baremetal` ports (those which offer filesystem and networking
+        support).
diff --git a/docs/reference/index.rst b/docs/reference/index.rst
index 7a85fc5cf..4d822d6fa 100644
--- a/docs/reference/index.rst
+++ b/docs/reference/index.rst
@@ -1,17 +1,25 @@
 The MicroPython language
 ========================
 
-MicroPython aims to implement the Python 3.4 standard, and most of
-the features of MicroPython are identical to those described by the
-documentation at
-`docs.python.org <https://docs.python.org/3.4/reference/index.html>`_.
+MicroPython aims to implement the Python 3.4 standard (with selected
+features from later versions) with respect to language syntax, and most
+of the features of MicroPython are identical to those described by the
+"Language Reference" documentation at
+`docs.python.org <https://docs.python.org/3/reference/index.html>`_.
 
-Differences to standard Python as well as additional features of
-MicroPython are described in the sections here.
+The MicroPython standard library is described in the
+:ref:`corresponding chapter <micropython_lib>`. The :ref:`cpython_diffs`
+chapter describes differences between MicroPython and CPython (which
+mostly concern standard library and types, but also some language-level
+features).
+
+This chapter describes features and peculiarities of MicroPython
+implementation and the best practices to use them.
 
 .. toctree::
    :maxdepth: 1
 
+   glossary.rst
    repl.rst
    isr_rules.rst
    speed_python.rst
diff --git a/docs/reference/speed_python.rst b/docs/reference/speed_python.rst
index 8efba4702..279a1bbcd 100644
--- a/docs/reference/speed_python.rst
+++ b/docs/reference/speed_python.rst
@@ -1,9 +1,11 @@
-Maximising Python Speed
-=======================
+Maximising MicroPython Speed
+============================
+
+.. contents::
 
 This tutorial describes ways of improving the performance of MicroPython code.
 Optimisations involving other languages are covered elsewhere, namely the use
-of modules written in C and the MicroPython inline ARM Thumb-2 assembler.
+of modules written in C and the MicroPython inline assembler.
 
 The process of developing high performance code comprises the following stages
 which should be performed in the order listed.
@@ -17,6 +19,7 @@ Optimisation steps:
 * Improve the efficiency of the Python code.
 * Use the native code emitter.
 * Use the viper code emitter.
+* Use hardware-specific optimisations.
 
 Designing for speed
 -------------------
@@ -50,7 +53,7 @@ once only and not permitted to grow in size. This implies that the object persis
 for the duration of its use: typically it will be instantiated in a class constructor
 and used in various methods.
 
-This is covered in further detail :ref:`Controlling garbage collection <gc>` below.
+This is covered in further detail :ref:`Controlling garbage collection <controlling_gc>` below.
 
 Buffers
 ~~~~~~~
@@ -60,8 +63,8 @@ used for communication with a device. A typical driver will create the buffer in
 constructor and use it in its I/O methods which will be called repeatedly.
 
 The MicroPython libraries typically provide support for pre-allocated buffers. For
-example, objects which support stream interface (e.g., file or UART) provide ``read()``
-method which allocate new buffer for read data, but also a ``readinto()`` method
+example, objects which support stream interface (e.g., file or UART) provide `read()`
+method which allocates new buffer for read data, but also a `readinto()` method
 to read data into an existing buffer.
 
 Floating Point
@@ -79,14 +82,14 @@ Arrays
 ~~~~~~
 
 Consider the use of the various types of array classes as an alternative to lists.
-The ``array`` module supports various element types with 8-bit elements supported
-by Python's built in ``bytes`` and ``bytearray`` classes. These data structures all store
+The `array` module supports various element types with 8-bit elements supported
+by Python's built in `bytes` and `bytearray` classes. These data structures all store
 elements in contiguous memory locations. Once again to avoid memory allocation in critical
 code these should be pre-allocated and passed as arguments or as bound objects.
 
-When passing slices of objects such as ``bytearray`` instances, Python creates
+When passing slices of objects such as `bytearray` instances, Python creates
 a copy which involves allocation of the size proportional to the size of slice.
-This can be alleviated using a ``memoryview`` object. ``memoryview`` itself
+This can be alleviated using a `memoryview` object. `memoryview` itself
 is allocated on heap, but is a small, fixed-size object, regardless of the size
 of slice it points too.
 
@@ -97,7 +100,7 @@ of slice it points too.
     mv = memoryview(ba)    # small object is allocated
     func(mv[30:2000])      # a pointer to memory is passed
 
-A ``memoryview`` can only be applied to objects supporting the buffer protocol - this
+A `memoryview` can only be applied to objects supporting the buffer protocol - this
 includes arrays but not lists. Small caveat is that while memoryview object is live,
 it also keeps alive the original buffer object. So, a memoryview isn't a universal
 panacea. For instance, in the example above, if you are done with 10K buffer and
@@ -105,11 +108,11 @@ just need those bytes 30:2000 from it, it may be better to make a slice, and let
 the 10K buffer go (be ready for garbage collection), instead of making a
 long-living memoryview and keeping 10K blocked for GC.
 
-Nonetheless, ``memoryview`` is indispensable for advanced preallocated buffer
-management. ``.readinto()`` method discussed above puts data at the beginning
+Nonetheless, `memoryview` is indispensable for advanced preallocated buffer
+management. `readinto()` method discussed above puts data at the beginning
 of buffer and fills in entire buffer. What if you need to put data in the
 middle of existing buffer? Just create a memoryview into the needed section
-of buffer and pass it to ``.readinto()``.
+of buffer and pass it to `readinto()`.
 
 Identifying the slowest section of code
 ---------------------------------------
@@ -118,8 +121,7 @@ This is a process known as profiling and is covered in textbooks and
 (for standard Python) supported by various software tools. For the type of
 smaller embedded application likely to be running on MicroPython platforms
 the slowest function or method can usually be established by judicious use
-of the timing ``ticks`` group of functions documented
-`here <http://docs.micropython.org/en/latest/pyboard/library/time.html>`_.
+of the timing ``ticks`` group of functions documented in `utime`.
 Code execution time can be measured in ms, us, or CPU cycles.
 
 The following enables any function or method to be timed by adding an
@@ -130,9 +132,9 @@ The following enables any function or method to be timed by adding an
     def timed_function(f, *args, **kwargs):
         myname = str(f).split(' ')[1]
         def new_func(*args, **kwargs):
-            t = time.ticks_us()
+            t = utime.ticks_us()
             result = f(*args, **kwargs)
-            delta = time.ticks_diff(time.ticks_us(), t)
+            delta = utime.ticks_diff(utime.ticks_us(), t)
             print('Function {} Time = {:6.3f}ms'.format(myname, delta/1000))
             return result
         return new_func
@@ -170,7 +172,7 @@ by caching the object in a local variable:
 This avoids the need repeatedly to look up ``self.ba`` and ``obj_display.framebuffer``
 in the body of the method ``bar()``.
 
-.. _gc:
+.. _controlling_gc:
 
 Controlling garbage collection
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
@@ -182,7 +184,7 @@ process known as garbage collection reclaims the memory used by these redundant
 objects and the allocation is then tried again - a process which can take several
 milliseconds.
 
-There are benefits in pre-empting this by periodically issuing ``gc.collect()``.
+There may be benefits in pre-empting this by periodically issuing `gc.collect()`.
 Firstly doing a collection before it is actually required is quicker - typically on the
 order of 1ms if done frequently. Secondly you can determine the point in code
 where this time is used rather than have a longer delay occur at random points,
@@ -190,34 +192,11 @@ possibly in a speed critical section. Finally performing collections regularly
 can reduce fragmentation in the heap. Severe fragmentation can lead to
 non-recoverable allocation failures.
 
-Accessing hardware directly
-~~~~~~~~~~~~~~~~~~~~~~~~~~~
-
-This comes into the category of more advanced programming and involves some knowledge
-of the target MCU. Consider the example of toggling an output pin on the Pyboard. The 
-standard approach would be to write
-
-.. code:: python
-
-    mypin.value(mypin.value() ^ 1) # mypin was instantiated as an output pin
-
-This involves the overhead of two calls to the ``Pin`` instance's ``value()``
-method. This overhead can be eliminated by performing a read/write to the relevant bit
-of the chip's GPIO port output data register (odr). To facilitate this the ``stm``
-module provides a set of constants providing the addresses of the relevant registers.
-A fast toggle of pin ``P4`` (CPU pin ``A14``) - corresponding to the green LED -
-can be performed as follows:
-
-.. code:: python
-
-    BIT14 = const(1 << 14)
-    stm.mem16[stm.GPIOA + stm.GPIO_ODR] ^= BIT14
-
 The Native code emitter
 -----------------------
 
-This causes the MicroPython compiler to emit ARM native opcodes rather than
-bytecode. It covers the bulk of the Python language so most functions will require
+This causes the MicroPython compiler to emit native CPU opcodes rather than
+bytecode. It covers the bulk of the MicroPython functionality, so most functions will require
 no adaptation (but see below). It is invoked by means of a function decorator:
 
 .. code:: python
@@ -276,7 +255,7 @@ Viper provides pointer types to assist the optimiser. These comprise
 * ``ptr32`` Points to a 32 bit machine word.
 
 The concept of a pointer may be unfamiliar to Python programmers. It has similarities
-to a Python ``memoryview`` object in that it provides direct access to data stored in memory.
+to a Python `memoryview` object in that it provides direct access to data stored in memory.
 Items are accessed using subscript notation, but slices are not supported: a pointer can return
 a single item only. Its purpose is to provide fast random access to data stored in contiguous
 memory locations - such as data stored in objects which support the buffer protocol, and
@@ -330,3 +309,34 @@ The following example illustrates the use of a ``ptr16`` cast to toggle pin X1 `
 A detailed technical description of the three code emitters may be found
 on Kickstarter here `Note 1 <https://www.kickstarter.com/projects/214379695/micro-python-python-for-microcontrollers/posts/664832>`_
 and here `Note 2 <https://www.kickstarter.com/projects/214379695/micro-python-python-for-microcontrollers/posts/665145>`_
+
+Accessing hardware directly
+---------------------------
+
+.. note::
+
+    Code examples in this section are given for the Pyboard. The techniques
+    described however may be applied to other MicroPython ports too.
+
+This comes into the category of more advanced programming and involves some knowledge
+of the target MCU. Consider the example of toggling an output pin on the Pyboard. The
+standard approach would be to write
+
+.. code:: python
+
+    mypin.value(mypin.value() ^ 1) # mypin was instantiated as an output pin
+
+This involves the overhead of two calls to the `Pin` instance's :meth:`~machine.Pin.value()`
+method. This overhead can be eliminated by performing a read/write to the relevant bit
+of the chip's GPIO port output data register (odr). To facilitate this the ``stm``
+module provides a set of constants providing the addresses of the relevant registers.
+A fast toggle of pin ``P4`` (CPU pin ``A14``) - corresponding to the green LED -
+can be performed as follows:
+
+.. code:: python
+
+    import machine
+    import stm
+
+    BIT14 = const(1 << 14)
+    machine.mem16[stm.GPIOA + stm.GPIO_ODR] ^= BIT14