<feed xmlns='http://www.w3.org/2005/Atom'>
<title>user/sven/linux.git/drivers/android/binder_alloc.c, branch v6.9.5</title>
<subtitle>Linux Kernel
</subtitle>
<id>https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v6.9.5</id>
<link rel='self' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v6.9.5'/>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/'/>
<updated>2024-03-07T22:22:32Z</updated>
<entry>
<title>binder: remove redundant variable page_addr</title>
<updated>2024-03-07T22:22:32Z</updated>
<author>
<name>Colin Ian King</name>
<email>colin.i.king@intel.com</email>
</author>
<published>2024-03-07T22:15:05Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=367b3560e10bbae3660d8ba4d0a7cc92170d8398'/>
<id>urn:sha1:367b3560e10bbae3660d8ba4d0a7cc92170d8398</id>
<content type='text'>
Variable page_addr is being assigned a value that is never read. The
variable is redundant and can be removed.

Cleans up clang scan build warning:
warning: Value stored to 'page_addr' is never read [deadcode.DeadStores]

Signed-off-by: Colin Ian King &lt;colin.i.king@intel.com&gt;
Fixes: 162c79731448 ("binder: avoid user addresses in debug logs")
Reported-by: kernel test robot &lt;lkp@intel.com&gt;
Closes: https://lore.kernel.org/oe-kbuild-all/202312060851.cudv98wG-lkp@intel.com/
Acked-by: Carlos Llamas &lt;cmllamas@google.com&gt;
Link: https://lore.kernel.org/r/20240307221505.101431-1-cmllamas@google.com
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>Merge tag 'char-misc-6.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc</title>
<updated>2024-01-18T00:47:17Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2024-01-18T00:47:17Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=296455ade1fdcf5f8f8c033201633b60946c589a'/>
<id>urn:sha1:296455ade1fdcf5f8f8c033201633b60946c589a</id>
<content type='text'>
Pull char/misc and other driver updates from Greg KH:
 "Here is the big set of char/misc and other driver subsystem changes
  for 6.8-rc1.

  Other than lots of binder driver changes (as you can see by the merge
  conflicts) included in here are:

   - lots of iio driver updates and additions

   - spmi driver updates

   - eeprom driver updates

   - firmware driver updates

   - ocxl driver updates

   - mhi driver updates

   - w1 driver updates

   - nvmem driver updates

   - coresight driver updates

   - platform driver remove callback api changes

   - tags.sh script updates

   - bus_type constant marking cleanups

   - lots of other small driver updates

  All of these have been in linux-next for a while with no reported
  issues"

* tag 'char-misc-6.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (341 commits)
  android: removed duplicate linux/errno
  uio: Fix use-after-free in uio_open
  drivers: soc: xilinx: add check for platform
  firmware: xilinx: Export function to use in other module
  scripts/tags.sh: remove find_sources
  scripts/tags.sh: use -n to test archinclude
  scripts/tags.sh: add local annotation
  scripts/tags.sh: use more portable -path instead of -wholename
  scripts/tags.sh: Update comment (addition of gtags)
  firmware: zynqmp: Convert to platform remove callback returning void
  firmware: turris-mox-rwtm: Convert to platform remove callback returning void
  firmware: stratix10-svc: Convert to platform remove callback returning void
  firmware: stratix10-rsu: Convert to platform remove callback returning void
  firmware: raspberrypi: Convert to platform remove callback returning void
  firmware: qemu_fw_cfg: Convert to platform remove callback returning void
  firmware: mtk-adsp-ipc: Convert to platform remove callback returning void
  firmware: imx-dsp: Convert to platform remove callback returning void
  firmware: coreboot_table: Convert to platform remove callback returning void
  firmware: arm_scpi: Convert to platform remove callback returning void
  firmware: arm_scmi: Convert to platform remove callback returning void
  ...
</content>
</entry>
<entry>
<title>list_lru: allow explicit memcg and NUMA node selection</title>
<updated>2023-12-12T18:57:01Z</updated>
<author>
<name>Nhat Pham</name>
<email>nphamcs@gmail.com</email>
</author>
<published>2023-11-30T19:40:18Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=0a97c01cd20bb96359d8c9dedad92a061ed34e0b'/>
<id>urn:sha1:0a97c01cd20bb96359d8c9dedad92a061ed34e0b</id>
<content type='text'>
Patch series "workload-specific and memory pressure-driven zswap
writeback", v8.

There are currently several issues with zswap writeback:

1. There is only a single global LRU for zswap, making it impossible to
   perform worload-specific shrinking - an memcg under memory pressure
   cannot determine which pages in the pool it owns, and often ends up
   writing pages from other memcgs. This issue has been previously
   observed in practice and mitigated by simply disabling
   memcg-initiated shrinking:

   https://lore.kernel.org/all/20230530232435.3097106-1-nphamcs@gmail.com/T/#u

   But this solution leaves a lot to be desired, as we still do not
   have an avenue for an memcg to free up its own memory locked up in
   the zswap pool.

2. We only shrink the zswap pool when the user-defined limit is hit.
   This means that if we set the limit too high, cold data that are
   unlikely to be used again will reside in the pool, wasting precious
   memory. It is hard to predict how much zswap space will be needed
   ahead of time, as this depends on the workload (specifically, on
   factors such as memory access patterns and compressibility of the
   memory pages).

This patch series solves these issues by separating the global zswap LRU
into per-memcg and per-NUMA LRUs, and performs workload-specific (i.e
memcg- and NUMA-aware) zswap writeback under memory pressure.  The new
shrinker does not have any parameter that must be tuned by the user, and
can be opted in or out on a per-memcg basis.

As a proof of concept, we ran the following synthetic benchmark: build the
linux kernel in a memory-limited cgroup, and allocate some cold data in
tmpfs to see if the shrinker could write them out and improved the overall
performance.  Depending on the amount of cold data generated, we observe
from 14% to 35% reduction in kernel CPU time used in the kernel builds.


This patch (of 6):

The interface of list_lru is based on the assumption that the list node
and the data it represents belong to the same allocated on the correct
node/memcg.  While this assumption is valid for existing slab objects LRU
such as dentries and inodes, it is undocumented, and rather inflexible for
certain potential list_lru users (such as the upcoming zswap shrinker and
the THP shrinker).  It has caused us a lot of issues during our
development.

This patch changes list_lru interface so that the caller must explicitly
specify numa node and memcg when adding and removing objects.  The old
list_lru_add() and list_lru_del() are renamed to list_lru_add_obj() and
list_lru_del_obj(), respectively.

It also extends the list_lru API with a new function, list_lru_putback,
which undoes a previous list_lru_isolate call.  Unlike list_lru_add, it
does not increment the LRU node count (as list_lru_isolate does not
decrement the node count).  list_lru_putback also allows for explicit
memcg and NUMA node selection.

Link: https://lkml.kernel.org/r/20231130194023.4102148-1-nphamcs@gmail.com
Link: https://lkml.kernel.org/r/20231130194023.4102148-2-nphamcs@gmail.com
Signed-off-by: Nhat Pham &lt;nphamcs@gmail.com&gt;
Suggested-by: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Acked-by: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Tested-by: Bagas Sanjaya &lt;bagasdotme@gmail.com&gt;
Cc: Chris Li &lt;chrisl@kernel.org&gt;
Cc: Dan Streetman &lt;ddstreet@ieee.org&gt;
Cc: Domenico Cerasuolo &lt;cerasuolodomenico@gmail.com&gt;
Cc: Michal Hocko &lt;mhocko@kernel.org&gt;
Cc: Muchun Song &lt;muchun.song@linux.dev&gt;
Cc: Roman Gushchin &lt;roman.gushchin@linux.dev&gt;
Cc: Seth Jennings &lt;sjenning@redhat.com&gt;
Cc: Shakeel Butt &lt;shakeelb@google.com&gt;
Cc: Shuah Khan &lt;shuah@kernel.org&gt;
Cc: Vitaly Wool &lt;vitaly.wool@konsulko.com&gt;
Cc: Yosry Ahmed &lt;yosryahmed@google.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>binder: switch alloc-&gt;mutex to spinlock_t</title>
<updated>2023-12-05T00:23:41Z</updated>
<author>
<name>Carlos Llamas</name>
<email>cmllamas@google.com</email>
</author>
<published>2023-12-01T17:21:57Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=7710e2cca32e7f3958480e8bd44f50e29d0c2509'/>
<id>urn:sha1:7710e2cca32e7f3958480e8bd44f50e29d0c2509</id>
<content type='text'>
The alloc-&gt;mutex is a highly contended lock that causes performance
issues on Android devices. When a low-priority task is given this lock
and it sleeps, it becomes difficult for the task to wake up and complete
its work. This delays other tasks that are also waiting on the mutex.

The problem gets worse when there is memory pressure in the system,
because this increases the contention on the alloc-&gt;mutex while the
shrinker reclaims binder pages.

Switching to a spinlock helps to keep the waiters running and avoids the
overhead of waking up tasks. This significantly improves the transaction
latency when the problematic scenario occurs.

The performance impact of this patchset was measured by stress-testing
the binder alloc contention. In this test, several clients of different
priorities send thousands of transactions of different sizes to a single
server. In parallel, pages get reclaimed using the shinker's debugfs.

The test was run on a Pixel 8, Pixel 6 and qemu machine. The results
were similar on all three devices:

after:
  | sched  | prio | average | max       | min     |
  |--------+------+---------+-----------+---------|
  | fifo   |   99 | 0.135ms |   1.197ms | 0.022ms |
  | fifo   |   01 | 0.136ms |   5.232ms | 0.018ms |
  | other  |  -20 | 0.180ms |   7.403ms | 0.019ms |
  | other  |   19 | 0.241ms |  58.094ms | 0.018ms |

before:
  | sched  | prio | average | max       | min     |
  |--------+------+---------+-----------+---------|
  | fifo   |   99 | 0.350ms | 248.730ms | 0.020ms |
  | fifo   |   01 | 0.357ms | 248.817ms | 0.024ms |
  | other  |  -20 | 0.399ms | 249.906ms | 0.020ms |
  | other  |   19 | 0.477ms | 297.756ms | 0.022ms |

The key metrics above are the average and max latencies (wall time).
These improvements should roughly translate to p95-p99 latencies on real
workloads. The response time is up to 200x faster in these scenarios and
there is no penalty in the regular path.

Note that it is only possible to convert this lock after a series of
changes made by previous patches. These mainly include refactoring the
sections that might_sleep() and changing the locking order with the
mmap_lock amongst others.

Reviewed-by: Alice Ryhl &lt;aliceryhl@google.com&gt;
Signed-off-by: Carlos Llamas &lt;cmllamas@google.com&gt;
Link: https://lore.kernel.org/r/20231201172212.1813387-29-cmllamas@google.com
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>binder: reverse locking order in shrinker callback</title>
<updated>2023-12-05T00:23:41Z</updated>
<author>
<name>Carlos Llamas</name>
<email>cmllamas@google.com</email>
</author>
<published>2023-12-01T17:21:56Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=e50f4e6cc9bfaca655d3b6a3506d27cf2caa1d40'/>
<id>urn:sha1:e50f4e6cc9bfaca655d3b6a3506d27cf2caa1d40</id>
<content type='text'>
The locking order currently requires the alloc-&gt;mutex to be acquired
first followed by the mmap lock. However, the alloc-&gt;mutex is converted
into a spinlock in subsequent commits so the order needs to be reversed
to avoid nesting the sleeping mmap lock under the spinlock.

The shrinker's callback binder_alloc_free_page() is the only place that
needs to be reordered since other functions have been refactored and no
longer nest these locks.

Some minor cosmetic changes are also included in this patch.

Signed-off-by: Carlos Llamas &lt;cmllamas@google.com&gt;
Link: https://lore.kernel.org/r/20231201172212.1813387-28-cmllamas@google.com
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>binder: avoid user addresses in debug logs</title>
<updated>2023-12-05T00:23:40Z</updated>
<author>
<name>Carlos Llamas</name>
<email>cmllamas@google.com</email>
</author>
<published>2023-12-01T17:21:55Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=162c79731448a5a052e93af7753df579dfe0bf7a'/>
<id>urn:sha1:162c79731448a5a052e93af7753df579dfe0bf7a</id>
<content type='text'>
Prefer logging vma offsets instead of addresses or simply drop the debug
log altogether if not useful. Note this covers the instances affected by
the switch to store addresses as unsigned long. However, there are other
sections in the driver that could do the same.

Signed-off-by: Carlos Llamas &lt;cmllamas@google.com&gt;
Reviewed-by: Alice Ryhl &lt;aliceryhl@google.com&gt;
Link: https://lore.kernel.org/r/20231201172212.1813387-27-cmllamas@google.com
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>binder: refactor binder_delete_free_buffer()</title>
<updated>2023-12-05T00:23:40Z</updated>
<author>
<name>Carlos Llamas</name>
<email>cmllamas@google.com</email>
</author>
<published>2023-12-01T17:21:54Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=f07b83a48e944c8a1cc1e9f6703fae5e34df2ba4'/>
<id>urn:sha1:f07b83a48e944c8a1cc1e9f6703fae5e34df2ba4</id>
<content type='text'>
Skip the freelist call immediately as needed, instead of continuing the
pointless checks. Also, drop the debug logs that we don't really need.

Signed-off-by: Carlos Llamas &lt;cmllamas@google.com&gt;
Reviewed-by: Alice Ryhl &lt;aliceryhl@google.com&gt;
Link: https://lore.kernel.org/r/20231201172212.1813387-26-cmllamas@google.com
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>binder: collapse print_binder_buffer() into caller</title>
<updated>2023-12-05T00:23:40Z</updated>
<author>
<name>Carlos Llamas</name>
<email>cmllamas@google.com</email>
</author>
<published>2023-12-01T17:21:53Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=8e905217c4543af9cf1754809846157a7dbbb261'/>
<id>urn:sha1:8e905217c4543af9cf1754809846157a7dbbb261</id>
<content type='text'>
The code in print_binder_buffer() is quite small so it can be collapsed
into its single caller binder_alloc_print_allocated().

No functional change in this patch.

Signed-off-by: Carlos Llamas &lt;cmllamas@google.com&gt;
Reviewed-by: Alice Ryhl &lt;aliceryhl@google.com&gt;
Link: https://lore.kernel.org/r/20231201172212.1813387-25-cmllamas@google.com
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>binder: document the final page calculation</title>
<updated>2023-12-05T00:23:40Z</updated>
<author>
<name>Carlos Llamas</name>
<email>cmllamas@google.com</email>
</author>
<published>2023-12-01T17:21:52Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=67dcc880780569ec40391cae4d8299adc1e7a44e'/>
<id>urn:sha1:67dcc880780569ec40391cae4d8299adc1e7a44e</id>
<content type='text'>
The code to determine the page range for binder_lru_freelist_del() is
quite obscure. It leverages the buffer_size calculated before doing an
oversized buffer split. This is used to figure out if the last page is
being shared with another active buffer. If so, the page gets trimmed
out of the range as it has been previously removed from the freelist.

This would be equivalent to getting the start page of the next in-use
buffer explicitly. However, the code for this is much larger as we can
see in binder_free_buf_locked() routine. Instead, lets settle on
documenting the tricky step and using better names for now.

I believe an ideal solution would be to count the binder_page-&gt;users to
determine when a page should be added or removed from the freelist.
However, this is a much bigger change than what I'm willing to risk at
this time.

Signed-off-by: Carlos Llamas &lt;cmllamas@google.com&gt;
Link: https://lore.kernel.org/r/20231201172212.1813387-24-cmllamas@google.com
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>binder: rename lru shrinker utilities</title>
<updated>2023-12-05T00:23:40Z</updated>
<author>
<name>Carlos Llamas</name>
<email>cmllamas@google.com</email>
</author>
<published>2023-12-01T17:21:51Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=ea9cdbf0c7273b55e251b2ed8f85794cfadab5d5'/>
<id>urn:sha1:ea9cdbf0c7273b55e251b2ed8f85794cfadab5d5</id>
<content type='text'>
Now that the page allocation step is done separately we should rename
the binder_free_page_range() and binder_allocate_page_range() functions
to provide a more accurate description of what they do. Lets borrow the
freelist concept used in other parts of the kernel for this.

No functional change here.

Signed-off-by: Carlos Llamas &lt;cmllamas@google.com&gt;
Reviewed-by: Alice Ryhl &lt;aliceryhl@google.com&gt;
Link: https://lore.kernel.org/r/20231201172212.1813387-23-cmllamas@google.com
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
</feed>
