diff options
| author | Lorenzo Stoakes <lorenzo.stoakes@oracle.com> | 2026-01-23 20:12:17 +0000 |
|---|---|---|
| committer | Andrew Morton <akpm@linux-foundation.org> | 2026-01-31 14:22:51 -0800 |
| commit | e28e575af956c4c3089b443e87be91a6ff7af355 (patch) | |
| tree | 97543bad533eb4b6a6578a108dd81656f65e711b /include | |
| parent | 28f590f35da8435f75e2aee51431c6c1b8d91f54 (diff) | |
mm/vma: introduce helper struct + thread through exclusive lock fns
It is confusing to have __vma_start_exclude_readers() return 0, 1 or an
error (but only when waiting for readers in TASK_KILLABLE state), and
having the return value be stored in a stack variable called 'locked'
is further confusion.
More generally, we are doing a lot of rather finnicky things during the
acquisition of a state in which readers are excluded and moving out of
this state, including tracking whether we are detached or not or
whether an error occurred.
We are implementing logic in __vma_start_exclude_readers() that
effectively acts as if 'if one caller calls us do X, if another then do
Y', which is very confusing from a control flow perspective.
Introducing the shared helper object state helps us avoid this, as we
can now handle the 'an error arose but we're detached' condition
correctly in both callers - a warning if not detaching, and treating
the situation as if no error arose in the case of a VMA detaching.
This also acts to help document what's going on and allows us to add
some more logical debug asserts.
Also update vma_mark_detached() to add a guard clause for the likely
'already detached' state (given we hold the mmap write lock), and add a
comment about ephemeral VMA read lock reference count increments to
clarify why we are entering/exiting an exclusive locked state here.
Finally, separate vma_mark_detached() into its fast-path component and
make it inline, then place the slow path for excluding readers in
mmap_lock.c.
No functional change intended.
[akpm@linux-foundation.org: fix function naming in comments, add comment per Vlastimil per Lorenzo]
Link: https://lkml.kernel.org/r/7d3084d596c84da10dd374130a5055deba6439c0.1769198904.git.lorenzo.stoakes@oracle.com
Link: https://lkml.kernel.org/r/7d3084d596c84da10dd374130a5055deba6439c0.1769198904.git.lorenzo.stoakes@oracle.com
Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Reviewed-by: Suren Baghdasaryan <surenb@google.com>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Shakeel Butt <shakeel.butt@linux.dev>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Waiman Long <longman@redhat.com>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Diffstat (limited to 'include')
| -rw-r--r-- | include/linux/mm_types.h | 14 | ||||
| -rw-r--r-- | include/linux/mmap_lock.h | 23 |
2 files changed, 29 insertions, 8 deletions
diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h index 3e608d22cab0..8731606d8d36 100644 --- a/include/linux/mm_types.h +++ b/include/linux/mm_types.h @@ -1011,15 +1011,15 @@ struct vm_area_struct { * decrementing it again. * * VM_REFCNT_EXCLUDE_READERS_FLAG - Detached, pending - * __vma_exit_locked() completion which will decrement the reference - * count to zero. IMPORTANT - at this stage no further readers can - * increment the reference count. It can only be reduced. + * __vma_end_exclude_readers() completion which will decrement the + * reference count to zero. IMPORTANT - at this stage no further readers + * can increment the reference count. It can only be reduced. * * VM_REFCNT_EXCLUDE_READERS_FLAG + 1 - A thread is either write-locking - * an attached VMA and has yet to invoke __vma_exit_locked(), OR a - * thread is detaching a VMA and is waiting on a single spurious reader - * in order to decrement the reference count. IMPORTANT - as above, no - * further readers can increment the reference count. + * an attached VMA and has yet to invoke __vma_end_exclude_readers(), + * OR a thread is detaching a VMA and is waiting on a single spurious + * reader in order to decrement the reference count. IMPORTANT - as + * above, no further readers can increment the reference count. * * > VM_REFCNT_EXCLUDE_READERS_FLAG + 1 - A thread is either * write-locking or detaching a VMA is waiting on readers to diff --git a/include/linux/mmap_lock.h b/include/linux/mmap_lock.h index d6df6aad3e24..678f90080fa6 100644 --- a/include/linux/mmap_lock.h +++ b/include/linux/mmap_lock.h @@ -358,7 +358,28 @@ static inline void vma_mark_attached(struct vm_area_struct *vma) refcount_set_release(&vma->vm_refcnt, 1); } -void vma_mark_detached(struct vm_area_struct *vma); +void __vma_exclude_readers_for_detach(struct vm_area_struct *vma); + +static inline void vma_mark_detached(struct vm_area_struct *vma) +{ + vma_assert_write_locked(vma); + vma_assert_attached(vma); + + /* + * The VMA still being attached (refcnt > 0) - is unlikely, because the + * vma has been already write-locked and readers can increment vm_refcnt + * only temporarily before they check vm_lock_seq, realize the vma is + * locked and drop back the vm_refcnt. That is a narrow window for + * observing a raised vm_refcnt. + * + * See the comment describing the vm_area_struct->vm_refcnt field for + * details of possible refcnt values. + */ + if (likely(!__vma_refcount_put_return(vma))) + return; + + __vma_exclude_readers_for_detach(vma); +} struct vm_area_struct *lock_vma_under_rcu(struct mm_struct *mm, unsigned long address); |
