From 1c27b644c0fdbc61e113b8faee14baeb8df32486 Mon Sep 17 00:00:00 2001
From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Date: Thu, 18 Jan 2018 19:58:55 -0800
Subject: Automate memory-barriers.txt; provide Linux-kernel memory model

There is some reason to believe that Documentation/memory-barriers.txt
could use some help, and a major purpose of this patch is to provide
that help in the form of a design-time tool that can produce all valid
executions of a small fragment of concurrent Linux-kernel code, which is
called a "litmus test".  This tool's functionality is roughly similar to
a full state-space search.  Please note that this is a design-time tool,
not useful for regression testing.  However, we hope that the underlying
Linux-kernel memory model will be incorporated into other tools capable
of analyzing large bodies of code for regression-testing purposes.

The main tool is herd7, together with the linux-kernel.bell,
linux-kernel.cat, linux-kernel.cfg, linux-kernel.def, and lock.cat files
added by this patch.  The herd7 executable takes the other files as input,
and all of these files collectively define the Linux-kernel memory memory
model.  A brief description of each of these other files is provided
in the README file.  Although this tool does have its limitations,
which are documented in the README file, it does improve on the version
reported on in the LWN series (https://lwn.net/Articles/718628/ and
https://lwn.net/Articles/720550/) by supporting locking and arithmetic,
including a much wider variety of read-modify-write atomic operations.
Please note that herd7 is not part of this submission, but is freely
available from http://diy.inria.fr/sources/index.html (and via "git"
at https://github.com/herd/herdtools7).

A second tool is klitmus7, which converts litmus tests to loadable
kernel modules for direct testing.  As with herd7, the klitmus7
code is freely available from http://diy.inria.fr/sources/index.html
(and via "git" at https://github.com/herd/herdtools7).

Of course, litmus tests are not always the best way to fully understand a
memory model, so this patch also includes Documentation/explanation.txt,
which describes the memory model in detail.  In addition,
Documentation/recipes.txt provides example known-good and known-bad use
cases for those who prefer working by example.

This patch also includes a few sample litmus tests, and a great many
more litmus tests are available at https://github.com/paulmckrcu/litmus.

This patch was the result of a most excellent collaboration founded
by Jade Alglave and also including Alan Stern, Andrea Parri, and Luc
Maranget.  For more details on the history of this collaboration, please
refer to the Linux-kernel memory model presentations at 2016 LinuxCon EU,
2016 Kernel Summit, 2016 Linux Plumbers Conference, 2017 linux.conf.au,
or 2017 Linux Plumbers Conference microconference.  However, one aspect
of the history does bear repeating due to weak copyright tracking earlier
in this project, which extends back to early 2015.  This weakness came
to light in late 2017 after an LKMM presentation by Paul in which an
audience member noted the similarity of some LKMM code to code in early
published papers.  This prompted a copyright review.

From Alan Stern:

	To say that the model was mine is not entirely accurate.
	Pieces of it (especially the Scpv and Atomic axioms) were taken
	directly from Jade's models.  And of course the Happens-before
	and Propagation relations and axioms were heavily based on
	Jade and Luc's work, even though they weren't identical to the
	earlier versions.  Only the RCU portion was completely original.

	. . .

	One can make a much better case that I wrote the bulk of lock.cat.
	However, it was inspired by Luc's earlier version (and still
	shares some elements in common), and of course it benefited from
	feedback and testing from all members of our group.

The model prior to Alan's was Luc Maranget's.  From Luc:

	 I totally agree on Alan Stern's account of the linux kernel model
	 genesis.  I thank him for his acknowledgments of my participation
	 to previous model drafts.  I'd like to complete Alan Stern's
	 statement: any bell cat code I have written has its roots in
	 discussions with Jade Alglave and Paul McKenney. Moreover I
	 have borrowed cat and bell code written by Jade Alglave freely.

This copyright review therefore resulted in late adds to the copyright
statements of several files.

Discussion of v1 has raised several issues, which we do not believe should
block acceptance given that this level of change will be ongoing, just
as it has been with memory-barriers.txt:

o	Under what conditions should ordering provided by pure locking
	be seen by CPUs not holding the relevant lock(s)?  In particular,
	should the message-passing pattern be forbidden?

o	Should examples involving C11 release sequences be forbidden?
	Note that this C11 is still a moving target for this issue:
	http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2017/p0735r0.html

o	Some details of the handling of internal dependencies for atomic
	read-modify-write atomic operations are still subject to debate.

o	Changes recently accepted into mainline greatly reduce the need
	to handle DEC Alpha as a special case.  These changes add an
	smp_read_barrier_depends() to READ_ONCE(), thus causing Alpha
	to respect ordering of dependent reads.  If these changes stick,
	the memory model can be simplified accordingly.

o	Will changes be required to accommodate RISC-V?

Differences from v1:
	(http://lkml.kernel.org/r/20171113184031.GA26302@linux.vnet.ibm.com)

o	Add SPDX notations to .bell and .cat files, replacing
	textual license statements.

o	Add reference to upcoming ASPLOS paper to .bell and .cat files.

o	Updated identifier names in .bell and .cat files to match those
	used in the ASPLOS paper.

o	Updates to READMEs and other documentation based on review
	feedback.

o	Added a memory-ordering cheatsheet.

o	Update sigs to new Co-Developed-by and add acks and
	reviewed-bys.

o	Simplify rules detecting nested RCU read-side critical sections.

o	Update copyright statements as noted above.

Co-Developed-by: Alan Stern <stern@rowland.harvard.edu>
Co-Developed-by: Andrea Parri <parri.andrea@gmail.com>
Co-Developed-by: Jade Alglave <j.alglave@ucl.ac.uk>
Co-Developed-by: Luc Maranget <luc.maranget@inria.fr>
Co-Developed-by: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Andrea Parri <parri.andrea@gmail.com>
Signed-off-by: Jade Alglave <j.alglave@ucl.ac.uk>
Signed-off-by: Luc Maranget <luc.maranget@inria.fr>
Signed-off-by: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Reviewed-by: Boqun Feng <boqun.feng@gmail.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Acked-by: Nicholas Piggin <npiggin@gmail.com>
Acked-by: David Howells <dhowells@redhat.com>
Acked-by: "Reshetova, Elena" <elena.reshetova@intel.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Akira Yokosawa <akiyks@gmail.com>
Cc: <linux-arch@vger.kernel.org>
---
 tools/memory-model/linux-kernel.def | 108 ++++++++++++++++++++++++++++++++++++
 1 file changed, 108 insertions(+)
 create mode 100644 tools/memory-model/linux-kernel.def

(limited to 'tools/memory-model/linux-kernel.def')

diff --git a/tools/memory-model/linux-kernel.def b/tools/memory-model/linux-kernel.def
new file mode 100644
index 000000000000..a397387f77cc
--- /dev/null
+++ b/tools/memory-model/linux-kernel.def
@@ -0,0 +1,108 @@
+// SPDX-License-Identifier: GPL-2.0+
+//
+// An earlier version of this file appears in the companion webpage for
+// "Frightening small children and disconcerting grown-ups: Concurrency
+// in the Linux kernel" by Alglave, Maranget, McKenney, Parri, and Stern,
+// which is to appear in ASPLOS 2018.
+
+// ONCE
+READ_ONCE(X) __load{once}(X)
+WRITE_ONCE(X,V) { __store{once}(X,V); }
+
+// Release Acquire and friends
+smp_store_release(X,V) { __store{release}(*X,V); }
+smp_load_acquire(X) __load{acquire}(*X)
+rcu_assign_pointer(X,V) { __store{release}(X,V); }
+lockless_dereference(X) __load{lderef}(X)
+rcu_dereference(X) __load{deref}(X)
+
+// Fences
+smp_mb() { __fence{mb} ; }
+smp_rmb() { __fence{rmb} ; }
+smp_wmb() { __fence{wmb} ; }
+smp_read_barrier_depends() { __fence{rb_dep}; }
+smp_mb__before_atomic() { __fence{before_atomic} ; }
+smp_mb__after_atomic() { __fence{after_atomic} ; }
+smp_mb__after_spinlock() { __fence{after_spinlock} ; }
+
+// Exchange
+xchg(X,V)  __xchg{mb}(X,V)
+xchg_relaxed(X,V) __xchg{once}(X,V)
+xchg_release(X,V) __xchg{release}(X,V)
+xchg_acquire(X,V) __xchg{acquire}(X,V)
+cmpxchg(X,V,W) __cmpxchg{mb}(X,V,W)
+cmpxchg_relaxed(X,V,W) __cmpxchg{once}(X,V,W)
+cmpxchg_acquire(X,V,W) __cmpxchg{acquire}(X,V,W)
+cmpxchg_release(X,V,W) __cmpxchg{release}(X,V,W)
+
+// Spinlocks
+spin_lock(X) { __lock(X) ; }
+spin_unlock(X) { __unlock(X) ; }
+spin_trylock(X) __trylock(X)
+
+// RCU
+rcu_read_lock() { __fence{rcu-lock}; }
+rcu_read_unlock() { __fence{rcu-unlock};}
+synchronize_rcu() { __fence{sync-rcu}; }
+synchronize_rcu_expedited() { __fence{sync-rcu}; }
+
+// Atomic
+atomic_read(X) READ_ONCE(*X)
+atomic_set(X,V) { WRITE_ONCE(*X,V) ; }
+atomic_read_acquire(X) smp_load_acquire(X)
+atomic_set_release(X,V) { smp_store_release(X,V); }
+
+atomic_add(V,X) { __atomic_op(X,+,V) ; }
+atomic_sub(V,X) { __atomic_op(X,-,V) ; }
+atomic_inc(X)   { __atomic_op(X,+,1) ; }
+atomic_dec(X)   { __atomic_op(X,-,1) ; }
+
+atomic_add_return(V,X) __atomic_op_return{mb}(X,+,V)
+atomic_add_return_relaxed(V,X) __atomic_op_return{once}(X,+,V)
+atomic_add_return_acquire(V,X) __atomic_op_return{acquire}(X,+,V)
+atomic_add_return_release(V,X) __atomic_op_return{release}(X,+,V)
+atomic_fetch_add(V,X) __atomic_fetch_op{mb}(X,+,V)
+atomic_fetch_add_relaxed(V,X) __atomic_fetch_op{once}(X,+,V)
+atomic_fetch_add_acquire(V,X) __atomic_fetch_op{acquire}(X,+,V)
+atomic_fetch_add_release(V,X) __atomic_fetch_op{release}(X,+,V)
+
+atomic_inc_return(X) __atomic_op_return{mb}(X,+,1)
+atomic_inc_return_relaxed(X) __atomic_op_return{once}(X,+,1)
+atomic_inc_return_acquire(X) __atomic_op_return{acquire}(X,+,1)
+atomic_inc_return_release(X) __atomic_op_return{release}(X,+,1)
+atomic_fetch_inc(X) __atomic_fetch_op{mb}(X,+,1)
+atomic_fetch_inc_relaxed(X) __atomic_fetch_op{once}(X,+,1)
+atomic_fetch_inc_acquire(X) __atomic_fetch_op{acquire}(X,+,1)
+atomic_fetch_inc_release(X) __atomic_fetch_op{release}(X,+,1)
+
+atomic_sub_return(V,X) __atomic_op_return{mb}(X,-,V)
+atomic_sub_return_relaxed(V,X) __atomic_op_return{once}(X,-,V)
+atomic_sub_return_acquire(V,X) __atomic_op_return{acquire}(X,-,V)
+atomic_sub_return_release(V,X) __atomic_op_return{release}(X,-,V)
+atomic_fetch_sub(V,X) __atomic_fetch_op{mb}(X,-,V)
+atomic_fetch_sub_relaxed(V,X) __atomic_fetch_op{once}(X,-,V)
+atomic_fetch_sub_acquire(V,X) __atomic_fetch_op{acquire}(X,-,V)
+atomic_fetch_sub_release(V,X) __atomic_fetch_op{release}(X,-,V)
+
+atomic_dec_return(X) __atomic_op_return{mb}(X,-,1)
+atomic_dec_return_relaxed(X) __atomic_op_return{once}(X,-,1)
+atomic_dec_return_acquire(X) __atomic_op_return{acquire}(X,-,1)
+atomic_dec_return_release(X) __atomic_op_return{release}(X,-,1)
+atomic_fetch_dec(X) __atomic_fetch_op{mb}(X,-,1)
+atomic_fetch_dec_relaxed(X) __atomic_fetch_op{once}(X,-,1)
+atomic_fetch_dec_acquire(X) __atomic_fetch_op{acquire}(X,-,1)
+atomic_fetch_dec_release(X) __atomic_fetch_op{release}(X,-,1)
+
+atomic_xchg(X,V) __xchg{mb}(X,V)
+atomic_xchg_relaxed(X,V) __xchg{once}(X,V)
+atomic_xchg_release(X,V) __xchg{release}(X,V)
+atomic_xchg_acquire(X,V) __xchg{acquire}(X,V)
+atomic_cmpxchg(X,V,W) __cmpxchg{mb}(X,V,W)
+atomic_cmpxchg_relaxed(X,V,W) __cmpxchg{once}(X,V,W)
+atomic_cmpxchg_acquire(X,V,W) __cmpxchg{acquire}(X,V,W)
+atomic_cmpxchg_release(X,V,W) __cmpxchg{release}(X,V,W)
+
+atomic_sub_and_test(V,X) __atomic_op_return{mb}(X,-,V) == 0
+atomic_dec_and_test(X)  __atomic_op_return{mb}(X,-,1) == 0
+atomic_inc_and_test(X)  __atomic_op_return{mb}(X,+,1) == 0
+atomic_add_negative(V,X) __atomic_op_return{mb}(X,+,V) < 0
-- 
cgit v1.2.3


From cac79a39f200ef73ae7fc8a429ce2859ebb118d9 Mon Sep 17 00:00:00 2001
From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Date: Tue, 20 Feb 2018 15:25:11 -0800
Subject: tools/memory-model: Convert underscores to hyphens

Typical cat-language code uses hyphens for word separators in
identifiers, but several LKMM identifiers use underscores instead.
This commit therefore converts underscores to hyphens in the .bell-
and .cat-file identifiers corresponding to smp_mb__before_atomic(),
smp_mb__after_atomic(), and smp_mb__after_spinlock().

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: akiyks@gmail.com
Cc: boqun.feng@gmail.com
Cc: dhowells@redhat.com
Cc: j.alglave@ucl.ac.uk
Cc: linux-arch@vger.kernel.org
Cc: luc.maranget@inria.fr
Cc: nborisov@suse.com
Cc: npiggin@gmail.com
Cc: parri.andrea@gmail.com
Cc: stern@rowland.harvard.edu
Cc: will.deacon@arm.com
Link: http://lkml.kernel.org/r/1519169112-20593-11-git-send-email-paulmck@linux.vnet.ibm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 tools/memory-model/linux-kernel.bell | 6 +++---
 tools/memory-model/linux-kernel.cat  | 6 +++---
 tools/memory-model/linux-kernel.def  | 6 +++---
 3 files changed, 9 insertions(+), 9 deletions(-)

(limited to 'tools/memory-model/linux-kernel.def')

diff --git a/tools/memory-model/linux-kernel.bell b/tools/memory-model/linux-kernel.bell
index b984bbda01a5..18885ad15de9 100644
--- a/tools/memory-model/linux-kernel.bell
+++ b/tools/memory-model/linux-kernel.bell
@@ -28,9 +28,9 @@ enum Barriers = 'wmb (*smp_wmb*) ||
 		'rcu-lock (*rcu_read_lock*)  ||
 		'rcu-unlock (*rcu_read_unlock*) ||
 		'sync-rcu (*synchronize_rcu*) ||
-		'before_atomic (*smp_mb__before_atomic*) ||
-		'after_atomic (*smp_mb__after_atomic*) ||
-		'after_spinlock (*smp_mb__after_spinlock*)
+		'before-atomic (*smp_mb__before_atomic*) ||
+		'after-atomic (*smp_mb__after_atomic*) ||
+		'after-spinlock (*smp_mb__after_spinlock*)
 instructions F[Barriers]
 
 (* Compute matching pairs of nested Rcu-lock and Rcu-unlock *)
diff --git a/tools/memory-model/linux-kernel.cat b/tools/memory-model/linux-kernel.cat
index babe2b3b0bb3..f0d27f813ec2 100644
--- a/tools/memory-model/linux-kernel.cat
+++ b/tools/memory-model/linux-kernel.cat
@@ -29,9 +29,9 @@ let rb-dep = [R] ; fencerel(Rb_dep) ; [R]
 let rmb = [R \ Noreturn] ; fencerel(Rmb) ; [R \ Noreturn]
 let wmb = [W] ; fencerel(Wmb) ; [W]
 let mb = ([M] ; fencerel(Mb) ; [M]) |
-	([M] ; fencerel(Before_atomic) ; [RMW] ; po? ; [M]) |
-	([M] ; po? ; [RMW] ; fencerel(After_atomic) ; [M]) |
-	([M] ; po? ; [LKW] ; fencerel(After_spinlock) ; [M])
+	([M] ; fencerel(Before-atomic) ; [RMW] ; po? ; [M]) |
+	([M] ; po? ; [RMW] ; fencerel(After-atomic) ; [M]) |
+	([M] ; po? ; [LKW] ; fencerel(After-spinlock) ; [M])
 let gp = po ; [Sync-rcu] ; po?
 
 let strong-fence = mb | gp
diff --git a/tools/memory-model/linux-kernel.def b/tools/memory-model/linux-kernel.def
index a397387f77cc..f5a1eb04cb64 100644
--- a/tools/memory-model/linux-kernel.def
+++ b/tools/memory-model/linux-kernel.def
@@ -21,9 +21,9 @@ smp_mb() { __fence{mb} ; }
 smp_rmb() { __fence{rmb} ; }
 smp_wmb() { __fence{wmb} ; }
 smp_read_barrier_depends() { __fence{rb_dep}; }
-smp_mb__before_atomic() { __fence{before_atomic} ; }
-smp_mb__after_atomic() { __fence{after_atomic} ; }
-smp_mb__after_spinlock() { __fence{after_spinlock} ; }
+smp_mb__before_atomic() { __fence{before-atomic} ; }
+smp_mb__after_atomic() { __fence{after-atomic} ; }
+smp_mb__after_spinlock() { __fence{after-spinlock} ; }
 
 // Exchange
 xchg(X,V)  __xchg{mb}(X,V)
-- 
cgit v1.2.3


From bf28ae5627442355dbb8d99238da4fb95c2dd4ec Mon Sep 17 00:00:00 2001
From: Alan Stern <stern@rowland.harvard.edu>
Date: Tue, 20 Feb 2018 15:25:12 -0800
Subject: tools/memory-model: Remove rb-dep, smp_read_barrier_depends, and
 lockless_dereference

Since commit 76ebbe78f739 ("locking/barriers: Add implicit
smp_read_barrier_depends() to READ_ONCE()") was merged for the 4.15
kernel, it has not been necessary to use smp_read_barrier_depends().
Similarly, commit 59ecbbe7b31c ("locking/barriers: Kill
lockless_dereference()") removed lockless_dereference() from the
kernel.

Since these primitives are no longer part of the kernel, they do not
belong in the Linux Kernel Memory Consistency Model.  This patch
removes them, along with the internal rb-dep relation, and updates the
revelant documentation.

Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: akiyks@gmail.com
Cc: boqun.feng@gmail.com
Cc: dhowells@redhat.com
Cc: j.alglave@ucl.ac.uk
Cc: linux-arch@vger.kernel.org
Cc: luc.maranget@inria.fr
Cc: nborisov@suse.com
Cc: npiggin@gmail.com
Cc: parri.andrea@gmail.com
Cc: will.deacon@arm.com
Link: http://lkml.kernel.org/r/1519169112-20593-12-git-send-email-paulmck@linux.vnet.ibm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 tools/memory-model/Documentation/cheatsheet.txt  |  3 +-
 tools/memory-model/Documentation/explanation.txt | 81 +++++++++++++-----------
 tools/memory-model/linux-kernel.bell             |  1 -
 tools/memory-model/linux-kernel.cat              |  7 +-
 tools/memory-model/linux-kernel.def              |  2 -
 5 files changed, 46 insertions(+), 48 deletions(-)

(limited to 'tools/memory-model/linux-kernel.def')

diff --git a/tools/memory-model/Documentation/cheatsheet.txt b/tools/memory-model/Documentation/cheatsheet.txt
index 1917712bce99..04e458acd6d4 100644
--- a/tools/memory-model/Documentation/cheatsheet.txt
+++ b/tools/memory-model/Documentation/cheatsheet.txt
@@ -6,8 +6,7 @@
 Store, e.g., WRITE_ONCE()            Y                                       Y
 Load, e.g., READ_ONCE()              Y                              Y        Y
 Unsuccessful RMW operation           Y                              Y        Y
-smp_read_barrier_depends()              Y                       Y   Y
-*_dereference()                      Y                          Y   Y        Y
+rcu_dereference()                    Y                          Y   Y        Y
 Successful *_acquire()               R                   Y  Y   Y   Y    Y   Y
 Successful *_release()         C        Y  Y    Y     W                      Y
 smp_rmb()                               Y       R        Y      Y        R
diff --git a/tools/memory-model/Documentation/explanation.txt b/tools/memory-model/Documentation/explanation.txt
index 867e0ea69b6d..dae8b8cb2ad3 100644
--- a/tools/memory-model/Documentation/explanation.txt
+++ b/tools/memory-model/Documentation/explanation.txt
@@ -1,5 +1,5 @@
-Explanation of the Linux-Kernel Memory Model
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+Explanation of the Linux-Kernel Memory Consistency Model
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 
 :Author: Alan Stern <stern@rowland.harvard.edu>
 :Created: October 2017
@@ -35,25 +35,24 @@ Explanation of the Linux-Kernel Memory Model
 INTRODUCTION
 ------------
 
-The Linux-kernel memory model (LKMM) is rather complex and obscure.
-This is particularly evident if you read through the linux-kernel.bell
-and linux-kernel.cat files that make up the formal version of the
-memory model; they are extremely terse and their meanings are far from
-clear.
+The Linux-kernel memory consistency model (LKMM) is rather complex and
+obscure.  This is particularly evident if you read through the
+linux-kernel.bell and linux-kernel.cat files that make up the formal
+version of the model; they are extremely terse and their meanings are
+far from clear.
 
 This document describes the ideas underlying the LKMM.  It is meant
-for people who want to understand how the memory model was designed.
-It does not go into the details of the code in the .bell and .cat
-files; rather, it explains in English what the code expresses
-symbolically.
+for people who want to understand how the model was designed.  It does
+not go into the details of the code in the .bell and .cat files;
+rather, it explains in English what the code expresses symbolically.
 
 Sections 2 (BACKGROUND) through 5 (ORDERING AND CYCLES) are aimed
-toward beginners; they explain what memory models are and the basic
-notions shared by all such models.  People already familiar with these
-concepts can skim or skip over them.  Sections 6 (EVENTS) through 12
-(THE FROM_READS RELATION) describe the fundamental relations used in
-many memory models.  Starting in Section 13 (AN OPERATIONAL MODEL),
-the workings of the LKMM itself are covered.
+toward beginners; they explain what memory consistency models are and
+the basic notions shared by all such models.  People already familiar
+with these concepts can skim or skip over them.  Sections 6 (EVENTS)
+through 12 (THE FROM_READS RELATION) describe the fundamental
+relations used in many models.  Starting in Section 13 (AN OPERATIONAL
+MODEL), the workings of the LKMM itself are covered.
 
 Warning: The code examples in this document are not written in the
 proper format for litmus tests.  They don't include a header line, the
@@ -827,8 +826,8 @@ A-cumulative; they only affect the propagation of stores that are
 executed on C before the fence (i.e., those which precede the fence in
 program order).
 
-smp_read_barrier_depends(), rcu_read_lock(), rcu_read_unlock(), and
-synchronize_rcu() fences have other properties which we discuss later.
+read_lock(), rcu_read_unlock(), and synchronize_rcu() fences have
+other properties which we discuss later.
 
 
 PROPAGATION ORDER RELATION: cumul-fence
@@ -988,8 +987,8 @@ Another possibility, not mentioned earlier but discussed in the next
 section, is:
 
 	X and Y are both loads, X ->addr Y (i.e., there is an address
-	dependency from X to Y), and an smp_read_barrier_depends()
-	fence occurs between them.
+	dependency from X to Y), and X is a READ_ONCE() or an atomic
+	access.
 
 Dependencies can also cause instructions to be executed in program
 order.  This is uncontroversial when the second instruction is a
@@ -1015,9 +1014,9 @@ After all, a CPU cannot ask the memory subsystem to load a value from
 a particular location before it knows what that location is.  However,
 the split-cache design used by Alpha can cause it to behave in a way
 that looks as if the loads were executed out of order (see the next
-section for more details).  For this reason, the LKMM does not include
-address dependencies between read events in the ppo relation unless an
-smp_read_barrier_depends() fence is present.
+section for more details).  The kernel includes a workaround for this
+problem when the loads come from READ_ONCE(), and therefore the LKMM
+includes address dependencies to loads in the ppo relation.
 
 On the other hand, dependencies can indirectly affect the ordering of
 two loads.  This happens when there is a dependency from a load to a
@@ -1114,11 +1113,12 @@ code such as the following:
 		int *r1;
 		int r2;
 
-		r1 = READ_ONCE(ptr);
+		r1 = ptr;
 		r2 = READ_ONCE(*r1);
 	}
 
-can malfunction on Alpha systems.  It is quite possible that r1 = &x
+can malfunction on Alpha systems (notice that P1 uses an ordinary load
+to read ptr instead of READ_ONCE()).  It is quite possible that r1 = &x
 and r2 = 0 at the end, in spite of the address dependency.
 
 At first glance this doesn't seem to make sense.  We know that the
@@ -1141,11 +1141,15 @@ This could not have happened if the local cache had processed the
 incoming stores in FIFO order.  In constrast, other architectures
 maintain at least the appearance of FIFO order.
 
-In practice, this difficulty is solved by inserting an
-smp_read_barrier_depends() fence between P1's two loads.  The effect
-of this fence is to cause the CPU not to execute any po-later
-instructions until after the local cache has finished processing all
-the stores it has already received.  Thus, if the code was changed to:
+In practice, this difficulty is solved by inserting a special fence
+between P1's two loads when the kernel is compiled for the Alpha
+architecture.  In fact, as of version 4.15, the kernel automatically
+adds this fence (called smp_read_barrier_depends() and defined as
+nothing at all on non-Alpha builds) after every READ_ONCE() and atomic
+load.  The effect of the fence is to cause the CPU not to execute any
+po-later instructions until after the local cache has finished
+processing all the stores it has already received.  Thus, if the code
+was changed to:
 
 	P1()
 	{
@@ -1153,13 +1157,15 @@ the stores it has already received.  Thus, if the code was changed to:
 		int r2;
 
 		r1 = READ_ONCE(ptr);
-		smp_read_barrier_depends();
 		r2 = READ_ONCE(*r1);
 	}
 
 then we would never get r1 = &x and r2 = 0.  By the time P1 executed
 its second load, the x = 1 store would already be fully processed by
-the local cache and available for satisfying the read request.
+the local cache and available for satisfying the read request.  Thus
+we have yet another reason why shared data should always be read with
+READ_ONCE() or another synchronization primitive rather than accessed
+directly.
 
 The LKMM requires that smp_rmb(), acquire fences, and strong fences
 share this property with smp_read_barrier_depends(): They do not allow
@@ -1751,11 +1757,10 @@ no further involvement from the CPU.  Since the CPU doesn't ever read
 the value of x, there is nothing for the smp_rmb() fence to act on.
 
 The LKMM defines a few extra synchronization operations in terms of
-things we have already covered.  In particular, rcu_dereference() and
-lockless_dereference() are both treated as a READ_ONCE() followed by
-smp_read_barrier_depends() -- which also happens to be how they are
-defined in include/linux/rcupdate.h and include/linux/compiler.h,
-respectively.
+things we have already covered.  In particular, rcu_dereference() is
+treated as READ_ONCE() and rcu_assign_pointer() is treated as
+smp_store_release() -- which is basically how the Linux kernel treats
+them.
 
 There are a few oddball fences which need special treatment:
 smp_mb__before_atomic(), smp_mb__after_atomic(), and
diff --git a/tools/memory-model/linux-kernel.bell b/tools/memory-model/linux-kernel.bell
index 18885ad15de9..432c7cf71b23 100644
--- a/tools/memory-model/linux-kernel.bell
+++ b/tools/memory-model/linux-kernel.bell
@@ -24,7 +24,6 @@ instructions RMW[{'once,'acquire,'release}]
 enum Barriers = 'wmb (*smp_wmb*) ||
 		'rmb (*smp_rmb*) ||
 		'mb (*smp_mb*) ||
-		'rb_dep (*smp_read_barrier_depends*) ||
 		'rcu-lock (*rcu_read_lock*)  ||
 		'rcu-unlock (*rcu_read_unlock*) ||
 		'sync-rcu (*synchronize_rcu*) ||
diff --git a/tools/memory-model/linux-kernel.cat b/tools/memory-model/linux-kernel.cat
index f0d27f813ec2..df97db03b6c2 100644
--- a/tools/memory-model/linux-kernel.cat
+++ b/tools/memory-model/linux-kernel.cat
@@ -25,7 +25,6 @@ include "lock.cat"
 (*******************)
 
 (* Fences *)
-let rb-dep = [R] ; fencerel(Rb_dep) ; [R]
 let rmb = [R \ Noreturn] ; fencerel(Rmb) ; [R \ Noreturn]
 let wmb = [W] ; fencerel(Wmb) ; [W]
 let mb = ([M] ; fencerel(Mb) ; [M]) |
@@ -61,11 +60,9 @@ let dep = addr | data
 let rwdep = (dep | ctrl) ; [W]
 let overwrite = co | fr
 let to-w = rwdep | (overwrite & int)
-let rrdep = addr | (dep ; rfi)
-let strong-rrdep = rrdep+ & rb-dep
-let to-r = strong-rrdep | rfi-rel-acq
+let to-r = addr | (dep ; rfi) | rfi-rel-acq
 let fence = strong-fence | wmb | po-rel | rmb | acq-po
-let ppo = rrdep* ; (to-r | to-w | fence)
+let ppo = to-r | to-w | fence
 
 (* Propagation: Ordering from release operations and strong fences. *)
 let A-cumul(r) = rfe? ; r
diff --git a/tools/memory-model/linux-kernel.def b/tools/memory-model/linux-kernel.def
index f5a1eb04cb64..5dfb9c7f3462 100644
--- a/tools/memory-model/linux-kernel.def
+++ b/tools/memory-model/linux-kernel.def
@@ -13,14 +13,12 @@ WRITE_ONCE(X,V) { __store{once}(X,V); }
 smp_store_release(X,V) { __store{release}(*X,V); }
 smp_load_acquire(X) __load{acquire}(*X)
 rcu_assign_pointer(X,V) { __store{release}(X,V); }
-lockless_dereference(X) __load{lderef}(X)
 rcu_dereference(X) __load{deref}(X)
 
 // Fences
 smp_mb() { __fence{mb} ; }
 smp_rmb() { __fence{rmb} ; }
 smp_wmb() { __fence{wmb} ; }
-smp_read_barrier_depends() { __fence{rb_dep}; }
 smp_mb__before_atomic() { __fence{before-atomic} ; }
 smp_mb__after_atomic() { __fence{after-atomic} ; }
 smp_mb__after_spinlock() { __fence{after-spinlock} ; }
-- 
cgit v1.2.3


From bd5c0ba2cd78a4c116726ead84f8f37dc92d043e Mon Sep 17 00:00:00 2001
From: Alan Stern <stern@rowland.harvard.edu>
Date: Wed, 7 Mar 2018 09:27:40 -0800
Subject: tools/memory-model: Finish the removal of rb-dep,
 smp_read_barrier_depends(), and lockless_dereference()

Commit:

  bf28ae562744 ("tools/memory-model: Remove rb-dep, smp_read_barrier_depends, and lockless_dereference")

was merged too early, while it was still in RFC form.  This patch adds in
the missing pieces.

Akira pointed out some typos in the original patch, and he noted that
cheatsheet.txt should indicate that READ_ONCE() now implies an address
dependency.  Andrea suggested documenting the relationship betwwen
unsuccessful RMW operations and address dependencies.

Andrea pointed out that the macro for rcu_dereference() in linux.def
should now use the "once" annotation instead of "deref".  He also
suggested that the comments should mention commit:

  5a8897cc7631 ("locking/atomics/alpha: Add smp_read_barrier_depends() to _release()/_relaxed() atomics")

... as an important precursor, and he contributed commit:

  cb13b424e986 ("locking/xchg/alpha: Add unconditional memory barrier to cmpxchg()")

which is another prerequisite.

Suggested-by: Akira Yokosawa <akiyks@gmail.com>
Suggested-by: Andrea Parri <parri.andrea@gmail.com>
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
[ Fixed read_read_lock() typo reported by Akira. ]
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Acked-by: Andrea Parri <parri.andrea@gmail.com>
Acked-by: Akira Yokosawa <akiyks@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: boqun.feng@gmail.com
Cc: dhowells@redhat.com
Cc: j.alglave@ucl.ac.uk
Cc: linux-arch@vger.kernel.org
Cc: luc.maranget@inria.fr
Cc: npiggin@gmail.com
Cc: will.deacon@arm.com
Fixes: bf28ae562744 ("tools/memory-model: Remove rb-dep, smp_read_barrier_depends, and lockless_dereference")
Link: http://lkml.kernel.org/r/1520443660-16858-4-git-send-email-paulmck@linux.vnet.ibm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 tools/memory-model/Documentation/cheatsheet.txt  | 6 +++---
 tools/memory-model/Documentation/explanation.txt | 4 ++--
 tools/memory-model/linux-kernel.def              | 2 +-
 3 files changed, 6 insertions(+), 6 deletions(-)

(limited to 'tools/memory-model/linux-kernel.def')

diff --git a/tools/memory-model/Documentation/cheatsheet.txt b/tools/memory-model/Documentation/cheatsheet.txt
index 04e458acd6d4..956b1ae4aafb 100644
--- a/tools/memory-model/Documentation/cheatsheet.txt
+++ b/tools/memory-model/Documentation/cheatsheet.txt
@@ -1,11 +1,11 @@
                                   Prior Operation     Subsequent Operation
                                   ---------------  ---------------------------
                                C  Self  R  W  RWM  Self  R  W  DR  DW  RMW  SV
-                              __  ----  -  -  ---  ----  -  -  --  --  ---  --
+                              --  ----  -  -  ---  ----  -  -  --  --  ---  --
 
 Store, e.g., WRITE_ONCE()            Y                                       Y
-Load, e.g., READ_ONCE()              Y                              Y        Y
-Unsuccessful RMW operation           Y                              Y        Y
+Load, e.g., READ_ONCE()              Y                          Y   Y        Y
+Unsuccessful RMW operation           Y                          Y   Y        Y
 rcu_dereference()                    Y                          Y   Y        Y
 Successful *_acquire()               R                   Y  Y   Y   Y    Y   Y
 Successful *_release()         C        Y  Y    Y     W                      Y
diff --git a/tools/memory-model/Documentation/explanation.txt b/tools/memory-model/Documentation/explanation.txt
index dae8b8cb2ad3..a727c82bd434 100644
--- a/tools/memory-model/Documentation/explanation.txt
+++ b/tools/memory-model/Documentation/explanation.txt
@@ -826,7 +826,7 @@ A-cumulative; they only affect the propagation of stores that are
 executed on C before the fence (i.e., those which precede the fence in
 program order).
 
-read_lock(), rcu_read_unlock(), and synchronize_rcu() fences have
+rcu_read_lock(), rcu_read_unlock(), and synchronize_rcu() fences have
 other properties which we discuss later.
 
 
@@ -1138,7 +1138,7 @@ final effect is that even though the two loads really are executed in
 program order, it appears that they aren't.
 
 This could not have happened if the local cache had processed the
-incoming stores in FIFO order.  In constrast, other architectures
+incoming stores in FIFO order.  By contrast, other architectures
 maintain at least the appearance of FIFO order.
 
 In practice, this difficulty is solved by inserting a special fence
diff --git a/tools/memory-model/linux-kernel.def b/tools/memory-model/linux-kernel.def
index 5dfb9c7f3462..397e4e67e8c8 100644
--- a/tools/memory-model/linux-kernel.def
+++ b/tools/memory-model/linux-kernel.def
@@ -13,7 +13,7 @@ WRITE_ONCE(X,V) { __store{once}(X,V); }
 smp_store_release(X,V) { __store{release}(*X,V); }
 smp_load_acquire(X) __load{acquire}(*X)
 rcu_assign_pointer(X,V) { __store{release}(X,V); }
-rcu_dereference(X) __load{deref}(X)
+rcu_dereference(X) __load{once}(X)
 
 // Fences
 smp_mb() { __fence{mb} ; }
-- 
cgit v1.2.3