<feed xmlns='http://www.w3.org/2005/Atom'>
<title>user/sven/linux.git/net/tipc/node.c, branch v4.19.135</title>
<subtitle>Linux Kernel
</subtitle>
<id>https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v4.19.135</id>
<link rel='self' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v4.19.135'/>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/'/>
<updated>2020-07-29T08:16:47Z</updated>
<entry>
<title>tipc: clean up skb list lock handling on send path</title>
<updated>2020-07-29T08:16:47Z</updated>
<author>
<name>Jon Maloy</name>
<email>jon.maloy@ericsson.com</email>
</author>
<published>2019-08-15T14:42:50Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=675f4bbcd68330d2d0c96e1f716886727ad7dd1a'/>
<id>urn:sha1:675f4bbcd68330d2d0c96e1f716886727ad7dd1a</id>
<content type='text'>
commit e654f9f53b45fde3fcc8051830b212c7a8f36148 upstream.

The policy for handling the skb list locks on the send and receive paths
is simple.

- On the send path we never need to grab the lock on the 'xmitq' list
  when the destination is an exernal node.

- On the receive path we always need to grab the lock on the 'inputq'
  list, irrespective of source node.

However, when transmitting node local messages those will eventually
end up on the receive path of a local socket, meaning that the argument
'xmitq' in tipc_node_xmit() will become the 'ínputq' argument in  the
function tipc_sk_rcv(). This has been handled by always initializing
the spinlock of the 'xmitq' list at message creation, just in case it
may end up on the receive path later, and despite knowing that the lock
in most cases never will be used.

This approach is inaccurate and confusing, and has also concealed the
fact that the stated 'no lock grabbing' policy for the send path is
violated in some cases.

We now clean up this by never initializing the lock at message creation,
instead doing this at the moment we find that the message actually will
enter the receive path. At the same time we fix the four locations
where we incorrectly access the spinlock on the send/error path.

This patch also reverts commit d12cffe9329f ("tipc: ensure head-&gt;lock
is initialised") which has now become redundant.

CC: Eric Dumazet &lt;edumazet@google.com&gt;
Reported-by: Chris Packham &lt;chris.packham@alliedtelesis.co.nz&gt;
Acked-by: Ying Xue &lt;ying.xue@windriver.com&gt;
Signed-off-by: Jon Maloy &lt;jon.maloy@ericsson.com&gt;
Reviewed-by: Xin Long &lt;lucien.xin@gmail.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
[acj: backport v4.19 -stable
- adjust context
- skipped the hunk modifying non-existent function tipc_mcast_send_sync
- additional comment ]
Signed-off-by: Aviraj CJ &lt;acj@cisco.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>tipc: tipc clang warning</title>
<updated>2020-01-27T13:50:33Z</updated>
<author>
<name>Jon Maloy</name>
<email>jon.maloy@ericsson.com</email>
</author>
<published>2019-03-22T14:03:51Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=e317b36324b0184ca3063675a3b4f18a0c206c76'/>
<id>urn:sha1:e317b36324b0184ca3063675a3b4f18a0c206c76</id>
<content type='text'>
[ Upstream commit 737889efe9713a0f20a75fd0de952841d9275e6b ]

When checking the code with clang -Wsometimes-uninitialized we get the
following warning:

if (!tipc_link_is_establishing(l)) {
    ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
net/tipc/node.c:847:46: note: uninitialized use occurs here
      tipc_bearer_xmit(n-&gt;net, bearer_id, &amp;xmitq, maddr);

net/tipc/node.c:831:2: note: remove the 'if' if its condition is always
true
if (!tipc_link_is_establishing(l)) {
    ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
net/tipc/node.c:821:31: note: initialize the variable 'maddr' to silence
this warning
struct tipc_media_addr *maddr;

We fix this by initializing 'maddr' to NULL. For the matter of clarity,
we also test if 'xmitq' is non-empty before we use it and 'maddr'
further down in the  function. It will never happen that 'xmitq' is non-
empty at the same time as 'maddr' is NULL, so this is a sufficient test.

Fixes: 598411d70f85 ("tipc: make resetting of links non-atomic")
Reported-by: Nathan Chancellor &lt;natechancellor@gmail.com&gt;
Signed-off-by: Jon Maloy &lt;jon.maloy@ericsson.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>tipc: fix node keep alive interval calculation</title>
<updated>2019-02-12T18:47:06Z</updated>
<author>
<name>Hoang Le</name>
<email>hoang.h.le@dektech.com.au</email>
</author>
<published>2018-12-06T02:00:09Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=1de47c06b3a3bf48d9b55199bda5c5112829db9f'/>
<id>urn:sha1:1de47c06b3a3bf48d9b55199bda5c5112829db9f</id>
<content type='text'>
[ Upstream commit f5d6c3e5a359c0507800e7ac68d565c21de9b5a1 ]

When setting LINK tolerance, node timer interval will be calculated
base on the LINK with lowest tolerance.

But when calculated, the old node timer interval only updated if current
setting value (tolerance/4) less than old ones regardless of number of
links as well as links' lowest tolerance value.

This caused to two cases missing if tolerance changed as following:
Case 1:
1.1/ There is one link (L1) available in the system
1.2/ Set L1's tolerance from 1500ms =&gt; lower (i.e 500ms)
1.3/ Then, fallback to default (1500ms) or higher (i.e 2000ms)

Expected:
    node timer interval is 1500/4=375ms after 1.3

Result:
node timer interval will not being updated after changing tolerance at 1.3
since its value 1500/4=375ms is not less than 500/4=125ms at 1.2.

Case 2:
2.1/ There are two links (L1, L2) available in the system
2.2/ L1 and L2 tolerance value are 2000ms as initial
2.3/ Set L2's tolerance from 2000ms =&gt; lower 1500ms
2.4/ Disable link L2 (bring down its bearer)

Expected:
    node timer interval is 2000ms/4=500ms after 2.4

Result:
node timer interval will not being updated after disabling L2 since
its value 2000ms/4=500ms is still not less than 1500/4=375ms at 2.3
although L2 is already not available in the system.

To fix this, we start the node interval calculation by initializing it to
a value larger than any conceivable calculated value. This way, the link
with the lowest tolerance will always determine the calculated value.

Acked-by: Jon Maloy &lt;jon.maloy@ericsson.com&gt;
Signed-off-by: Hoang Le &lt;hoang.h.le@dektech.com.au&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>tipc: fix lockdep warning during node delete</title>
<updated>2018-12-05T18:32:00Z</updated>
<author>
<name>Jon Maloy</name>
<email>donmalo99@gmail.com</email>
</author>
<published>2018-11-26T17:26:14Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=4e3fbd7433d7bbeca4db70ae9ecae9bfa688e2a4'/>
<id>urn:sha1:4e3fbd7433d7bbeca4db70ae9ecae9bfa688e2a4</id>
<content type='text'>
[ Upstream commit ec835f891232d7763dea9da0358f31e24ca6dfb7 ]

We see the following lockdep warning:

[ 2284.078521] ======================================================
[ 2284.078604] WARNING: possible circular locking dependency detected
[ 2284.078604] 4.19.0+ #42 Tainted: G            E
[ 2284.078604] ------------------------------------------------------
[ 2284.078604] rmmod/254 is trying to acquire lock:
[ 2284.078604] 00000000acd94e28 ((&amp;n-&gt;timer)#2){+.-.}, at: del_timer_sync+0x5/0xa0
[ 2284.078604]
[ 2284.078604] but task is already holding lock:
[ 2284.078604] 00000000f997afc0 (&amp;(&amp;tn-&gt;node_list_lock)-&gt;rlock){+.-.}, at: tipc_node_stop+0xac/0x190 [tipc]
[ 2284.078604]
[ 2284.078604] which lock already depends on the new lock.
[ 2284.078604]
[ 2284.078604]
[ 2284.078604] the existing dependency chain (in reverse order) is:
[ 2284.078604]
[ 2284.078604] -&gt; #1 (&amp;(&amp;tn-&gt;node_list_lock)-&gt;rlock){+.-.}:
[ 2284.078604]        tipc_node_timeout+0x20a/0x330 [tipc]
[ 2284.078604]        call_timer_fn+0xa1/0x280
[ 2284.078604]        run_timer_softirq+0x1f2/0x4d0
[ 2284.078604]        __do_softirq+0xfc/0x413
[ 2284.078604]        irq_exit+0xb5/0xc0
[ 2284.078604]        smp_apic_timer_interrupt+0xac/0x210
[ 2284.078604]        apic_timer_interrupt+0xf/0x20
[ 2284.078604]        default_idle+0x1c/0x140
[ 2284.078604]        do_idle+0x1bc/0x280
[ 2284.078604]        cpu_startup_entry+0x19/0x20
[ 2284.078604]        start_secondary+0x187/0x1c0
[ 2284.078604]        secondary_startup_64+0xa4/0xb0
[ 2284.078604]
[ 2284.078604] -&gt; #0 ((&amp;n-&gt;timer)#2){+.-.}:
[ 2284.078604]        del_timer_sync+0x34/0xa0
[ 2284.078604]        tipc_node_delete+0x1a/0x40 [tipc]
[ 2284.078604]        tipc_node_stop+0xcb/0x190 [tipc]
[ 2284.078604]        tipc_net_stop+0x154/0x170 [tipc]
[ 2284.078604]        tipc_exit_net+0x16/0x30 [tipc]
[ 2284.078604]        ops_exit_list.isra.8+0x36/0x70
[ 2284.078604]        unregister_pernet_operations+0x87/0xd0
[ 2284.078604]        unregister_pernet_subsys+0x1d/0x30
[ 2284.078604]        tipc_exit+0x11/0x6f2 [tipc]
[ 2284.078604]        __x64_sys_delete_module+0x1df/0x240
[ 2284.078604]        do_syscall_64+0x66/0x460
[ 2284.078604]        entry_SYSCALL_64_after_hwframe+0x49/0xbe
[ 2284.078604]
[ 2284.078604] other info that might help us debug this:
[ 2284.078604]
[ 2284.078604]  Possible unsafe locking scenario:
[ 2284.078604]
[ 2284.078604]        CPU0                    CPU1
[ 2284.078604]        ----                    ----
[ 2284.078604]   lock(&amp;(&amp;tn-&gt;node_list_lock)-&gt;rlock);
[ 2284.078604]                                lock((&amp;n-&gt;timer)#2);
[ 2284.078604]                                lock(&amp;(&amp;tn-&gt;node_list_lock)-&gt;rlock);
[ 2284.078604]   lock((&amp;n-&gt;timer)#2);
[ 2284.078604]
[ 2284.078604]  *** DEADLOCK ***
[ 2284.078604]
[ 2284.078604] 3 locks held by rmmod/254:
[ 2284.078604]  #0: 000000003368be9b (pernet_ops_rwsem){+.+.}, at: unregister_pernet_subsys+0x15/0x30
[ 2284.078604]  #1: 0000000046ed9c86 (rtnl_mutex){+.+.}, at: tipc_net_stop+0x144/0x170 [tipc]
[ 2284.078604]  #2: 00000000f997afc0 (&amp;(&amp;tn-&gt;node_list_lock)-&gt;rlock){+.-.}, at: tipc_node_stop+0xac/0x19
[...}

The reason is that the node timer handler sometimes needs to delete a
node which has been disconnected for too long. To do this, it grabs
the lock 'node_list_lock', which may at the same time be held by the
generic node cleanup function, tipc_node_stop(), during module removal.
Since the latter is calling del_timer_sync() inside the same lock, we
have a potential deadlock.

We fix this letting the timer cleanup function use spin_trylock()
instead of just spin_lock(), and when it fails to grab the lock it
just returns so that the timer handler can terminate its execution.
This is safe to do, since tipc_node_stop() anyway is about to
delete both the timer and the node instance.

Fixes: 6a939f365bdb ("tipc: Auto removal of peer down node instance")
Acked-by: Ying Xue &lt;ying.xue@windriver.com&gt;
Signed-off-by: Jon Maloy &lt;jon.maloy@ericsson.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>tipc: ignore STATE_MSG on wrong link session</title>
<updated>2018-10-02T05:35:30Z</updated>
<author>
<name>LUU Duc Canh</name>
<email>canh.d.luu@dektech.com.au</email>
</author>
<published>2018-09-26T20:28:52Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=d949cfedbcbab4e91590576cbace2671924ad69c'/>
<id>urn:sha1:d949cfedbcbab4e91590576cbace2671924ad69c</id>
<content type='text'>
The initial session number when a link is created is based on a random
value, taken from struct tipc_net-&gt;random. It is then incremented for
each link reset to avoid mixing protocol messages from different link
sessions.

However, when a bearer is reset all its links are deleted, and will
later be re-created using the same random value as the first time.
This means that if the link never went down between creation and
deletion we will still sometimes have two subsequent sessions with
the same session number. In virtual environments with potentially
long transmission times this has turned out to be a real problem.

We now fix this by randomizing the session number each time a link
is created.

With a session number size of 16 bits this gives a risk of session
collision of 1/64k. To reduce this further, we also introduce a sanity
check on the very first STATE message arriving at a link. If this has
an acknowledge value differing from 0, which is logically impossible,
we ignore the message. The final risk for session collision is hence
reduced to 1/4G, which should be sufficient.

Signed-off-by: LUU Duc Canh &lt;canh.d.luu@dektech.com.au&gt;
Signed-off-by: Jon Maloy &lt;jon.maloy@ericsson.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>tipc: fix failover problem</title>
<updated>2018-09-29T18:45:14Z</updated>
<author>
<name>LUU Duc Canh</name>
<email>canh.d.luu@dektech.com.au</email>
</author>
<published>2018-09-26T19:00:54Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=c140eb166d681f66bd7e99fb121357db1a503e7f'/>
<id>urn:sha1:c140eb166d681f66bd7e99fb121357db1a503e7f</id>
<content type='text'>
We see the following scenario:
1) Link endpoint B on node 1 discovers that its peer endpoint is gone.
   Since there is a second working link, failover procedure is started.
2) Link endpoint A on node 1 sends a FAILOVER message to peer endpoint
   A on node 2. The node item 1-&gt;2 goes to state FAILINGOVER.
3) Linke endpoint A/2 receives the failover, and is supposed to take
   down its parallell link endpoint B/2, while producing a FAILOVER
   message to send back to A/1.
4) However, B/2 has already been deleted, so no FAILOVER message can
   created.
5) Node 1-&gt;2 remains in state FAILINGOVER forever, refusing to receive
   any messages that can bring B/1 up again. We are left with a non-
   redundant link between node 1 and 2.

We fix this with letting endpoint A/2 build a dummy FAILOVER message
to send to back to A/1, so that the situation can be resolved.

Signed-off-by: LUU Duc Canh &lt;canh.d.luu@dektech.com.au&gt;
Signed-off-by: Jon Maloy &lt;jon.maloy@ericsson.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>Merge ra.kernel.org:/pub/scm/linux/kernel/git/torvalds/linux</title>
<updated>2018-07-21T04:17:12Z</updated>
<author>
<name>David S. Miller</name>
<email>davem@davemloft.net</email>
</author>
<published>2018-07-20T21:45:10Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=c4c5551df136a7c4edd7c2f433d9a296b39826a2'/>
<id>urn:sha1:c4c5551df136a7c4edd7c2f433d9a296b39826a2</id>
<content type='text'>
All conflicts were trivial overlapping changes, so reasonably
easy to resolve.

Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>tipc: make link capability update thread safe</title>
<updated>2018-07-20T19:36:13Z</updated>
<author>
<name>Jon Maloy</name>
<email>jon.maloy@ericsson.com</email>
</author>
<published>2018-07-18T17:50:06Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=40999f11ce677ce3c5d0e8f5f76c40192a26b479'/>
<id>urn:sha1:40999f11ce677ce3c5d0e8f5f76c40192a26b479</id>
<content type='text'>
The commit referred to below introduced an update of the link
capabilities field that is not safe. Given the recently added
feature to remove idle node and link items after 5 minutes, there
is a small risk that the update will happen at the very moment the
targeted link is being removed. To avoid this we have to perform
the update inside the node item's write lock protection.

Fixes: 9012de508956 ("tipc: add sequence number check for link STATE messages")
Signed-off-by: Jon Maloy &lt;jon.maloy@ericsson.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>tipc: check session number before accepting link protocol messages</title>
<updated>2018-07-12T06:06:14Z</updated>
<author>
<name>Jon Maloy</name>
<email>jon.maloy@ericsson.com</email>
</author>
<published>2018-07-09T23:07:36Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=7ea817f4e8322fa27fb860d15025bf72f68b179f'/>
<id>urn:sha1:7ea817f4e8322fa27fb860d15025bf72f68b179f</id>
<content type='text'>
In some virtual environments we observe a significant higher number of
packet reordering and delays than we have been used to traditionally.

This makes it necessary with stricter checks on incoming link protocol
messages' session number, which until now only has been validated for
RESET messages.

Since the other two message types, ACTIVATE and STATE messages also
carry this number, it is easy to extend the validation check to those
messages.

We also introduce a flag indicating if a link has a valid peer session
number or not. This eliminates the mixing of 32- and 16-bit arithmethics
we are currently using to achieve this.

Acked-by: Ying Xue &lt;ying.xue@windriver.com&gt;
Signed-off-by: Jon Maloy &lt;jon.maloy@ericsson.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>tipc: add sequence number check for link STATE messages</title>
<updated>2018-07-12T06:06:14Z</updated>
<author>
<name>Jon Maloy</name>
<email>jon.maloy@ericsson.com</email>
</author>
<published>2018-07-09T23:07:35Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=9012de5089560136b849b920ad038b96160ed8f6'/>
<id>urn:sha1:9012de5089560136b849b920ad038b96160ed8f6</id>
<content type='text'>
Some switch infrastructures produce huge amounts of packet duplicates.
This becomes a problem if those messages are STATE/NACK protocol
messages, causing unnecessary retransmissions of already accepted
packets.

We now introduce a unique sequence number per STATE protocol message
so that duplicates can be identified and ignored. This will also be
useful when tracing such cases, and to avert replay attacks when TIPC
is encrypted.

For compatibility reasons we have to introduce a new capability flag
TIPC_LINK_PROTO_SEQNO to handle this new feature.

Signed-off-by: Jon Maloy &lt;jon.maloy@ericsson.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
</feed>
