<feed xmlns='http://www.w3.org/2005/Atom'>
<title>user/sven/linux.git/include/linux/hyperv.h, branch v5.9.14</title>
<subtitle>Linux Kernel
</subtitle>
<id>https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v5.9.14</id>
<link rel='self' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v5.9.14'/>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/'/>
<updated>2020-06-20T09:16:19Z</updated>
<entry>
<title>Drivers: hv: vmbus: Remove the lock field from the vmbus_channel struct</title>
<updated>2020-06-20T09:16:19Z</updated>
<author>
<name>Andrea Parri (Microsoft)</name>
<email>parri.andrea@gmail.com</email>
</author>
<published>2020-06-17T16:46:42Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=775f43facfe89af605d77a9669aa311b5b95cd07'/>
<id>urn:sha1:775f43facfe89af605d77a9669aa311b5b95cd07</id>
<content type='text'>
The spinlock is (now) *not used to protect test-and-set accesses
to attributes of the structure or sc_list operations.

There is, AFAICT, a distinct lack of {WRITE,READ}_ONCE()s in the
handling of channel-&gt;state, but the changes below do not seem to
make things "worse".  ;-)

Signed-off-by: Andrea Parri (Microsoft) &lt;parri.andrea@gmail.com&gt;
Link: https://lore.kernel.org/r/20200617164642.37393-9-parri.andrea@gmail.com
Reviewed-by: Michael Kelley &lt;mikelley@microsoft.com&gt;
Signed-off-by: Wei Liu &lt;wei.liu@kernel.org&gt;
</content>
</entry>
<entry>
<title>Drivers: hv: vmbus: Remove the numa_node field from the vmbus_channel struct</title>
<updated>2020-06-19T15:38:15Z</updated>
<author>
<name>Andrea Parri (Microsoft)</name>
<email>parri.andrea@gmail.com</email>
</author>
<published>2020-06-17T16:46:36Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=458d090fbad59d1f849ee15e78d0471784d428b6'/>
<id>urn:sha1:458d090fbad59d1f849ee15e78d0471784d428b6</id>
<content type='text'>
The field is read only in numa_node_show() and it is already stored twice
(after a call to cpu_to_node()) in target_cpu_store() and init_vp_index();
there is no need to "cache" its value in the channel data structure.

Signed-off-by: Andrea Parri (Microsoft) &lt;parri.andrea@gmail.com&gt;
Link: https://lore.kernel.org/r/20200617164642.37393-3-parri.andrea@gmail.com
Reviewed-by: Michael Kelley &lt;mikelley@microsoft.com&gt;
Signed-off-by: Wei Liu &lt;wei.liu@kernel.org&gt;
</content>
</entry>
<entry>
<title>Drivers: hv: vmbus: Remove the target_vp field from the vmbus_channel struct</title>
<updated>2020-06-19T15:38:10Z</updated>
<author>
<name>Andrea Parri (Microsoft)</name>
<email>parri.andrea@gmail.com</email>
</author>
<published>2020-06-17T16:46:35Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=5bf74682118b3003c8f9b0b0ec596e473fc6eb82'/>
<id>urn:sha1:5bf74682118b3003c8f9b0b0ec596e473fc6eb82</id>
<content type='text'>
The field is read only in __vmbus_open() and it is already stored twice
(after a call to hv_cpu_number_to_vp_number()) in target_cpu_store() and
init_vp_index(); there is no need to "cache" its value in the channel
data structure.

Suggested-by: Michael Kelley &lt;mikelley@microsoft.com&gt;
Signed-off-by: Andrea Parri (Microsoft) &lt;parri.andrea@gmail.com&gt;
Link: https://lore.kernel.org/r/20200617164642.37393-2-parri.andrea@gmail.com
Reviewed-by: Michael Kelley &lt;mikelley@microsoft.com&gt;
Signed-off-by: Wei Liu &lt;wei.liu@kernel.org&gt;
</content>
</entry>
<entry>
<title>Drivers: hv: vmbus: Resolve more races involving init_vp_index()</title>
<updated>2020-05-23T09:07:00Z</updated>
<author>
<name>Andrea Parri (Microsoft)</name>
<email>parri.andrea@gmail.com</email>
</author>
<published>2020-05-22T17:19:01Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=afaa33da08abd10be8978781d7c99a9e67d2bbff'/>
<id>urn:sha1:afaa33da08abd10be8978781d7c99a9e67d2bbff</id>
<content type='text'>
init_vp_index() uses the (per-node) hv_numa_map[] masks to record the
CPUs allocated for channel interrupts at a given time, and distribute
the performance-critical channels across the available CPUs: in part.,
the mask of "candidate" target CPUs in a given NUMA node, for a newly
offered channel, is determined by XOR-ing the node's CPU mask and the
node's hv_numa_map.  This operation/mechanism assumes that no offline
CPUs is set in the hv_numa_map mask, an assumption that does not hold
since such mask is currently not updated when a channel is removed or
assigned to a different CPU.

To address the issues described above, this adds hooks in the channel
removal path (hv_process_channel_removal()) and in target_cpu_store()
in order to clear, resp. to update, the hv_numa_map[] masks as needed.
This also adds a (missed) update of the masks in init_vp_index() (cf.,
e.g., the memory-allocation failure path in this function).

Like in the case of init_vp_index(), such hooks require to determine
if the given channel is performance critical.  init_vp_index() does
this by parsing the channel's offer, it can not rely on the device
data structure (device_obj) to retrieve such information because the
device data structure has not been allocated/linked with the channel
by the time that init_vp_index() executes.  A similar situation may
hold in hv_is_alloced_cpu() (defined below); the adopted approach is
to "cache" the device type of the channel, as computed by parsing the
channel's offer, in the channel structure itself.

Fixes: 7527810573436f ("Drivers: hv: vmbus: Introduce the CHANNELMSG_MODIFYCHANNEL message type")
Signed-off-by: Andrea Parri (Microsoft) &lt;parri.andrea@gmail.com&gt;
Reviewed-by: Michael Kelley &lt;mikelley@microsoft.com&gt;
Link: https://lore.kernel.org/r/20200522171901.204127-3-parri.andrea@gmail.com
Signed-off-by: Wei Liu &lt;wei.liu@kernel.org&gt;
</content>
</entry>
<entry>
<title>vmbus: Replace zero-length array with flexible-array</title>
<updated>2020-05-20T09:13:59Z</updated>
<author>
<name>Gustavo A. R. Silva</name>
<email>gustavoars@kernel.org</email>
</author>
<published>2020-05-07T18:53:23Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=db5871e85533f06ffe1795c1cb9b9153860d1e4f'/>
<id>urn:sha1:db5871e85533f06ffe1795c1cb9b9153860d1e4f</id>
<content type='text'>
The current codebase makes use of the zero-length array language
extension to the C90 standard, but the preferred mechanism to declare
variable-length types such as these ones is a flexible array member[1][2],
introduced in C99:

struct foo {
        int stuff;
        struct boo array[];
};

By making use of the mechanism above, we will get a compiler warning
in case the flexible array does not occur last in the structure, which
will help us prevent some kind of undefined behavior bugs from being
inadvertently introduced[3] to the codebase from now on.

Also, notice that, dynamic memory allocations won't be affected by
this change:

"Flexible array members have incomplete type, and so the sizeof operator
may not be applied. As a quirk of the original implementation of
zero-length arrays, sizeof evaluates to zero."[1]

sizeof(flexible-array-member) triggers a warning because flexible array
members have incomplete type[1]. There are some instances of code in
which the sizeof operator is being incorrectly/erroneously applied to
zero-length arrays and the result is zero. Such instances may be hiding
some bugs. So, this work (flexible-array member conversions) will also
help to get completely rid of those sorts of issues.

This issue was found with the help of Coccinelle.

[1] https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html
[2] https://github.com/KSPP/linux/issues/21
[3] commit 76497732932f ("cxgb3/l2t: Fix undefined behaviour")

Signed-off-by: Gustavo A. R. Silva &lt;gustavoars@kernel.org&gt;
Link: https://lore.kernel.org/r/20200507185323.GA14416@embeddedor
Signed-off-by: Wei Liu &lt;wei.liu@kernel.org&gt;
</content>
</entry>
<entry>
<title>scsi: storvsc: Re-init stor_chns when a channel interrupt is re-assigned</title>
<updated>2020-05-20T09:13:19Z</updated>
<author>
<name>Andrea Parri (Microsoft)</name>
<email>parri.andrea@gmail.com</email>
</author>
<published>2020-04-06T00:15:14Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=7769e18c201aa88eade5556faf9da7f2bc15bb8a'/>
<id>urn:sha1:7769e18c201aa88eade5556faf9da7f2bc15bb8a</id>
<content type='text'>
For each storvsc_device, storvsc keeps track of the channel target CPUs
associated to the device (alloced_cpus) and it uses this information to
fill a "cache" (stor_chns) mapping CPU-&gt;channel according to a certain
heuristic.  Update the alloced_cpus mask and the stor_chns array when a
channel of the storvsc device is re-assigned to a different CPU.

Signed-off-by: Andrea Parri (Microsoft) &lt;parri.andrea@gmail.com&gt;
Cc: "James E.J. Bottomley" &lt;jejb@linux.ibm.com&gt;
Cc: "Martin K. Petersen" &lt;martin.petersen@oracle.com&gt;
Cc: &lt;linux-scsi@vger.kernel.org&gt;
Link: https://lore.kernel.org/r/20200406001514.19876-12-parri.andrea@gmail.com
Reviewed-by; Long Li &lt;longli@microsoft.com&gt;
Reviewed-by: Michael Kelley &lt;mikelley@microsoft.com&gt;
[ wei: fix a small issue reported by kbuild test robot &lt;lkp@intel.com&gt; ]
Signed-off-by: Wei Liu &lt;wei.liu@kernel.org&gt;
</content>
</entry>
<entry>
<title>Drivers: hv: vmbus: Introduce the CHANNELMSG_MODIFYCHANNEL message type</title>
<updated>2020-04-23T13:17:12Z</updated>
<author>
<name>Andrea Parri (Microsoft)</name>
<email>parri.andrea@gmail.com</email>
</author>
<published>2020-04-06T00:15:13Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=7527810573436f00e582d3d5ef2eb3c027c98d7d'/>
<id>urn:sha1:7527810573436f00e582d3d5ef2eb3c027c98d7d</id>
<content type='text'>
VMBus version 4.1 and later support the CHANNELMSG_MODIFYCHANNEL(22)
message type which can be used to request Hyper-V to change the vCPU
that a channel will interrupt.

Introduce the CHANNELMSG_MODIFYCHANNEL message type, and define the
vmbus_send_modifychannel() function to send CHANNELMSG_MODIFYCHANNEL
requests to the host via a hypercall.  The function is then used to
define a sysfs "store" operation, which allows to change the (v)CPU
the channel will interrupt by using the sysfs interface.  The feature
can be used for load balancing or other purposes.

One interesting catch here is that Hyper-V can *not* currently ACK
CHANNELMSG_MODIFYCHANNEL messages with the promise that (after the ACK
is sent) the channel won't send any more interrupts to the "old" CPU.

The peculiarity of the CHANNELMSG_MODIFYCHANNEL messages is problematic
if the user want to take a CPU offline, since we don't want to take a
CPU offline (and, potentially, "lose" channel interrupts on such CPU)
if the host is still processing a CHANNELMSG_MODIFYCHANNEL message
associated to that CPU.

It is worth mentioning, however, that we have been unable to observe
the above mentioned "race": in all our tests, CHANNELMSG_MODIFYCHANNEL
requests appeared *as if* they were processed synchronously by the host.

Suggested-by: Michael Kelley &lt;mikelley@microsoft.com&gt;
Signed-off-by: Andrea Parri (Microsoft) &lt;parri.andrea@gmail.com&gt;
Link: https://lore.kernel.org/r/20200406001514.19876-11-parri.andrea@gmail.com
Reviewed-by: Michael Kelley &lt;mikelley@microsoft.com&gt;
[ wei: fix conflict in channel_mgmt.c ]
Signed-off-by: Wei Liu &lt;wei.liu@kernel.org&gt;
</content>
</entry>
<entry>
<title>Drivers: hv: vmbus: Remove the unused HV_LOCALIZED channel affinity logic</title>
<updated>2020-04-23T13:17:12Z</updated>
<author>
<name>Andrea Parri (Microsoft)</name>
<email>parri.andrea@gmail.com</email>
</author>
<published>2020-04-06T00:15:11Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=8ef4c4abbbcdcd9d4bc0fd9454df03e6dac24b73'/>
<id>urn:sha1:8ef4c4abbbcdcd9d4bc0fd9454df03e6dac24b73</id>
<content type='text'>
The logic is unused since commit 509879bdb30b8 ("Drivers: hv: Introduce
a policy for controlling channel affinity").

This logic assumes that a channel target_cpu doesn't change during the
lifetime of a channel, but this assumption is incompatible with the new
functionality that allows changing the vCPU a channel will interrupt.

Signed-off-by: Andrea Parri (Microsoft) &lt;parri.andrea@gmail.com&gt;
Link: https://lore.kernel.org/r/20200406001514.19876-9-parri.andrea@gmail.com
Reviewed-by: Michael Kelley &lt;mikelley@microsoft.com&gt;
Signed-off-by: Wei Liu &lt;wei.liu@kernel.org&gt;
</content>
</entry>
<entry>
<title>Drivers: hv: vmbus: Use a spin lock for synchronizing channel scheduling vs. channel removal</title>
<updated>2020-04-23T13:17:12Z</updated>
<author>
<name>Andrea Parri (Microsoft)</name>
<email>parri.andrea@gmail.com</email>
</author>
<published>2020-04-06T00:15:09Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=9403b66e6161130ebae7e55a97491c84c1ad6f9f'/>
<id>urn:sha1:9403b66e6161130ebae7e55a97491c84c1ad6f9f</id>
<content type='text'>
Since vmbus_chan_sched() dereferences the ring buffer pointer, we have
to make sure that the ring buffer data structures don't get freed while
such dereferencing is happening.  Current code does this by sending an
IPI to the CPU that is allowed to access that ring buffer from interrupt
level, cf., vmbus_reset_channel_cb().  But with the new functionality
to allow changing the CPU that a channel will interrupt, we can't be
sure what CPU will be running the vmbus_chan_sched() function for a
particular channel, so the current IPI mechanism is infeasible.

Instead synchronize vmbus_chan_sched() and vmbus_reset_channel_cb() by
using the (newly introduced) per-channel spin lock "sched_lock".  Move
the test for onchannel_callback being NULL before the "switch" control
statement in vmbus_chan_sched(), in order to not access the ring buffer
if the vmbus_reset_channel_cb() has been completed on the channel.

Suggested-by: Michael Kelley &lt;mikelley@microsoft.com&gt;
Signed-off-by: Andrea Parri (Microsoft) &lt;parri.andrea@gmail.com&gt;
Link: https://lore.kernel.org/r/20200406001514.19876-7-parri.andrea@gmail.com
Reviewed-by: Michael Kelley &lt;mikelley@microsoft.com&gt;
Signed-off-by: Wei Liu &lt;wei.liu@kernel.org&gt;
</content>
</entry>
<entry>
<title>Drivers: hv: vmbus: Replace the per-CPU channel lists with a global array of channels</title>
<updated>2020-04-23T13:17:11Z</updated>
<author>
<name>Andrea Parri (Microsoft)</name>
<email>parri.andrea@gmail.com</email>
</author>
<published>2020-04-06T00:15:06Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=8b6a877c060ed6b86878fe66c7c6493a6054cf23'/>
<id>urn:sha1:8b6a877c060ed6b86878fe66c7c6493a6054cf23</id>
<content type='text'>
When Hyper-V sends an interrupt to the guest, the guest has to figure
out which channel the interrupt is associated with.  Hyper-V sets a bit
in a memory page that is shared with the guest, indicating a particular
"relid" that the interrupt is associated with.  The current Linux code
then uses a set of per-CPU linked lists to map a given "relid" to a
pointer to a channel structure.

This design introduces a synchronization problem if the CPU that Hyper-V
will interrupt for a certain channel is changed.  If the interrupt comes
on the "old CPU" and the channel was already moved to the per-CPU list
of the "new CPU", then the relid -&gt; channel mapping will fail and the
interrupt is dropped.  Similarly, if the interrupt comes on the new CPU
but the channel was not moved to the per-CPU list of the new CPU, then
the mapping will fail and the interrupt is dropped.

Relids are integers ranging from 0 to 2047.  The mapping from relids to
channel structures can be done by setting up an array with 2048 entries,
each entry being a pointer to a channel structure (hence total size ~16K
bytes, which is not a problem).  The array is global, so there are no
per-CPU linked lists to update.  The array can be searched and updated
by loading from/storing to the array at the specified index.  With no
per-CPU data structures, the above mentioned synchronization problem is
avoided and the relid2channel() function gets simpler.

Suggested-by: Michael Kelley &lt;mikelley@microsoft.com&gt;
Signed-off-by: Andrea Parri (Microsoft) &lt;parri.andrea@gmail.com&gt;
Link: https://lore.kernel.org/r/20200406001514.19876-4-parri.andrea@gmail.com
Reviewed-by: Michael Kelley &lt;mikelley@microsoft.com&gt;
Signed-off-by: Wei Liu &lt;wei.liu@kernel.org&gt;
</content>
</entry>
</feed>
