<feed xmlns='http://www.w3.org/2005/Atom'>
<title>user/sven/linux.git/virt, branch v5.15.27</title>
<subtitle>Linux Kernel
</subtitle>
<id>https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v5.15.27</id>
<link rel='self' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v5.15.27'/>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/'/>
<updated>2022-03-08T18:12:34Z</updated>
<entry>
<title>KVM: s390: Ensure kvm_arch_no_poll() is read once when blocking vCPU</title>
<updated>2022-03-08T18:12:34Z</updated>
<author>
<name>Sean Christopherson</name>
<email>seanjc@google.com</email>
</author>
<published>2021-10-09T02:11:56Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=32b758d12c24d7fc385e359d6a54ebbb29960485'/>
<id>urn:sha1:32b758d12c24d7fc385e359d6a54ebbb29960485</id>
<content type='text'>
[ Upstream commit 6f390916c4fb359507d9ac4bf1b28a4f8abee5c0 ]

Wrap s390's halt_poll_max_steal with READ_ONCE and snapshot the result of
kvm_arch_no_poll() in kvm_vcpu_block() to avoid a mostly-theoretical,
largely benign bug on s390 where the result of kvm_arch_no_poll() could
change due to userspace modifying halt_poll_max_steal while the vCPU is
blocking.  The bug is largely benign as it will either cause KVM to skip
updating halt-polling times (no_poll toggles false=&gt;true) or to update
halt-polling times with a slightly flawed block_ns.

Note, READ_ONCE is unnecessary in the current code, add it in case the
arch hook is ever inlined, and to provide a hint that userspace can
change the param at will.

Fixes: 8b905d28ee17 ("KVM: s390: provide kvm_arch_no_poll function")
Reviewed-by: Christian Borntraeger &lt;borntraeger@de.ibm.com&gt;
Signed-off-by: Sean Christopherson &lt;seanjc@google.com&gt;
Message-Id: &lt;20211009021236.4122790-4-seanjc@google.com&gt;
Signed-off-by: Paolo Bonzini &lt;pbonzini@redhat.com&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>KVM: eventfd: Fix false positive RCU usage warning</title>
<updated>2022-02-16T11:56:16Z</updated>
<author>
<name>Hou Wenlong</name>
<email>houwenlong93@linux.alibaba.com</email>
</author>
<published>2022-01-27T06:54:49Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=d44af3ad2a2148582839b556d0ff5b5e3787dfc0'/>
<id>urn:sha1:d44af3ad2a2148582839b556d0ff5b5e3787dfc0</id>
<content type='text'>
[ Upstream commit 6a0c61703e3a5d67845a4b275e1d9d7bc1b5aad7 ]

Fix the following false positive warning:
 =============================
 WARNING: suspicious RCU usage
 5.16.0-rc4+ #57 Not tainted
 -----------------------------
 arch/x86/kvm/../../../virt/kvm/eventfd.c:484 RCU-list traversed in non-reader section!!

 other info that might help us debug this:

 rcu_scheduler_active = 2, debug_locks = 1
 3 locks held by fc_vcpu 0/330:
  #0: ffff8884835fc0b0 (&amp;vcpu-&gt;mutex){+.+.}-{3:3}, at: kvm_vcpu_ioctl+0x88/0x6f0 [kvm]
  #1: ffffc90004c0bb68 (&amp;kvm-&gt;srcu){....}-{0:0}, at: vcpu_enter_guest+0x600/0x1860 [kvm]
  #2: ffffc90004c0c1d0 (&amp;kvm-&gt;irq_srcu){....}-{0:0}, at: kvm_notify_acked_irq+0x36/0x180 [kvm]

 stack backtrace:
 CPU: 26 PID: 330 Comm: fc_vcpu 0 Not tainted 5.16.0-rc4+
 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014
 Call Trace:
  &lt;TASK&gt;
  dump_stack_lvl+0x44/0x57
  kvm_notify_acked_gsi+0x6b/0x70 [kvm]
  kvm_notify_acked_irq+0x8d/0x180 [kvm]
  kvm_ioapic_update_eoi+0x92/0x240 [kvm]
  kvm_apic_set_eoi_accelerated+0x2a/0xe0 [kvm]
  handle_apic_eoi_induced+0x3d/0x60 [kvm_intel]
  vmx_handle_exit+0x19c/0x6a0 [kvm_intel]
  vcpu_enter_guest+0x66e/0x1860 [kvm]
  kvm_arch_vcpu_ioctl_run+0x438/0x7f0 [kvm]
  kvm_vcpu_ioctl+0x38a/0x6f0 [kvm]
  __x64_sys_ioctl+0x89/0xc0
  do_syscall_64+0x3a/0x90
  entry_SYSCALL_64_after_hwframe+0x44/0xae

Since kvm_unregister_irq_ack_notifier() does synchronize_srcu(&amp;kvm-&gt;irq_srcu),
kvm-&gt;irq_ack_notifier_list is protected by kvm-&gt;irq_srcu. In fact,
kvm-&gt;irq_srcu SRCU read lock is held in kvm_notify_acked_irq(), making it
a false positive warning. So use hlist_for_each_entry_srcu() instead of
hlist_for_each_entry_rcu().

Reviewed-by: Sean Christopherson &lt;seanjc@google.com&gt;
Signed-off-by: Hou Wenlong &lt;houwenlong93@linux.alibaba.com&gt;
Message-Id: &lt;f98bac4f5052bad2c26df9ad50f7019e40434512.1643265976.git.houwenlong.hwl@antgroup.com&gt;
Signed-off-by: Paolo Bonzini &lt;pbonzini@redhat.com&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>Revert "KVM: SVM: avoid infinite loop on NPF from bad address"</title>
<updated>2022-02-01T16:27:01Z</updated>
<author>
<name>Sean Christopherson</name>
<email>seanjc@google.com</email>
</author>
<published>2022-01-20T01:07:12Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=b17cb93dda1dcdc9737c8b334e850c18dc6b0b7d'/>
<id>urn:sha1:b17cb93dda1dcdc9737c8b334e850c18dc6b0b7d</id>
<content type='text'>
commit 31c25585695abdf03d6160aa6d829e855b256329 upstream.

Revert a completely broken check on an "invalid" RIP in SVM's workaround
for the DecodeAssists SMAP errata.  kvm_vcpu_gfn_to_memslot() obviously
expects a gfn, i.e. operates in the guest physical address space, whereas
RIP is a virtual (not even linear) address.  The "fix" worked for the
problematic KVM selftest because the test identity mapped RIP.

Fully revert the hack instead of trying to translate RIP to a GPA, as the
non-SEV case is now handled earlier, and KVM cannot access guest page
tables to translate RIP.

This reverts commit e72436bc3a5206f95bb384e741154166ddb3202e.

Fixes: e72436bc3a52 ("KVM: SVM: avoid infinite loop on NPF from bad address")
Reported-by: Liam Merwick &lt;liam.merwick@oracle.com&gt;
Cc: stable@vger.kernel.org
Signed-off-by: Sean Christopherson &lt;seanjc@google.com&gt;
Reviewed-by: Liam Merwick &lt;liam.merwick@oracle.com&gt;
Message-Id: &lt;20220120010719.711476-3-seanjc@google.com&gt;
Signed-off-by: Paolo Bonzini &lt;pbonzini@redhat.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>KVM: downgrade two BUG_ONs to WARN_ON_ONCE</title>
<updated>2021-12-22T08:32:34Z</updated>
<author>
<name>Paolo Bonzini</name>
<email>pbonzini@redhat.com</email>
</author>
<published>2021-11-22T23:24:01Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=54bf0b0d3536bf1651c66f60f621532e8d6c50af'/>
<id>urn:sha1:54bf0b0d3536bf1651c66f60f621532e8d6c50af</id>
<content type='text'>
[ Upstream commit 5f25e71e311478f9bb0a8ef49e7d8b95316491d7 ]

This is not an unrecoverable situation.  Users of kvm_read_guest_offset_cached
and kvm_write_guest_offset_cached must expect the read/write to fail, and
therefore it is possible to just return early with an error value.

Signed-off-by: Paolo Bonzini &lt;pbonzini@redhat.com&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>KVM: Ensure local memslot copies operate on up-to-date arch-specific data</title>
<updated>2021-12-08T08:04:43Z</updated>
<author>
<name>Sean Christopherson</name>
<email>seanjc@google.com</email>
</author>
<published>2021-11-04T00:25:02Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=cbe4fcf37150a409ab1bfc59be0cb3408ab22236'/>
<id>urn:sha1:cbe4fcf37150a409ab1bfc59be0cb3408ab22236</id>
<content type='text'>
commit bda44d844758c70c8dc1478e6fc9c25efa90c5a7 upstream.

When modifying memslots, snapshot the "old" memslot and copy it to the
"new" memslot's arch data after (re)acquiring slots_arch_lock.  x86 can
change a memslot's arch data while memslot updates are in-progress so
long as it holds slots_arch_lock, thus snapshotting a memslot without
holding the lock can result in the consumption of stale data.

Fixes: b10a038e84d1 ("KVM: mmu: Add slots_arch_lock for memslot arch fields")
Cc: stable@vger.kernel.org
Cc: Ben Gardon &lt;bgardon@google.com&gt;
Signed-off-by: Sean Christopherson &lt;seanjc@google.com&gt;
Message-Id: &lt;20211104002531.1176691-2-seanjc@google.com&gt;
Signed-off-by: Paolo Bonzini &lt;pbonzini@redhat.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>KVM: Disallow user memslot with size that exceeds "unsigned long"</title>
<updated>2021-12-08T08:04:43Z</updated>
<author>
<name>Sean Christopherson</name>
<email>seanjc@google.com</email>
</author>
<published>2021-11-04T00:25:03Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=0827b8db5c7fd3ecbe00e725f5295eda8e7aa9a7'/>
<id>urn:sha1:0827b8db5c7fd3ecbe00e725f5295eda8e7aa9a7</id>
<content type='text'>
commit 6b285a5587506bae084cf9a3ed5aa491d623b91b upstream.

Reject userspace memslots whose size exceeds the storage capacity of an
"unsigned long".  KVM's uAPI takes the size as u64 to support large slots
on 64-bit hosts, but does not account for the size being truncated on
32-bit hosts in various flows.  The access_ok() check on the userspace
virtual address in particular casts the size to "unsigned long" and will
check the wrong number of bytes.

KVM doesn't actually support slots whose size doesn't fit in an "unsigned
long", e.g. KVM's internal kvm_memory_slot.npages is an "unsigned long",
not a "u64", and misc arch specific code follows that behavior.

Fixes: fa3d315a4ce2 ("KVM: Validate userspace_addr of memslot when registered")
Cc: stable@vger.kernel.org
Signed-off-by: Sean Christopherson &lt;seanjc@google.com&gt;
Reviewed-by: Maciej S. Szmigiero &lt;maciej.szmigiero@oracle.com&gt;
Message-Id: &lt;20211104002531.1176691-3-seanjc@google.com&gt;
Signed-off-by: Paolo Bonzini &lt;pbonzini@redhat.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>KVM: Remove tlbs_dirty</title>
<updated>2021-09-23T15:01:12Z</updated>
<author>
<name>Lai Jiangshan</name>
<email>laijs@linux.alibaba.com</email>
</author>
<published>2021-09-18T00:56:29Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=6bc6db000295332bae2c1e8815d7450b72923d23'/>
<id>urn:sha1:6bc6db000295332bae2c1e8815d7450b72923d23</id>
<content type='text'>
There is no user of tlbs_dirty.

Signed-off-by: Lai Jiangshan &lt;laijs@linux.alibaba.com&gt;
Signed-off-by: Paolo Bonzini &lt;pbonzini@redhat.com&gt;
Message-Id: &lt;20210918005636.3675-4-jiangshanlai@gmail.com&gt;
Signed-off-by: Paolo Bonzini &lt;pbonzini@redhat.com&gt;
</content>
</entry>
<entry>
<title>KVM: KVM: Use cpumask_available() to check for NULL cpumask when kicking vCPUs</title>
<updated>2021-09-22T14:33:15Z</updated>
<author>
<name>Sean Christopherson</name>
<email>seanjc@google.com</email>
</author>
<published>2021-08-27T09:25:10Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=0bbc2ca8515f9cdf11df84ccb63dc7c44bc3d8f4'/>
<id>urn:sha1:0bbc2ca8515f9cdf11df84ccb63dc7c44bc3d8f4</id>
<content type='text'>
Check for a NULL cpumask_var_t when kicking multiple vCPUs via
cpumask_available(), which performs a !NULL check if and only if cpumasks
are configured to be allocated off-stack.  This is a meaningless
optimization, e.g. avoids a TEST+Jcc and TEST+CMOV on x86, but more
importantly helps document that the NULL check is necessary even though
all callers pass in a local variable.

No functional change intended.

Cc: Lai Jiangshan &lt;jiangshanlai@gmail.com&gt;
Signed-off-by: Sean Christopherson &lt;seanjc@google.com&gt;
Signed-off-by: Vitaly Kuznetsov &lt;vkuznets@redhat.com&gt;
Message-Id: &lt;20210827092516.1027264-3-vkuznets@redhat.com&gt;
Signed-off-by: Paolo Bonzini &lt;pbonzini@redhat.com&gt;
</content>
</entry>
<entry>
<title>KVM: Clean up benign vcpu-&gt;cpu data races when kicking vCPUs</title>
<updated>2021-09-22T14:33:15Z</updated>
<author>
<name>Sean Christopherson</name>
<email>seanjc@google.com</email>
</author>
<published>2021-08-27T09:25:09Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=85b640450ddcfa09cf72771b69a9c3daf0ddc772'/>
<id>urn:sha1:85b640450ddcfa09cf72771b69a9c3daf0ddc772</id>
<content type='text'>
Fix a benign data race reported by syzbot+KCSAN[*] by ensuring vcpu-&gt;cpu
is read exactly once, and by ensuring the vCPU is booted from guest mode
if kvm_arch_vcpu_should_kick() returns true.  Fix a similar race in
kvm_make_vcpus_request_mask() by ensuring the vCPU is interrupted if
kvm_request_needs_ipi() returns true.

Reading vcpu-&gt;cpu before vcpu-&gt;mode (via kvm_arch_vcpu_should_kick() or
kvm_request_needs_ipi()) means the target vCPU could get migrated (change
vcpu-&gt;cpu) and enter !OUTSIDE_GUEST_MODE between reading vcpu-&gt;cpud and
reading vcpu-&gt;mode.  If that happens, the kick/IPI will be sent to the
old pCPU, not the new pCPU that is now running the vCPU or reading SPTEs.

Although failing to kick the vCPU is not exactly ideal, practically
speaking it cannot cause a functional issue unless there is also a bug in
the caller, and any such bug would exist regardless of kvm_vcpu_kick()'s
behavior.

The purpose of sending an IPI is purely to get a vCPU into the host (or
out of reading SPTEs) so that the vCPU can recognize a change in state,
e.g. a KVM_REQ_* request.  If vCPU's handling of the state change is
required for correctness, KVM must ensure either the vCPU sees the change
before entering the guest, or that the sender sees the vCPU as running in
guest mode.  All architectures handle this by (a) sending the request
before calling kvm_vcpu_kick() and (b) checking for requests _after_
setting vcpu-&gt;mode.

x86's READING_SHADOW_PAGE_TABLES has similar requirements; KVM needs to
ensure it kicks and waits for vCPUs that started reading SPTEs _before_
MMU changes were finalized, but any vCPU that starts reading after MMU
changes were finalized will see the new state and can continue on
uninterrupted.

For uses of kvm_vcpu_kick() that are not paired with a KVM_REQ_*, e.g.
x86's kvm_arch_sync_dirty_log(), the order of the kick must not be relied
upon for functional correctness, e.g. in the dirty log case, userspace
cannot assume it has a 100% complete log if vCPUs are still running.

All that said, eliminate the benign race since the cost of doing so is an
"extra" atomic cmpxchg() in the case where the target vCPU is loaded by
the current pCPU or is not loaded at all.  I.e. the kick will be skipped
due to kvm_vcpu_exiting_guest_mode() seeing a compatible vcpu-&gt;mode as
opposed to the kick being skipped because of the cpu checks.

Keep the "cpu != me" checks even though they appear useless/impossible at
first glance.  x86 processes guest IPI writes in a fast path that runs in
IN_GUEST_MODE, i.e. can call kvm_vcpu_kick() from IN_GUEST_MODE.  And
calling kvm_vm_bugged()-&gt;kvm_make_vcpus_request_mask() from IN_GUEST or
READING_SHADOW_PAGE_TABLES is perfectly reasonable.

Note, a race with the cpu_online() check in kvm_vcpu_kick() likely
persists, e.g. the vCPU could exit guest mode and get offlined between
the cpu_online() check and the sending of smp_send_reschedule().  But,
the online check appears to exist only to avoid a WARN in x86's
native_smp_send_reschedule() that fires if the target CPU is not online.
The reschedule WARN exists because CPU offlining takes the CPU out of the
scheduling pool, i.e. the WARN is intended to detect the case where the
kernel attempts to schedule a task on an offline CPU.  The actual sending
of the IPI is a non-issue as at worst it will simpy be dropped on the
floor.  In other words, KVM's usurping of the reschedule IPI could
theoretically trigger a WARN if the stars align, but there will be no
loss of functionality.

[*] https://syzkaller.appspot.com/bug?extid=cd4154e502f43f10808a

Cc: Venkatesh Srinivas &lt;venkateshs@google.com&gt;
Cc: Vitaly Kuznetsov &lt;vkuznets@redhat.com&gt;
Fixes: 97222cc83163 ("KVM: Emulate local APIC in kernel")
Signed-off-by: Sean Christopherson &lt;seanjc@google.com&gt;
Signed-off-by: Vitaly Kuznetsov &lt;vkuznets@redhat.com&gt;
Message-Id: &lt;20210827092516.1027264-2-vkuznets@redhat.com&gt;
Signed-off-by: Paolo Bonzini &lt;pbonzini@redhat.com&gt;
</content>
</entry>
<entry>
<title>KVM: do not shrink halt_poll_ns below grow_start</title>
<updated>2021-09-22T14:33:10Z</updated>
<author>
<name>Sergey Senozhatsky</name>
<email>senozhatsky@chromium.org</email>
</author>
<published>2021-09-02T03:11:00Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=ae232ea460888dc5a8b37e840c553b02521fbf18'/>
<id>urn:sha1:ae232ea460888dc5a8b37e840c553b02521fbf18</id>
<content type='text'>
grow_halt_poll_ns() ignores values between 0 and
halt_poll_ns_grow_start (10000 by default). However,
when we shrink halt_poll_ns we may fall way below
halt_poll_ns_grow_start and endup with halt_poll_ns
values that don't make a lot of sense: like 1 or 9,
or 19.

VCPU1 trace (halt_poll_ns_shrink equals 2):

VCPU1 grow 10000
VCPU1 shrink 5000
VCPU1 shrink 2500
VCPU1 shrink 1250
VCPU1 shrink 625
VCPU1 shrink 312
VCPU1 shrink 156
VCPU1 shrink 78
VCPU1 shrink 39
VCPU1 shrink 19
VCPU1 shrink 9
VCPU1 shrink 4

Mirror what grow_halt_poll_ns() does and set halt_poll_ns
to 0 as soon as new shrink-ed halt_poll_ns value falls
below halt_poll_ns_grow_start.

Signed-off-by: Sergey Senozhatsky &lt;senozhatsky@chromium.org&gt;
Signed-off-by: Paolo Bonzini &lt;pbonzini@redhat.com&gt;
Message-Id: &lt;20210902031100.252080-1-senozhatsky@chromium.org&gt;
Signed-off-by: Paolo Bonzini &lt;pbonzini@redhat.com&gt;
</content>
</entry>
</feed>
