<feed xmlns='http://www.w3.org/2005/Atom'>
<title>user/sven/linux.git/include/linux/kvm_host.h, branch v4.14.136</title>
<subtitle>Linux Kernel
</subtitle>
<id>https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v4.14.136</id>
<link rel='self' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v4.14.136'/>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/'/>
<updated>2019-05-16T17:42:22Z</updated>
<entry>
<title>KVM: fix spectrev1 gadgets</title>
<updated>2019-05-16T17:42:22Z</updated>
<author>
<name>Paolo Bonzini</name>
<email>pbonzini@redhat.com</email>
</author>
<published>2019-04-11T09:16:47Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=559e2696d2f47a3575e9550f101a7e59e30b1b38'/>
<id>urn:sha1:559e2696d2f47a3575e9550f101a7e59e30b1b38</id>
<content type='text'>
[ Upstream commit 1d487e9bf8ba66a7174c56a0029c54b1eca8f99c ]

These were found with smatch, and then generalized when applicable.

Signed-off-by: Paolo Bonzini &lt;pbonzini@redhat.com&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>KVM: Call kvm_arch_memslots_updated() before updating memslots</title>
<updated>2019-03-23T13:35:31Z</updated>
<author>
<name>Sean Christopherson</name>
<email>sean.j.christopherson@intel.com</email>
</author>
<published>2019-02-05T20:54:17Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=89dce6e457a14aa53fc0a83ec8f4206748a5c87a'/>
<id>urn:sha1:89dce6e457a14aa53fc0a83ec8f4206748a5c87a</id>
<content type='text'>
commit 152482580a1b0accb60676063a1ac57b2d12daf6 upstream.

kvm_arch_memslots_updated() is at this point in time an x86-specific
hook for handling MMIO generation wraparound.  x86 stashes 19 bits of
the memslots generation number in its MMIO sptes in order to avoid
full page fault walks for repeat faults on emulated MMIO addresses.
Because only 19 bits are used, wrapping the MMIO generation number is
possible, if unlikely.  kvm_arch_memslots_updated() alerts x86 that
the generation has changed so that it can invalidate all MMIO sptes in
case the effective MMIO generation has wrapped so as to avoid using a
stale spte, e.g. a (very) old spte that was created with generation==0.

Given that the purpose of kvm_arch_memslots_updated() is to prevent
consuming stale entries, it needs to be called before the new generation
is propagated to memslots.  Invalidating the MMIO sptes after updating
memslots means that there is a window where a vCPU could dereference
the new memslots generation, e.g. 0, and incorrectly reuse an old MMIO
spte that was created with (pre-wrap) generation==0.

Fixes: e59dbe09f8e6 ("KVM: Introduce kvm_arch_memslots_updated()")
Cc: &lt;stable@vger.kernel.org&gt;
Signed-off-by: Sean Christopherson &lt;sean.j.christopherson@intel.com&gt;
Signed-off-by: Paolo Bonzini &lt;pbonzini@redhat.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>kvm: Change offset in kvm_write_guest_offset_cached to unsigned</title>
<updated>2019-02-12T18:46:07Z</updated>
<author>
<name>Jim Mattson</name>
<email>jmattson@google.com</email>
</author>
<published>2018-12-14T22:34:43Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=f70123c6d3accc024445eca6b9ee01c1fe2b80d8'/>
<id>urn:sha1:f70123c6d3accc024445eca6b9ee01c1fe2b80d8</id>
<content type='text'>
[ Upstream commit 7a86dab8cf2f0fdf508f3555dddfc236623bff60 ]

Since the offset is added directly to the hva from the
gfn_to_hva_cache, a negative offset could result in an out of bounds
write. The existing BUG_ON only checks for addresses beyond the end of
the gfn_to_hva_cache, not for addresses before the start of the
gfn_to_hva_cache.

Note that all current call sites have non-negative offsets.

Fixes: 4ec6e8636256 ("kvm: Introduce kvm_write_guest_offset_cached()")
Reported-by: Cfir Cohen &lt;cfir@google.com&gt;
Signed-off-by: Jim Mattson &lt;jmattson@google.com&gt;
Reviewed-by: Cfir Cohen &lt;cfir@google.com&gt;
Reviewed-by: Peter Shier &lt;pshier@google.com&gt;
Reviewed-by: Krish Sadhukhan &lt;krish.sadhukhan@oracle.com&gt;
Reviewed-by: Sean Christopherson &lt;sean.j.christopherson@intel.com&gt;
Signed-off-by: Radim Krčmář &lt;rkrcmar@redhat.com&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>x86,kvm: move qemu/guest FPU switching out to vcpu_run</title>
<updated>2019-01-16T21:07:09Z</updated>
<author>
<name>Rik van Riel</name>
<email>riel@redhat.com</email>
</author>
<published>2017-11-14T21:54:23Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=4124a4cff344abbf8187775eb643d9827830e715'/>
<id>urn:sha1:4124a4cff344abbf8187775eb643d9827830e715</id>
<content type='text'>
commit f775b13eedee2f7f3c6fdd4e90fb79090ce5d339 upstream.

Currently, every time a VCPU is scheduled out, the host kernel will
first save the guest FPU/xstate context, then load the qemu userspace
FPU context, only to then immediately save the qemu userspace FPU
context back to memory. When scheduling in a VCPU, the same extraneous
FPU loads and saves are done.

This could be avoided by moving from a model where the guest FPU is
loaded and stored with preemption disabled, to a model where the
qemu userspace FPU is swapped out for the guest FPU context for
the duration of the KVM_RUN ioctl.

This is done under the VCPU mutex, which is also taken when other
tasks inspect the VCPU FPU context, so the code should already be
safe for this change. That should come as no surprise, given that
s390 already has this optimization.

This can fix a bug where KVM calls get_user_pages while owning the
FPU, and the file system ends up requesting the FPU again:

    [258270.527947]  __warn+0xcb/0xf0
    [258270.527948]  warn_slowpath_null+0x1d/0x20
    [258270.527951]  kernel_fpu_disable+0x3f/0x50
    [258270.527953]  __kernel_fpu_begin+0x49/0x100
    [258270.527955]  kernel_fpu_begin+0xe/0x10
    [258270.527958]  crc32c_pcl_intel_update+0x84/0xb0
    [258270.527961]  crypto_shash_update+0x3f/0x110
    [258270.527968]  crc32c+0x63/0x8a [libcrc32c]
    [258270.527975]  dm_bm_checksum+0x1b/0x20 [dm_persistent_data]
    [258270.527978]  node_prepare_for_write+0x44/0x70 [dm_persistent_data]
    [258270.527985]  dm_block_manager_write_callback+0x41/0x50 [dm_persistent_data]
    [258270.527988]  submit_io+0x170/0x1b0 [dm_bufio]
    [258270.527992]  __write_dirty_buffer+0x89/0x90 [dm_bufio]
    [258270.527994]  __make_buffer_clean+0x4f/0x80 [dm_bufio]
    [258270.527996]  __try_evict_buffer+0x42/0x60 [dm_bufio]
    [258270.527998]  dm_bufio_shrink_scan+0xc0/0x130 [dm_bufio]
    [258270.528002]  shrink_slab.part.40+0x1f5/0x420
    [258270.528004]  shrink_node+0x22c/0x320
    [258270.528006]  do_try_to_free_pages+0xf5/0x330
    [258270.528008]  try_to_free_pages+0xe9/0x190
    [258270.528009]  __alloc_pages_slowpath+0x40f/0xba0
    [258270.528011]  __alloc_pages_nodemask+0x209/0x260
    [258270.528014]  alloc_pages_vma+0x1f1/0x250
    [258270.528017]  do_huge_pmd_anonymous_page+0x123/0x660
    [258270.528021]  handle_mm_fault+0xfd3/0x1330
    [258270.528025]  __get_user_pages+0x113/0x640
    [258270.528027]  get_user_pages+0x4f/0x60
    [258270.528063]  __gfn_to_pfn_memslot+0x120/0x3f0 [kvm]
    [258270.528108]  try_async_pf+0x66/0x230 [kvm]
    [258270.528135]  tdp_page_fault+0x130/0x280 [kvm]
    [258270.528149]  kvm_mmu_page_fault+0x60/0x120 [kvm]
    [258270.528158]  handle_ept_violation+0x91/0x170 [kvm_intel]
    [258270.528162]  vmx_handle_exit+0x1ca/0x1400 [kvm_intel]

No performance changes were detected in quick ping-pong tests on
my 4 socket system, which is expected since an FPU+xstate load is
on the order of 0.1us, while ping-ponging between CPUs is on the
order of 20us, and somewhat noisy.

Cc: stable@vger.kernel.org
Signed-off-by: Rik van Riel &lt;riel@redhat.com&gt;
Suggested-by: Christian Borntraeger &lt;borntraeger@de.ibm.com&gt;
Signed-off-by: Paolo Bonzini &lt;pbonzini@redhat.com&gt;
[Fixed a bug where reset_vcpu called put_fpu without preceding load_fpu,
 which happened inside from KVM_CREATE_VCPU ioctl. - Radim]
Signed-off-by: Radim Krčmář &lt;rkrcmar@redhat.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>KVM: Extend MAX_IRQ_ROUTES to 4096 for all archs</title>
<updated>2018-06-20T19:02:57Z</updated>
<author>
<name>Wanpeng Li</name>
<email>wanpengli@tencent.com</email>
</author>
<published>2018-04-27T00:55:03Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=ce44a4d5db630728a5a20cf833afeafe1dce88f6'/>
<id>urn:sha1:ce44a4d5db630728a5a20cf833afeafe1dce88f6</id>
<content type='text'>
[ Upstream commit ddc9cfb79c1096a0855839631c091aa7e9602052 ]

Our virtual machines make use of device assignment by configuring
12 NVMe disks for high I/O performance. Each NVMe device has 129
MSI-X Table entries:
Capabilities: [50] MSI-X: Enable+ Count=129 Masked-Vector table: BAR=0 offset=00002000
The windows virtual machines fail to boot since they will map the number of
MSI-table entries that the NVMe hardware reported to the bus to msi routing
table, this will exceed the 1024. This patch extends MAX_IRQ_ROUTES to 4096
for all archs, in the future this might be extended again if needed.

Reviewed-by: Cornelia Huck &lt;cohuck@redhat.com&gt;
Cc: Paolo Bonzini &lt;pbonzini@redhat.com&gt;
Cc: Radim KrÄmÃ¡Å™ &lt;rkrcmar@redhat.com&gt;
Cc: Cornelia Huck &lt;cohuck@redhat.com&gt;
Cc: Christian Borntraeger &lt;borntraeger@de.ibm.com&gt;
Signed-off-by: Wanpeng Li &lt;wanpengli@tencent.com&gt;
Signed-off-by: Tonny Lu &lt;tonnylu@tencent.com&gt;
Signed-off-by: Paolo Bonzini &lt;pbonzini@redhat.com&gt;
Signed-off-by: Sasha Levin &lt;alexander.levin@microsoft.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>kvm: fix warning for CONFIG_HAVE_KVM_EVENTFD builds</title>
<updated>2018-05-30T05:52:02Z</updated>
<author>
<name>Sebastian Ott</name>
<email>sebott@linux.vnet.ibm.com</email>
</author>
<published>2018-02-22T12:05:41Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=5f95541a0db52dd4edf2dcca3d3e3ba54b97d3cf'/>
<id>urn:sha1:5f95541a0db52dd4edf2dcca3d3e3ba54b97d3cf</id>
<content type='text'>
[ Upstream commit 076467490b8176eb96eddc548a14d4135c7b5852 ]

Move the kvm_arch_irq_routing_update() prototype outside of
ifdef CONFIG_HAVE_KVM_EVENTFD guards to fix the following sparse warning:

arch/s390/kvm/../../../virt/kvm/irqchip.c:171:28: warning: symbol 'kvm_arch_irq_routing_update' was not declared. Should it be static?

Signed-off-by: Sebastian Ott &lt;sebott@linux.vnet.ibm.com&gt;
Acked-by: Christian Borntraeger &lt;borntraeger@de.ibm.com&gt;
Signed-off-by: Paolo Bonzini &lt;pbonzini@redhat.com&gt;
Signed-off-by: Sasha Levin &lt;alexander.levin@microsoft.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>KVM: Let KVM_SET_SIGNAL_MASK work as advertised</title>
<updated>2018-02-03T16:39:06Z</updated>
<author>
<name>Jan H. Schönherr</name>
<email>jschoenh@amazon.de</email>
</author>
<published>2017-11-24T21:39:01Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=40ba283e2602d319d00d4f6539b5113eb8d25d24'/>
<id>urn:sha1:40ba283e2602d319d00d4f6539b5113eb8d25d24</id>
<content type='text'>
[ Upstream commit 20b7035c66bacc909ae3ffe92c1a1ea7db99fe4f ]

KVM API says for the signal mask you set via KVM_SET_SIGNAL_MASK, that
"any unblocked signal received [...] will cause KVM_RUN to return with
-EINTR" and that "the signal will only be delivered if not blocked by
the original signal mask".

This, however, is only true, when the calling task has a signal handler
registered for a signal. If not, signal evaluation is short-circuited for
SIG_IGN and SIG_DFL, and the signal is either ignored without KVM_RUN
returning or the whole process is terminated.

Make KVM_SET_SIGNAL_MASK behave as advertised by utilizing logic similar
to that in do_sigtimedwait() to avoid short-circuiting of signals.

Signed-off-by: Jan H. SchÃ¶nherr &lt;jschoenh@amazon.de&gt;
Signed-off-by: Paolo Bonzini &lt;pbonzini@redhat.com&gt;
Signed-off-by: Sasha Levin &lt;alexander.levin@verizon.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>KVM: add spinlock optimization framework</title>
<updated>2017-08-08T08:57:43Z</updated>
<author>
<name>Longpeng(Mike)</name>
<email>longpeng2@huawei.com</email>
</author>
<published>2017-08-08T04:05:32Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=199b5763d329b43c88f6ad539db8a6c6b42f8edb'/>
<id>urn:sha1:199b5763d329b43c88f6ad539db8a6c6b42f8edb</id>
<content type='text'>
If a vcpu exits due to request a user mode spinlock, then
the spinlock-holder may be preempted in user mode or kernel mode.
(Note that not all architectures trap spin loops in user mode,
only AMD x86 and ARM/ARM64 currently do).

But if a vcpu exits in kernel mode, then the holder must be
preempted in kernel mode, so we should choose a vcpu in kernel mode
as a more likely candidate for the lock holder.

This introduces kvm_arch_vcpu_in_kernel() to decide whether the
vcpu is in kernel-mode when it's preempted.  kvm_vcpu_on_spin's
new argument says the same of the spinning VCPU.

Signed-off-by: Longpeng(Mike) &lt;longpeng2@huawei.com&gt;
Signed-off-by: Paolo Bonzini &lt;pbonzini@redhat.com&gt;
</content>
</entry>
<entry>
<title>KVM: nVMX: get rid of nested_get_page()</title>
<updated>2017-08-07T13:27:00Z</updated>
<author>
<name>David Hildenbrand</name>
<email>david@redhat.com</email>
</author>
<published>2017-08-03T16:11:04Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=5e2f30b756a37bd80c5b0471d0e10d769ab2eb9a'/>
<id>urn:sha1:5e2f30b756a37bd80c5b0471d0e10d769ab2eb9a</id>
<content type='text'>
nested_get_page() just sounds confusing. All we want is a page from G1.
This is even unrelated to nested.

Let's introduce kvm_vcpu_gpa_to_page() so we don't get too lengthy
lines.

Reviewed-by: Paolo Bonzini &lt;pbonzini@redhat.com&gt;
Signed-off-by: David Hildenbrand &lt;david@redhat.com&gt;
Signed-off-by: Radim Krčmář &lt;rkrcmar@redhat.com&gt;
[Squash pasto fix from Wanpeng Li. - Paolo]
Signed-off-by: Paolo Bonzini &lt;pbonzini@redhat.com&gt;
</content>
</entry>
<entry>
<title>KVM: avoid using rcu_dereference_protected</title>
<updated>2017-08-02T20:41:02Z</updated>
<author>
<name>Paolo Bonzini</name>
<email>pbonzini@redhat.com</email>
</author>
<published>2017-08-02T15:55:54Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=3898da947bbaf9e7fd5816e825978d360028bba2'/>
<id>urn:sha1:3898da947bbaf9e7fd5816e825978d360028bba2</id>
<content type='text'>
During teardown, accesses to memslots and buses are using
rcu_dereference_protected with an always-true condition because
these accesses are done outside the usual mutexes.  This
is because the last reference is gone and there cannot be any
concurrent modifications, but rcu_dereference_protected is
ugly and unobvious.

Instead, check the refcount in kvm_get_bus and __kvm_memslots.

Signed-off-by: Paolo Bonzini &lt;pbonzini@redhat.com&gt;
Signed-off-by: Radim Krčmář &lt;rkrcmar@redhat.com&gt;
</content>
</entry>
</feed>
