<feed xmlns='http://www.w3.org/2005/Atom'>
<title>user/sven/linux.git/include/linux/kvm_host.h, branch v5.4.60</title>
<subtitle>Linux Kernel
</subtitle>
<id>https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v5.4.60</id>
<link rel='self' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v5.4.60'/>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/'/>
<updated>2020-06-17T14:40:26Z</updated>
<entry>
<title>KVM: x86: Fix APIC page invalidation race</title>
<updated>2020-06-17T14:40:26Z</updated>
<author>
<name>Eiichi Tsukata</name>
<email>eiichi.tsukata@nutanix.com</email>
</author>
<published>2020-06-06T04:26:27Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=cb810f75e98ab8d3d7565133b5a475a9edb01e23'/>
<id>urn:sha1:cb810f75e98ab8d3d7565133b5a475a9edb01e23</id>
<content type='text'>
commit e649b3f0188f8fd34dd0dde8d43fd3312b902fb2 upstream.

Commit b1394e745b94 ("KVM: x86: fix APIC page invalidation") tried
to fix inappropriate APIC page invalidation by re-introducing arch
specific kvm_arch_mmu_notifier_invalidate_range() and calling it from
kvm_mmu_notifier_invalidate_range_start. However, the patch left a
possible race where the VMCS APIC address cache is updated *before*
it is unmapped:

  (Invalidator) kvm_mmu_notifier_invalidate_range_start()
  (Invalidator) kvm_make_all_cpus_request(kvm, KVM_REQ_APIC_PAGE_RELOAD)
  (KVM VCPU) vcpu_enter_guest()
  (KVM VCPU) kvm_vcpu_reload_apic_access_page()
  (Invalidator) actually unmap page

Because of the above race, there can be a mismatch between the
host physical address stored in the APIC_ACCESS_PAGE VMCS field and
the host physical address stored in the EPT entry for the APIC GPA
(0xfee0000).  When this happens, the processor will not trap APIC
accesses, and will instead show the raw contents of the APIC-access page.
Because Windows OS periodically checks for unexpected modifications to
the LAPIC register, this will show up as a BSOD crash with BugCheck
CRITICAL_STRUCTURE_CORRUPTION (109) we are currently seeing in
https://bugzilla.redhat.com/show_bug.cgi?id=1751017.

The root cause of the issue is that kvm_arch_mmu_notifier_invalidate_range()
cannot guarantee that no additional references are taken to the pages in
the range before kvm_mmu_notifier_invalidate_range_end().  Fortunately,
this case is supported by the MMU notifier API, as documented in
include/linux/mmu_notifier.h:

	 * If the subsystem
         * can't guarantee that no additional references are taken to
         * the pages in the range, it has to implement the
         * invalidate_range() notifier to remove any references taken
         * after invalidate_range_start().

The fix therefore is to reload the APIC-access page field in the VMCS
from kvm_mmu_notifier_invalidate_range() instead of ..._range_start().

Cc: stable@vger.kernel.org
Fixes: b1394e745b94 ("KVM: x86: fix APIC page invalidation")
Fixes: https://bugzilla.kernel.org/show_bug.cgi?id=197951
Signed-off-by: Eiichi Tsukata &lt;eiichi.tsukata@nutanix.com&gt;
Message-Id: &lt;20200606042627.61070-1-eiichi.tsukata@nutanix.com&gt;
Signed-off-by: Paolo Bonzini &lt;pbonzini@redhat.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>KVM: Check validity of resolved slot when searching memslots</title>
<updated>2020-04-29T14:33:16Z</updated>
<author>
<name>Sean Christopherson</name>
<email>sean.j.christopherson@intel.com</email>
</author>
<published>2020-04-08T06:40:58Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=878127ac8b7013d7d7fad9d160654da5b6195902'/>
<id>urn:sha1:878127ac8b7013d7d7fad9d160654da5b6195902</id>
<content type='text'>
commit b6467ab142b708dd076f6186ca274f14af379c72 upstream.

Check that the resolved slot (somewhat confusingly named 'start') is a
valid/allocated slot before doing the final comparison to see if the
specified gfn resides in the associated slot.  The resolved slot can be
invalid if the binary search loop terminated because the search index
was incremented beyond the number of used slots.

This bug has existed since the binary search algorithm was introduced,
but went unnoticed because KVM statically allocated memory for the max
number of slots, i.e. the access would only be truly out-of-bounds if
all possible slots were allocated and the specified gfn was less than
the base of the lowest memslot.  Commit 36947254e5f98 ("KVM: Dynamically
size memslot array based on number of used slots") eliminated the "all
possible slots allocated" condition and made the bug embarrasingly easy
to hit.

Fixes: 9c1a5d38780e6 ("kvm: optimize GFN to memslot lookup with large slots amount")
Reported-by: syzbot+d889b59b2bb87d4047a2@syzkaller.appspotmail.com
Cc: stable@vger.kernel.org
Signed-off-by: Sean Christopherson &lt;sean.j.christopherson@intel.com&gt;
Message-Id: &lt;20200408064059.8957-2-sean.j.christopherson@intel.com&gt;
Reviewed-by: Cornelia Huck &lt;cohuck@redhat.com&gt;
Signed-off-by: Paolo Bonzini &lt;pbonzini@redhat.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>KVM: Use vcpu-specific gva-&gt;hva translation when querying host page size</title>
<updated>2020-02-11T12:35:54Z</updated>
<author>
<name>Sean Christopherson</name>
<email>sean.j.christopherson@intel.com</email>
</author>
<published>2020-01-08T20:24:37Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=7426ddf01f1639c19a79d93c5a354b463a563d29'/>
<id>urn:sha1:7426ddf01f1639c19a79d93c5a354b463a563d29</id>
<content type='text'>
[ Upstream commit f9b84e19221efc5f493156ee0329df3142085f28 ]

Use kvm_vcpu_gfn_to_hva() when retrieving the host page size so that the
correct set of memslots is used when handling x86 page faults in SMM.

Fixes: 54bf36aac520 ("KVM: x86: use vcpu-specific functions to read/write/translate GFNs")
Cc: stable@vger.kernel.org
Signed-off-by: Sean Christopherson &lt;sean.j.christopherson@intel.com&gt;
Signed-off-by: Paolo Bonzini &lt;pbonzini@redhat.com&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>KVM: x86: Use gpa_t for cr2/gpa to fix TDP support on 32-bit KVM</title>
<updated>2020-02-11T12:35:53Z</updated>
<author>
<name>Sean Christopherson</name>
<email>sean.j.christopherson@intel.com</email>
</author>
<published>2019-12-06T23:57:14Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=8a1cd01bee30bd1033a452035f66be127728d4fd'/>
<id>urn:sha1:8a1cd01bee30bd1033a452035f66be127728d4fd</id>
<content type='text'>
[ Upstream commit 736c291c9f36b07f8889c61764c28edce20e715d ]

Convert a plethora of parameters and variables in the MMU and page fault
flows from type gva_t to gpa_t to properly handle TDP on 32-bit KVM.

Thanks to PSE and PAE paging, 32-bit kernels can access 64-bit physical
addresses.  When TDP is enabled, the fault address is a guest physical
address and thus can be a 64-bit value, even when both KVM and its guest
are using 32-bit virtual addressing, e.g. VMX's VMCS.GUEST_PHYSICAL is a
64-bit field, not a natural width field.

Using a gva_t for the fault address means KVM will incorrectly drop the
upper 32-bits of the GPA.  Ditto for gva_to_gpa() when it is used to
translate L2 GPAs to L1 GPAs.

Opportunistically rename variables and parameters to better reflect the
dual address modes, e.g. use "cr2_or_gpa" for fault addresses and plain
"addr" instead of "vaddr" when the address may be either a GVA or an L2
GPA.  Similarly, use "gpa" in the nonpaging_page_fault() flows to avoid
a confusing "gpa_t gva" declaration; this also sets the stage for a
future patch to combing nonpaging_page_fault() and tdp_page_fault() with
minimal churn.

Sprinkle in a few comments to document flows where an address is known
to be a GVA and thus can be safely truncated to a 32-bit value.  Add
WARNs in kvm_handle_page_fault() and FNAME(gva_to_gpa_nested)() to help
document such cases and detect bugs.

Cc: stable@vger.kernel.org
Signed-off-by: Sean Christopherson &lt;sean.j.christopherson@intel.com&gt;
Signed-off-by: Paolo Bonzini &lt;pbonzini@redhat.com&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>x86/kvm: Cache gfn to pfn translation</title>
<updated>2020-02-11T12:35:40Z</updated>
<author>
<name>Boris Ostrovsky</name>
<email>boris.ostrovsky@oracle.com</email>
</author>
<published>2019-12-05T01:30:51Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=f7c1a6c67ff36532f1b0b339e3aae7701a2c0b1e'/>
<id>urn:sha1:f7c1a6c67ff36532f1b0b339e3aae7701a2c0b1e</id>
<content type='text'>
commit 917248144db5d7320655dbb41d3af0b8a0f3d589 upstream.

__kvm_map_gfn()'s call to gfn_to_pfn_memslot() is
* relatively expensive
* in certain cases (such as when done from atomic context) cannot be called

Stashing gfn-to-pfn mapping should help with both cases.

This is part of CVE-2019-3016.

Signed-off-by: Boris Ostrovsky &lt;boris.ostrovsky@oracle.com&gt;
Reviewed-by: Joao Martins &lt;joao.m.martins@oracle.com&gt;
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Bonzini &lt;pbonzini@redhat.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>x86/kvm: Introduce kvm_(un)map_gfn()</title>
<updated>2020-02-11T12:35:40Z</updated>
<author>
<name>Boris Ostrovsky</name>
<email>boris.ostrovsky@oracle.com</email>
</author>
<published>2019-11-12T16:35:06Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=a3db2949904b81ae53a840d99f71021f02a01fd3'/>
<id>urn:sha1:a3db2949904b81ae53a840d99f71021f02a01fd3</id>
<content type='text'>
commit 1eff70a9abd46f175defafd29bc17ad456f398a7 upstream.

kvm_vcpu_(un)map operates on gfns from any current address space.
In certain cases we want to make sure we are not mapping SMRAM
and for that we can use kvm_(un)map_gfn() that we are introducing
in this patch.

This is part of CVE-2019-3016.

Signed-off-by: Boris Ostrovsky &lt;boris.ostrovsky@oracle.com&gt;
Reviewed-by: Joao Martins &lt;joao.m.martins@oracle.com&gt;
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Bonzini &lt;pbonzini@redhat.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm</title>
<updated>2019-11-12T21:19:15Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2019-11-12T21:19:15Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=8c5bd25bf42effd194d4b0b43895c42b374e620b'/>
<id>urn:sha1:8c5bd25bf42effd194d4b0b43895c42b374e620b</id>
<content type='text'>
Pull kvm fixes from Paolo Bonzini:
 "Fix unwinding of KVM_CREATE_VM failure, VT-d posted interrupts,
  DAX/ZONE_DEVICE, and module unload/reload"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  KVM: MMU: Do not treat ZONE_DEVICE pages as being reserved
  KVM: VMX: Introduce pi_is_pir_empty() helper
  KVM: VMX: Do not change PID.NDST when loading a blocked vCPU
  KVM: VMX: Consider PID.PIR to determine if vCPU has pending interrupts
  KVM: VMX: Fix comment to specify PID.ON instead of PIR.ON
  KVM: X86: Fix initialization of MSR lists
  KVM: fix placement of refcount initialization
  KVM: Fix NULL-ptr deref after kvm_create_vm fails
</content>
</entry>
<entry>
<title>KVM: MMU: Do not treat ZONE_DEVICE pages as being reserved</title>
<updated>2019-11-12T09:17:42Z</updated>
<author>
<name>Sean Christopherson</name>
<email>sean.j.christopherson@intel.com</email>
</author>
<published>2019-11-11T22:12:27Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=a78986aae9b2988f8493f9f65a587ee433e83bc3'/>
<id>urn:sha1:a78986aae9b2988f8493f9f65a587ee433e83bc3</id>
<content type='text'>
Explicitly exempt ZONE_DEVICE pages from kvm_is_reserved_pfn() and
instead manually handle ZONE_DEVICE on a case-by-case basis.  For things
like page refcounts, KVM needs to treat ZONE_DEVICE pages like normal
pages, e.g. put pages grabbed via gup().  But for flows such as setting
A/D bits or shifting refcounts for transparent huge pages, KVM needs to
to avoid processing ZONE_DEVICE pages as the flows in question lack the
underlying machinery for proper handling of ZONE_DEVICE pages.

This fixes a hang reported by Adam Borowski[*] in dev_pagemap_cleanup()
when running a KVM guest backed with /dev/dax memory, as KVM straight up
doesn't put any references to ZONE_DEVICE pages acquired by gup().

Note, Dan Williams proposed an alternative solution of doing put_page()
on ZONE_DEVICE pages immediately after gup() in order to simplify the
auditing needed to ensure is_zone_device_page() is called if and only if
the backing device is pinned (via gup()).  But that approach would break
kvm_vcpu_{un}map() as KVM requires the page to be pinned from map() 'til
unmap() when accessing guest memory, unlike KVM's secondary MMU, which
coordinates with mmu_notifier invalidations to avoid creating stale
page references, i.e. doesn't rely on pages being pinned.

[*] http://lkml.kernel.org/r/20190919115547.GA17963@angband.pl

Reported-by: Adam Borowski &lt;kilobyte@angband.pl&gt;
Analyzed-by: David Hildenbrand &lt;david@redhat.com&gt;
Acked-by: Dan Williams &lt;dan.j.williams@intel.com&gt;
Cc: stable@vger.kernel.org
Fixes: 3565fce3a659 ("mm, x86: get_user_pages() for dax mappings")
Signed-off-by: Sean Christopherson &lt;sean.j.christopherson@intel.com&gt;
Signed-off-by: Paolo Bonzini &lt;pbonzini@redhat.com&gt;
</content>
</entry>
<entry>
<title>kvm: Add helper function for creating VM worker threads</title>
<updated>2019-11-04T11:22:02Z</updated>
<author>
<name>Junaid Shahid</name>
<email>junaids@google.com</email>
</author>
<published>2019-11-04T11:22:02Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=c57c80467f90e5504c8df9ad3555d2c78800bf94'/>
<id>urn:sha1:c57c80467f90e5504c8df9ad3555d2c78800bf94</id>
<content type='text'>
Add a function to create a kernel thread associated with a given VM. In
particular, it ensures that the worker thread inherits the priority and
cgroups of the calling thread.

Signed-off-by: Junaid Shahid &lt;junaids@google.com&gt;
Signed-off-by: Paolo Bonzini &lt;pbonzini@redhat.com&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;

</content>
</entry>
<entry>
<title>kvm: x86, powerpc: do not allow clearing largepages debugfs entry</title>
<updated>2019-09-30T16:52:00Z</updated>
<author>
<name>Paolo Bonzini</name>
<email>pbonzini@redhat.com</email>
</author>
<published>2019-09-30T16:48:44Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=833b45de69a6016c4b0cebe6765d526a31a81580'/>
<id>urn:sha1:833b45de69a6016c4b0cebe6765d526a31a81580</id>
<content type='text'>
The largepages debugfs entry is incremented/decremented as shadow
pages are created or destroyed.  Clearing it will result in an
underflow, which is harmless to KVM but ugly (and could be
misinterpreted by tools that use debugfs information), so make
this particular statistic read-only.

Cc: kvm-ppc@vger.kernel.org
Signed-off-by: Paolo Bonzini &lt;pbonzini@redhat.com&gt;
</content>
</entry>
</feed>
