<feed xmlns='http://www.w3.org/2005/Atom'>
<title>user/sven/linux.git/Documentation/virtual, branch v4.14.263</title>
<subtitle>Linux Kernel
</subtitle>
<id>https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v4.14.263</id>
<link rel='self' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v4.14.263'/>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/'/>
<updated>2021-09-03T07:56:26Z</updated>
<entry>
<title>KVM: X86: MMU: Use the correct inherited permissions to get shadow page</title>
<updated>2021-09-03T07:56:26Z</updated>
<author>
<name>Lai Jiangshan</name>
<email>laijs@linux.alibaba.com</email>
</author>
<published>2021-06-03T05:24:55Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=cea9e8ee3b8059bd2b36d68f1f428d165e5d13ce'/>
<id>urn:sha1:cea9e8ee3b8059bd2b36d68f1f428d165e5d13ce</id>
<content type='text'>
commit b1bd5cba3306691c771d558e94baa73e8b0b96b7 upstream.

When computing the access permissions of a shadow page, use the effective
permissions of the walk up to that point, i.e. the logic AND of its parents'
permissions.  Two guest PxE entries that point at the same table gfn need to
be shadowed with different shadow pages if their parents' permissions are
different.  KVM currently uses the effective permissions of the last
non-leaf entry for all non-leaf entries.  Because all non-leaf SPTEs have
full ("uwx") permissions, and the effective permissions are recorded only
in role.access and merged into the leaves, this can lead to incorrect
reuse of a shadow page and eventually to a missing guest protection page
fault.

For example, here is a shared pagetable:

   pgd[]   pud[]        pmd[]            virtual address pointers
                     /-&gt;pmd1(u--)-&gt;pte1(uw-)-&gt;page1 &lt;- ptr1 (u--)
        /-&gt;pud1(uw-)---&gt;pmd2(uw-)-&gt;pte2(uw-)-&gt;page2 &lt;- ptr2 (uw-)
   pgd-|           (shared pmd[] as above)
        \-&gt;pud2(u--)---&gt;pmd1(u--)-&gt;pte1(uw-)-&gt;page1 &lt;- ptr3 (u--)
                     \-&gt;pmd2(uw-)-&gt;pte2(uw-)-&gt;page2 &lt;- ptr4 (u--)

  pud1 and pud2 point to the same pmd table, so:
  - ptr1 and ptr3 points to the same page.
  - ptr2 and ptr4 points to the same page.

(pud1 and pud2 here are pud entries, while pmd1 and pmd2 here are pmd entries)

- First, the guest reads from ptr1 first and KVM prepares a shadow
  page table with role.access=u--, from ptr1's pud1 and ptr1's pmd1.
  "u--" comes from the effective permissions of pgd, pud1 and
  pmd1, which are stored in pt-&gt;access.  "u--" is used also to get
  the pagetable for pud1, instead of "uw-".

- Then the guest writes to ptr2 and KVM reuses pud1 which is present.
  The hypervisor set up a shadow page for ptr2 with pt-&gt;access is "uw-"
  even though the pud1 pmd (because of the incorrect argument to
  kvm_mmu_get_page in the previous step) has role.access="u--".

- Then the guest reads from ptr3.  The hypervisor reuses pud1's
  shadow pmd for pud2, because both use "u--" for their permissions.
  Thus, the shadow pmd already includes entries for both pmd1 and pmd2.

- At last, the guest writes to ptr4.  This causes no vmexit or pagefault,
  because pud1's shadow page structures included an "uw-" page even though
  its role.access was "u--".

Any kind of shared pagetable might have the similar problem when in
virtual machine without TDP enabled if the permissions are different
from different ancestors.

In order to fix the problem, we change pt-&gt;access to be an array, and
any access in it will not include permissions ANDed from child ptes.

The test code is: https://lore.kernel.org/kvm/20210603050537.19605-1-jiangshanlai@gmail.com/
Remember to test it with TDP disabled.

The problem had existed long before the commit 41074d07c78b ("KVM: MMU:
Fix inherited permissions for emulated guest pte updates"), and it
is hard to find which is the culprit.  So there is no fixes tag here.

Signed-off-by: Lai Jiangshan &lt;laijs@linux.alibaba.com&gt;
Message-Id: &lt;20210603052455.21023-1-jiangshanlai@gmail.com&gt;
Cc: stable@vger.kernel.org
Fixes: cea0f0e7ea54 ("[PATCH] KVM: MMU: Shadow page table caching")
Signed-off-by: Paolo Bonzini &lt;pbonzini@redhat.com&gt;
[OP: - apply arch/x86/kvm/mmu/* changes to arch/x86/kvm
     - apply documentation changes to Documentation/virtual/kvm/mmu.txt
     - add vcpu parameter to gpte_access() call]
Signed-off-by: Ovidiu Panait &lt;ovidiu.panait@windriver.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>x86/kvm/hyper-v: Explicitly align hcall param for kvm_hyperv_exit</title>
<updated>2020-06-20T08:25:09Z</updated>
<author>
<name>Jon Doron</name>
<email>arilou@gmail.com</email>
</author>
<published>2020-04-24T11:37:40Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=86fde86b9307d33769a8a171ba5c9ae3c021a375'/>
<id>urn:sha1:86fde86b9307d33769a8a171ba5c9ae3c021a375</id>
<content type='text'>
[ Upstream commit f7d31e65368aeef973fab788aa22c4f1d5a6af66 ]

The problem the patch is trying to address is the fact that 'struct
kvm_hyperv_exit' has different layout on when compiling in 32 and 64 bit
modes.

In 64-bit mode the default alignment boundary is 64 bits thus
forcing extra gaps after 'type' and 'msr' but in 32-bit mode the
boundary is at 32 bits thus no extra gaps.

This is an issue as even when the kernel is 64 bit, the userspace using
the interface can be both 32 and 64 bit but the same 32 bit userspace has
to work with 32 bit kernel.

The issue is fixed by forcing the 64 bit layout, this leads to ABI
change for 32 bit builds and while we are obviously breaking '32 bit
userspace with 32 bit kernel' case, we're fixing the '32 bit userspace
with 64 bit kernel' one.

As the interface has no (known) users and 32 bit KVM is rather baroque
nowadays, this seems like a reasonable decision.

Reviewed-by: Vitaly Kuznetsov &lt;vkuznets@redhat.com&gt;
Signed-off-by: Jon Doron &lt;arilou@gmail.com&gt;
Message-Id: &lt;20200424113746.3473563-2-arilou@gmail.com&gt;
Reviewed-by: Roman Kagan &lt;rvkagan@yandex-team.ru&gt;
Signed-off-by: Paolo Bonzini &lt;pbonzini@redhat.com&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>kvm: Convert kvm_lock to a mutex</title>
<updated>2019-11-12T18:19:05Z</updated>
<author>
<name>Junaid Shahid</name>
<email>junaids@google.com</email>
</author>
<published>2019-01-04T01:14:28Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=05fe997e30d439e3ff0c7e7e46499e9d41d98ba7'/>
<id>urn:sha1:05fe997e30d439e3ff0c7e7e46499e9d41d98ba7</id>
<content type='text'>
commit 0d9ce162cf46c99628cc5da9510b959c7976735b upstream.

It doesn't seem as if there is any particular need for kvm_lock to be a
spinlock, so convert the lock to a mutex so that sleepable functions (in
particular cond_resched()) can be called while holding it.

Signed-off-by: Junaid Shahid &lt;junaids@google.com&gt;
Signed-off-by: Paolo Bonzini &lt;pbonzini@redhat.com&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>KVM: Reject device ioctls from processes other than the VM's creator</title>
<updated>2019-04-03T04:25:20Z</updated>
<author>
<name>Sean Christopherson</name>
<email>sean.j.christopherson@intel.com</email>
</author>
<published>2019-02-15T20:48:39Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=9badc8549fa75b3f1b5d74eea9d22c60a525b242'/>
<id>urn:sha1:9badc8549fa75b3f1b5d74eea9d22c60a525b242</id>
<content type='text'>
commit ddba91801aeb5c160b660caed1800eb3aef403f8 upstream.

KVM's API requires thats ioctls must be issued from the same process
that created the VM.  In other words, userspace can play games with a
VM's file descriptors, e.g. fork(), SCM_RIGHTS, etc..., but only the
creator can do anything useful.  Explicitly reject device ioctls that
are issued by a process other than the VM's creator, and update KVM's
API documentation to extend its requirements to device ioctls.

Fixes: 852b6d57dc7f ("kvm: add device control API")
Cc: &lt;stable@vger.kernel.org&gt;
Signed-off-by: Sean Christopherson &lt;sean.j.christopherson@intel.com&gt;
Signed-off-by: Paolo Bonzini &lt;pbonzini@redhat.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>KVM: x86: Add a framework for supporting MSR-based features</title>
<updated>2018-08-15T16:12:59Z</updated>
<author>
<name>Tom Lendacky</name>
<email>thomas.lendacky@amd.com</email>
</author>
<published>2018-02-21T19:39:51Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=f0660d587efed81cfe3d734d77e3fb03fce0b15b'/>
<id>urn:sha1:f0660d587efed81cfe3d734d77e3fb03fce0b15b</id>
<content type='text'>
commit 801e459a6f3a63af9d447e6249088c76ae16efc4 upstream

Provide a new KVM capability that allows bits within MSRs to be recognized
as features.  Two new ioctls are added to the /dev/kvm ioctl routine to
retrieve the list of these MSRs and then retrieve their values. A kvm_x86_ops
callback is used to determine support for the listed MSR-based features.

Signed-off-by: Tom Lendacky &lt;thomas.lendacky@amd.com&gt;
Signed-off-by: Paolo Bonzini &lt;pbonzini@redhat.com&gt;
[Tweaked documentation. - Radim]
Signed-off-by: Radim Krčmář &lt;rkrcmar@redhat.com&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>arm/arm64: KVM: Add PSCI version selection API</title>
<updated>2018-05-01T19:58:27Z</updated>
<author>
<name>Marc Zyngier</name>
<email>marc.zyngier@arm.com</email>
</author>
<published>2018-01-21T16:42:56Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=e5a290c4ff77c9fb3fcb1dee7cfb356969daeee2'/>
<id>urn:sha1:e5a290c4ff77c9fb3fcb1dee7cfb356969daeee2</id>
<content type='text'>
commit 85bd0ba1ff9875798fad94218b627ea9f768f3c3 upstream.

Although we've implemented PSCI 0.1, 0.2 and 1.0, we expose either 0.1
or 1.0 to a guest, defaulting to the latest version of the PSCI
implementation that is compatible with the requested version. This is
no different from doing a firmware upgrade on KVM.

But in order to give a chance to hypothetical badly implemented guests
that would have a fit by discovering something other than PSCI 0.2,
let's provide a new API that allows userspace to pick one particular
version of the API.

This is implemented as a new class of "firmware" registers, where
we expose the PSCI version. This allows the PSCI version to be
save/restored as part of a guest migration, and also set to
any supported version if the guest requires it.

Cc: stable@vger.kernel.org #4.16
Reviewed-by: Christoffer Dall &lt;cdall@kernel.org&gt;
Signed-off-by: Marc Zyngier &lt;marc.zyngier@arm.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>KVM: PPC: Book3S HV: Enable migration of decrementer register</title>
<updated>2018-04-26T09:02:04Z</updated>
<author>
<name>Paul Mackerras</name>
<email>paulus@ozlabs.org</email>
</author>
<published>2018-01-12T09:55:20Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=ddf09f2a0896bb29f1574d0ef983f42eb8592346'/>
<id>urn:sha1:ddf09f2a0896bb29f1574d0ef983f42eb8592346</id>
<content type='text'>
[ Upstream commit 5855564c8ab2d9cefca7b2933bd19818eb795e40 ]

This adds a register identifier for use with the one_reg interface
to allow the decrementer expiry time to be read and written by
userspace.  The decrementer expiry time is in guest timebase units
and is equal to the sum of the decrementer and the guest timebase.
(The expiry time is used rather than the decrementer value itself
because the expiry time is not constantly changing, though the
decrementer value is, while the guest vcpu is not running.)

Without this, a guest vcpu migrated to a new host will see its
decrementer set to some random value.  On POWER8 and earlier, the
decrementer is 32 bits wide and counts down at 512MHz, so the
guest vcpu will potentially see no decrementer interrupts for up
to about 4 seconds, which will lead to a stall.  With POWER9, the
decrementer is now 56 bits side, so the stall can be much longer
(up to 2.23 years) and more noticeable.

To help work around the problem in cases where userspace has not been
updated to migrate the decrementer expiry time, we now set the
default decrementer expiry at vcpu creation time to the current time
rather than the maximum possible value.  This should mean an
immediate decrementer interrupt when a migrated vcpu starts
running.  In cases where the decrementer is 32 bits wide and more
than 4 seconds elapse between the creation of the vcpu and when it
first runs, the decrementer would have wrapped around to positive
values and there may still be a stall - but this is no worse than
the current situation.  In the large-decrementer case, we are sure
to get an immediate decrementer interrupt (assuming the time from
vcpu creation to first run is less than 2.23 years) and we thus
avoid a very long stall.

Signed-off-by: Paul Mackerras &lt;paulus@ozlabs.org&gt;
Signed-off-by: Sasha Levin &lt;alexander.levin@microsoft.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>KVM: x86: fix backward migration with async_PF</title>
<updated>2018-03-11T15:23:23Z</updated>
<author>
<name>Radim Krčmář</name>
<email>rkrcmar@redhat.com</email>
</author>
<published>2018-02-01T21:16:21Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=dc6fb79de47daa1cb38054af318ff6725eb473cc'/>
<id>urn:sha1:dc6fb79de47daa1cb38054af318ff6725eb473cc</id>
<content type='text'>
commit fe2a3027e74e40a3ece3a4c1e4e51403090a907a upstream.

Guests on new hypersiors might set KVM_ASYNC_PF_DELIVERY_AS_PF_VMEXIT
bit when enabling async_PF, but this bit is reserved on old hypervisors,
which results in a failure upon migration.

To avoid breaking different cases, we are checking for CPUID feature bit
before enabling the feature and nothing else.

Fixes: 52a5c155cf79 ("KVM: async_pf: Let guest support delivery of async_pf from guest mode")
Cc: &lt;stable@vger.kernel.org&gt;
Reviewed-by: Wanpeng Li &lt;wanpengli@tencent.com&gt;
Reviewed-by: David Hildenbrand &lt;david@redhat.com&gt;
Signed-off-by: Radim Krčmář &lt;rkrcmar@redhat.com&gt;
Signed-off-by: Paolo Bonzini &lt;pbonzini@redhat.com&gt;
[jwang: port to 4.14]
Signed-off-by: Jack Wang &lt;jinpu.wang@profitbricks.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>Merge tag 'kvm-arm-for-v4.14' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm</title>
<updated>2017-09-07T16:22:04Z</updated>
<author>
<name>Radim Krčmář</name>
<email>rkrcmar@redhat.com</email>
</author>
<published>2017-09-07T16:22:04Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=082d3900a446283a6ec15917a1682db2cdf17621'/>
<id>urn:sha1:082d3900a446283a6ec15917a1682db2cdf17621</id>
<content type='text'>
KVM/ARM Changes for v4.14

Two minor cleanups and improvements, a fix for decoding external abort
types from guests, and added support for migrating the active priority
of interrupts when running a GICv2 guest on a GICv3 host.
</content>
</entry>
<entry>
<title>KVM: arm/arm64: Support uaccess of GICC_APRn</title>
<updated>2017-09-05T15:33:39Z</updated>
<author>
<name>Christoffer Dall</name>
<email>cdall@linaro.org</email>
</author>
<published>2017-08-31T20:24:25Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=9b87e7a8bfb5098129836757608b3cbbdc11245a'/>
<id>urn:sha1:9b87e7a8bfb5098129836757608b3cbbdc11245a</id>
<content type='text'>
When migrating guests around we need to know the active priorities to
ensure functional virtual interrupt prioritization by the GIC.

This commit clarifies the API and how active priorities of interrupts in
different groups are represented, and implements the accessor functions
for the uaccess register range.

We live with a slight layering violation in accessing GICv3 data
structures from vgic-mmio-v2.c, because anything else just adds too much
complexity for us to deal with (it's not like there's a benefit
elsewhere in the code of an intermediate representation as is the case
with the VMCR).  We accept this, because while doing v3 processing from
a file named something-v2.c can look strange at first, this really is
specific to dealing with the user space interface for something that
looks like a GICv2.

Reviewed-by: Marc Zyngier &lt;marc.zyngier@arm.com&gt;
Signed-off-by: Christoffer Dall &lt;cdall@linaro.org&gt;
</content>
</entry>
</feed>
