<feed xmlns='http://www.w3.org/2005/Atom'>
<title>user/sven/linux.git/virt, branch v3.14.38</title>
<subtitle>Linux Kernel
</subtitle>
<id>https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v3.14.38</id>
<link rel='self' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v3.14.38'/>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/'/>
<updated>2014-11-14T16:59:54Z</updated>
<entry>
<title>kvm: fix excessive pages un-pinning in kvm_iommu_map error path.</title>
<updated>2014-11-14T16:59:54Z</updated>
<author>
<name>Quentin Casasnovas</name>
<email>quentin.casasnovas@oracle.com</email>
</author>
<published>2014-10-17T20:55:59Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=8c373cfce6904feccca7ccf2a61e236db56dedf4'/>
<id>urn:sha1:8c373cfce6904feccca7ccf2a61e236db56dedf4</id>
<content type='text'>
commit 3d32e4dbe71374a6780eaf51d719d76f9a9bf22f upstream.

The third parameter of kvm_unpin_pages() when called from
kvm_iommu_map_pages() is wrong, it should be the number of pages to un-pin
and not the page size.

This error was facilitated with an inconsistent API: kvm_pin_pages() takes
a size, but kvn_unpin_pages() takes a number of pages, so fix the problem
by matching the two.

This was introduced by commit 350b8bd ("kvm: iommu: fix the third parameter
of kvm_iommu_put_pages (CVE-2014-3601)"), which fixes the lack of
un-pinning for pages intended to be un-pinned (i.e. memory leak) but
unfortunately potentially aggravated the number of pages we un-pin that
should have stayed pinned. As far as I understand though, the same
practical mitigations apply.

This issue was found during review of Red Hat 6.6 patches to prepare
Ksplice rebootless updates.

Thanks to Vegard for his time on a late Friday evening to help me in
understanding this code.

Fixes: 350b8bd ("kvm: iommu: fix the third parameter of... (CVE-2014-3601)")
Signed-off-by: Quentin Casasnovas &lt;quentin.casasnovas@oracle.com&gt;
Signed-off-by: Vegard Nossum &lt;vegard.nossum@oracle.com&gt;
Signed-off-by: Jamie Iles &lt;jamie.iles@oracle.com&gt;
Reviewed-by: Sasha Levin &lt;sasha.levin@oracle.com&gt;
Signed-off-by: Paolo Bonzini &lt;pbonzini@redhat.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>kvm: don't take vcpu mutex for obviously invalid vcpu ioctls</title>
<updated>2014-10-30T16:38:19Z</updated>
<author>
<name>David Matlack</name>
<email>dmatlack@google.com</email>
</author>
<published>2014-09-19T23:03:25Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=dff6e8cd03a5c2709ef0933e8b48cbf8b28aee4a'/>
<id>urn:sha1:dff6e8cd03a5c2709ef0933e8b48cbf8b28aee4a</id>
<content type='text'>
commit 2ea75be3219571d0ec009ce20d9971e54af96e09 upstream.

vcpu ioctls can hang the calling thread if issued while a vcpu is running.
However, invalid ioctls can happen when userspace tries to probe the kind
of file descriptors (e.g. isatty() calls ioctl(TCGETS)); in that case,
we know the ioctl is going to be rejected as invalid anyway and we can
fail before trying to take the vcpu mutex.

This patch does not change functionality, it just makes invalid ioctls
fail faster.

Signed-off-by: David Matlack &lt;dmatlack@google.com&gt;
Signed-off-by: Paolo Bonzini &lt;pbonzini@redhat.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>kvm: fix potentially corrupt mmio cache</title>
<updated>2014-10-30T16:38:19Z</updated>
<author>
<name>David Matlack</name>
<email>dmatlack@google.com</email>
</author>
<published>2014-08-18T22:46:06Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=55eb96ec53e7b041f10ffcc7b1442a31d89d4488'/>
<id>urn:sha1:55eb96ec53e7b041f10ffcc7b1442a31d89d4488</id>
<content type='text'>
commit ee3d1570b58677885b4552bce8217fda7b226a68 upstream.

vcpu exits and memslot mutations can run concurrently as long as the
vcpu does not aquire the slots mutex. Thus it is theoretically possible
for memslots to change underneath a vcpu that is handling an exit.

If we increment the memslot generation number again after
synchronize_srcu_expedited(), vcpus can safely cache memslot generation
without maintaining a single rcu_dereference through an entire vm exit.
And much of the x86/kvm code does not maintain a single rcu_dereference
of the current memslots during each exit.

We can prevent the following case:

   vcpu (CPU 0)                             | thread (CPU 1)
--------------------------------------------+--------------------------
1  vm exit                                  |
2  srcu_read_unlock(&amp;kvm-&gt;srcu)             |
3  decide to cache something based on       |
     old memslots                           |
4                                           | change memslots
                                            | (increments generation)
5                                           | synchronize_srcu(&amp;kvm-&gt;srcu);
6  retrieve generation # from new memslots  |
7  tag cache with new memslot generation    |
8  srcu_read_unlock(&amp;kvm-&gt;srcu)             |
...                                         |
   &lt;action based on cache occurs even       |
    though the caching decision was based   |
    on the old memslots&gt;                    |
...                                         |
   &lt;action *continues* to occur until next  |
    memslot generation change, which may    |
    be never&gt;                               |
                                            |

By incrementing the generation after synchronizing with kvm-&gt;srcu readers,
we ensure that the generation retrieved in (6) will become invalid soon
after (8).

Keeping the existing increment is not strictly necessary, but we
do keep it and just move it for consistency from update_memslots to
install_new_memslots.  It invalidates old cached MMIOs immediately,
instead of having to wait for the end of synchronize_srcu_expedited,
which makes the code more clearly correct in case CPU 1 is preempted
right after synchronize_srcu() returns.

To avoid halving the generation space in SPTEs, always presume that the
low bit of the generation is zero when reconstructing a generation number
out of an SPTE.  This effectively disables MMIO caching in SPTEs during
the call to synchronize_srcu_expedited.  Using the low bit this way is
somewhat like a seqcount---where the protected thing is a cache, and
instead of retrying we can simply punt if we observe the low bit to be 1.

Signed-off-by: David Matlack &lt;dmatlack@google.com&gt;
Reviewed-by: Xiao Guangrong &lt;xiaoguangrong@linux.vnet.ibm.com&gt;
Reviewed-by: David Matlack &lt;dmatlack@google.com&gt;
Signed-off-by: Paolo Bonzini &lt;pbonzini@redhat.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>kvm: iommu: fix the third parameter of kvm_iommu_put_pages (CVE-2014-3601)</title>
<updated>2014-09-05T23:34:15Z</updated>
<author>
<name>Michael S. Tsirkin</name>
<email>mst@redhat.com</email>
</author>
<published>2014-08-19T11:14:50Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=42a1927a7a1d9e9992a7d1cd43a797e461019e01'/>
<id>urn:sha1:42a1927a7a1d9e9992a7d1cd43a797e461019e01</id>
<content type='text'>
commit 350b8bdd689cd2ab2c67c8a86a0be86cfa0751a7 upstream.

The third parameter of kvm_iommu_put_pages is wrong,
It should be 'gfn - slot-&gt;base_gfn'.

By making gfn very large, malicious guest or userspace can cause kvm to
go to this error path, and subsequently to pass a huge value as size.
Alternatively if gfn is small, then pages would be pinned but never
unpinned, causing host memory leak and local DOS.

Passing a reasonable but large value could be the most dangerous case,
because it would unpin a page that should have stayed pinned, and thus
allow the device to DMA into arbitrary memory.  However, this cannot
happen because of the condition that can trigger the error:

- out of memory (where you can't allocate even a single page)
  should not be possible for the attacker to trigger

- when exceeding the iommu's address space, guest pages after gfn
  will also exceed the iommu's address space, and inside
  kvm_iommu_put_pages() the iommu_iova_to_phys() will fail.  The
  page thus would not be unpinned at all.

Reported-by: Jack Morgenstein &lt;jackm@mellanox.com&gt;
Signed-off-by: Michael S. Tsirkin &lt;mst@redhat.com&gt;
Signed-off-by: Paolo Bonzini &lt;pbonzini@redhat.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>KVM: x86: always exit on EOIs for interrupts listed in the IOAPIC redir table</title>
<updated>2014-09-05T23:34:15Z</updated>
<author>
<name>Paolo Bonzini</name>
<email>pbonzini@redhat.com</email>
</author>
<published>2014-07-30T16:07:24Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=a5d5291d5ba8bea645321277b5e5786a7ef7a76b'/>
<id>urn:sha1:a5d5291d5ba8bea645321277b5e5786a7ef7a76b</id>
<content type='text'>
commit 0f6c0a740b7d3e1f3697395922d674000f83d060 upstream.

Currently, the EOI exit bitmap (used for APICv) does not include
interrupts that are masked.  However, this can cause a bug that manifests
as an interrupt storm inside the guest.  Alex Williamson reported the
bug and is the one who really debugged this; I only wrote the patch. :)

The scenario involves a multi-function PCI device with OHCI and EHCI
USB functions and an audio function, all assigned to the guest, where
both USB functions use legacy INTx interrupts.

As soon as the guest boots, interrupts for these devices turn into an
interrupt storm in the guest; the host does not see the interrupt storm.
Basically the EOI path does not work, and the guest continues to see the
interrupt over and over, even after it attempts to mask it at the APIC.
The bug is only visible with older kernels (RHEL6.5, based on 2.6.32
with not many changes in the area of APIC/IOAPIC handling).

Alex then tried forcing bit 59 (corresponding to the USB functions' IRQ)
on in the eoi_exit_bitmap and TMR, and things then work.  What happens
is that VFIO asserts IRQ11, then KVM recomputes the EOI exit bitmap.
It does not have set bit 59 because the RTE was masked, so the IOAPIC
never sees the EOI and the interrupt continues to fire in the guest.

My guess was that the guest is masking the interrupt in the redirection
table in the interrupt routine, i.e. while the interrupt is set in a
LAPIC's ISR, The simplest fix is to ignore the masking state, we would
rather have an unnecessary exit rather than a missed IRQ ACK and anyway
IOAPIC interrupts are not as performance-sensitive as for example MSIs.
Alex tested this patch and it fixed his bug.

[Thanks to Alex for his precise description of the problem
 and initial debugging effort.  A lot of the text above is
 based on emails exchanged with him.]

Reported-by: Alex Williamson &lt;alex.williamson@redhat.com&gt;
Tested-by: Alex Williamson &lt;alex.williamson@redhat.com&gt;
Signed-off-by: Paolo Bonzini &lt;pbonzini@redhat.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>KVM: ioapic: fix assignment of ioapic-&gt;rtc_status.pending_eoi (CVE-2014-0155)</title>
<updated>2014-05-13T11:32:48Z</updated>
<author>
<name>Paolo Bonzini</name>
<email>pbonzini@redhat.com</email>
</author>
<published>2014-03-28T19:41:50Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=f8944acc97ceebf902e5b26b900aefef987ab4be'/>
<id>urn:sha1:f8944acc97ceebf902e5b26b900aefef987ab4be</id>
<content type='text'>
commit 5678de3f15010b9022ee45673f33bcfc71d47b60 upstream.

QE reported that they got the BUG_ON in ioapic_service to trigger.
I cannot reproduce it, but there are two reasons why this could happen.

The less likely but also easiest one, is when kvm_irq_delivery_to_apic
does not deliver to any APIC and returns -1.

Because irqe.shorthand == 0, the kvm_for_each_vcpu loop in that
function is never reached.  However, you can target the similar loop in
kvm_irq_delivery_to_apic_fast; just program a zero logical destination
address into the IOAPIC, or an out-of-range physical destination address.

Signed-off-by: Paolo Bonzini &lt;pbonzini@redhat.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>KVM: async_pf: mm-&gt;mm_users can not pin apf-&gt;mm</title>
<updated>2014-05-13T11:32:47Z</updated>
<author>
<name>Oleg Nesterov</name>
<email>oleg@redhat.com</email>
</author>
<published>2014-04-21T13:26:01Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=9e66614b8c36b8d96bf7b271a3eacada1e4c589d'/>
<id>urn:sha1:9e66614b8c36b8d96bf7b271a3eacada1e4c589d</id>
<content type='text'>
commit 41c22f626254b9dc0376928cae009e73d1b6a49a upstream.

get_user_pages(mm) is simply wrong if mm-&gt;mm_users == 0 and exit_mmap/etc
was already called (or is in progress), mm-&gt;mm_count can only pin mm-&gt;pgd
and mm_struct itself.

Change kvm_setup_async_pf/async_pf_execute to inc/dec mm-&gt;mm_users.

kvm_create_vm/kvm_destroy_vm play with -&gt;mm_count too but this case looks
fine at first glance, it seems that this -&gt;mm is only used to verify that
current-&gt;mm == kvm-&gt;mm.

Signed-off-by: Oleg Nesterov &lt;oleg@redhat.com&gt;
Signed-off-by: Paolo Bonzini &lt;pbonzini@redhat.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>KVM: ARM: vgic: Fix sgi dispatch problem</title>
<updated>2014-05-13T11:32:47Z</updated>
<author>
<name>Haibin Wang</name>
<email>wanghaibin.wang@huawei.com</email>
</author>
<published>2014-04-10T12:14:32Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=2264ca6c71682e1c0d72a9f21e178204b1e621e4'/>
<id>urn:sha1:2264ca6c71682e1c0d72a9f21e178204b1e621e4</id>
<content type='text'>
commit 91021a6c8ffdc55804dab5acdfc7de4f278b9ac3 upstream.

When dispatch SGI(mode == 0), that is the vcpu of VM should send
sgi to the cpu which the target_cpus list.
So, there must add the "break" to branch of case 0.

Signed-off-by: Haibin Wang &lt;wanghaibin.wang@huawei.com&gt;
Acked-by: Marc Zyngier &lt;marc.zyngier@arm.com&gt;
Signed-off-by: Christoffer Dall &lt;christoffer.dall@linaro.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>arm64: KVM: Add VGIC device control for arm64</title>
<updated>2014-02-14T10:09:49Z</updated>
<author>
<name>Christoffer Dall</name>
<email>christoffer.dall@linaro.org</email>
</author>
<published>2014-02-02T21:41:02Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=2a2f3e269c75edf916de5967079069aeb6a601cb'/>
<id>urn:sha1:2a2f3e269c75edf916de5967079069aeb6a601cb</id>
<content type='text'>
This fixes the build breakage introduced by
c07a0191ef2de1f9510f12d1f88e3b0b5cd8d66f and adds support for the device
control API and save/restore of the VGIC state for ARMv8.

The defines were simply missing from the arm64 header files and
uaccess.h must be implicitly imported from somewhere else on arm.

Signed-off-by: Christoffer Dall &lt;christoffer.dall@linaro.org&gt;
Signed-off-by: Paolo Bonzini &lt;pbonzini@redhat.com&gt;
</content>
</entry>
<entry>
<title>KVM: return an error code in kvm_vm_ioctl_register_coalesced_mmio()</title>
<updated>2014-01-30T10:56:09Z</updated>
<author>
<name>Dan Carpenter</name>
<email>dan.carpenter@oracle.com</email>
</author>
<published>2014-01-29T13:16:39Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=aac5c4226e7136c331ed384c25d5560204da10a0'/>
<id>urn:sha1:aac5c4226e7136c331ed384c25d5560204da10a0</id>
<content type='text'>
If kvm_io_bus_register_dev() fails then it returns success but it should
return an error code.

I also did a little cleanup like removing an impossible NULL test.

Cc: stable@vger.kernel.org
Fixes: 2b3c246a682c ('KVM: Make coalesced mmio use a device per zone')
Signed-off-by: Dan Carpenter &lt;dan.carpenter@oracle.com&gt;
Signed-off-by: Paolo Bonzini &lt;pbonzini@redhat.com&gt;
</content>
</entry>
</feed>
