<feed xmlns='http://www.w3.org/2005/Atom'>
<title>user/sven/linux.git/include/linux, branch v3.3.4</title>
<subtitle>Linux Kernel
</subtitle>
<id>https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v3.3.4</id>
<link rel='self' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v3.3.4'/>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/'/>
<updated>2012-04-27T17:17:06Z</updated>
<entry>
<title>tcp: avoid order-1 allocations on wifi and tx path</title>
<updated>2012-04-27T17:17:06Z</updated>
<author>
<name>Eric Dumazet</name>
<email>eric.dumazet@gmail.com</email>
</author>
<published>2012-04-25T03:01:22Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=0df29c4a2857ed43ff23689423144aaf4f4101bc'/>
<id>urn:sha1:0df29c4a2857ed43ff23689423144aaf4f4101bc</id>
<content type='text'>
[ This combines upstream commit
  a21d45726acacc963d8baddf74607d9b74e2b723 and the follow-on bug fix
  commit a21d45726acacc963d8baddf74607d9b74e2b723 ]

Marc Merlin reported many order-1 allocations failures in TX path on its
wireless setup, that dont make any sense with MTU=1500 network, and non
SG capable hardware.

After investigation, it turns out TCP uses sk_stream_alloc_skb() and
used as a convention skb_tailroom(skb) to know how many bytes of data
payload could be put in this skb (for non SG capable devices)

Note : these skb used kmalloc-4096 (MTU=1500 + MAX_HEADER +
sizeof(struct skb_shared_info) being above 2048)

Later, mac80211 layer need to add some bytes at the tail of skb
(IEEE80211_ENCRYPT_TAILROOM = 18 bytes) and since no more tailroom is
available has to call pskb_expand_head() and request order-1
allocations.

This patch changes sk_stream_alloc_skb() so that only
sk-&gt;sk_prot-&gt;max_header bytes of headroom are reserved, and use a new
skb field, avail_size to hold the data payload limit.

This way, order-0 allocations done by TCP stack can leave more than 2 KB
of tailroom and no more allocation is performed in mac80211 layer (or
any layer needing some tailroom)

avail_size is unioned with mark/dropcount, since mark will be set later
in IP stack for output packets. Therefore, skb size is unchanged.

Reported-by: Marc MERLIN &lt;marc@merlins.org&gt;
Tested-by: Marc MERLIN &lt;marc@merlins.org&gt;
Signed-off-by: Eric Dumazet &lt;eric.dumazet@gmail.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>tcp: allow splice() to build full TSO packets</title>
<updated>2012-04-27T17:16:54Z</updated>
<author>
<name>Eric Dumazet</name>
<email>eric.dumazet@gmail.com</email>
</author>
<published>2012-04-25T02:12:06Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=b2b96b30f3d835fa17b54a454e37ba98ee21a67a'/>
<id>urn:sha1:b2b96b30f3d835fa17b54a454e37ba98ee21a67a</id>
<content type='text'>
[ This combines upstream commit
  2f53384424251c06038ae612e56231b96ab610ee and the follow-on bug fix
  commit 35f9c09fe9c72eb8ca2b8e89a593e1c151f28fc2 ]

vmsplice()/splice(pipe, socket) call do_tcp_sendpages() one page at a
time, adding at most 4096 bytes to an skb. (assuming PAGE_SIZE=4096)

The call to tcp_push() at the end of do_tcp_sendpages() forces an
immediate xmit when pipe is not already filled, and tso_fragment() try
to split these skb to MSS multiples.

4096 bytes are usually split in a skb with 2 MSS, and a remaining
sub-mss skb (assuming MTU=1500)

This makes slow start suboptimal because many small frames are sent to
qdisc/driver layers instead of big ones (constrained by cwnd and packets
in flight of course)

In fact, applications using sendmsg() (adding an additional memory copy)
instead of vmsplice()/splice()/sendfile() are a bit faster because of
this anomaly, especially if serving small files in environments with
large initial [c]wnd.

Call tcp_push() only if MSG_MORE is not set in the flags parameter.

This bit is automatically provided by splice() internals but for the
last page, or on all pages if user specified SPLICE_F_MORE splice()
flag.

In some workloads, this can reduce number of sent logical packets by an
order of magnitude, making zero-copy TCP actually faster than
one-copy :)

Reported-by: Tom Herbert &lt;therbert@google.com&gt;
Cc: Nandita Dukkipati &lt;nanditad@google.com&gt;
Cc: Neal Cardwell &lt;ncardwell@google.com&gt;
Cc: Tom Herbert &lt;therbert@google.com&gt;
Cc: Yuchung Cheng &lt;ycheng@google.com&gt;
Cc: H.K. Jerry Chu &lt;hkchu@google.com&gt;
Cc: Maciej Żenczykowski &lt;maze@google.com&gt;
Cc: Mahesh Bandewar &lt;maheshb@google.com&gt;
Cc: Ilpo Järvinen &lt;ilpo.jarvinen@helsinki.fi&gt;
Signed-off-by: Eric Dumazet &lt;eric.dumazet@gmail.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>net: fix /proc/net/dev regression</title>
<updated>2012-04-27T17:16:53Z</updated>
<author>
<name>Eric Dumazet</name>
<email>eric.dumazet@gmail.com</email>
</author>
<published>2012-04-02T22:33:02Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=3f97bf1dddac33d09627fd941d0ee3a2f57898d7'/>
<id>urn:sha1:3f97bf1dddac33d09627fd941d0ee3a2f57898d7</id>
<content type='text'>
[ Upstream commit 2def16ae6b0c77571200f18ba4be049b03d75579 ]

Commit f04565ddf52 (dev: use name hash for dev_seq_ops) added a second
regression, as some devices are missing from /proc/net/dev if many
devices are defined.

When seq_file buffer is filled, the last -&gt;next/show() method is
canceled (pos value is reverted to value prior -&gt;next() call)

Problem is after above commit, we dont restart the lookup at right
position in -&gt;start() method.

Fix this by removing the internal 'pos' pointer added in commit, since
we need to use the 'loff_t *pos' provided by seq_file layer.

This also reverts commit 5cac98dd0 (net: Fix corruption
in /proc/*/net/dev_mcast), since its not needed anymore.

Reported-by: Ben Greear &lt;greearb@candelatech.com&gt;
Signed-off-by: Eric Dumazet &lt;eric.dumazet@gmail.com&gt;
Cc: Mihai Maruseac &lt;mmaruseac@ixiacom.com&gt;
Tested-by:  Ben Greear &lt;greearb@candelatech.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>KVM: unmap pages from the iommu when slots are removed</title>
<updated>2012-04-27T17:16:52Z</updated>
<author>
<name>Alex Williamson</name>
<email>alex.williamson@redhat.com</email>
</author>
<published>2012-04-11T15:51:49Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=09ca8e1173bcb12e2a449698c9ae3b86a8a10195'/>
<id>urn:sha1:09ca8e1173bcb12e2a449698c9ae3b86a8a10195</id>
<content type='text'>
commit 32f6daad4651a748a58a3ab6da0611862175722f upstream.

We've been adding new mappings, but not destroying old mappings.
This can lead to a page leak as pages are pinned using
get_user_pages, but only unpinned with put_page if they still
exist in the memslots list on vm shutdown.  A memslot that is
destroyed while an iommu domain is enabled for the guest will
therefore result in an elevated page reference count that is
never cleared.

Additionally, without this fix, the iommu is only programmed
with the first translation for a gpa.  This can result in
peer-to-peer errors if a mapping is destroyed and replaced by a
new mapping at the same gpa as the iommu will still be pointing
to the original, pinned memory address.

Signed-off-by: Alex Williamson &lt;alex.williamson@redhat.com&gt;
Signed-off-by: Marcelo Tosatti &lt;mtosatti@redhat.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>serial/8250_pci: add a "force background timer" flag and use it for the "kt" serial port</title>
<updated>2012-04-22T22:38:57Z</updated>
<author>
<name>Dan Williams</name>
<email>dan.j.williams@intel.com</email>
</author>
<published>2012-04-06T18:49:50Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=db3c56d76ac8c7eee6c2652e056f326c699e8ce9'/>
<id>urn:sha1:db3c56d76ac8c7eee6c2652e056f326c699e8ce9</id>
<content type='text'>
commit bc02d15a3452fdf9276e8fb89c5e504a88df888a upstream.

Workaround dropped notifications in the iir register.  Register reads
coincident with new interrupt notifications sometimes result in this
device clearing the interrupt event without reporting it in the read
data.

The serial core already has a heuristic for determining when a device
has an untrustworthy iir register.  In this case when we apriori know
that the iir is faulty use a flag (UPF_BUG_THRE) to bypass the test and
force usage of the background timer.

Acked-by: Alan Cox &lt;alan@linux.intel.com&gt;
Reported-by: Nhan H Mai &lt;nhan.h.mai@intel.com&gt;
Reported-by: Sudhakar Mamillapalli &lt;sudhakar@fb.com&gt;
Tested-by: Nhan H Mai &lt;nhan.h.mai@intel.com&gt;
Tested-by: Sudhakar Mamillapalli &lt;sudhakar@fb.com&gt;
Signed-off-by: Dan Williams &lt;dan.j.williams@intel.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>Revert "serial/8250_pci: setup-quirk workaround for the kt serial controller"</title>
<updated>2012-04-22T22:38:57Z</updated>
<author>
<name>Dan Williams</name>
<email>dan.j.williams@intel.com</email>
</author>
<published>2012-04-06T18:49:44Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=450fb07050462a1b3839a7b8d301c34cc6882461'/>
<id>urn:sha1:450fb07050462a1b3839a7b8d301c34cc6882461</id>
<content type='text'>
commit 49b532f96fda23663f8be35593d1c1372c0f91e0 upstream.

This reverts commit 448ac154c957c4580531fa0c8f2045816fe2f0e7.

The semantic of UPF_IIR_ONCE is only guaranteed to workaround the race
condition in the kt serial's iir register if the only source of
interrupts is THRE (fifo-empty) events.  An modem status event at the
wrong time can again cause an iir read to drop the 'empty' status
leading to a hang.  So, revert this in preparation for using the
existing "I don't trust my iir register" workaround in the 8250 core
(UART_BUG_THRE).

Acked-by: Alan Cox &lt;alan@linux.intel.com&gt;
Cc: Sudhakar Mamillapalli &lt;sudhakar@fb.com&gt;
Reported-by: Nhan H Mai &lt;nhan.h.mai@intel.com&gt;
Signed-off-by: Dan Williams &lt;dan.j.williams@intel.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>sched/x86: Fix overflow in cyc2ns_offset</title>
<updated>2012-04-13T16:13:56Z</updated>
<author>
<name>Salman Qazi</name>
<email>sqazi@google.com</email>
</author>
<published>2012-03-10T00:41:01Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=1b41960c84ce5045cf5556678223bbe93a94497d'/>
<id>urn:sha1:1b41960c84ce5045cf5556678223bbe93a94497d</id>
<content type='text'>
commit 9993bc635d01a6ee7f6b833b4ee65ce7c06350b1 upstream.

When a machine boots up, the TSC generally gets reset.  However,
when kexec is used to boot into a kernel, the TSC value would be
carried over from the previous kernel.  The computation of
cycns_offset in set_cyc2ns_scale is prone to an overflow, if the
machine has been up more than 208 days prior to the kexec.  The
overflow happens when we multiply *scale, even though there is
enough room to store the final answer.

We fix this issue by decomposing tsc_now into the quotient and
remainder of division by CYC2NS_SCALE_FACTOR and then performing
the multiplication separately on the two components.

Refactor code to share the calculation with the previous
fix in __cycles_2_ns().

Signed-off-by: Salman Qazi &lt;sqazi@google.com&gt;
Acked-by: John Stultz &lt;john.stultz@linaro.org&gt;
Acked-by: Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;
Cc: Paul Turner &lt;pjt@google.com&gt;
Cc: john stultz &lt;johnstul@us.ibm.com&gt;
Link: http://lkml.kernel.org/r/20120310004027.19291.88460.stgit@dungbeetle.mtv.corp.google.com
Signed-off-by: Ingo Molnar &lt;mingo@elte.hu&gt;
Cc: Mike Galbraith &lt;efault@gmx.de&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>CIFS: Fix VFS lock usage for oplocked files</title>
<updated>2012-04-13T16:13:54Z</updated>
<author>
<name>Pavel Shilovsky</name>
<email>piastry@etersoft.ru</email>
</author>
<published>2012-03-28T17:56:19Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=49c8ca7ade694f65de205adc08c23a710244b6b0'/>
<id>urn:sha1:49c8ca7ade694f65de205adc08c23a710244b6b0</id>
<content type='text'>
commit 66189be74ff5f9f3fd6444315b85be210d07cef2 upstream.

We can deadlock if we have a write oplock and two processes
use the same file handle. In this case the first process can't
unlock its lock if the second process blocked on the lock in the
same time.

Fix it by using posix_lock_file rather than posix_lock_file_wait
under cinode-&gt;lock_mutex. If we request a blocking lock and
posix_lock_file indicates that there is another lock that prevents
us, wait untill that lock is released and restart our call.

Acked-by: Jeff Layton &lt;jlayton@redhat.com&gt;
Signed-off-by: Pavel Shilovsky &lt;piastry@etersoft.ru&gt;
Signed-off-by: Steve French &lt;sfrench@us.ibm.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>x86,kgdb: Fix DEBUG_RODATA limitation using text_poke()</title>
<updated>2012-04-13T16:13:54Z</updated>
<author>
<name>Jason Wessel</name>
<email>jason.wessel@windriver.com</email>
</author>
<published>2012-03-23T14:35:05Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=c9f568f1a99794aa7b749828c41fb7cdf55fe435'/>
<id>urn:sha1:c9f568f1a99794aa7b749828c41fb7cdf55fe435</id>
<content type='text'>
commit 3751d3e85cf693e10e2c47c03c8caa65e171099b upstream.

There has long been a limitation using software breakpoints with a
kernel compiled with CONFIG_DEBUG_RODATA going back to 2.6.26. For
this particular patch, it will apply cleanly and has been tested all
the way back to 2.6.36.

The kprobes code uses the text_poke() function which accommodates
writing a breakpoint into a read-only page.  The x86 kgdb code can
solve the problem similarly by overriding the default breakpoint
set/remove routines and using text_poke() directly.

The x86 kgdb code will first attempt to use the traditional
probe_kernel_write(), and next try using a the text_poke() function.
The break point install method is tracked such that the correct break
point removal routine will get called later on.

Cc: x86@kernel.org
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Ingo Molnar &lt;mingo@redhat.com&gt;
Cc: H. Peter Anvin &lt;hpa@zytor.com&gt;
Inspried-by: Masami Hiramatsu &lt;masami.hiramatsu.pt@hitachi.com&gt;
Signed-off-by: Jason Wessel &lt;jason.wessel@windriver.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>kgdb,debug_core: pass the breakpoint struct instead of address and memory</title>
<updated>2012-04-13T16:13:54Z</updated>
<author>
<name>Jason Wessel</name>
<email>jason.wessel@windriver.com</email>
</author>
<published>2012-03-21T15:17:03Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=28cd59a8278065cc77e8b630b184b64d643da4ec'/>
<id>urn:sha1:28cd59a8278065cc77e8b630b184b64d643da4ec</id>
<content type='text'>
commit 98b54aa1a2241b59372468bd1e9c2d207bdba54b upstream.

There is extra state information that needs to be exposed in the
kgdb_bpt structure for tracking how a breakpoint was installed.  The
debug_core only uses the the probe_kernel_write() to install
breakpoints, but this is not enough for all the archs.  Some arch such
as x86 need to use text_poke() in order to install a breakpoint into a
read only page.

Passing the kgdb_bpt structure to kgdb_arch_set_breakpoint() and
kgdb_arch_remove_breakpoint() allows other archs to set the type
variable which indicates how the breakpoint was installed.

Signed-off-by: Jason Wessel &lt;jason.wessel@windriver.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
</feed>
