| Age | Commit message (Collapse) | Author |
|
commit 9ce119f318ba1a07c29149301f1544b6c4bea52a upstream.
A line discipline which does not define a receive_buf() method can
can cause a GPF if data is ever received [1]. Oddly, this was known
to the author of n_tracesink in 2011, but never fixed.
[1] GPF report
BUG: unable to handle kernel NULL pointer dereference at (null)
IP: [< (null)>] (null)
PGD 3752d067 PUD 37a7b067 PMD 0
Oops: 0010 [#1] SMP KASAN
Modules linked in:
CPU: 2 PID: 148 Comm: kworker/u10:2 Not tainted 4.4.0-rc2+ #51
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
Workqueue: events_unbound flush_to_ldisc
task: ffff88006da94440 ti: ffff88006db60000 task.ti: ffff88006db60000
RIP: 0010:[<0000000000000000>] [< (null)>] (null)
RSP: 0018:ffff88006db67b50 EFLAGS: 00010246
RAX: 0000000000000102 RBX: ffff88003ab32f88 RCX: 0000000000000102
RDX: 0000000000000000 RSI: ffff88003ab330a6 RDI: ffff88003aabd388
RBP: ffff88006db67c48 R08: ffff88003ab32f9c R09: ffff88003ab31fb0
R10: ffff88003ab32fa8 R11: 0000000000000000 R12: dffffc0000000000
R13: ffff88006db67c20 R14: ffffffff863df820 R15: ffff88003ab31fb8
FS: 0000000000000000(0000) GS:ffff88006dc00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 0000000000000000 CR3: 0000000037938000 CR4: 00000000000006e0
Stack:
ffffffff829f46f1 ffff88006da94bf8 ffff88006da94bf8 0000000000000000
ffff88003ab31fb0 ffff88003aabd438 ffff88003ab31ff8 ffff88006430fd90
ffff88003ab32f9c ffffed0007557a87 1ffff1000db6cf78 ffff88003ab32078
Call Trace:
[<ffffffff8127cf91>] process_one_work+0x8f1/0x17a0 kernel/workqueue.c:2030
[<ffffffff8127df14>] worker_thread+0xd4/0x1180 kernel/workqueue.c:2162
[<ffffffff8128faaf>] kthread+0x1cf/0x270 drivers/block/aoe/aoecmd.c:1302
[<ffffffff852a7c2f>] ret_from_fork+0x3f/0x70 arch/x86/entry/entry_64.S:468
Code: Bad RIP value.
RIP [< (null)>] (null)
RSP <ffff88006db67b50>
CR2: 0000000000000000
---[ end trace a587f8947e54d6ea ]---
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Peter Hurley <peter@hurleysoftware.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[bwh: Backported to 3.2: adjust context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
commit 64325a3be08d364a62ee8f84b2cf86934bc2544a upstream.
The root of problem is carelessly zeroing pointer(in function __tty_buffer_flush()),
when another thread can use it. It can be cause of "NULL pointer dereference".
Main idea of the patch, this is never free last (struct tty_buffer) in the active buffer.
Only flush the data for ldisc(buf->head->read = buf->head->commit).
At that moment driver can collect(write) data in buffer without conflict.
It is repeat behavior of flush_to_ldisc(), only without feeding data to ldisc.
Signed-off-by: Ilya Zykov <ilya@ilyx.ru>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
The flush_to_ldisc() work entry has special logic to notice when it has
seen the original tail of the data queue, and it avoids continuing the
flush if it sees that _original_ tail rather than the current tail.
This logic can trigger in case somebody is constantly adding new data to
the tty while the flushing is active - and the intent is to avoid
excessive CPU usage while flushing the tty, especially as we used to do
this from a softirq context which made it non-preemptible.
However, since we no longer re-arm the work-queue from within itself
(because that causes other trouble: see commit a5660b41af6a "tty: fix
endless work loop when the buffer fills up"), this just leads to
possible hung tty's (most easily seen in SMP and with a test-program
that floods a pty with data - nobody seems to have reported this for any
real-life situation yet).
And since the workqueue isn't done from timers and softirq's any more,
it's doubtful whether the CPU useage issue is really relevant any more.
So just remove the logic entirely, and see if anybody ever notices.
Alternatively, we might want to re-introduce the "re-arm the work" for
just this case, but then we'd have to re-introduce the delayed work
model or some explicit timer, which really doesn't seem worth it for
this.
Reported-and-tested-by: Guillaume Chazarain <guichaz@gmail.com>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Cc: Felipe Balbi <balbi@ti.com>
Cc: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
This reverts commit b1c43f82c5aa265442f82dba31ce985ebb7aa71c.
It was broken in so many ways, and results in random odd pty issues.
It re-introduced the buggy schedule_work() in flush_to_ldisc() that can
cause endless work-loops (see commit a5660b41af6a: "tty: fix endless
work loop when the buffer fills up").
It also used an "unsigned int" return value fo the ->receive_buf()
function, but then made multiple functions return a negative error code,
and didn't actually check for the error in the caller.
And it didn't actually work at all. BenH bisected down odd tty behavior
to it:
"It looks like the patch is causing some major malfunctions of the X
server for me, possibly related to PTYs. For example, cat'ing a
large file in a gnome terminal hangs the kernel for -minutes- in a
loop of what looks like flush_to_ldisc/workqueue code, (some ftrace
data in the quoted bits further down).
...
Some more data: It -looks- like what happens is that the
flush_to_ldisc work queue entry constantly re-queues itself (because
the PTY is full ?) and the workqueue thread will basically loop
forver calling it without ever scheduling, thus starving the consumer
process that could have emptied the PTY."
which is pretty much exactly the problem we fixed in a5660b41af6a.
Milton Miller pointed out the 'unsigned int' issue.
Reported-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Reported-by: Milton Miller <miltonm@bga.com>
Cc: Stefan Bigler <stefan.bigler@keymile.com>
Cc: Toby Gray <toby.gray@realvnc.com>
Cc: Felipe Balbi <balbi@ti.com>
Cc: Greg Kroah-Hartman <gregkh@suse.de>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
it makes it simpler to keep track of the amount of
bytes received and simplifies how flush_to_ldisc counts
the remaining bytes. It also fixes a bug of lost bytes
on n_tty when flushing too many bytes via the USB
serial gadget driver.
Tested-by: Stefan Bigler <stefan.bigler@keymile.com>
Tested-by: Toby Gray <toby.gray@realvnc.com>
Signed-off-by: Felipe Balbi <balbi@ti.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
Commit f23eb2b2b285 ('tty: stop using "delayed_work" in the tty layer')
ended up causing hung machines on UP with no preemption, because the
work routine to flip the buffer data to the ldisc would endlessly re-arm
itself if the destination buffer had filled up.
With the delayed work, that only caused a timer-driving polling of the
tty state every timer tick, but without the delay we just ended up with
basically a busy loop instead.
Stop the insane polling, and instead make the code that opens up the
receive room re-schedule the buffer flip work. That's what we should
have been doing anyway.
This same "poll for tty room" issue is almost certainly also the cause
of excessive kworker activity when idle reported by Dave Jones, who also
reported "flush_to_ldisc executing 2500 times a second" back in Nov 2010:
http://lkml.org/lkml/2010/11/30/592
which is that silly flushing done every timer tick. Wasting both power
and CPU for no good reason.
Reported-and-tested-by: Alexander Beregalov <a.beregalov@gmail.com>
Reported-and-tested-by: Sitsofe Wheeler <sitsofe@yahoo.com>
Cc: Greg KH <gregkh@suse.de>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Cc: Dave Jones <davej@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Using delayed-work for tty flip buffers ends up causing us to wait for
the next tick to complete some actions. That's usually not all that
noticeable, but for certain latency-critical workloads it ends up being
totally unacceptable.
As an extreme case of this, passing a token back-and-forth over a pty
will take two ticks per iteration, so even just a thousand iterations
will take 8 seconds assuming a common 250Hz configuration.
Avoiding the whole delayed work issue brings that ping-pong test-case
down to 0.009s on my machine.
In more practical terms, this latency has been a performance problem for
things like dive computer simulators (simulating the serial interface
using the ptys) and for other environments (Alan mentions a CP/M emulator).
Reported-by: Jef Driesen <jefdriesen@telenet.be>
Acked-by: Greg KH <gregkh@suse.de>
Acked-by: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
There's a small window inside the flush_to_ldisc function,
where the tty is unlocked and calling ldisc's receive_buf
function. If in this window new buffer is added to the tty,
the processing might never leave the flush_to_ldisc function.
This scenario will hog the cpu, causing other tty processing
starving, and making it impossible to interface the computer
via tty.
I was able to exploit this via pty interface by sending only
control characters to the master input, causing the flush_to_ldisc
to be scheduled, but never actually generate any output.
To reproduce, please run multiple instances of following code.
- SNIP
#define _XOPEN_SOURCE
#include <stdlib.h>
#include <stdio.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
int main(int argc, char **argv)
{
int i, slave, master = getpt();
char buf[8192];
sprintf(buf, "%s", ptsname(master));
grantpt(master);
unlockpt(master);
slave = open(buf, O_RDWR);
if (slave < 0) {
perror("open slave failed");
return 1;
}
for(i = 0; i < sizeof(buf); i++)
buf[i] = rand() % 32;
while(1) {
write(master, buf, sizeof(buf));
}
return 0;
}
- SNIP
The attached patch (based on -next tree) fixes this by checking on the
tty buffer tail. Once it's reached, the current work is rescheduled
and another could run.
Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Cc: stable <stable@kernel.org>
Acked-by: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
The tty code should be in its own subdirectory and not in the char
driver with all of the cruft that is currently there.
Based on work done by Arnd Bergmann <arnd@arndb.de>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Cc: Jiri Slaby <jslaby@suse.cz>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|