<feed xmlns='http://www.w3.org/2005/Atom'>
<title>user/sven/linux.git/Documentation/sysctl/kernel.txt, branch v4.4.153</title>
<subtitle>Linux Kernel
</subtitle>
<id>https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v4.4.153</id>
<link rel='self' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v4.4.153'/>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/'/>
<updated>2017-07-05T12:37:16Z</updated>
<entry>
<title>sysctl: enable strict writes</title>
<updated>2017-07-05T12:37:16Z</updated>
<author>
<name>Kees Cook</name>
<email>keescook@chromium.org</email>
</author>
<published>2016-01-20T23:00:45Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=2449a71eb98204fc54ff55ddd6825cc5141ce176'/>
<id>urn:sha1:2449a71eb98204fc54ff55ddd6825cc5141ce176</id>
<content type='text'>
commit 41662f5cc55335807d39404371cfcbb1909304c4 upstream.

SYSCTL_WRITES_WARN was added in commit f4aacea2f5d1 ("sysctl: allow for
strict write position handling"), and released in v3.16 in August of
2014.  Since then I can find only 1 instance of non-zero offset
writing[1], and it was fixed immediately in CRIU[2].  As such, it
appears safe to flip this to the strict state now.

[1] https://www.google.com/search?q="when%20file%20position%20was%20not%200"
[2] http://lists.openvz.org/pipermail/criu/2015-April/019819.html

Signed-off-by: Kees Cook &lt;keescook@chromium.org&gt;
Cc: "Eric W. Biederman" &lt;ebiederm@xmission.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Sumit Semwal &lt;sumit.semwal@linaro.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>kernel/watchdog.c: perform all-CPU backtrace in case of hard lockup</title>
<updated>2015-11-06T03:34:48Z</updated>
<author>
<name>Jiri Kosina</name>
<email>jkosina@suse.cz</email>
</author>
<published>2015-11-06T02:44:41Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=55537871ef666b4153fd1ef8782e4a13fee142cc'/>
<id>urn:sha1:55537871ef666b4153fd1ef8782e4a13fee142cc</id>
<content type='text'>
In many cases of hardlockup reports, it's actually not possible to know
why it triggered, because the CPU that got stuck is usually waiting on a
resource (with IRQs disabled) in posession of some other CPU is holding.

IOW, we are often looking at the stacktrace of the victim and not the
actual offender.

Introduce sysctl / cmdline parameter that makes it possible to have
hardlockup detector perform all-CPU backtrace.

Signed-off-by: Jiri Kosina &lt;jkosina@suse.cz&gt;
Reviewed-by: Aaron Tomlin &lt;atomlin@redhat.com&gt;
Cc: Ulrich Obergfell &lt;uobergfe@redhat.com&gt;
Acked-by: Don Zickus &lt;dzickus@redhat.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>coredump: use from_kuid/kgid when formatting corename</title>
<updated>2015-06-26T00:00:43Z</updated>
<author>
<name>Nicolas Iooss</name>
<email>nicolas.iooss_linux@m4x.org</email>
</author>
<published>2015-06-25T22:03:51Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=5202efe544c279be152f44f2821010ff7b2d7e5b'/>
<id>urn:sha1:5202efe544c279be152f44f2821010ff7b2d7e5b</id>
<content type='text'>
When adding __printf attribute to cn_printf, gcc reports some issues:

  fs/coredump.c:213:5: warning: format '%d' expects argument of type
  'int', but argument 3 has type 'kuid_t' [-Wformat=]
       err = cn_printf(cn, "%d", cred-&gt;uid);
       ^
  fs/coredump.c:217:5: warning: format '%d' expects argument of type
  'int', but argument 3 has type 'kgid_t' [-Wformat=]
       err = cn_printf(cn, "%d", cred-&gt;gid);
       ^

These warnings come from the fact that the value of uid/gid needs to be
extracted from the kuid_t/kgid_t structure before being used as an
integer.  More precisely, cred-&gt;uid and cred-&gt;gid need to be converted to
either user-namespace uid/gid or to init_user_ns uid/gid.

Use init_user_ns in order not to break existing ABI, and document this in
Documentation/sysctl/kernel.txt.

While at it, format uid and gid values with %u instead of %d because
uid_t/__kernel_uid32_t and gid_t/__kernel_gid32_t are unsigned int.

Signed-off-by: Nicolas Iooss &lt;nicolas.iooss_linux@m4x.org&gt;
Acked-by: "Eric W. Biederman" &lt;ebiederm@xmission.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>watchdog: add watchdog_cpumask sysctl to assist nohz</title>
<updated>2015-06-25T00:49:40Z</updated>
<author>
<name>Chris Metcalf</name>
<email>cmetcalf@ezchip.com</email>
</author>
<published>2015-06-24T23:55:45Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=fe4ba3c34352b7e8068b7f18eb233444aed17011'/>
<id>urn:sha1:fe4ba3c34352b7e8068b7f18eb233444aed17011</id>
<content type='text'>
Change the default behavior of watchdog so it only runs on the
housekeeping cores when nohz_full is enabled at build and boot time.
Allow modifying the set of cores the watchdog is currently running on
with a new kernel.watchdog_cpumask sysctl.

In the current system, the watchdog subsystem runs a periodic timer that
schedules the watchdog kthread to run.  However, nohz_full cores are
designed to allow userspace application code running on those cores to
have 100% access to the CPU.  So the watchdog system prevents the
nohz_full application code from being able to run the way it wants to,
thus the motivation to suppress the watchdog on nohz_full cores, which
this patchset provides by default.

However, if we disable the watchdog globally, then the housekeeping
cores can't benefit from the watchdog functionality.  So we allow
disabling it only on some cores.  See Documentation/lockup-watchdogs.txt
for more information.

[jhubbard@nvidia.com: fix a watchdog crash in some configurations]
Signed-off-by: Chris Metcalf &lt;cmetcalf@ezchip.com&gt;
Acked-by: Don Zickus &lt;dzickus@redhat.com&gt;
Cc: Ingo Molnar &lt;mingo@kernel.org&gt;
Cc: Ulrich Obergfell &lt;uobergfe@redhat.com&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Frederic Weisbecker &lt;fweisbec@gmail.com&gt;
Signed-off-by: John Hubbard &lt;jhubbard@nvidia.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>Doc/sysctl/kernel.txt: document threads-max</title>
<updated>2015-04-17T13:04:07Z</updated>
<author>
<name>Heinrich Schuchardt</name>
<email>xypron.glpk@gmx.de</email>
</author>
<published>2015-04-16T19:47:53Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=0ec62afeb143a34ce78143cf442f879ef68382f7'/>
<id>urn:sha1:0ec62afeb143a34ce78143cf442f879ef68382f7</id>
<content type='text'>
File /proc/sys/kernel/threads-max controls the maximum number of threads
that can be created using fork().

[akpm@linux-foundation.org: fix typo, per Guenter]
Signed-off-by: Heinrich Schuchardt &lt;xypron.glpk@gmx.de&gt;
Cc: Oleg Nesterov &lt;oleg@redhat.com&gt;
Cc: Ingo Molnar &lt;mingo@kernel.org&gt;
Cc: Guenter Roeck &lt;linux@roeck-us.net&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>watchdog: enable the new user interface of the watchdog mechanism</title>
<updated>2015-04-14T23:48:59Z</updated>
<author>
<name>Ulrich Obergfell</name>
<email>uobergfe@redhat.com</email>
</author>
<published>2015-04-14T22:44:13Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=195daf665a6299de98a4da3843fed2dd9de19d3a'/>
<id>urn:sha1:195daf665a6299de98a4da3843fed2dd9de19d3a</id>
<content type='text'>
With the current user interface of the watchdog mechanism it is only
possible to disable or enable both lockup detectors at the same time.
This series introduces new kernel parameters and changes the semantics of
some existing kernel parameters, so that the hard lockup detector and the
soft lockup detector can be disabled or enabled individually.  With this
series applied, the user interface is as follows.

- parameters in /proc/sys/kernel

  . soft_watchdog
    This is a new parameter to control and examine the run state of
    the soft lockup detector.

  . nmi_watchdog
    The semantics of this parameter have changed. It can now be used
    to control and examine the run state of the hard lockup detector.

  . watchdog
    This parameter is still available to control the run state of both
    lockup detectors at the same time. If this parameter is examined,
    it shows the logical OR of soft_watchdog and nmi_watchdog.

  . watchdog_thresh
    The semantics of this parameter are not affected by the patch.

- kernel command line parameters

  . nosoftlockup
    The semantics of this parameter have changed. It can now be used
    to disable the soft lockup detector at boot time.

  . nmi_watchdog=0 or nmi_watchdog=1
    Disable or enable the hard lockup detector at boot time. The patch
    introduces '=1' as a new option.

  . nowatchdog
    The semantics of this parameter are not affected by the patch. It
    is still available to disable both lockup detectors at boot time.

Also, remove the proc_dowatchdog() function which is no longer needed.

[dzickus@redhat.com: wrote changelog]
[dzickus@redhat.com: update documentation for kernel params and sysctl]
Signed-off-by: Ulrich Obergfell &lt;uobergfe@redhat.com&gt;
Signed-off-by: Don Zickus &lt;dzickus@redhat.com&gt;
Cc: Ingo Molnar &lt;mingo@elte.hu&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>livepatch: kernel: add TAINT_LIVEPATCH</title>
<updated>2014-12-22T14:40:48Z</updated>
<author>
<name>Seth Jennings</name>
<email>sjenning@redhat.com</email>
</author>
<published>2014-12-16T17:58:18Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=c5f4546593e9911800f0926c1090959b58bc5c93'/>
<id>urn:sha1:c5f4546593e9911800f0926c1090959b58bc5c93</id>
<content type='text'>
This adds a new taint flag to indicate when the kernel or a kernel
module has been live patched.  This will provide a clean indication in
bug reports that live patching was used.

Additionally, if the crash occurs in a live patched function, the live
patch module will appear beside the patched function in the backtrace.

Signed-off-by: Seth Jennings &lt;sjenning@redhat.com&gt;
Acked-by: Josh Poimboeuf &lt;jpoimboe@redhat.com&gt;
Reviewed-by: Miroslav Benes &lt;mbenes@suse.cz&gt;
Reviewed-by: Petr Mladek &lt;pmladek@suse.cz&gt;
Reviewed-by: Masami Hiramatsu &lt;masami.hiramatsu.pt@hitachi.com&gt;
Signed-off-by: Jiri Kosina &lt;jkosina@suse.cz&gt;
</content>
</entry>
<entry>
<title>ipc/msg: increase MSGMNI, remove scaling</title>
<updated>2014-12-13T20:42:52Z</updated>
<author>
<name>Manfred Spraul</name>
<email>manfred@colorfullife.com</email>
</author>
<published>2014-12-13T00:58:17Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=0050ee059f7fc86b1df2527aaa14ed5dc72f9973'/>
<id>urn:sha1:0050ee059f7fc86b1df2527aaa14ed5dc72f9973</id>
<content type='text'>
SysV can be abused to allocate locked kernel memory.  For most systems, a
small limit doesn't make sense, see the discussion with regards to SHMMAX.

Therefore: increase MSGMNI to the maximum supported.

And: If we ignore the risk of locking too much memory, then an automatic
scaling of MSGMNI doesn't make sense.  Therefore the logic can be removed.

The code preserves auto_msgmni to avoid breaking any user space applications
that expect that the value exists.

Notes:
1) If an administrator must limit the memory allocations, then he can set
MSGMNI as necessary.

Or he can disable sysv entirely (as e.g. done by Android).

2) MSGMAX and MSGMNB are intentionally not increased, as these values are used
to control latency vs. throughput:
If MSGMNB is large, then msgsnd() just returns and more messages can be queued
before a task switch to a task that calls msgrcv() is forced.

[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Manfred Spraul &lt;manfred@colorfullife.com&gt;
Cc: Davidlohr Bueso &lt;dave@stgolabs.net&gt;
Cc: Rafael Aquini &lt;aquini@redhat.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>kernel: add panic_on_warn</title>
<updated>2014-12-11T01:41:10Z</updated>
<author>
<name>Prarit Bhargava</name>
<email>prarit@redhat.com</email>
</author>
<published>2014-12-10T23:45:50Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=9e3961a0979817c612b10b2da4f3045ec9faa779'/>
<id>urn:sha1:9e3961a0979817c612b10b2da4f3045ec9faa779</id>
<content type='text'>
There have been several times where I have had to rebuild a kernel to
cause a panic when hitting a WARN() in the code in order to get a crash
dump from a system.  Sometimes this is easy to do, other times (such as
in the case of a remote admin) it is not trivial to send new images to
the user.

A much easier method would be a switch to change the WARN() over to a
panic.  This makes debugging easier in that I can now test the actual
image the WARN() was seen on and I do not have to engage in remote
debugging.

This patch adds a panic_on_warn kernel parameter and
/proc/sys/kernel/panic_on_warn calls panic() in the
warn_slowpath_common() path.  The function will still print out the
location of the warning.

An example of the panic_on_warn output:

The first line below is from the WARN_ON() to output the WARN_ON()'s
location.  After that the panic() output is displayed.

    WARNING: CPU: 30 PID: 11698 at /home/prarit/dummy_module/dummy-module.c:25 init_dummy+0x1f/0x30 [dummy_module]()
    Kernel panic - not syncing: panic_on_warn set ...

    CPU: 30 PID: 11698 Comm: insmod Tainted: G        W  OE  3.17.0+ #57
    Hardware name: Intel Corporation S2600CP/S2600CP, BIOS RMLSDP.86I.00.29.D696.1311111329 11/11/2013
     0000000000000000 000000008e3f87df ffff88080f093c38 ffffffff81665190
     0000000000000000 ffffffff818aea3d ffff88080f093cb8 ffffffff8165e2ec
     ffffffff00000008 ffff88080f093cc8 ffff88080f093c68 000000008e3f87df
    Call Trace:
     [&lt;ffffffff81665190&gt;] dump_stack+0x46/0x58
     [&lt;ffffffff8165e2ec&gt;] panic+0xd0/0x204
     [&lt;ffffffffa038e05f&gt;] ? init_dummy+0x1f/0x30 [dummy_module]
     [&lt;ffffffff81076b90&gt;] warn_slowpath_common+0xd0/0xd0
     [&lt;ffffffffa038e040&gt;] ? dummy_greetings+0x40/0x40 [dummy_module]
     [&lt;ffffffff81076c8a&gt;] warn_slowpath_null+0x1a/0x20
     [&lt;ffffffffa038e05f&gt;] init_dummy+0x1f/0x30 [dummy_module]
     [&lt;ffffffff81002144&gt;] do_one_initcall+0xd4/0x210
     [&lt;ffffffff811b52c2&gt;] ? __vunmap+0xc2/0x110
     [&lt;ffffffff810f8889&gt;] load_module+0x16a9/0x1b30
     [&lt;ffffffff810f3d30&gt;] ? store_uevent+0x70/0x70
     [&lt;ffffffff810f49b9&gt;] ? copy_module_from_fd.isra.44+0x129/0x180
     [&lt;ffffffff810f8ec6&gt;] SyS_finit_module+0xa6/0xd0
     [&lt;ffffffff8166cf29&gt;] system_call_fastpath+0x12/0x17

Successfully tested by me.

hpa said: There is another very valid use for this: many operators would
rather a machine shuts down than being potentially compromised either
functionally or security-wise.

Signed-off-by: Prarit Bhargava &lt;prarit@redhat.com&gt;
Cc: Jonathan Corbet &lt;corbet@lwn.net&gt;
Cc: Rusty Russell &lt;rusty@rustcorp.com.au&gt;
Cc: "H. Peter Anvin" &lt;hpa@zytor.com&gt;
Cc: Andi Kleen &lt;ak@linux.intel.com&gt;
Cc: Masami Hiramatsu &lt;masami.hiramatsu.pt@hitachi.com&gt;
Acked-by: Yasuaki Ishimatsu &lt;isimatu.yasuaki@jp.fujitsu.com&gt;
Cc: Fabian Frederick &lt;fabf@skynet.be&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>coredump: add %i/%I in core_pattern to report the tid of the crashed thread</title>
<updated>2014-10-14T00:18:21Z</updated>
<author>
<name>Oleg Nesterov</name>
<email>oleg@redhat.com</email>
</author>
<published>2014-10-13T22:53:35Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=b03023ecbdb76c1dec86b41ed80b123c22783220'/>
<id>urn:sha1:b03023ecbdb76c1dec86b41ed80b123c22783220</id>
<content type='text'>
format_corename() can only pass the leader's pid to the core handler,
but there is no simple way to figure out which thread originated the
coredump.

As Jan explains, this also means that there is no simple way to create
the backtrace of the crashed process:

As programs are mostly compiled with implicit gcc -fomit-frame-pointer
one needs program's .eh_frame section (equivalently PT_GNU_EH_FRAME
segment) or .debug_frame section.  .debug_frame usually is present only
in separate debug info files usually not even installed on the system.
While .eh_frame is a part of the executable/library (and it is even
always mapped for C++ exceptions unwinding) it no longer has to be
present anywhere on the disk as the program could be upgraded in the
meantime and the running instance has its executable file already
unlinked from disk.

One possibility is to echo 0x3f &gt;/proc/*/coredump_filter and dump all
the file-backed memory including the executable's .eh_frame section.
But that can create huge core files, for example even due to mmapped
data files.

Other possibility would be to read .eh_frame from /proc/PID/mem at the
core_pattern handler time of the core dump.  For the backtrace one needs
to read the register state first which can be done from core_pattern
handler:

    ptrace(PTRACE_SEIZE, tid, 0, PTRACE_O_TRACEEXIT)
    close(0);    // close pipe fd to resume the sleeping dumper
    waitpid();   // should report EXIT
    PTRACE_GETREGS or other requests

The remaining problem is how to get the 'tid' value of the crashed
thread.  It could be read from the first NT_PRSTATUS note of the core
file but that makes the core_pattern handler complicated.

Unfortunately %t is already used so this patch uses %i/%I.

Automatic Bug Reporting Tool (https://github.com/abrt/abrt/wiki/overview)
is experimenting with this.  It is using the elfutils
(https://fedorahosted.org/elfutils/) unwinder for generating the
backtraces.  Apart from not needing matching executables as mentioned
above, another advantage is that we can get the backtrace without saving
the core (which might be quite large) to disk.

[mmilata@redhat.com: final paragraph of changelog]
Signed-off-by: Jan Kratochvil &lt;jan.kratochvil@redhat.com&gt;
Signed-off-by: Oleg Nesterov &lt;oleg@redhat.com&gt;
Cc: Alexander Viro &lt;viro@zeniv.linux.org.uk&gt;
Cc: Denys Vlasenko &lt;dvlasenk@redhat.com&gt;
Cc: Jan Kratochvil &lt;jan.kratochvil@redhat.com&gt;
Cc: Mark Wielaard &lt;mjw@redhat.com&gt;
Cc: Martin Milata &lt;mmilata@redhat.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
</feed>
