<feed xmlns='http://www.w3.org/2005/Atom'>
<title>user/sven/linux.git/security/commoncap.c, branch v4.1.37</title>
<subtitle>Linux Kernel
</subtitle>
<id>https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v4.1.37</id>
<link rel='self' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v4.1.37'/>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/'/>
<updated>2016-04-12T02:07:35Z</updated>
<entry>
<title>ptrace: use fsuid, fsgid, effective creds for fs access checks</title>
<updated>2016-04-12T02:07:35Z</updated>
<author>
<name>Jann Horn</name>
<email>jann@thejh.net</email>
</author>
<published>2016-01-20T23:00:04Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=ab88ce5feca4204ecf4e7ef6c6693ff67edc2169'/>
<id>urn:sha1:ab88ce5feca4204ecf4e7ef6c6693ff67edc2169</id>
<content type='text'>
[ Upstream commit caaee6234d05a58c5b4d05e7bf766131b810a657 ]

By checking the effective credentials instead of the real UID / permitted
capabilities, ensure that the calling process actually intended to use its
credentials.

To ensure that all ptrace checks use the correct caller credentials (e.g.
in case out-of-tree code or newly added code omits the PTRACE_MODE_*CREDS
flag), use two new flags and require one of them to be set.

The problem was that when a privileged task had temporarily dropped its
privileges, e.g.  by calling setreuid(0, user_uid), with the intent to
perform following syscalls with the credentials of a user, it still passed
ptrace access checks that the user would not be able to pass.

While an attacker should not be able to convince the privileged task to
perform a ptrace() syscall, this is a problem because the ptrace access
check is reused for things in procfs.

In particular, the following somewhat interesting procfs entries only rely
on ptrace access checks:

 /proc/$pid/stat - uses the check for determining whether pointers
     should be visible, useful for bypassing ASLR
 /proc/$pid/maps - also useful for bypassing ASLR
 /proc/$pid/cwd - useful for gaining access to restricted
     directories that contain files with lax permissions, e.g. in
     this scenario:
     lrwxrwxrwx root root /proc/13020/cwd -&gt; /root/foobar
     drwx------ root root /root
     drwxr-xr-x root root /root/foobar
     -rw-r--r-- root root /root/foobar/secret

Therefore, on a system where a root-owned mode 6755 binary changes its
effective credentials as described and then dumps a user-specified file,
this could be used by an attacker to reveal the memory layout of root's
processes or reveal the contents of files he is not allowed to access
(through /proc/$pid/cwd).

[akpm@linux-foundation.org: fix warning]
Signed-off-by: Jann Horn &lt;jann@thejh.net&gt;
Acked-by: Kees Cook &lt;keescook@chromium.org&gt;
Cc: Casey Schaufler &lt;casey@schaufler-ca.com&gt;
Cc: Oleg Nesterov &lt;oleg@redhat.com&gt;
Cc: Ingo Molnar &lt;mingo@redhat.com&gt;
Cc: James Morris &lt;james.l.morris@oracle.com&gt;
Cc: "Serge E. Hallyn" &lt;serge.hallyn@ubuntu.com&gt;
Cc: Andy Shevchenko &lt;andriy.shevchenko@linux.intel.com&gt;
Cc: Andy Lutomirski &lt;luto@kernel.org&gt;
Cc: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
Cc: "Eric W. Biederman" &lt;ebiederm@xmission.com&gt;
Cc: Willy Tarreau &lt;w@1wt.eu&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;

Signed-off-by: Sasha Levin &lt;sasha.levin@oracle.com&gt;
</content>
</entry>
<entry>
<title>VFS: security/: d_backing_inode() annotations</title>
<updated>2015-04-15T19:06:56Z</updated>
<author>
<name>David Howells</name>
<email>dhowells@redhat.com</email>
</author>
<published>2015-03-17T22:26:22Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=c6f493d631c4d40cea5c36055f9148f547b200af'/>
<id>urn:sha1:c6f493d631c4d40cea5c36055f9148f547b200af</id>
<content type='text'>
most of the -&gt;d_inode uses there refer to the same inode IO would
go to, i.e. d_backing_inode()

Signed-off-by: David Howells &lt;dhowells@redhat.com&gt;
Signed-off-by: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
</content>
</entry>
<entry>
<title>file-&gt;f_path.dentry is pinned down for as long as the file is open...</title>
<updated>2015-01-26T04:16:27Z</updated>
<author>
<name>Al Viro</name>
<email>viro@zeniv.linux.org.uk</email>
</author>
<published>2014-12-28T14:27:07Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=f4a4a8b1252a08b60cde66a6622bbca4a7f4af2e'/>
<id>urn:sha1:f4a4a8b1252a08b60cde66a6622bbca4a7f4af2e</id>
<content type='text'>
Signed-off-by: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
</content>
</entry>
<entry>
<title>kill f_dentry uses</title>
<updated>2014-11-19T18:01:25Z</updated>
<author>
<name>Al Viro</name>
<email>viro@zeniv.linux.org.uk</email>
</author>
<published>2014-10-31T05:22:04Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=b583043e99bc6d91e98fae32bd9eff6a5958240a'/>
<id>urn:sha1:b583043e99bc6d91e98fae32bd9eff6a5958240a</id>
<content type='text'>
Signed-off-by: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
</content>
</entry>
<entry>
<title>CAPABILITIES: remove undefined caps from all processes</title>
<updated>2014-07-24T11:53:47Z</updated>
<author>
<name>Eric Paris</name>
<email>eparis@redhat.com</email>
</author>
<published>2014-07-23T19:36:26Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=7d8b6c63751cfbbe5eef81a48c22978b3407a3ad'/>
<id>urn:sha1:7d8b6c63751cfbbe5eef81a48c22978b3407a3ad</id>
<content type='text'>
This is effectively a revert of 7b9a7ec565505699f503b4fcf61500dceb36e744
plus fixing it a different way...

We found, when trying to run an application from an application which
had dropped privs that the kernel does security checks on undefined
capability bits.  This was ESPECIALLY difficult to debug as those
undefined bits are hidden from /proc/$PID/status.

Consider a root application which drops all capabilities from ALL 4
capability sets.  We assume, since the application is going to set
eff/perm/inh from an array that it will clear not only the defined caps
less than CAP_LAST_CAP, but also the higher 28ish bits which are
undefined future capabilities.

The BSET gets cleared differently.  Instead it is cleared one bit at a
time.  The problem here is that in security/commoncap.c::cap_task_prctl()
we actually check the validity of a capability being read.  So any task
which attempts to 'read all things set in bset' followed by 'unset all
things set in bset' will not even attempt to unset the undefined bits
higher than CAP_LAST_CAP.

So the 'parent' will look something like:
CapInh:	0000000000000000
CapPrm:	0000000000000000
CapEff:	0000000000000000
CapBnd:	ffffffc000000000

All of this 'should' be fine.  Given that these are undefined bits that
aren't supposed to have anything to do with permissions.  But they do...

So lets now consider a task which cleared the eff/perm/inh completely
and cleared all of the valid caps in the bset (but not the invalid caps
it couldn't read out of the kernel).  We know that this is exactly what
the libcap-ng library does and what the go capabilities library does.
They both leave you in that above situation if you try to clear all of
you capapabilities from all 4 sets.  If that root task calls execve()
the child task will pick up all caps not blocked by the bset.  The bset
however does not block bits higher than CAP_LAST_CAP.  So now the child
task has bits in eff which are not in the parent.  These are
'meaningless' undefined bits, but still bits which the parent doesn't
have.

The problem is now in cred_cap_issubset() (or any operation which does a
subset test) as the child, while a subset for valid cap bits, is not a
subset for invalid cap bits!  So now we set durring commit creds that
the child is not dumpable.  Given it is 'more priv' than its parent.  It
also means the parent cannot ptrace the child and other stupidity.

The solution here:
1) stop hiding capability bits in status
	This makes debugging easier!

2) stop giving any task undefined capability bits.  it's simple, it you
don't put those invalid bits in CAP_FULL_SET you won't get them in init
and you won't get them in any other task either.
	This fixes the cap_issubset() tests and resulting fallout (which
	made the init task in a docker container untraceable among other
	things)

3) mask out undefined bits when sys_capset() is called as it might use
~0, ~0 to denote 'all capabilities' for backward/forward compatibility.
	This lets 'capsh --caps="all=eip" -- -c /bin/bash' run.

4) mask out undefined bit when we read a file capability off of disk as
again likely all bits are set in the xattr for forward/backward
compatibility.
	This lets 'setcap all+pe /bin/bash; /bin/bash' run

Signed-off-by: Eric Paris &lt;eparis@redhat.com&gt;
Reviewed-by: Kees Cook &lt;keescook@chromium.org&gt;
Cc: Andrew Vagin &lt;avagin@openvz.org&gt;
Cc: Andrew G. Morgan &lt;morgan@kernel.org&gt;
Cc: Serge E. Hallyn &lt;serge.hallyn@canonical.com&gt;
Cc: Kees Cook &lt;keescook@chromium.org&gt;
Cc: Steve Grubb &lt;sgrubb@redhat.com&gt;
Cc: Dan Walsh &lt;dwalsh@redhat.com&gt;
Cc: stable@vger.kernel.org
Signed-off-by: James Morris &lt;james.l.morris@oracle.com&gt;
</content>
</entry>
<entry>
<title>commoncap: don't alloc the credential unless needed in cap_task_prctl</title>
<updated>2014-07-24T11:12:30Z</updated>
<author>
<name>Tetsuo Handa</name>
<email>penguin-kernel@I-love.SAKURA.ne.jp</email>
</author>
<published>2014-07-22T12:20:01Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=6d6f3328422a3bc56b0d8dd026a5de845d2abfa7'/>
<id>urn:sha1:6d6f3328422a3bc56b0d8dd026a5de845d2abfa7</id>
<content type='text'>
In function cap_task_prctl(), we would allocate a credential
unconditionally and then check if we support the requested function.
If not we would release this credential with abort_creds() by using
RCU method. But on some archs such as powerpc, the sys_prctl is heavily
used to get/set the floating point exception mode. So the unnecessary
allocating/releasing of credential not only introduce runtime overhead
but also do cause OOM due to the RCU implementation.

This patch removes abort_creds() from cap_task_prctl() by calling
prepare_creds() only when we need to modify it.

Reported-by: Kevin Hao &lt;haokexin@gmail.com&gt;
Signed-off-by: Tetsuo Handa &lt;penguin-kernel@I-love.SAKURA.ne.jp&gt;
Reviewed-by: Paul Moore &lt;paul@paul-moore.com&gt;
Acked-by: Serge E. Hallyn &lt;serge.hallyn@ubuntu.com&gt;
Reviewed-by: Kees Cook &lt;keescook@chromium.org&gt;
Signed-off-by: James Morris &lt;james.l.morris@oracle.com&gt;
</content>
</entry>
<entry>
<title>capabilities: allow nice if we are privileged</title>
<updated>2013-08-31T06:44:09Z</updated>
<author>
<name>Serge Hallyn</name>
<email>serge.hallyn@ubuntu.com</email>
</author>
<published>2013-07-23T18:18:53Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=f54fb863c6bbcbafdfc332b4a4260abb5a002137'/>
<id>urn:sha1:f54fb863c6bbcbafdfc332b4a4260abb5a002137</id>
<content type='text'>
We allow task A to change B's nice level if it has a supserset of
B's privileges, or of it has CAP_SYS_NICE.  Also allow it if A has
CAP_SYS_NICE with respect to B - meaning it is root in the same
namespace, or it created B's namespace.

Signed-off-by: Serge Hallyn &lt;serge.hallyn@canonical.com&gt;
Reviewed-by: "Eric W. Biederman" &lt;ebiederm@xmission.com&gt;
Signed-off-by: Eric W. Biederman &lt;ebiederm@xmission.com&gt;
</content>
</entry>
<entry>
<title>userns: Allow PR_CAPBSET_DROP in a user namespace.</title>
<updated>2013-08-31T00:30:39Z</updated>
<author>
<name>Eric W. Biederman</name>
<email>ebiederm@xmission.com</email>
</author>
<published>2013-07-02T17:04:54Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=160da84dbb39443fdade7151bc63a88f8e953077'/>
<id>urn:sha1:160da84dbb39443fdade7151bc63a88f8e953077</id>
<content type='text'>
As the capabilites and capability bounding set are per user namespace
properties it is safe to allow changing them with just CAP_SETPCAP
permission in the user namespace.

Acked-by: Serge Hallyn &lt;serge.hallyn@canonical.com&gt;
Tested-by: Richard Weinberger &lt;richard@nod.at&gt;
Signed-off-by: "Eric W. Biederman" &lt;ebiederm@xmission.com&gt;
</content>
</entry>
<entry>
<title>kill f_vfsmnt</title>
<updated>2013-02-26T07:46:10Z</updated>
<author>
<name>Al Viro</name>
<email>viro@zeniv.linux.org.uk</email>
</author>
<published>2013-01-24T07:21:54Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=182be684784334598eee1d90274e7f7aa0063616'/>
<id>urn:sha1:182be684784334598eee1d90274e7f7aa0063616</id>
<content type='text'>
very few users left...

Signed-off-by: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
</content>
</entry>
<entry>
<title>Fix cap_capable to only allow owners in the parent user namespace to have caps.</title>
<updated>2012-12-14T21:50:32Z</updated>
<author>
<name>Eric W. Biederman</name>
<email>ebiederm@xmission.com</email>
</author>
<published>2012-12-14T02:06:40Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=520d9eabce18edfef76a60b7b839d54facafe1f9'/>
<id>urn:sha1:520d9eabce18edfef76a60b7b839d54facafe1f9</id>
<content type='text'>
Andy Lutomirski pointed out that the current behavior of allowing the
owner of a user namespace to have all caps when that owner is not in a
parent user namespace is wrong.  Add a test to ensure the owner of a user
namespace is in the parent of the user namespace to fix this bug.

Thankfully this bug did not apply to the initial user namespace, keeping
the mischief that can be caused by this bug quite small.

This is bug was introduced in v3.5 by commit 783291e6900
"Simplify the user_namespace by making userns-&gt;creator a kuid."
But did not matter until the permisions required to create
a user namespace were relaxed allowing a user namespace to be created
inside of a user namespace.

The bug made it possible for the owner of a user namespace to be
present in a child user namespace.  Since the owner of a user nameapce
is granted all capabilities it became possible for users in a
grandchild user namespace to have all privilges over their parent user
namspace.

Reorder the checks in cap_capable.  This should make the common case
faster and make it clear that nothing magic happens in the initial
user namespace.  The reordering is safe because cred-&gt;user_ns
can only be in targ_ns or targ_ns-&gt;parent but not both.

Add a comment a the top of the loop to make the logic of
the code clear.

Add a distinct variable ns that changes as we walk up
the user namespace hierarchy to make it clear which variable
is changing.

Acked-by: Serge Hallyn &lt;serge.hallyn@canonical.com&gt;
Signed-off-by: "Eric W. Biederman" &lt;ebiederm@xmission.com&gt;
</content>
</entry>
</feed>
