<feed xmlns='http://www.w3.org/2005/Atom'>
<title>user/sven/linux.git/net/ipv4/fib_rules.c, branch leds/master</title>
<subtitle>Linux Kernel
</subtitle>
<id>https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=leds%2Fmaster</id>
<link rel='self' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=leds%2Fmaster'/>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/'/>
<updated>2015-06-24T09:15:54Z</updated>
<entry>
<title>net: ipv4 sysctl option to ignore routes when nexthop link is down</title>
<updated>2015-06-24T09:15:54Z</updated>
<author>
<name>Andy Gospodarek</name>
<email>gospo@cumulusnetworks.com</email>
</author>
<published>2015-06-23T17:45:37Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=0eeb075fad736fb92620af995c47c204bbb5e829'/>
<id>urn:sha1:0eeb075fad736fb92620af995c47c204bbb5e829</id>
<content type='text'>
This feature is only enabled with the new per-interface or ipv4 global
sysctls called 'ignore_routes_with_linkdown'.

net.ipv4.conf.all.ignore_routes_with_linkdown = 0
net.ipv4.conf.default.ignore_routes_with_linkdown = 0
net.ipv4.conf.lo.ignore_routes_with_linkdown = 0
...

When the above sysctls are set, will report to userspace that a route is
dead and will no longer resolve to this nexthop when performing a fib
lookup.  This will signal to userspace that the route will not be
selected.  The signalling of a RTNH_F_DEAD is only passed to userspace
if the sysctl is enabled and link is down.  This was done as without it
the netlink listeners would have no idea whether or not a nexthop would
be selected.   The kernel only sets RTNH_F_DEAD internally if the
interface has IFF_UP cleared.

With the new sysctl set, the following behavior can be observed
(interface p8p1 is link-down):

default via 10.0.5.2 dev p9p1
10.0.5.0/24 dev p9p1  proto kernel  scope link  src 10.0.5.15
70.0.0.0/24 dev p7p1  proto kernel  scope link  src 70.0.0.1
80.0.0.0/24 dev p8p1  proto kernel  scope link  src 80.0.0.1 dead linkdown
90.0.0.0/24 via 80.0.0.2 dev p8p1  metric 1 dead linkdown
90.0.0.0/24 via 70.0.0.2 dev p7p1  metric 2
90.0.0.1 via 70.0.0.2 dev p7p1  src 70.0.0.1
    cache
local 80.0.0.1 dev lo  src 80.0.0.1
    cache &lt;local&gt;
80.0.0.2 via 10.0.5.2 dev p9p1  src 10.0.5.15
    cache

While the route does remain in the table (so it can be modified if
needed rather than being wiped away as it would be if IFF_UP was
cleared), the proper next-hop is chosen automatically when the link is
down.  Now interface p8p1 is linked-up:

default via 10.0.5.2 dev p9p1
10.0.5.0/24 dev p9p1  proto kernel  scope link  src 10.0.5.15
70.0.0.0/24 dev p7p1  proto kernel  scope link  src 70.0.0.1
80.0.0.0/24 dev p8p1  proto kernel  scope link  src 80.0.0.1
90.0.0.0/24 via 80.0.0.2 dev p8p1  metric 1
90.0.0.0/24 via 70.0.0.2 dev p7p1  metric 2
192.168.56.0/24 dev p2p1  proto kernel  scope link  src 192.168.56.2
90.0.0.1 via 80.0.0.2 dev p8p1  src 80.0.0.1
    cache
local 80.0.0.1 dev lo  src 80.0.0.1
    cache &lt;local&gt;
80.0.0.2 dev p8p1  src 80.0.0.1
    cache

and the output changes to what one would expect.

If the sysctl is not set, the following output would be expected when
p8p1 is down:

default via 10.0.5.2 dev p9p1
10.0.5.0/24 dev p9p1  proto kernel  scope link  src 10.0.5.15
70.0.0.0/24 dev p7p1  proto kernel  scope link  src 70.0.0.1
80.0.0.0/24 dev p8p1  proto kernel  scope link  src 80.0.0.1 linkdown
90.0.0.0/24 via 80.0.0.2 dev p8p1  metric 1 linkdown
90.0.0.0/24 via 70.0.0.2 dev p7p1  metric 2

Since the dead flag does not appear, there should be no expectation that
the kernel would skip using this route due to link being down.

v2: Split kernel changes into 2 patches, this actually makes a
behavioral change if the sysctl is set.  Also took suggestion from Alex
to simplify code by only checking sysctl during fib lookup and
suggestion from Scott to add a per-interface sysctl.

v3: Code clean-ups to make it more readable and efficient as well as a
reverse path check fix.

v4: Drop binary sysctl

v5: Whitespace fixups from Dave

v6: Style changes from Dave and checkpatch suggestions

v7: One more checkpatch fixup

Signed-off-by: Andy Gospodarek &lt;gospo@cumulusnetworks.com&gt;
Signed-off-by: Dinesh Dutt &lt;ddutt@cumulusnetworks.com&gt;
Acked-by: Scott Feldman &lt;sfeldma@gmail.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>ipv4: coding style: comparison for equality with NULL</title>
<updated>2015-04-03T16:11:15Z</updated>
<author>
<name>Ian Morris</name>
<email>ipm@chirality.org.uk</email>
</author>
<published>2015-04-03T08:17:26Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=51456b2914a34d16b1255b7c55d5cbf6a681d306'/>
<id>urn:sha1:51456b2914a34d16b1255b7c55d5cbf6a681d306</id>
<content type='text'>
The ipv4 code uses a mixture of coding styles. In some instances check
for NULL pointer is done as x == NULL and sometimes as !x. !x is
preferred according to checkpatch and this patch makes the code
consistent by adopting the latter form.

No changes detected by objdiff.

Signed-off-by: Ian Morris &lt;ipm@chirality.org.uk&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>netlink: implement nla_get_in_addr and nla_get_in6_addr</title>
<updated>2015-03-31T17:58:35Z</updated>
<author>
<name>Jiri Benc</name>
<email>jbenc@redhat.com</email>
</author>
<published>2015-03-29T14:59:26Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=67b61f6c130a05b2cd4c3dfded49a751ff42c534'/>
<id>urn:sha1:67b61f6c130a05b2cd4c3dfded49a751ff42c534</id>
<content type='text'>
Those are counterparts to nla_put_in_addr and nla_put_in6_addr.

Signed-off-by: Jiri Benc &lt;jbenc@redhat.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>netlink: implement nla_put_in_addr and nla_put_in6_addr</title>
<updated>2015-03-31T17:58:35Z</updated>
<author>
<name>Jiri Benc</name>
<email>jbenc@redhat.com</email>
</author>
<published>2015-03-29T14:59:25Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=930345ea630405aa6e6f42efcb149c3f360a6b67'/>
<id>urn:sha1:930345ea630405aa6e6f42efcb149c3f360a6b67</id>
<content type='text'>
IP addresses are often stored in netlink attributes. Add generic functions
to do that.

For nla_put_in_addr, it would be nicer to pass struct in_addr but this is
not used universally throughout the kernel, in way too many places __be32 is
used to store IPv4 address.

Signed-off-by: Jiri Benc &lt;jbenc@redhat.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>ipv4: FIB Local/MAIN table collapse</title>
<updated>2015-03-11T20:22:14Z</updated>
<author>
<name>Alexander Duyck</name>
<email>alexander.h.duyck@redhat.com</email>
</author>
<published>2015-03-06T21:47:00Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=0ddcf43d5d4a03ded1ee3f6b3b72a0cbed4e90b1'/>
<id>urn:sha1:0ddcf43d5d4a03ded1ee3f6b3b72a0cbed4e90b1</id>
<content type='text'>
This patch is meant to collapse local and main into one by converting
tb_data from an array to a pointer.  Doing this allows us to point the
local table into the main while maintaining the same variables in the
table.

As such the tb_data was converted from an array to a pointer, and a new
array called data is added in order to still provide an object for tb_data
to point to.

In order to track the origin of the fib aliases a tb_id value was added in
a hole that existed on 64b systems.  Using this we can also reverse the
merge in the event that custom FIB rules are enabled.

With this patch I am seeing an improvement of 20ns to 30ns for routing
lookups as long as custom rules are not enabled, with custom rules enabled
we fall back to split tables and the original behavior.

Signed-off-by: Alexander Duyck &lt;alexander.h.duyck@redhat.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>switchdev: don't support custom ip rules, for now</title>
<updated>2015-03-06T05:24:58Z</updated>
<author>
<name>Scott Feldman</name>
<email>sfeldma@gmail.com</email>
</author>
<published>2015-03-06T05:21:16Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=104616e74e0b464d449fdd2ee2f547d2fad71610'/>
<id>urn:sha1:104616e74e0b464d449fdd2ee2f547d2fad71610</id>
<content type='text'>
Keep switchdev FIB offload model simple for now and don't allow custom ip
rules.

Signed-off-by: Scott Feldman &lt;sfeldma@gmail.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>fib_trie: Push rcu_read_lock/unlock to callers</title>
<updated>2014-12-31T23:25:54Z</updated>
<author>
<name>Alexander Duyck</name>
<email>alexander.h.duyck@redhat.com</email>
</author>
<published>2014-12-31T18:56:24Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=345e9b54268ae065520a7252c182d22ef4591718'/>
<id>urn:sha1:345e9b54268ae065520a7252c182d22ef4591718</id>
<content type='text'>
This change is to start cleaning up some of the rcu_read_lock/unlock
handling.  I realized while reviewing the code there are several spots that
I don't believe are being handled correctly or are masking warnings by
locally calling rcu_read_lock/unlock instead of calling them at the correct
level.

A common example is a call to fib_get_table followed by fib_table_lookup.
The rcu_read_lock/unlock ought to wrap both but there are several spots where
they were not wrapped.

Signed-off-by: Alexander Duyck &lt;alexander.h.duyck@redhat.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>ipv4: Fix incorrect error code when adding an unreachable route</title>
<updated>2014-11-16T19:11:45Z</updated>
<author>
<name>Panu Matilainen</name>
<email>pmatilai@redhat.com</email>
</author>
<published>2014-11-14T11:14:32Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=49dd18ba4615eaa72f15c9087dea1c2ab4744cf5'/>
<id>urn:sha1:49dd18ba4615eaa72f15c9087dea1c2ab4744cf5</id>
<content type='text'>
Trying to add an unreachable route incorrectly returns -ESRCH if
if custom FIB rules are present:

[root@localhost ~]# ip route add 74.125.31.199 dev eth0 via 1.2.3.4
RTNETLINK answers: Network is unreachable
[root@localhost ~]# ip rule add to 55.66.77.88 table 200
[root@localhost ~]# ip route add 74.125.31.199 dev eth0 via 1.2.3.4
RTNETLINK answers: No such process
[root@localhost ~]#

Commit 83886b6b636173b206f475929e58fac75c6f2446 ("[NET]: Change "not found"
return value for rule lookup") changed fib_rules_lookup()
to use -ESRCH as a "not found" code internally, but for user space it
should be translated into -ENETUNREACH. Handle the translation centrally in
ipv4-specific fib_lookup(), leaving the DECnet case alone.

On a related note, commit b7a71b51ee37d919e4098cd961d59a883fd272d8
("ipv4: removed redundant conditional") removed a similar translation from
ip_route_input_slow() prematurely AIUI.

Fixes: b7a71b51ee37 ("ipv4: removed redundant conditional")
Signed-off-by: Panu Matilainen &lt;pmatilai@redhat.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>inet: fix NULL pointer Oops in fib(6)_rule_suppress</title>
<updated>2013-12-10T22:54:23Z</updated>
<author>
<name>Stefan Tomanek</name>
<email>stefan.tomanek@wertarbyte.de</email>
</author>
<published>2013-12-10T22:21:25Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=673498b8ed4c4d4b7221c5309d891c5eac2b7528'/>
<id>urn:sha1:673498b8ed4c4d4b7221c5309d891c5eac2b7528</id>
<content type='text'>
This changes ensures that the routing entry investigated by the suppress
function actually does point to a device struct before following that pointer,
fixing a possible kernel oops situation when verifying the interface group
associated with a routing table entry.

According to Daniel Golle, this Oops can be triggered by a user process trying
to establish an outgoing IPv6 connection while having no real IPv6 connectivity
set up (only autoassigned link-local addresses).

Fixes: 6ef94cfafba15 ("fib_rules: add route suppression based on ifgroup")

Reported-by: Daniel Golle &lt;daniel.golle@gmail.com&gt;
Tested-by: Daniel Golle &lt;daniel.golle@gmail.com&gt;
Signed-off-by: Stefan Tomanek &lt;stefan.tomanek@wertarbyte.de&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>fib_rules: fix suppressor names and default values</title>
<updated>2013-08-03T17:40:23Z</updated>
<author>
<name>Stefan Tomanek</name>
<email>stefan.tomanek@wertarbyte.de</email>
</author>
<published>2013-08-03T12:14:43Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=73f5698e77219bfc3ea1903759fe8e20ab5b285e'/>
<id>urn:sha1:73f5698e77219bfc3ea1903759fe8e20ab5b285e</id>
<content type='text'>
This change brings the suppressor attribute names into line; it also changes
the data types to provide a more consistent interface.

While -1 indicates that the suppressor is not enabled, values &gt;= 0 for
suppress_prefixlen or suppress_ifgroup  reject routing decisions violating the
constraint.

This changes the previously presented behaviour of suppress_prefixlen, where a
prefix length _less_ than the attribute value was rejected. After this change,
a prefix length less than *or* equal to the value is considered a violation of
the rule constraint.

It also changes the default values for default and newly added rules (disabling
any suppression for those).

Signed-off-by: Stefan Tomanek &lt;stefan.tomanek@wertarbyte.de&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
</feed>
