<feed xmlns='http://www.w3.org/2005/Atom'>
<title>user/sven/linux.git/include/net/dsa.h, branch v5.7</title>
<subtitle>Linux Kernel
</subtitle>
<id>https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v5.7</id>
<link rel='self' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v5.7'/>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/'/>
<updated>2020-03-30T18:44:00Z</updated>
<entry>
<title>net: dsa: add port policers</title>
<updated>2020-03-30T18:44:00Z</updated>
<author>
<name>Vladimir Oltean</name>
<email>vladimir.oltean@nxp.com</email>
</author>
<published>2020-03-29T11:51:59Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=342971766c177ee5ad42dadd9809d581c5a65a67'/>
<id>urn:sha1:342971766c177ee5ad42dadd9809d581c5a65a67</id>
<content type='text'>
The approach taken to pass the port policer methods on to drivers is
pragmatic. It is similar to the port mirroring implementation (in that
the DSA core does all of the filter block interaction and only passes
simple operations for the driver to implement) and dissimilar to how
flow-based policers are going to be implemented (where the driver has
full control over the flow_cls_offload data structure).

Signed-off-by: Vladimir Oltean &lt;vladimir.oltean@nxp.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>net: dsa: implement auto-normalization of MTU for bridge hardware datapath</title>
<updated>2020-03-27T23:07:25Z</updated>
<author>
<name>Vladimir Oltean</name>
<email>vladimir.oltean@nxp.com</email>
</author>
<published>2020-03-27T19:55:43Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=bff33f7e2ae2e805a4b0af597b58422185c68900'/>
<id>urn:sha1:bff33f7e2ae2e805a4b0af597b58422185c68900</id>
<content type='text'>
Many switches don't have an explicit knob for configuring the MTU
(maximum transmission unit per interface).  Instead, they do the
length-based packet admission checks on the ingress interface, for
reasons that are easy to understand (why would you accept a packet in
the queuing subsystem if you know you're going to drop it anyway).

So it is actually the MRU that these switches permit configuring.

In Linux there only exists the IFLA_MTU netlink attribute and the
associated dev_set_mtu function. The comments like to play blind and say
that it's changing the "maximum transfer unit", which is to say that
there isn't any directionality in the meaning of the MTU word. So that
is the interpretation that this patch is giving to things: MTU == MRU.

When 2 interfaces having different MTUs are bridged, the bridge driver
MTU auto-adjustment logic kicks in: what br_mtu_auto_adjust() does is it
adjusts the MTU of the bridge net device itself (and not that of the
slave net devices) to the minimum value of all slave interfaces, in
order for forwarded packets to not exceed the MTU regardless of the
interface they are received and send on.

The idea behind this behavior, and why the slave MTUs are not adjusted,
is that normal termination from Linux over the L2 forwarding domain
should happen over the bridge net device, which _is_ properly limited by
the minimum MTU. And termination over individual slave devices is
possible even if those are bridged. But that is not "forwarding", so
there's no reason to do normalization there, since only a single
interface sees that packet.

The problem with those switches that can only control the MRU is with
the offloaded data path, where a packet received on an interface with
MRU 9000 would still be forwarded to an interface with MRU 1500. And the
br_mtu_auto_adjust() function does not really help, since the MTU
configured on the bridge net device is ignored.

In order to enforce the de-facto MTU == MRU rule for these switches, we
need to do MTU normalization, which means: in order for no packet larger
than the MTU configured on this port to be sent, then we need to limit
the MRU on all ports that this packet could possibly come from. AKA
since we are configuring the MRU via MTU, it means that all ports within
a bridge forwarding domain should have the same MTU.

And that is exactly what this patch is trying to do.

&gt;From an implementation perspective, we try to follow the intent of the
user, otherwise there is a risk that we might livelock them (they try to
change the MTU on an already-bridged interface, but we just keep
changing it back in an attempt to keep the MTU normalized). So the MTU
that the bridge is normalized to is either:

 - The most recently changed one:

   ip link set dev swp0 master br0
   ip link set dev swp1 master br0
   ip link set dev swp0 mtu 1400

   This sequence will make swp1 inherit MTU 1400 from swp0.

 - The one of the most recently added interface to the bridge:

   ip link set dev swp0 master br0
   ip link set dev swp1 mtu 1400
   ip link set dev swp1 master br0

   The above sequence will make swp0 inherit MTU 1400 as well.

Suggested-by: Florian Fainelli &lt;f.fainelli@gmail.com&gt;
Signed-off-by: Vladimir Oltean &lt;vladimir.oltean@nxp.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>net: dsa: configure the MTU for switch ports</title>
<updated>2020-03-27T23:07:24Z</updated>
<author>
<name>Vladimir Oltean</name>
<email>vladimir.oltean@nxp.com</email>
</author>
<published>2020-03-27T19:55:42Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=bfcb813203e619a8960a819bf533ad2a108d8105'/>
<id>urn:sha1:bfcb813203e619a8960a819bf533ad2a108d8105</id>
<content type='text'>
It is useful be able to configure port policers on a switch to accept
frames of various sizes:

- Increase the MTU for better throughput from the default of 1500 if it
  is known that there is no 10/100 Mbps device in the network.
- Decrease the MTU to limit the latency of high-priority frames under
  congestion, or work around various network segments that add extra
  headers to packets which can't be fragmented.

For DSA slave ports, this is mostly a pass-through callback, called
through the regular ndo ops and at probe time (to ensure consistency
across all supported switches).

The CPU port is called with an MTU equal to the largest configured MTU
of the slave ports. The assumption is that the user might want to
sustain a bidirectional conversation with a partner over any switch
port.

The DSA master is configured the same as the CPU port, plus the tagger
overhead. Since the MTU is by definition L2 payload (sans Ethernet
header), it is up to each individual driver to figure out if it needs to
do anything special for its frame tags on the CPU port (it shouldn't
except in special cases). So the MTU does not contain the tagger
overhead on the CPU port.
However the MTU of the DSA master, minus the tagger overhead, is used as
a proxy for the MTU of the CPU port, which does not have a net device.
This is to avoid uselessly calling the .change_mtu function on the CPU
port when nothing should change.

So it is safe to assume that the DSA master and the CPU port MTUs are
apart by exactly the tagger's overhead in bytes.

Some changes were made around dsa_master_set_mtu(), function which was
now removed, for 2 reasons:
  - dev_set_mtu() already calls dev_validate_mtu(), so it's redundant to
    do the same thing in DSA
  - __dev_set_mtu() returns 0 if ops-&gt;ndo_change_mtu is an absent method
That is to say, there's no need for this function in DSA, we can safely
call dev_set_mtu() directly, take the rtnl lock when necessary, and just
propagate whatever errors get reported (since the user probably wants to
be informed).

Some inspiration (mainly in the MTU DSA notifier) was taken from a
vaguely similar patch from Murali and Florian, who are credited as
co-developers down below.

Co-developed-by: Murali Krishna Policharla &lt;murali.policharla@broadcom.com&gt;
Signed-off-by: Murali Krishna Policharla &lt;murali.policharla@broadcom.com&gt;
Co-developed-by: Florian Fainelli &lt;f.fainelli@gmail.com&gt;
Signed-off-by: Florian Fainelli &lt;f.fainelli@gmail.com&gt;
Signed-off-by: Vladimir Oltean &lt;vladimir.oltean@nxp.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>net: dsa: Add bypass operations for the flower classifier-action filter</title>
<updated>2020-03-04T02:57:49Z</updated>
<author>
<name>Vladimir Oltean</name>
<email>vladimir.oltean@nxp.com</email>
</author>
<published>2020-02-29T14:31:13Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=ed11bb1f9657e95e8b79a9a49211986bde96005a'/>
<id>urn:sha1:ed11bb1f9657e95e8b79a9a49211986bde96005a</id>
<content type='text'>
Due to the immense variety of classification keys and actions available
for tc-flower, as well as due to potentially very different DSA switch
capabilities, it doesn't make a lot of sense for the DSA mid layer to
even attempt to interpret these. So just pass them on to the underlying
switch driver.

DSA implements just the standard boilerplate for binding and unbinding
flow blocks to ports, since nobody wants to deal with that.

Signed-off-by: Vladimir Oltean &lt;vladimir.oltean@nxp.com&gt;
Reviewed-by: Florian Fainelli &lt;f.fainelli@gmail.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>net: dsa: propagate resolved link config via mac_link_up()</title>
<updated>2020-02-27T20:02:14Z</updated>
<author>
<name>Russell King</name>
<email>rmk+kernel@armlinux.org.uk</email>
</author>
<published>2020-02-26T10:23:46Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=5b502a7b2992008a1fd5962ba032771b03c4e840'/>
<id>urn:sha1:5b502a7b2992008a1fd5962ba032771b03c4e840</id>
<content type='text'>
Propagate the resolved link configuration down via DSA's
phylink_mac_link_up() operation to allow split PCS/MAC to work.

Tested-by: Vladimir Oltean &lt;vladimir.oltean@nxp.com&gt;
Signed-off-by: Russell King &lt;rmk+kernel@armlinux.org.uk&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>net: dsa: Get information about stacked DSA protocol</title>
<updated>2020-01-09T00:01:13Z</updated>
<author>
<name>Florian Fainelli</name>
<email>f.fainelli@gmail.com</email>
</author>
<published>2020-01-08T05:06:05Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=4d776482ecc689bdd68627985ac4cb5a6f325953'/>
<id>urn:sha1:4d776482ecc689bdd68627985ac4cb5a6f325953</id>
<content type='text'>
It is possible to stack multiple DSA switches in a way that they are not
part of the tree (disjoint) but the DSA master of a switch is a DSA
slave of another. When that happens switch drivers may have to know this
is the case so as to determine whether their tagging protocol has a
remove chance of working.

This is useful for specific switch drivers such as b53 where devices
have been known to be stacked in the wild without the Broadcom tag
protocol supporting that feature. This allows b53 to continue supporting
those devices by forcing the disabling of Broadcom tags on the outermost
switches if necessary.

The get_tag_protocol() function is therefore updated to gain an
additional enum dsa_tag_protocol argument which denotes the current
tagging protocol used by the DSA master we are attached to, else
DSA_TAG_PROTO_NONE for the top of the dsa_switch_tree.

Signed-off-by: Florian Fainelli &lt;f.fainelli@gmail.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>net: dsa: Pass pcs_poll flag from driver to PHYLINK</title>
<updated>2020-01-06T07:22:32Z</updated>
<author>
<name>Vladimir Oltean</name>
<email>vladimir.oltean@nxp.com</email>
</author>
<published>2020-01-06T01:34:12Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=787cac3f5a650fd3184a41c5a27a2fe9ded833aa'/>
<id>urn:sha1:787cac3f5a650fd3184a41c5a27a2fe9ded833aa</id>
<content type='text'>
The DSA drivers that implement .phylink_mac_link_state should normally
register an interrupt for the PCS, from which they should call
phylink_mac_change(). However not all switches implement this, and those
who don't should set this flag in dsa_switch in the .setup callback, so
that PHYLINK will poll for a few ms until the in-band AN link timer
expires and the PCS state settles.

Signed-off-by: Vladimir Oltean &lt;vladimir.oltean@nxp.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>net: dsa: Make deferred_xmit private to sja1105</title>
<updated>2020-01-05T23:13:13Z</updated>
<author>
<name>Vladimir Oltean</name>
<email>olteanv@gmail.com</email>
</author>
<published>2020-01-04T00:37:10Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=a68578c20a9667463ee3000402b21644ea62d753'/>
<id>urn:sha1:a68578c20a9667463ee3000402b21644ea62d753</id>
<content type='text'>
There are 3 things that are wrong with the DSA deferred xmit mechanism:

1. Its introduction has made the DSA hotpath ever so slightly more
   inefficient for everybody, since DSA_SKB_CB(skb)-&gt;deferred_xmit needs
   to be initialized to false for every transmitted frame, in order to
   figure out whether the driver requested deferral or not (a very rare
   occasion, rare even for the only driver that does use this mechanism:
   sja1105). That was necessary to avoid kfree_skb from freeing the skb.

2. Because L2 PTP is a link-local protocol like STP, it requires
   management routes and deferred xmit with this switch. But as opposed
   to STP, the deferred work mechanism needs to schedule the packet
   rather quickly for the TX timstamp to be collected in time and sent
   to user space. But there is no provision for controlling the
   scheduling priority of this deferred xmit workqueue. Too bad this is
   a rather specific requirement for a feature that nobody else uses
   (more below).

3. Perhaps most importantly, it makes the DSA core adhere a bit too
   much to the NXP company-wide policy "Innovate Where It Doesn't
   Matter". The sja1105 is probably the only DSA switch that requires
   some frames sent from the CPU to be routed to the slave port via an
   out-of-band configuration (register write) rather than in-band (DSA
   tag). And there are indeed very good reasons to not want to do that:
   if that out-of-band register is at the other end of a slow bus such
   as SPI, then you limit that Ethernet flow's throughput to effectively
   the throughput of the SPI bus. So hardware vendors should definitely
   not be encouraged to design this way. We do _not_ want more
   widespread use of this mechanism.

Luckily we have a solution for each of the 3 issues:

For 1, we can just remove that variable in the skb-&gt;cb and counteract
the effect of kfree_skb with skb_get, much to the same effect. The
advantage, of course, being that anybody who doesn't use deferred xmit
doesn't need to do any extra operation in the hotpath.

For 2, we can create a kernel thread for each port's deferred xmit work.
If the user switch ports are named swp0, swp1, swp2, the kernel threads
will be named swp0_xmit, swp1_xmit, swp2_xmit (there appears to be a 15
character length limit on kernel thread names). With this, the user can
change the scheduling priority with chrt $(pidof swp2_xmit).

For 3, we can actually move the entire implementation to the sja1105
driver.

So this patch deletes the generic implementation from the DSA core and
adds a new one, more adequate to the requirements of PTP TX
timestamping, in sja1105_main.c.

Suggested-by: Florian Fainelli &lt;f.fainelli@gmail.com&gt;
Signed-off-by: Vladimir Oltean &lt;olteanv@gmail.com&gt;
Reviewed-by: Florian Fainelli &lt;f.fainelli@gmail.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>net: dsa: add support for Atheros AR9331 TAG format</title>
<updated>2019-12-21T01:05:47Z</updated>
<author>
<name>Oleksij Rempel</name>
<email>o.rempel@pengutronix.de</email>
</author>
<published>2019-12-18T08:02:14Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=48fda74f0a9377ce2145ac5f06b6db0274880826'/>
<id>urn:sha1:48fda74f0a9377ce2145ac5f06b6db0274880826</id>
<content type='text'>
Add support for tag format used in Atheros AR9331 built-in switch.

Reviewed-by: Vivien Didelot &lt;vivien.didelot@gmail.com&gt;
Reviewed-by: Andrew Lunn &lt;andrew@lunn.ch&gt;
Signed-off-by: Oleksij Rempel &lt;o.rempel@pengutronix.de&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>net: dsa: ocelot: add tagger for Ocelot/Felix switches</title>
<updated>2019-11-15T20:32:16Z</updated>
<author>
<name>Vladimir Oltean</name>
<email>vladimir.oltean@nxp.com</email>
</author>
<published>2019-11-14T15:03:29Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=8dce89aa5f3274e7c26132433840f63d129406bb'/>
<id>urn:sha1:8dce89aa5f3274e7c26132433840f63d129406bb</id>
<content type='text'>
While it is entirely possible that this tagger format is in fact more
generic than just these 2 switch families, I don't have that knowledge.
The Seville switch in NXP T1040 has a similar frame format, but there
are enough differences (e.g. DEST field starts at bit 57 instead of 56)
that calling this file tag_vitesse.c is a bit of a stretch at the
moment. The frame format has been listed in a comment so that people who
add support for further Vitesse switches can rework this tagger while
keeping compatibility with Felix.

The "ocelot" name was chosen instead of "felix" because even the Ocelot
switch can act as a DSA device when it is used in NPI mode, and the Felix
tagger format is almost identical. Currently it is only used for the
Felix switch embedded in the NXP LS1028A chip.

The ABI for this tagger should be considered "not stable" at the moment.
The DSA tag is always placed before the Ethernet header and therefore,
we are using the long prefix for RX tags to avoid putting the DSA master
port in promiscuous mode. Once there will be an API in DSA for drivers
to request DSA masters to be in promiscuous mode unconditionally, we
will switch to the "no prefix" extraction frame header, which will save
16 padding bytes for each RX frame.

Signed-off-by: Vladimir Oltean &lt;vladimir.oltean@nxp.com&gt;
Reviewed-by: Florian Fainelli &lt;f.fainelli@gmail.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
</feed>
