<feed xmlns='http://www.w3.org/2005/Atom'>
<title>user/sven/linux.git/include/uapi/linux/pkt_sched.h, branch v5.3.11</title>
<subtitle>Linux Kernel
</subtitle>
<id>https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v5.3.11</id>
<link rel='self' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v5.3.11'/>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/'/>
<updated>2019-07-16T21:19:19Z</updated>
<entry>
<title>fix: taprio: Change type of txtime-delay parameter to u32</title>
<updated>2019-07-16T21:19:19Z</updated>
<author>
<name>Vedang Patel</name>
<email>vedang.patel@intel.com</email>
</author>
<published>2019-07-16T19:52:18Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=a5b647007e9d794956dbed9339a3354a9fc4d5c3'/>
<id>urn:sha1:a5b647007e9d794956dbed9339a3354a9fc4d5c3</id>
<content type='text'>
During the review of the iproute2 patches for txtime-assist mode, it was
pointed out that it does not make sense for the txtime-delay parameter to
be negative. So, change the type of the parameter from s32 to u32.

Fixes: 4cfd5779bd6e ("taprio: Add support for txtime-assist mode")
Reported-by: Stephen Hemminger &lt;stephen@networkplumber.org&gt;
Signed-off-by: Vedang Patel &lt;vedang.patel@intel.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>pkt_sched: Include const.h</title>
<updated>2019-07-09T21:47:45Z</updated>
<author>
<name>David Ahern</name>
<email>dsahern@gmail.com</email>
</author>
<published>2019-07-09T21:45:17Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=fbc697796e358d1ed8ed25758b19bdb3a1f8e9f9'/>
<id>urn:sha1:fbc697796e358d1ed8ed25758b19bdb3a1f8e9f9</id>
<content type='text'>
Commit 9903c8dc7342 changed TC_ETF defines to use _BITUL instead of BIT
but did not add the dependecy on linux/const.h. As a consequence,
importing the uapi headers into iproute2 causes builds to fail. Add
the dependency.

Fixes: 9903c8dc7342 ("etf: Don't use BIT() in UAPI headers.")
Cc: Vedang Patel &lt;vedang.patel@intel.com&gt;
Signed-off-by: David Ahern &lt;dsahern@gmail.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>taprio: Add support for txtime-assist mode</title>
<updated>2019-06-28T21:45:34Z</updated>
<author>
<name>Vedang Patel</name>
<email>vedang.patel@intel.com</email>
</author>
<published>2019-06-25T22:07:17Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=4cfd5779bd6efe8c76b4494aec63a063be0d2ff2'/>
<id>urn:sha1:4cfd5779bd6efe8c76b4494aec63a063be0d2ff2</id>
<content type='text'>
Currently, we are seeing non-critical packets being transmitted outside of
their timeslice. We can confirm that the packets are being dequeued at the
right time. So, the delay is induced in the hardware side.  The most likely
reason is the hardware queues are starving the lower priority queues.

In order to improve the performance of taprio, we will be making use of the
txtime feature provided by the ETF qdisc. For all the packets which do not
have the SO_TXTIME option set, taprio will set the transmit timestamp (set
in skb-&gt;tstamp) in this mode. TAPrio Qdisc will ensure that the transmit
time for the packet is set to when the gate is open. If SO_TXTIME is set,
the TAPrio qdisc will validate whether the timestamp (in skb-&gt;tstamp)
occurs when the gate corresponding to skb's traffic class is open.

Following two parameters added to support this mode:
- flags: used to enable txtime-assist mode. Will also be used to enable
  other modes (like hardware offloading) later.
- txtime-delay: This indicates the minimum time it will take for the packet
  to hit the wire. This is useful in determining whether we can transmit
the packet in the remaining time if the gate corresponding to the packet is
currently open.

An example configuration for enabling txtime-assist:

tc qdisc replace dev eth0 parent root handle 100 taprio \\
      num_tc 3 \\
      map 2 2 1 0 2 2 2 2 2 2 2 2 2 2 2 2 \\
      queues 1@0 1@0 1@0 \\
      base-time 1558653424279842568 \\
      sched-entry S 01 300000 \\
      sched-entry S 02 300000 \\
      sched-entry S 04 400000 \\
      flags 0x1 \\
      txtime-delay 40000 \\
      clockid CLOCK_TAI

tc qdisc replace dev $IFACE parent 100:1 etf skip_sock_check \\
      offload delta 200000 clockid CLOCK_TAI

Note that all the traffic classes are mapped to the same queue.  This is
only possible in taprio when txtime-assist is enabled. Also, note that the
ETF Qdisc is enabled with offload mode set.

In this mode, if the packet's traffic class is open and the complete packet
can be transmitted, taprio will try to transmit the packet immediately.
This will be done by setting skb-&gt;tstamp to current_time + the time delta
indicated in the txtime-delay parameter. This parameter indicates the time
taken (in software) for packet to reach the network adapter.

If the packet cannot be transmitted in the current interval or if the
packet's traffic is not currently transmitting, the skb-&gt;tstamp is set to
the next available timestamp value. This is tracked in the next_launchtime
parameter in the struct sched_entry.

The behaviour w.r.t admin and oper schedules is not changed from what is
present in software mode.

The transmit time is already known in advance. So, we do not need the HR
timers to advance the schedule and wakeup the dequeue side of taprio.  So,
HR timer won't be run when this mode is enabled.

Signed-off-by: Vedang Patel &lt;vedang.patel@intel.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>etf: Add skip_sock_check</title>
<updated>2019-06-28T21:45:33Z</updated>
<author>
<name>Vedang Patel</name>
<email>vedang.patel@intel.com</email>
</author>
<published>2019-06-25T22:07:14Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=d14d2b20680f02fa739c2cbbb59e3629e487f359'/>
<id>urn:sha1:d14d2b20680f02fa739c2cbbb59e3629e487f359</id>
<content type='text'>
Currently, etf expects a socket with SO_TXTIME option set for each packet
it encounters. So, it will drop all other packets. But, in the future
commits we are planning to add functionality where tstamp value will be set
by another qdisc. Also, some packets which are generated from within the
kernel (e.g. ICMP packets) do not have any socket associated with them.

So, this commit adds support for skip_sock_check. When this option is set,
etf will skip checking for a socket and other associated options for all
skbs.

Signed-off-by: Vedang Patel &lt;vedang.patel@intel.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>etf: Don't use BIT() in UAPI headers.</title>
<updated>2019-06-28T21:45:33Z</updated>
<author>
<name>Vedang Patel</name>
<email>vedang.patel@intel.com</email>
</author>
<published>2019-06-25T22:07:13Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=9903c8dc734265689d5770ff28c84a7228fe5890'/>
<id>urn:sha1:9903c8dc734265689d5770ff28c84a7228fe5890</id>
<content type='text'>
The BIT() macro isn't exported as part of the UAPI interface. So, the
compile-test to ensure they are self contained fails. So, use _BITUL()
instead.

Signed-off-by: Vedang Patel &lt;vedang.patel@intel.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>taprio: Add support for cycle-time-extension</title>
<updated>2019-05-01T15:58:51Z</updated>
<author>
<name>Vinicius Costa Gomes</name>
<email>vinicius.gomes@intel.com</email>
</author>
<published>2019-04-29T22:48:33Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=c25031e993440debdd530278ce2171ce477df029'/>
<id>urn:sha1:c25031e993440debdd530278ce2171ce477df029</id>
<content type='text'>
IEEE 802.1Q-2018 defines the concept of a cycle-time-extension, so the
last entry of a schedule before the start of a new schedule can be
extended, so "too-short" entries can be avoided.

Signed-off-by: Vinicius Costa Gomes &lt;vinicius.gomes@intel.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>taprio: Add support for setting the cycle-time manually</title>
<updated>2019-05-01T15:58:51Z</updated>
<author>
<name>Vinicius Costa Gomes</name>
<email>vinicius.gomes@intel.com</email>
</author>
<published>2019-04-29T22:48:32Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=6ca6a6654225f3cd001304d33429c817e0c0b85f'/>
<id>urn:sha1:6ca6a6654225f3cd001304d33429c817e0c0b85f</id>
<content type='text'>
IEEE 802.1Q-2018 defines that a the cycle-time of a schedule may be
overridden, so the schedule is truncated to a determined "width".

Signed-off-by: Vinicius Costa Gomes &lt;vinicius.gomes@intel.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>taprio: Add support adding an admin schedule</title>
<updated>2019-05-01T15:58:51Z</updated>
<author>
<name>Vinicius Costa Gomes</name>
<email>vinicius.gomes@intel.com</email>
</author>
<published>2019-04-29T22:48:31Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=a3d43c0d56f1b94e74963a2fbadfb70126d92213'/>
<id>urn:sha1:a3d43c0d56f1b94e74963a2fbadfb70126d92213</id>
<content type='text'>
The IEEE 802.1Q-2018 defines two "types" of schedules, the "Oper" (from
operational?) and "Admin" ones. Up until now, 'taprio' only had
support for the "Oper" one, added when the qdisc is created. This adds
support for the "Admin" one, which allows the .change() operation to
be supported.

Just for clarification, some quick (and dirty) definitions, the "Oper"
schedule is the currently (as in this instant) running one, and it's
read-only. The "Admin" one is the one that the system configurator has
installed, it can be changed, and it will be "promoted" to "Oper" when
it's 'base-time' is reached.

The idea behing this patch is that calling something like the below,
(after taprio is already configured with an initial schedule):

$ tc qdisc change taprio dev IFACE parent root 	     \
     	   base-time X 	     	   	       	     \
     	   sched-entry &lt;CMD&gt; &lt;GATES&gt; &lt;INTERVAL&gt;	     \
	   ...

Will cause a new admin schedule to be created and programmed to be
"promoted" to "Oper" at instant X. If an "Admin" schedule already
exists, it will be overwritten with the new parameters.

Up until now, there was some code that was added to ease the support
of changing a single entry of a schedule, but was ultimately unused.
Now, that we have support for "change" with more well thought
semantics, updating a single entry seems to be less useful.

So we remove what is in practice dead code, and return a "not
supported" error if the user tries to use it. If changing a single
entry would make the user's life easier we may ressurrect this idea,
but at this point, removing it simplifies the code.

For now, only the schedule specific bits are allowed to be added for a
new schedule, that means that 'clockid', 'num_tc', 'map' and 'queues'
cannot be modified.

Example:

$ tc qdisc change dev IFACE parent root handle 100 taprio \
      base-time $BASE_TIME \
      sched-entry S 00 500000 \
      sched-entry S 0f 500000 \
      clockid CLOCK_TAI

The only change in the netlink API introduced by this change is the
introduction of an "admin" type in the response to a dump request,
that type allows userspace to separate the "oper" schedule from the
"admin" schedule. If userspace doesn't support the "admin" type, it
will only display the "oper" schedule.

Signed-off-by: Vinicius Costa Gomes &lt;vinicius.gomes@intel.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>sch_cake: Permit use of connmarks as tin classifiers</title>
<updated>2019-03-04T04:14:28Z</updated>
<author>
<name>Kevin Darbyshire-Bryant</name>
<email>ldir@darbyshire-bryant.me.uk</email>
</author>
<published>2019-03-01T15:04:05Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=0b5c7efdfc6e389ec6840579fe90bdb6f42b08dc'/>
<id>urn:sha1:0b5c7efdfc6e389ec6840579fe90bdb6f42b08dc</id>
<content type='text'>
Add flag 'FWMARK' to enable use of firewall connmarks as tin selector.
The connmark (skbuff-&gt;mark) needs to be in the range 1-&gt;tin_cnt ie.
for diffserv3 the mark needs to be 1-&gt;3.

Background

Typically CAKE uses DSCP as the basis for tin selection.  DSCP values
are relatively easily changed as part of the egress path, usually with
iptables &amp; the mangle table, ingress is more challenging.  CAKE is often
used on the WAN interface of a residential gateway where passthrough of
DSCP from the ISP is either missing or set to unhelpful values thus use
of ingress DSCP values for tin selection isn't helpful in that
environment.

An approach to solving the ingress tin selection problem is to use
CAKE's understanding of tc filters.  Naive tc filters could match on
source/destination port numbers and force tin selection that way, but
multiple filters don't scale particularly well as each filter must be
traversed whether it matches or not. e.g. a simple example to map 3
firewall marks to tins:

MAJOR=$( tc qdisc show dev $DEV | head -1 | awk '{print $3}' )
tc filter add dev $DEV parent $MAJOR protocol all handle 0x01 fw action skbedit priority ${MAJOR}1
tc filter add dev $DEV parent $MAJOR protocol all handle 0x02 fw action skbedit priority ${MAJOR}2
tc filter add dev $DEV parent $MAJOR protocol all handle 0x03 fw action skbedit priority ${MAJOR}3

Another option is to use eBPF cls_act with tc filters e.g.

MAJOR=$( tc qdisc show dev $DEV | head -1 | awk '{print $3}' )
tc filter add dev $DEV parent $MAJOR bpf da obj my-bpf-fwmark-to-class.o

This has the disadvantages of a) needing someone to write &amp; maintain
the bpf program, b) a bpf toolchain to compile it and c) needing to
hardcode the major number in the bpf program so it matches the cake
instance (or forcing the cake instance to a particular major number)
since the major number cannot be passed to the bpf program via tc
command line.

As already hinted at by the previous examples, it would be helpful
to associate tins with something that survives the Internet path and
ideally allows tin selection on both egress and ingress.  Netfilter's
conntrack permits setting an identifying mark on a connection which
can also be restored to an ingress packet with tc action connmark e.g.

tc filter add dev eth0 parent ffff: protocol all prio 10 u32 \
	match u32 0 0 flowid 1:1 action connmark action mirred egress redirect dev ifb1

Since tc's connmark action has restored any connmark into skb-&gt;mark,
any of the previous solutions are based upon it and in one form or
another copy that mark to the skb-&gt;priority field where again CAKE
picks this up.

This change cuts out at least one of the (less intuitive &amp;
non-scalable) middlemen and permit direct access to skb-&gt;mark.

Signed-off-by: Kevin Darbyshire-Bryant &lt;ldir@darbyshire-bryant.me.uk&gt;
Signed-off-by: Toke Høiland-Jørgensen &lt;toke@redhat.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>net: sched: pie: add more cases to auto-tune alpha and beta</title>
<updated>2019-02-25T22:21:03Z</updated>
<author>
<name>Mohit P. Tahiliani</name>
<email>tahiliani@nitk.edu.in</email>
</author>
<published>2019-02-25T19:09:59Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=3f7ae5f3dc5295ac17d6521130ed8a8f8a723fbf'/>
<id>urn:sha1:3f7ae5f3dc5295ac17d6521130ed8a8f8a723fbf</id>
<content type='text'>
The current implementation scales the local alpha and beta
variables in the calculate_probability function by the same
amount for all values of drop probability below 1%.

RFC 8033 suggests using additional cases for auto-tuning
alpha and beta when the drop probability is less than 1%.

In order to add more auto-tuning cases, MAX_PROB must be
scaled by u64 instead of u32 to prevent underflow when
scaling the local alpha and beta variables in the
calculate_probability function.

Signed-off-by: Mohit P. Tahiliani &lt;tahiliani@nitk.edu.in&gt;
Signed-off-by: Dhaval Khandla &lt;dhavaljkhandla26@gmail.com&gt;
Signed-off-by: Hrishikesh Hiraskar &lt;hrishihiraskar@gmail.com&gt;
Signed-off-by: Manish Kumar B &lt;bmanish15597@gmail.com&gt;
Signed-off-by: Sachin D. Patil &lt;sdp.sachin@gmail.com&gt;
Signed-off-by: Leslie Monis &lt;lesliemonis@gmail.com&gt;
Acked-by: Dave Taht &lt;dave.taht@gmail.com&gt;
Acked-by: Jamal Hadi Salim &lt;jhs@mojatatu.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
</feed>
