diff options
Diffstat (limited to 'Documentation')
88 files changed, 2282 insertions, 742 deletions
diff --git a/Documentation/ABI/testing/sysfs-block-bcache b/Documentation/ABI/testing/sysfs-block-bcache index 9e4bbc5d51fd..9344a657ca70 100644 --- a/Documentation/ABI/testing/sysfs-block-bcache +++ b/Documentation/ABI/testing/sysfs-block-bcache @@ -106,13 +106,6 @@ Description: will be discarded from the cache. Should not be turned off with writeback caching enabled. -What: /sys/block/<disk>/bcache/discard -Date: November 2010 -Contact: Kent Overstreet <kent.overstreet@gmail.com> -Description: - For a cache, a boolean allowing discard/TRIM to be turned off - or back on if the device supports it. - What: /sys/block/<disk>/bcache/bucket_size Date: November 2010 Contact: Kent Overstreet <kent.overstreet@gmail.com> diff --git a/Documentation/admin-guide/bcache.rst b/Documentation/admin-guide/bcache.rst index 6fdb495ac466..f71f349553e4 100644 --- a/Documentation/admin-guide/bcache.rst +++ b/Documentation/admin-guide/bcache.rst @@ -17,8 +17,7 @@ The latest bcache kernel code can be found from mainline Linux kernel: It's designed around the performance characteristics of SSDs - it only allocates in erase block sized buckets, and it uses a hybrid btree/log to track cached extents (which can be anywhere from a single sector to the bucket size). It's -designed to avoid random writes at all costs; it fills up an erase block -sequentially, then issues a discard before reusing it. +designed to avoid random writes at all costs. Both writethrough and writeback caching are supported. Writeback defaults to off, but can be switched on and off arbitrarily at runtime. Bcache goes to @@ -618,19 +617,11 @@ bucket_size cache_replacement_policy One of either lru, fifo or random. -discard - Boolean; if on a discard/TRIM will be issued to each bucket before it is - reused. Defaults to off, since SATA TRIM is an unqueued command (and thus - slow). - freelist_percent Size of the freelist as a percentage of nbuckets. Can be written to to increase the number of buckets kept on the freelist, which lets you artificially reduce the size of the cache at runtime. Mostly for testing - purposes (i.e. testing how different size caches affect your hit rate), but - since buckets are discarded when they move on to the freelist will also make - the SSD's garbage collection easier by effectively giving it more reserved - space. + purposes (i.e. testing how different size caches affect your hit rate). io_errors Number of errors that have occurred, decayed by io_error_halflife. diff --git a/Documentation/admin-guide/blockdev/zoned_loop.rst b/Documentation/admin-guide/blockdev/zoned_loop.rst index 64dcfde7450a..806adde664db 100644 --- a/Documentation/admin-guide/blockdev/zoned_loop.rst +++ b/Documentation/admin-guide/blockdev/zoned_loop.rst @@ -68,30 +68,43 @@ The options available for the add command can be listed by reading the In more details, the options that can be used with the "add" command are as follows. -================ =========================================================== -id Device number (the X in /dev/zloopX). - Default: automatically assigned. -capacity_mb Device total capacity in MiB. This is always rounded up to - the nearest higher multiple of the zone size. - Default: 16384 MiB (16 GiB). -zone_size_mb Device zone size in MiB. Default: 256 MiB. -zone_capacity_mb Device zone capacity (must always be equal to or lower than - the zone size. Default: zone size. -conv_zones Total number of conventioanl zones starting from sector 0. - Default: 8. -base_dir Path to the base directory where to create the directory - containing the zone files of the device. - Default=/var/local/zloop. - The device directory containing the zone files is always - named with the device ID. E.g. the default zone file - directory for /dev/zloop0 is /var/local/zloop/0. -nr_queues Number of I/O queues of the zoned block device. This value is - always capped by the number of online CPUs - Default: 1 -queue_depth Maximum I/O queue depth per I/O queue. - Default: 64 -buffered_io Do buffered IOs instead of direct IOs (default: false) -================ =========================================================== +=================== ========================================================= +id Device number (the X in /dev/zloopX). + Default: automatically assigned. +capacity_mb Device total capacity in MiB. This is always rounded up + to the nearest higher multiple of the zone size. + Default: 16384 MiB (16 GiB). +zone_size_mb Device zone size in MiB. Default: 256 MiB. +zone_capacity_mb Device zone capacity (must always be equal to or lower + than the zone size. Default: zone size. +conv_zones Total number of conventioanl zones starting from + sector 0 + Default: 8 +base_dir Path to the base directory where to create the directory + containing the zone files of the device. + Default=/var/local/zloop. + The device directory containing the zone files is always + named with the device ID. E.g. the default zone file + directory for /dev/zloop0 is /var/local/zloop/0. +nr_queues Number of I/O queues of the zoned block device. This + value is always capped by the number of online CPUs + Default: 1 +queue_depth Maximum I/O queue depth per I/O queue. + Default: 64 +buffered_io Do buffered IOs instead of direct IOs (default: false) +zone_append Enable or disable a zloop device native zone append + support. + Default: 1 (enabled). + If native zone append support is disabled, the block layer + will emulate this operation using regular write + operations. +ordered_zone_append Enable zloop mitigation of zone append reordering. + Default: disabled. + This is useful for testing file systems file data mapping + (extents), as when enabled, this can significantly reduce + the number of data extents needed to for a file data + mapping. +=================== ========================================================= 3) Deleting a Zoned Device -------------------------- diff --git a/Documentation/admin-guide/md.rst b/Documentation/admin-guide/md.rst index deed823eab01..dc7eab191caa 100644 --- a/Documentation/admin-guide/md.rst +++ b/Documentation/admin-guide/md.rst @@ -238,6 +238,16 @@ All md devices contain: the number of devices in a raid4/5/6, or to support external metadata formats which mandate such clipping. + logical_block_size + Configure the array's logical block size in bytes. This attribute + is only supported for 1.x meta. Write the value before starting + array. The final array LBS uses the maximum between this + configuration and LBS of all combined devices. Note that + LBS cannot exceed PAGE_SIZE before RAID supports folio. + WARNING: Arrays created on new kernel cannot be assembled at old + kernel due to padding check, Set module parameter 'check_new_feature' + to false to bypass, but data loss may occur. + reshape_position This is either ``none`` or a sector number within the devices of the array where ``reshape`` is up to. If this is set, the three diff --git a/Documentation/admin-guide/sysctl/net.rst b/Documentation/admin-guide/sysctl/net.rst index 2ef50828aff1..369a738a6819 100644 --- a/Documentation/admin-guide/sysctl/net.rst +++ b/Documentation/admin-guide/sysctl/net.rst @@ -212,6 +212,14 @@ mem_pcpu_rsv Per-cpu reserved forward alloc cache size in page units. Default 1MB per CPU. +bypass_prot_mem +--------------- + +Skip charging socket buffers to the global per-protocol memory +accounting controlled by net.ipv4.tcp_mem, net.ipv4.udp_mem, etc. + +Default: 0 (off) + rmem_default ------------ @@ -347,9 +355,9 @@ skb_defer_max ------------- Max size (in skbs) of the per-cpu list of skbs being freed -by the cpu which allocated them. Used by TCP stack so far. +by the cpu which allocated them. -Default: 64 +Default: 128 optmem_max ---------- @@ -406,6 +414,23 @@ to SOCK_TXREHASH_DEFAULT (i. e. not overridden by setsockopt). If set to 1 (default), hash rethink is performed on listening socket. If set to 0, hash rethink is not performed. +txq_reselection_ms +------------------ + +Controls how often (in ms) a busy connected flow can select another tx queue. + +A resection is desirable when/if user thread has migrated and XPS +would select a different queue. Same can occur without XPS +if the flow hash has changed. + +But switching txq can introduce reorders, especially if the +old queue is under high pressure. Modern TCP stacks deal +well with reorders if they happen not too often. + +To disable this feature, set the value to 0. + +Default : 1000 + gro_normal_batch ---------------- diff --git a/Documentation/devicetree/bindings/net/airoha,en7581-eth.yaml b/Documentation/devicetree/bindings/net/airoha,en7581-eth.yaml index 6d22131ac2f9..fbe2ddcdd909 100644 --- a/Documentation/devicetree/bindings/net/airoha,en7581-eth.yaml +++ b/Documentation/devicetree/bindings/net/airoha,en7581-eth.yaml @@ -17,6 +17,7 @@ properties: compatible: enum: - airoha,en7581-eth + - airoha,an7583-eth reg: items: @@ -44,6 +45,7 @@ properties: - description: PDMA irq resets: + minItems: 7 maxItems: 8 reset-names: @@ -54,8 +56,9 @@ properties: - const: xsi-mac - const: hsi0-mac - const: hsi1-mac - - const: hsi-mac + - enum: [ hsi-mac, xfp-mac ] - const: xfp-mac + minItems: 7 memory-region: items: @@ -81,6 +84,36 @@ properties: interface to implement hardware flow offloading programming Packet Processor Engine (PPE) flow table. +allOf: + - $ref: ethernet-controller.yaml# + - if: + properties: + compatible: + contains: + enum: + - airoha,en7581-eth + then: + properties: + resets: + minItems: 8 + + reset-names: + minItems: 8 + + - if: + properties: + compatible: + contains: + enum: + - airoha,an7583-eth + then: + properties: + resets: + maxItems: 7 + + reset-names: + maxItems: 7 + patternProperties: "^ethernet@[1-4]$": type: object diff --git a/Documentation/devicetree/bindings/net/airoha,en7581-npu.yaml b/Documentation/devicetree/bindings/net/airoha,en7581-npu.yaml index c7644e6586d3..59c57f58116b 100644 --- a/Documentation/devicetree/bindings/net/airoha,en7581-npu.yaml +++ b/Documentation/devicetree/bindings/net/airoha,en7581-npu.yaml @@ -18,6 +18,7 @@ properties: compatible: enum: - airoha,en7581-npu + - airoha,an7583-npu reg: maxItems: 1 diff --git a/Documentation/devicetree/bindings/net/amd,xgbe-seattle-v1a.yaml b/Documentation/devicetree/bindings/net/amd,xgbe-seattle-v1a.yaml new file mode 100644 index 000000000000..006add8b6410 --- /dev/null +++ b/Documentation/devicetree/bindings/net/amd,xgbe-seattle-v1a.yaml @@ -0,0 +1,147 @@ +# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause) +%YAML 1.2 +--- +$id: http://devicetree.org/schemas/net/amd,xgbe-seattle-v1a.yaml# +$schema: http://devicetree.org/meta-schemas/core.yaml# + +title: AMD XGBE Seattle v1a + +maintainers: + - Shyam Sundar S K <Shyam-sundar.S-k@amd.com> + +allOf: + - $ref: /schemas/net/ethernet-controller.yaml# + +properties: + compatible: + const: amd,xgbe-seattle-v1a + + reg: + items: + - description: MAC registers + - description: PCS registers + - description: SerDes Rx/Tx registers + - description: SerDes integration registers (1/2) + - description: SerDes integration registers (2/2) + + interrupts: + description: Device interrupts. The first entry is the general device + interrupt. If amd,per-channel-interrupt is specified, each DMA channel + interrupt must be specified. The last entry is the PCS auto-negotiation + interrupt. + minItems: 2 + maxItems: 6 + + clocks: + items: + - description: DMA clock for the device + - description: PTP clock for the device + + clock-names: + items: + - const: dma_clk + - const: ptp_clk + + iommus: + maxItems: 1 + + phy-mode: true + + dma-coherent: true + + amd,per-channel-interrupt: + description: Indicates that Rx and Tx complete will generate a unique + interrupt for each DMA channel. + type: boolean + + amd,speed-set: + description: > + Speed capabilities of the device. + 0 = 1GbE and 10GbE + 1 = 2.5GbE and 10GbE + $ref: /schemas/types.yaml#/definitions/uint32 + enum: [0, 1] + + amd,serdes-blwc: + description: Baseline wandering correction enablement for each speed. + $ref: /schemas/types.yaml#/definitions/uint32-array + minItems: 3 + maxItems: 3 + items: + enum: [0, 1] + + amd,serdes-cdr-rate: + description: CDR rate speed selection for each speed. + $ref: /schemas/types.yaml#/definitions/uint32-array + items: + - description: CDR rate for 1GbE + - description: CDR rate for 2.5GbE + - description: CDR rate for 10GbE + + amd,serdes-pq-skew: + description: PQ data sampling skew for each speed. + $ref: /schemas/types.yaml#/definitions/uint32-array + items: + - description: PQ skew for 1GbE + - description: PQ skew for 2.5GbE + - description: PQ skew for 10GbE + + amd,serdes-tx-amp: + description: TX amplitude boost for each speed. + $ref: /schemas/types.yaml#/definitions/uint32-array + items: + - description: TX amplitude for 1GbE + - description: TX amplitude for 2.5GbE + - description: TX amplitude for 10GbE + + amd,serdes-dfe-tap-config: + description: DFE taps available to run for each speed. + $ref: /schemas/types.yaml#/definitions/uint32-array + items: + - description: DFE taps available for 1GbE + - description: DFE taps available for 2.5GbE + - description: DFE taps available for 10GbE + + amd,serdes-dfe-tap-enable: + description: DFE taps to enable for each speed. + $ref: /schemas/types.yaml#/definitions/uint32-array + items: + - description: DFE taps to enable for 1GbE + - description: DFE taps to enable for 2.5GbE + - description: DFE taps to enable for 10GbE + +required: + - compatible + - reg + - interrupts + - clocks + - clock-names + - phy-mode + +unevaluatedProperties: false + +examples: + - | + ethernet@e0700000 { + compatible = "amd,xgbe-seattle-v1a"; + reg = <0xe0700000 0x80000>, + <0xe0780000 0x80000>, + <0xe1240800 0x00400>, + <0xe1250000 0x00060>, + <0xe1250080 0x00004>; + interrupts = <0 325 4>, + <0 326 1>, <0 327 1>, <0 328 1>, <0 329 1>, + <0 323 4>; + amd,per-channel-interrupt; + clocks = <&xgbe_dma_clk>, <&xgbe_ptp_clk>; + clock-names = "dma_clk", "ptp_clk"; + phy-mode = "xgmii"; + mac-address = [ 02 a1 a2 a3 a4 a5 ]; + amd,speed-set = <0>; + amd,serdes-blwc = <1>, <1>, <0>; + amd,serdes-cdr-rate = <2>, <2>, <7>; + amd,serdes-pq-skew = <10>, <10>, <30>; + amd,serdes-tx-amp = <15>, <15>, <10>; + amd,serdes-dfe-tap-config = <3>, <3>, <1>; + amd,serdes-dfe-tap-enable = <0>, <0>, <127>; + }; diff --git a/Documentation/devicetree/bindings/net/amd-xgbe.txt b/Documentation/devicetree/bindings/net/amd-xgbe.txt deleted file mode 100644 index 9c27dfcd1133..000000000000 --- a/Documentation/devicetree/bindings/net/amd-xgbe.txt +++ /dev/null @@ -1,76 +0,0 @@ -* AMD 10GbE driver (amd-xgbe) - -Required properties: -- compatible: Should be "amd,xgbe-seattle-v1a" -- reg: Address and length of the register sets for the device - - MAC registers - - PCS registers - - SerDes Rx/Tx registers - - SerDes integration registers (1/2) - - SerDes integration registers (2/2) -- interrupts: Should contain the amd-xgbe interrupt(s). The first interrupt - listed is required and is the general device interrupt. If the optional - amd,per-channel-interrupt property is specified, then one additional - interrupt for each DMA channel supported by the device should be specified. - The last interrupt listed should be the PCS auto-negotiation interrupt. -- clocks: - - DMA clock for the amd-xgbe device (used for calculating the - correct Rx interrupt watchdog timer value on a DMA channel - for coalescing) - - PTP clock for the amd-xgbe device -- clock-names: Should be the names of the clocks - - "dma_clk" for the DMA clock - - "ptp_clk" for the PTP clock -- phy-mode: See ethernet.txt file in the same directory - -Optional properties: -- dma-coherent: Present if dma operations are coherent -- amd,per-channel-interrupt: Indicates that Rx and Tx complete will generate - a unique interrupt for each DMA channel - this requires an additional - interrupt be configured for each DMA channel -- amd,speed-set: Speed capabilities of the device - 0 - 1GbE and 10GbE (default) - 1 - 2.5GbE and 10GbE - -The MAC address will be determined using the optional properties defined in -ethernet.txt. - -The following optional properties are represented by an array with each -value corresponding to a particular speed. The first array value represents -the setting for the 1GbE speed, the second value for the 2.5GbE speed and -the third value for the 10GbE speed. All three values are required if the -property is used. -- amd,serdes-blwc: Baseline wandering correction enablement - 0 - Off - 1 - On -- amd,serdes-cdr-rate: CDR rate speed selection -- amd,serdes-pq-skew: PQ (data sampling) skew -- amd,serdes-tx-amp: TX amplitude boost -- amd,serdes-dfe-tap-config: DFE taps available to run -- amd,serdes-dfe-tap-enable: DFE taps to enable - -Example: - xgbe@e0700000 { - compatible = "amd,xgbe-seattle-v1a"; - reg = <0 0xe0700000 0 0x80000>, - <0 0xe0780000 0 0x80000>, - <0 0xe1240800 0 0x00400>, - <0 0xe1250000 0 0x00060>, - <0 0xe1250080 0 0x00004>; - interrupt-parent = <&gic>; - interrupts = <0 325 4>, - <0 326 1>, <0 327 1>, <0 328 1>, <0 329 1>, - <0 323 4>; - amd,per-channel-interrupt; - clocks = <&xgbe_dma_clk>, <&xgbe_ptp_clk>; - clock-names = "dma_clk", "ptp_clk"; - phy-mode = "xgmii"; - mac-address = [ 02 a1 a2 a3 a4 a5 ]; - amd,speed-set = <0>; - amd,serdes-blwc = <1>, <1>, <0>; - amd,serdes-cdr-rate = <2>, <2>, <7>; - amd,serdes-pq-skew = <10>, <10>, <30>; - amd,serdes-tx-amp = <15>, <15>, <10>; - amd,serdes-dfe-tap-config = <3>, <3>, <1>; - amd,serdes-dfe-tap-enable = <0>, <0>, <127>; - }; diff --git a/Documentation/devicetree/bindings/net/aspeed,ast2600-mdio.yaml b/Documentation/devicetree/bindings/net/aspeed,ast2600-mdio.yaml index d6ef468495c5..a105dc07ed12 100644 --- a/Documentation/devicetree/bindings/net/aspeed,ast2600-mdio.yaml +++ b/Documentation/devicetree/bindings/net/aspeed,ast2600-mdio.yaml @@ -19,7 +19,12 @@ allOf: properties: compatible: - const: aspeed,ast2600-mdio + oneOf: + - const: aspeed,ast2600-mdio + - items: + - enum: + - aspeed,ast2700-mdio + - const: aspeed,ast2600-mdio reg: maxItems: 1 diff --git a/Documentation/devicetree/bindings/net/bluetooth/marvell,sd8897-bt.yaml b/Documentation/devicetree/bindings/net/bluetooth/marvell,sd8897-bt.yaml new file mode 100644 index 000000000000..a307c64cfa4d --- /dev/null +++ b/Documentation/devicetree/bindings/net/bluetooth/marvell,sd8897-bt.yaml @@ -0,0 +1,79 @@ +# SPDX-License-Identifier: (GPL-2.0 OR BSD-2-Clause) +%YAML 1.2 +--- +$id: http://devicetree.org/schemas/net/bluetooth/marvell,sd8897-bt.yaml# +$schema: http://devicetree.org/meta-schemas/core.yaml# + +title: Marvell 8897/8997 (sd8897/sd8997) bluetooth devices (SDIO) + +maintainers: + - Ariel D'Alessandro <ariel.dalessandro@collabora.com> + +allOf: + - $ref: /schemas/net/bluetooth/bluetooth-controller.yaml# + +properties: + compatible: + enum: + - marvell,sd8897-bt + - marvell,sd8997-bt + + reg: + maxItems: 1 + + interrupts: + maxItems: 1 + + marvell,cal-data: + $ref: /schemas/types.yaml#/definitions/uint8-array + description: + Calibration data downloaded to the device during initialization. + maxItems: 28 + + marvell,wakeup-pin: + $ref: /schemas/types.yaml#/definitions/uint16 + description: + Wakeup pin number of the bluetooth chip. Used by firmware to wakeup host + system. + + marvell,wakeup-gap-ms: + $ref: /schemas/types.yaml#/definitions/uint16 + description: + Wakeup latency of the host platform. Required by the chip sleep feature. + +required: + - compatible + - reg + - interrupts + +additionalProperties: false + +examples: + - | + #include <dt-bindings/interrupt-controller/irq.h> + + mmc { + vmmc-supply = <&wlan_en_reg>; + bus-width = <4>; + cap-power-off-card; + keep-power-in-suspend; + + #address-cells = <1>; + #size-cells = <0>; + + bluetooth@2 { + compatible = "marvell,sd8897-bt"; + reg = <2>; + interrupt-parent = <&pio>; + interrupts = <119 IRQ_TYPE_LEVEL_LOW>; + + marvell,cal-data = /bits/ 8 < + 0x37 0x01 0x1c 0x00 0xff 0xff 0xff 0xff 0x01 0x7f 0x04 0x02 + 0x00 0x00 0xba 0xce 0xc0 0xc6 0x2d 0x00 0x00 0x00 0x00 0x00 + 0x00 0x00 0xf0 0x00>; + marvell,wakeup-pin = /bits/ 16 <0x0d>; + marvell,wakeup-gap-ms = /bits/ 16 <0x64>; + }; + }; + +... diff --git a/Documentation/devicetree/bindings/net/btusb.txt b/Documentation/devicetree/bindings/net/btusb.txt index f546b1f7dd6d..a68022a57c51 100644 --- a/Documentation/devicetree/bindings/net/btusb.txt +++ b/Documentation/devicetree/bindings/net/btusb.txt @@ -14,7 +14,7 @@ Required properties: Also, vendors that use btusb may have device additional properties, e.g: -Documentation/devicetree/bindings/net/marvell-bt-8xxx.txt +Documentation/devicetree/bindings/net/bluetooth/marvell,sd8897-bt.yaml Optional properties: diff --git a/Documentation/devicetree/bindings/net/can/bosch,m_can.yaml b/Documentation/devicetree/bindings/net/can/bosch,m_can.yaml index 61ef60d8f1c7..2c9d37975bed 100644 --- a/Documentation/devicetree/bindings/net/can/bosch,m_can.yaml +++ b/Documentation/devicetree/bindings/net/can/bosch,m_can.yaml @@ -109,6 +109,26 @@ properties: maximum: 32 minItems: 1 + pinctrl-0: + description: Default pinctrl state + + pinctrl-1: + description: Can be "sleep" or "wakeup" pinctrl state + + pinctrl-2: + description: Can be "sleep" or "wakeup" pinctrl state + + pinctrl-names: + description: + When present should contain at least "default" describing the default pin + states. Other states are "sleep" which describes the pinstate when + sleeping and "wakeup" describing the pins if wakeup is enabled. + minItems: 1 + items: + - const: default + - enum: [ sleep, wakeup ] + - const: wakeup + power-domains: description: Power domain provider node and an args specifier containing @@ -125,6 +145,11 @@ properties: minItems: 1 maxItems: 2 + wakeup-source: + $ref: /schemas/types.yaml#/definitions/phandle-array + description: + List of phandles to system idle states in which mcan can wakeup the system. + required: - compatible - reg diff --git a/Documentation/devicetree/bindings/net/can/microchip,mcp251xfd.yaml b/Documentation/devicetree/bindings/net/can/microchip,mcp251xfd.yaml index c155c9c6db39..2d13638ebc6a 100644 --- a/Documentation/devicetree/bindings/net/can/microchip,mcp251xfd.yaml +++ b/Documentation/devicetree/bindings/net/can/microchip,mcp251xfd.yaml @@ -49,6 +49,11 @@ properties: Must be half or less of "clocks" frequency. maximum: 20000000 + gpio-controller: true + + "#gpio-cells": + const: 2 + required: - compatible - reg diff --git a/Documentation/devicetree/bindings/net/can/microchip,mpfs-can.yaml b/Documentation/devicetree/bindings/net/can/microchip,mpfs-can.yaml index 1219c5cb601f..519a11fbe972 100644 --- a/Documentation/devicetree/bindings/net/can/microchip,mpfs-can.yaml +++ b/Documentation/devicetree/bindings/net/can/microchip,mpfs-can.yaml @@ -32,11 +32,15 @@ properties: - description: AHB peripheral clock - description: CAN bus clock + resets: + maxItems: 1 + required: - compatible - reg - interrupts - clocks + - resets additionalProperties: false @@ -46,6 +50,7 @@ examples: compatible = "microchip,mpfs-can"; reg = <0x2010c000 0x1000>; clocks = <&clkcfg 17>, <&clkcfg 37>; + resets = <&clkcfg 17>; interrupt-parent = <&plic>; interrupts = <56>; }; diff --git a/Documentation/devicetree/bindings/net/cdns,macb.yaml b/Documentation/devicetree/bindings/net/cdns,macb.yaml index 1029786a855c..cb14c35ba996 100644 --- a/Documentation/devicetree/bindings/net/cdns,macb.yaml +++ b/Documentation/devicetree/bindings/net/cdns,macb.yaml @@ -38,7 +38,10 @@ properties: - cdns,sam9x60-macb # Microchip sam9x60 SoC - microchip,mpfs-macb # Microchip PolarFire SoC - const: cdns,macb # Generic - + - items: + - const: microchip,pic64gx-macb # Microchip PIC64GX SoC + - const: microchip,mpfs-macb # Microchip PolarFire SoC + - const: cdns,macb # Generic - items: - enum: - atmel,sama5d3-macb # 10/100Mbit IP on Atmel sama5d3 SoCs @@ -47,18 +50,19 @@ properties: - const: cdns,macb # Generic - enum: - - atmel,sama5d29-gem # GEM XL IP (10/100) on Atmel sama5d29 SoCs - atmel,sama5d2-gem # GEM IP (10/100) on Atmel sama5d2 SoCs + - atmel,sama5d29-gem # GEM XL IP (10/100) on Atmel sama5d29 SoCs - atmel,sama5d3-gem # Gigabit IP on Atmel sama5d3 SoCs - atmel,sama5d4-gem # GEM IP (10/100) on Atmel sama5d4 SoCs + - cdns,emac # Generic + - cdns,gem # Generic + - cdns,macb # Generic - cdns,np4-macb # NP4 SoC devices - microchip,sama7g5-emac # Microchip SAMA7G5 ethernet interface - microchip,sama7g5-gem # Microchip SAMA7G5 gigabit ethernet interface + - mobileye,eyeq5-gem # Mobileye EyeQ5 SoCs - raspberrypi,rp1-gem # Raspberry Pi RP1 gigabit ethernet interface - sifive,fu540-c000-gem # SiFive FU540-C000 SoC - - cdns,emac # Generic - - cdns,gem # Generic - - cdns,macb # Generic - items: - enum: @@ -183,6 +187,15 @@ allOf: reg: maxItems: 1 + - if: + properties: + compatible: + contains: + const: mobileye,eyeq5-gem + then: + required: + - phys + unevaluatedProperties: false examples: diff --git a/Documentation/devicetree/bindings/net/dsa/lantiq,gswip.yaml b/Documentation/devicetree/bindings/net/dsa/lantiq,gswip.yaml index f3154b19af78..205b683849a5 100644 --- a/Documentation/devicetree/bindings/net/dsa/lantiq,gswip.yaml +++ b/Documentation/devicetree/bindings/net/dsa/lantiq,gswip.yaml @@ -4,10 +4,14 @@ $id: http://devicetree.org/schemas/net/dsa/lantiq,gswip.yaml# $schema: http://devicetree.org/meta-schemas/core.yaml# -title: Lantiq GSWIP Ethernet switches +title: Lantiq GSWIP and MaxLinear GSW1xx Ethernet switches -allOf: - - $ref: dsa.yaml#/$defs/ethernet-ports +description: + Lantiq GSWIP and MaxLinear GSW1xx switches share the same hardware IP. + Lantiq switches are embedded in SoCs and accessed via memory-mapped I/O, + while MaxLinear switches are standalone ICs connected via MDIO. + +$ref: dsa.yaml# maintainers: - Hauke Mehrtens <hauke@hauke-m.de> @@ -18,9 +22,14 @@ properties: - lantiq,xrx200-gswip - lantiq,xrx300-gswip - lantiq,xrx330-gswip + - maxlinear,gsw120 + - maxlinear,gsw125 + - maxlinear,gsw140 + - maxlinear,gsw141 + - maxlinear,gsw145 reg: - minItems: 3 + minItems: 1 maxItems: 3 reg-names: @@ -37,9 +46,6 @@ properties: compatible: const: lantiq,xrx200-mdio - required: - - compatible - gphy-fw: type: object properties: @@ -91,10 +97,63 @@ properties: additionalProperties: false +patternProperties: + "^(ethernet-)?ports$": + type: object + patternProperties: + "^(ethernet-)?port@[0-6]$": + $ref: dsa-port.yaml# + unevaluatedProperties: false + + properties: + maxlinear,rmii-refclk-out: + type: boolean + description: + Configure the RMII reference clock to be a clock output + rather than an input. Only applicable for RMII mode. + tx-internal-delay-ps: + enum: [0, 500, 1000, 1500, 2000, 2500, 3000, 3500] + description: + RGMII TX Clock Delay defined in pico seconds. + The delay lines adjust the MII clock vs. data timing. + If this property is not present the delay is determined by + the interface mode. + rx-internal-delay-ps: + enum: [0, 500, 1000, 1500, 2000, 2500, 3000, 3500] + description: + RGMII RX Clock Delay defined in pico seconds. + The delay lines adjust the MII clock vs. data timing. + If this property is not present the delay is determined by + the interface mode. + required: - compatible - reg +allOf: + - if: + properties: + compatible: + contains: + enum: + - lantiq,xrx200-gswip + - lantiq,xrx300-gswip + - lantiq,xrx330-gswip + then: + properties: + reg: + minItems: 3 + maxItems: 3 + mdio: + required: + - compatible + else: + properties: + reg: + maxItems: 1 + reg-names: false + gphy-fw: false + unevaluatedProperties: false examples: @@ -113,8 +172,10 @@ examples: port@0 { reg = <0>; label = "lan3"; - phy-mode = "rgmii"; + phy-mode = "rgmii-id"; phy-handle = <&phy0>; + tx-internal-delay-ps = <2000>; + rx-internal-delay-ps = <2000>; }; port@1 { @@ -200,3 +261,90 @@ examples: }; }; }; + + - | + #include <dt-bindings/leds/common.h> + + mdio { + #address-cells = <1>; + #size-cells = <0>; + + switch@1f { + compatible = "maxlinear,gsw125"; + reg = <0x1f>; + + ports { + #address-cells = <1>; + #size-cells = <0>; + + port@0 { + reg = <0>; + label = "lan0"; + phy-handle = <&switchphy0>; + phy-mode = "internal"; + }; + + port@1 { + reg = <1>; + label = "lan1"; + phy-handle = <&switchphy1>; + phy-mode = "internal"; + }; + + port@4 { + reg = <4>; + label = "wan"; + phy-mode = "1000base-x"; + managed = "in-band-status"; + }; + + port@5 { + reg = <5>; + phy-mode = "rgmii-id"; + tx-internal-delay-ps = <2000>; + rx-internal-delay-ps = <2000>; + ethernet = <ð0>; + + fixed-link { + speed = <1000>; + full-duplex; + }; + }; + }; + + mdio { + #address-cells = <1>; + #size-cells = <0>; + + switchphy0: switchphy@0 { + reg = <0>; + + leds { + #address-cells = <1>; + #size-cells = <0>; + + led@0 { + reg = <0>; + color = <LED_COLOR_ID_GREEN>; + function = LED_FUNCTION_LAN; + }; + }; + }; + + switchphy1: switchphy@1 { + reg = <1>; + + leds { + #address-cells = <1>; + #size-cells = <0>; + + led@0 { + reg = <0>; + color = <LED_COLOR_ID_GREEN>; + function = LED_FUNCTION_LAN; + }; + }; + }; + }; + }; + }; diff --git a/Documentation/devicetree/bindings/net/dsa/motorcomm,yt921x.yaml b/Documentation/devicetree/bindings/net/dsa/motorcomm,yt921x.yaml new file mode 100644 index 000000000000..33a6552e46fc --- /dev/null +++ b/Documentation/devicetree/bindings/net/dsa/motorcomm,yt921x.yaml @@ -0,0 +1,167 @@ +# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause) +%YAML 1.2 +--- +$id: http://devicetree.org/schemas/net/dsa/motorcomm,yt921x.yaml# +$schema: http://devicetree.org/meta-schemas/core.yaml# + +title: Motorcomm YT921x Ethernet switch family + +maintainers: + - David Yang <mmyangfl@gmail.com> + +description: | + The Motorcomm YT921x series is a family of Ethernet switches with up to 8 + internal GbE PHYs and up to 2 GMACs, including: + + - YT9215S / YT9215RB / YT9215SC: 5 GbE PHYs (Port 0-4) + 2 GMACs (Port 8-9) + - YT9213NB: 2 GbE PHYs (Port 1/3) + 1 GMAC (Port 9) + - YT9214NB: 2 GbE PHYs (Port 1/3) + 2 GMACs (Port 8-9) + - YT9218N: 8 GbE PHYs (Port 0-7) + - YT9218MB: 8 GbE PHYs (Port 0-7) + 2 GMACs (Port 8-9) + + Any port can be used as the CPU port. + +properties: + compatible: + const: motorcomm,yt9215 + + reg: + enum: [0x0, 0x1d] + + reset-gpios: + maxItems: 1 + + mdio: + $ref: /schemas/net/mdio.yaml# + unevaluatedProperties: false + description: + Internal MDIO bus for the internal GbE PHYs. PHY 0-7 are used for Port + 0-7 respectively. + + mdio-external: + $ref: /schemas/net/mdio.yaml# + unevaluatedProperties: false + description: + External MDIO bus to access external components. External PHYs for GMACs + (Port 8-9) are expected to be connected to the external MDIO bus in + vendor's reference design, but that is not a hard limitation from the + chip. + +required: + - compatible + - reg + +allOf: + - $ref: dsa.yaml#/$defs/ethernet-ports + +unevaluatedProperties: false + +examples: + - | + #include <dt-bindings/gpio/gpio.h> + + mdio { + #address-cells = <1>; + #size-cells = <0>; + + switch@1d { + compatible = "motorcomm,yt9215"; + /* default 0x1d, alternate 0x0 */ + reg = <0x1d>; + reset-gpios = <&tlmm 39 GPIO_ACTIVE_LOW>; + + mdio { + #address-cells = <1>; + #size-cells = <0>; + + sw_phy0: phy@0 { + reg = <0x0>; + }; + + sw_phy1: phy@1 { + reg = <0x1>; + }; + + sw_phy2: phy@2 { + reg = <0x2>; + }; + + sw_phy3: phy@3 { + reg = <0x3>; + }; + + sw_phy4: phy@4 { + reg = <0x4>; + }; + }; + + mdio-external { + #address-cells = <1>; + #size-cells = <0>; + + phy1: phy@b { + reg = <0xb>; + }; + }; + + ethernet-ports { + #address-cells = <1>; + #size-cells = <0>; + + ethernet-port@0 { + reg = <0>; + label = "lan1"; + phy-mode = "internal"; + phy-handle = <&sw_phy0>; + }; + + ethernet-port@1 { + reg = <1>; + label = "lan2"; + phy-mode = "internal"; + phy-handle = <&sw_phy1>; + }; + + ethernet-port@2 { + reg = <2>; + label = "lan3"; + phy-mode = "internal"; + phy-handle = <&sw_phy2>; + }; + + ethernet-port@3 { + reg = <3>; + label = "lan4"; + phy-mode = "internal"; + phy-handle = <&sw_phy3>; + }; + + ethernet-port@4 { + reg = <4>; + label = "lan5"; + phy-mode = "internal"; + phy-handle = <&sw_phy4>; + }; + + /* CPU port */ + ethernet-port@8 { + reg = <8>; + phy-mode = "2500base-x"; + ethernet = <ð0>; + + fixed-link { + speed = <2500>; + full-duplex; + }; + }; + + /* if external phy is connected to a MAC */ + ethernet-port@9 { + reg = <9>; + label = "wan"; + phy-mode = "rgmii-id"; + phy-handle = <&phy1>; + }; + }; + }; + }; diff --git a/Documentation/devicetree/bindings/net/dsa/nxp,sja1105.yaml b/Documentation/devicetree/bindings/net/dsa/nxp,sja1105.yaml index e9dd914b0734..607b7fe8d28e 100644 --- a/Documentation/devicetree/bindings/net/dsa/nxp,sja1105.yaml +++ b/Documentation/devicetree/bindings/net/dsa/nxp,sja1105.yaml @@ -41,6 +41,9 @@ properties: therefore discouraged. maxItems: 1 + clocks: + maxItems: 1 + spi-cpha: true spi-cpol: true diff --git a/Documentation/devicetree/bindings/net/eswin,eic7700-eth.yaml b/Documentation/devicetree/bindings/net/eswin,eic7700-eth.yaml new file mode 100644 index 000000000000..91e8cd1db67b --- /dev/null +++ b/Documentation/devicetree/bindings/net/eswin,eic7700-eth.yaml @@ -0,0 +1,129 @@ +# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause) +%YAML 1.2 +--- +$id: http://devicetree.org/schemas/net/eswin,eic7700-eth.yaml# +$schema: http://devicetree.org/meta-schemas/core.yaml# + +title: Eswin EIC7700 SOC Eth Controller + +maintainers: + - Shuang Liang <liangshuang@eswincomputing.com> + - Zhi Li <lizhi2@eswincomputing.com> + - Shangjuan Wei <weishangjuan@eswincomputing.com> + +description: + Platform glue layer implementation for STMMAC Ethernet driver. + +select: + properties: + compatible: + contains: + enum: + - eswin,eic7700-qos-eth + required: + - compatible + +allOf: + - $ref: snps,dwmac.yaml# + +properties: + compatible: + items: + - const: eswin,eic7700-qos-eth + - const: snps,dwmac-5.20 + + reg: + maxItems: 1 + + interrupts: + maxItems: 1 + + interrupt-names: + const: macirq + + clocks: + items: + - description: AXI clock + - description: Configuration clock + - description: GMAC main clock + - description: Tx clock + + clock-names: + items: + - const: axi + - const: cfg + - const: stmmaceth + - const: tx + + resets: + maxItems: 1 + + reset-names: + items: + - const: stmmaceth + + rx-internal-delay-ps: + enum: [0, 200, 600, 1200, 1600, 1800, 2000, 2200, 2400] + + tx-internal-delay-ps: + enum: [0, 200, 600, 1200, 1600, 1800, 2000, 2200, 2400] + + eswin,hsp-sp-csr: + description: + HSP CSR is to control and get status of different high-speed peripherals + (such as Ethernet, USB, SATA, etc.) via register, which can tune + board-level's parameters of PHY, etc. + $ref: /schemas/types.yaml#/definitions/phandle-array + items: + - items: + - description: Phandle to HSP(High-Speed Peripheral) device + - description: Offset of phy control register for internal + or external clock selection + - description: Offset of AXI clock controller Low-Power request + register + - description: Offset of register controlling TX/RX clock delay + +required: + - compatible + - reg + - clocks + - clock-names + - interrupts + - interrupt-names + - phy-mode + - resets + - reset-names + - rx-internal-delay-ps + - tx-internal-delay-ps + - eswin,hsp-sp-csr + +unevaluatedProperties: false + +examples: + - | + ethernet@50400000 { + compatible = "eswin,eic7700-qos-eth", "snps,dwmac-5.20"; + reg = <0x50400000 0x10000>; + clocks = <&d0_clock 186>, <&d0_clock 171>, <&d0_clock 40>, + <&d0_clock 193>; + clock-names = "axi", "cfg", "stmmaceth", "tx"; + interrupt-parent = <&plic>; + interrupts = <61>; + interrupt-names = "macirq"; + phy-mode = "rgmii-id"; + phy-handle = <&phy0>; + resets = <&reset 95>; + reset-names = "stmmaceth"; + rx-internal-delay-ps = <200>; + tx-internal-delay-ps = <200>; + eswin,hsp-sp-csr = <&hsp_sp_csr 0x100 0x108 0x118>; + snps,axi-config = <&stmmac_axi_setup>; + snps,aal; + snps,fixed-burst; + snps,tso; + stmmac_axi_setup: stmmac-axi-config { + snps,blen = <0 0 0 0 16 8 4>; + snps,rd_osr_lmt = <2>; + snps,wr_osr_lmt = <2>; + }; + }; diff --git a/Documentation/devicetree/bindings/net/ethernet-phy.yaml b/Documentation/devicetree/bindings/net/ethernet-phy.yaml index 2ec2d9fda7e3..bb4c49fc5fd8 100644 --- a/Documentation/devicetree/bindings/net/ethernet-phy.yaml +++ b/Documentation/devicetree/bindings/net/ethernet-phy.yaml @@ -35,9 +35,13 @@ properties: description: PHYs that implement IEEE802.3 clause 45 - pattern: "^ethernet-phy-id[a-f0-9]{4}\\.[a-f0-9]{4}$" description: - If the PHY reports an incorrect ID (or none at all) then the - compatible list may contain an entry with the correct PHY ID - in the above form. + PHYs contain identification registers. These will be read to + identify the PHY. If the PHY reports an incorrect ID, or the + PHY requires a specific initialization sequence (like a + particular order of clocks, resets, power supplies), in + order to be able to read the ID registers, then the + compatible list must contain an entry with the correct PHY + ID in the above form. The first group of digits is the 16 bit Phy Identifier 1 register, this is the chip vendor OUI bits 3:18. The second group of digits is the Phy Identifier 2 register, diff --git a/Documentation/devicetree/bindings/net/fsl,enetc.yaml b/Documentation/devicetree/bindings/net/fsl,enetc.yaml index ca70f0050171..aac20ab72ace 100644 --- a/Documentation/devicetree/bindings/net/fsl,enetc.yaml +++ b/Documentation/devicetree/bindings/net/fsl,enetc.yaml @@ -27,6 +27,7 @@ properties: - const: fsl,enetc - enum: - pci1131,e101 + - pci1131,e110 reg: maxItems: 1 diff --git a/Documentation/devicetree/bindings/net/marvell-bt-8xxx.txt b/Documentation/devicetree/bindings/net/marvell-bt-8xxx.txt deleted file mode 100644 index 957e5e5c2927..000000000000 --- a/Documentation/devicetree/bindings/net/marvell-bt-8xxx.txt +++ /dev/null @@ -1,83 +0,0 @@ -Marvell 8897/8997 (sd8897/sd8997) bluetooth devices (SDIO or USB based) ------- -The 8997 devices supports multiple interfaces. When used on SDIO interfaces, -the btmrvl driver is used and when used on USB interface, the btusb driver is -used. - -Required properties: - - - compatible : should be one of the following: - * "marvell,sd8897-bt" (for SDIO) - * "marvell,sd8997-bt" (for SDIO) - * "usb1286,204e" (for USB) - -Optional properties: - - - marvell,cal-data: Calibration data downloaded to the device during - initialization. This is an array of 28 values(u8). - This is only applicable to SDIO devices. - - - marvell,wakeup-pin: It represents wakeup pin number of the bluetooth chip. - firmware will use the pin to wakeup host system (u16). - - marvell,wakeup-gap-ms: wakeup gap represents wakeup latency of the host - platform. The value will be configured to firmware. This - is needed to work chip's sleep feature as expected (u16). - - interrupt-names: Used only for USB based devices (See below) - - interrupts : specifies the interrupt pin number to the cpu. For SDIO, the - driver will use the first interrupt specified in the interrupt - array. For USB based devices, the driver will use the interrupt - named "wakeup" from the interrupt-names and interrupt arrays. - The driver will request an irq based on this interrupt number. - During system suspend, the irq will be enabled so that the - bluetooth chip can wakeup host platform under certain - conditions. During system resume, the irq will be disabled - to make sure unnecessary interrupt is not received. - -Example: - -IRQ pin 119 is used as system wakeup source interrupt. -wakeup pin 13 and gap 100ms are configured so that firmware can wakeup host -using this device side pin and wakeup latency. - -Example for SDIO device follows (calibration data is also available in -below example). - -&mmc3 { - vmmc-supply = <&wlan_en_reg>; - bus-width = <4>; - cap-power-off-card; - keep-power-in-suspend; - - #address-cells = <1>; - #size-cells = <0>; - btmrvl: bluetooth@2 { - compatible = "marvell,sd8897-bt"; - reg = <2>; - interrupt-parent = <&pio>; - interrupts = <119 IRQ_TYPE_LEVEL_LOW>; - - marvell,cal-data = /bits/ 8 < - 0x37 0x01 0x1c 0x00 0xff 0xff 0xff 0xff 0x01 0x7f 0x04 0x02 - 0x00 0x00 0xba 0xce 0xc0 0xc6 0x2d 0x00 0x00 0x00 0x00 0x00 - 0x00 0x00 0xf0 0x00>; - marvell,wakeup-pin = /bits/ 16 <0x0d>; - marvell,wakeup-gap-ms = /bits/ 16 <0x64>; - }; -}; - -Example for USB device: - -&usb_host1_ohci { - #address-cells = <1>; - #size-cells = <0>; - - mvl_bt1: bt@1 { - compatible = "usb1286,204e"; - reg = <1>; - interrupt-parent = <&gpio0>; - interrupt-names = "wakeup"; - interrupts = <119 IRQ_TYPE_LEVEL_LOW>; - marvell,wakeup-pin = /bits/ 16 <0x0d>; - marvell,wakeup-gap-ms = /bits/ 16 <0x64>; - }; -}; diff --git a/Documentation/devicetree/bindings/net/mediatek,net.yaml b/Documentation/devicetree/bindings/net/mediatek,net.yaml index b45f67f92e80..cc346946291a 100644 --- a/Documentation/devicetree/bindings/net/mediatek,net.yaml +++ b/Documentation/devicetree/bindings/net/mediatek,net.yaml @@ -112,7 +112,7 @@ properties: mediatek,wed: $ref: /schemas/types.yaml#/definitions/phandle-array - minItems: 2 + minItems: 1 maxItems: 2 items: maxItems: 1 @@ -249,6 +249,9 @@ allOf: minItems: 1 maxItems: 1 + mediatek,wed: + minItems: 2 + mediatek,wed-pcie: false else: properties: @@ -338,12 +341,13 @@ allOf: - const: netsys0 - const: netsys1 - mediatek,infracfg: false - mediatek,sgmiisys: minItems: 2 maxItems: 2 + mediatek,wed: + maxItems: 1 + - if: properties: compatible: @@ -385,6 +389,9 @@ allOf: minItems: 2 maxItems: 2 + mediatek,wed: + minItems: 2 + - if: properties: compatible: @@ -429,6 +436,19 @@ allOf: - const: xgp2 - const: xgp3 + mediatek,wed: + minItems: 2 + + - if: + properties: + compatible: + contains: + const: ralink,rt5350-eth + then: + properties: + mediatek,wed: + minItems: 2 + patternProperties: "^mac@[0-2]$": type: object diff --git a/Documentation/devicetree/bindings/net/mscc-phy-vsc8531.txt b/Documentation/devicetree/bindings/net/mscc-phy-vsc8531.txt deleted file mode 100644 index 0a3647fe331b..000000000000 --- a/Documentation/devicetree/bindings/net/mscc-phy-vsc8531.txt +++ /dev/null @@ -1,73 +0,0 @@ -* Microsemi - vsc8531 Giga bit ethernet phy - -Optional properties: -- vsc8531,vddmac : The vddmac in mV. Allowed values is listed - in the first row of Table 1 (below). - This property is only used in combination - with the 'edge-slowdown' property. - Default value is 3300. -- vsc8531,edge-slowdown : % the edge should be slowed down relative to - the fastest possible edge time. - Edge rate sets the drive strength of the MAC - interface output signals. Changing the - drive strength will affect the edge rate of - the output signal. The goal of this setting - is to help reduce electrical emission (EMI) - by being able to reprogram drive strength - and in effect slow down the edge rate if - desired. - To adjust the edge-slowdown, the 'vddmac' - must be specified. Table 1 lists the - supported edge-slowdown values for a given - 'vddmac'. - Default value is 0%. - Ref: Table:1 - Edge rate change (below). -- vsc8531,led-[N]-mode : LED mode. Specify how the LED[N] should behave. - N depends on the number of LEDs supported by a - PHY. - Allowed values are defined in - "include/dt-bindings/net/mscc-phy-vsc8531.h". - Default values are VSC8531_LINK_1000_ACTIVITY (1), - VSC8531_LINK_100_ACTIVITY (2), - VSC8531_LINK_ACTIVITY (0) and - VSC8531_DUPLEX_COLLISION (8). -- load-save-gpios : GPIO used for the load/save operation of the PTP - hardware clock (PHC). - - -Table: 1 - Edge rate change -----------------------------------------------------------------| -| Edge Rate Change (VDDMAC) | -| | -| 3300 mV 2500 mV 1800 mV 1500 mV | -|---------------------------------------------------------------| -| 0% 0% 0% 0% | -| (Fastest) (recommended) (recommended) | -|---------------------------------------------------------------| -| 2% 3% 5% 6% | -|---------------------------------------------------------------| -| 4% 6% 9% 14% | -|---------------------------------------------------------------| -| 7% 10% 16% 21% | -|(recommended) (recommended) | -|---------------------------------------------------------------| -| 10% 14% 23% 29% | -|---------------------------------------------------------------| -| 17% 23% 35% 42% | -|---------------------------------------------------------------| -| 29% 37% 52% 58% | -|---------------------------------------------------------------| -| 53% 63% 76% 77% | -| (slowest) | -|---------------------------------------------------------------| - -Example: - - vsc8531_0: ethernet-phy@0 { - compatible = "ethernet-phy-id0007.0570"; - vsc8531,vddmac = <3300>; - vsc8531,edge-slowdown = <7>; - vsc8531,led-0-mode = <VSC8531_LINK_1000_ACTIVITY>; - vsc8531,led-1-mode = <VSC8531_LINK_100_ACTIVITY>; - load-save-gpios = <&gpio 10 GPIO_ACTIVE_HIGH>; - }; diff --git a/Documentation/devicetree/bindings/net/mscc-phy-vsc8531.yaml b/Documentation/devicetree/bindings/net/mscc-phy-vsc8531.yaml new file mode 100644 index 000000000000..0afbd0ff126f --- /dev/null +++ b/Documentation/devicetree/bindings/net/mscc-phy-vsc8531.yaml @@ -0,0 +1,131 @@ +# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause) +%YAML 1.2 +--- +$id: http://devicetree.org/schemas/net/mscc-phy-vsc8531.yaml# +$schema: http://devicetree.org/meta-schemas/core.yaml# + +title: Microsemi VSC8531 Gigabit Ethernet PHY + +maintainers: + - Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com> + +description: + The VSC8531 is a Gigabit Ethernet PHY with configurable MAC interface + drive strength and LED modes. + +allOf: + - $ref: ethernet-phy.yaml# + +select: + properties: + compatible: + contains: + enum: + - ethernet-phy-id0007.0570 # VSC8531 + - ethernet-phy-id0007.0772 # VSC8541 + required: + - compatible + +properties: + compatible: + items: + - enum: + - ethernet-phy-id0007.0570 # VSC8531 + - ethernet-phy-id0007.0772 # VSC8541 + - const: ethernet-phy-ieee802.3-c22 + + vsc8531,vddmac: + $ref: /schemas/types.yaml#/definitions/uint32 + description: + The VDDMAC voltage in millivolts. This property is used in combination + with the edge-slowdown property to control the drive strength of the + MAC interface output signals. + enum: [3300, 2500, 1800, 1500] + default: 3300 + + vsc8531,edge-slowdown: + $ref: /schemas/types.yaml#/definitions/uint32 + description: > + Percentage by which the edge rate should be slowed down relative to + the fastest possible edge time. This setting helps reduce electromagnetic + interference (EMI) by adjusting the drive strength of the MAC interface + output signals. Valid values depend on the vddmac voltage setting + according to the edge rate change table in the datasheet. + + - When vsc8531,vddmac = 3300 mV: allowed values are 0, 2, 4, 7, 10, 17, 29, and 53. + (Recommended: 7) + - When vsc8531,vddmac = 2500 mV: allowed values are 0, 3, 6, 10, 14, 23, 37, and 63. + (Recommended: 10) + - When vsc8531,vddmac = 1800 mV: allowed values are 0, 5, 9, 16, 23, 35, 52, and 76. + (Recommended: 0) + - When vsc8531,vddmac = 1500 mV: allowed values are 0, 6, 14, 21, 29, 42, 58, and 77. + (Recommended: 0) + enum: [0, 2, 3, 4, 5, 6, 7, 9, 10, 14, 16, 17, 21, 23, 29, 35, 37, 42, 52, 53, 58, 63, 76, 77] + default: 0 + + vsc8531,led-0-mode: + $ref: /schemas/types.yaml#/definitions/uint32 + description: LED[0] behavior mode. See include/dt-bindings/net/mscc-phy-vsc8531.h + for available modes. + minimum: 0 + maximum: 15 + default: 1 + + vsc8531,led-1-mode: + $ref: /schemas/types.yaml#/definitions/uint32 + description: LED[1] behavior mode. See include/dt-bindings/net/mscc-phy-vsc8531.h + for available modes. + minimum: 0 + maximum: 15 + default: 2 + + vsc8531,led-2-mode: + $ref: /schemas/types.yaml#/definitions/uint32 + description: LED[2] behavior mode. See include/dt-bindings/net/mscc-phy-vsc8531.h + for available modes. + minimum: 0 + maximum: 15 + default: 0 + + vsc8531,led-3-mode: + $ref: /schemas/types.yaml#/definitions/uint32 + description: LED[3] behavior mode. See include/dt-bindings/net/mscc-phy-vsc8531.h + for available modes. + minimum: 0 + maximum: 15 + default: 8 + + load-save-gpios: + description: GPIO phandle used for the load/save operation of the PTP hardware + clock (PHC). + maxItems: 1 + +dependencies: + vsc8531,edge-slowdown: + - vsc8531,vddmac + +required: + - compatible + - reg + +unevaluatedProperties: false + +examples: + - | + #include <dt-bindings/gpio/gpio.h> + #include <dt-bindings/net/mscc-phy-vsc8531.h> + + mdio { + #address-cells = <1>; + #size-cells = <0>; + + ethernet-phy@0 { + compatible = "ethernet-phy-id0007.0772", "ethernet-phy-ieee802.3-c22"; + reg = <0>; + vsc8531,vddmac = <3300>; + vsc8531,edge-slowdown = <7>; + vsc8531,led-0-mode = <VSC8531_LINK_1000_ACTIVITY>; + vsc8531,led-1-mode = <VSC8531_LINK_100_ACTIVITY>; + load-save-gpios = <&gpio 10 GPIO_ACTIVE_HIGH>; + }; + }; diff --git a/Documentation/devicetree/bindings/net/nxp,netc-blk-ctrl.yaml b/Documentation/devicetree/bindings/net/nxp,netc-blk-ctrl.yaml index 97389fd5dbbf..deea4fd73d76 100644 --- a/Documentation/devicetree/bindings/net/nxp,netc-blk-ctrl.yaml +++ b/Documentation/devicetree/bindings/net/nxp,netc-blk-ctrl.yaml @@ -21,6 +21,7 @@ maintainers: properties: compatible: enum: + - nxp,imx94-netc-blk-ctrl - nxp,imx95-netc-blk-ctrl reg: diff --git a/Documentation/devicetree/bindings/net/pse-pd/ti,tps23881.yaml b/Documentation/devicetree/bindings/net/pse-pd/ti,tps23881.yaml index bb1ee3398655..0b3803f647b7 100644 --- a/Documentation/devicetree/bindings/net/pse-pd/ti,tps23881.yaml +++ b/Documentation/devicetree/bindings/net/pse-pd/ti,tps23881.yaml @@ -16,6 +16,7 @@ properties: compatible: enum: - ti,tps23881 + - ti,tps23881b reg: maxItems: 1 diff --git a/Documentation/devicetree/bindings/net/qcom,ethqos.yaml b/Documentation/devicetree/bindings/net/qcom,ethqos.yaml index e7ee0d9efed8..423959cb928d 100644 --- a/Documentation/devicetree/bindings/net/qcom,ethqos.yaml +++ b/Documentation/devicetree/bindings/net/qcom,ethqos.yaml @@ -73,6 +73,14 @@ properties: dma-coherent: true + interconnects: + maxItems: 2 + + interconnect-names: + items: + - const: cpu-mac + - const: mac-mem + phys: true phy-names: diff --git a/Documentation/devicetree/bindings/net/rockchip-dwmac.yaml b/Documentation/devicetree/bindings/net/rockchip-dwmac.yaml index 0ac7c4b47d6b..d17112527dab 100644 --- a/Documentation/devicetree/bindings/net/rockchip-dwmac.yaml +++ b/Documentation/devicetree/bindings/net/rockchip-dwmac.yaml @@ -24,6 +24,7 @@ select: - rockchip,rk3366-gmac - rockchip,rk3368-gmac - rockchip,rk3399-gmac + - rockchip,rk3506-gmac - rockchip,rk3528-gmac - rockchip,rk3568-gmac - rockchip,rk3576-gmac @@ -50,6 +51,7 @@ properties: - rockchip,rv1108-gmac - items: - enum: + - rockchip,rk3506-gmac - rockchip,rk3528-gmac - rockchip,rk3568-gmac - rockchip,rk3576-gmac @@ -148,6 +150,7 @@ allOf: compatible: contains: enum: + - rockchip,rk3506-gmac - rockchip,rk3528-gmac then: properties: diff --git a/Documentation/devicetree/bindings/net/snps,dwmac.yaml b/Documentation/devicetree/bindings/net/snps,dwmac.yaml index 658c004e6a5c..dd3c72e8363e 100644 --- a/Documentation/devicetree/bindings/net/snps,dwmac.yaml +++ b/Documentation/devicetree/bindings/net/snps,dwmac.yaml @@ -86,10 +86,14 @@ properties: - rockchip,rk3328-gmac - rockchip,rk3366-gmac - rockchip,rk3368-gmac + - rockchip,rk3399-gmac + - rockchip,rk3506-gmac + - rockchip,rk3528-gmac + - rockchip,rk3568-gmac - rockchip,rk3576-gmac - rockchip,rk3588-gmac - - rockchip,rk3399-gmac - rockchip,rv1108-gmac + - rockchip,rv1126-gmac - snps,dwmac - snps,dwmac-3.40a - snps,dwmac-3.50a diff --git a/Documentation/devicetree/bindings/net/sophgo,sg2044-dwmac.yaml b/Documentation/devicetree/bindings/net/sophgo,sg2044-dwmac.yaml index ce21979a2d9a..e8d3814db0e9 100644 --- a/Documentation/devicetree/bindings/net/sophgo,sg2044-dwmac.yaml +++ b/Documentation/devicetree/bindings/net/sophgo,sg2044-dwmac.yaml @@ -70,6 +70,25 @@ required: allOf: - $ref: snps,dwmac.yaml# + - if: + properties: + compatible: + contains: + const: sophgo,sg2042-dwmac + then: + properties: + phy-mode: + enum: + - rgmii-rxid + - rgmii-id + else: + properties: + phy-mode: + enum: + - rgmii + - rgmii-rxid + - rgmii-txid + - rgmii-id unevaluatedProperties: false diff --git a/Documentation/devicetree/bindings/net/wireless/mediatek,mt76.yaml b/Documentation/devicetree/bindings/net/wireless/mediatek,mt76.yaml index eabceb849537..ae6b97cdc44b 100644 --- a/Documentation/devicetree/bindings/net/wireless/mediatek,mt76.yaml +++ b/Documentation/devicetree/bindings/net/wireless/mediatek,mt76.yaml @@ -151,6 +151,12 @@ properties: - ETSI - JP + country: + $ref: /schemas/types.yaml#/definitions/string + pattern: '^[A-Z]{2}$' + description: + ISO 3166-1 alpha-2 country code for power limits + patternProperties: "^txpower-[256]g$": type: object @@ -210,6 +216,66 @@ properties: minItems: 13 maxItems: 13 + paths-cck: + $ref: /schemas/types.yaml#/definitions/uint8-array + minItems: 4 + maxItems: 4 + description: + 4 half-dBm backoff values (1 - 4 antennas, single spacial + stream) + + paths-ofdm: + $ref: /schemas/types.yaml#/definitions/uint8-array + minItems: 4 + maxItems: 4 + description: + 4 half-dBm backoff values (1 - 4 antennas, single spacial + stream) + + paths-ofdm-bf: + $ref: /schemas/types.yaml#/definitions/uint8-array + minItems: 4 + maxItems: 4 + description: + 4 half-dBm backoff values for beamforming + (1 - 4 antennas, single spacial stream) + + paths-ru: + $ref: /schemas/types.yaml#/definitions/uint8-matrix + description: + Sets of half-dBm backoff values for 802.11ax rates for + 1T1ss (aka 1 transmitting antenna with 1 spacial stream), + 2T1ss, 3T1ss, 4T1ss, 2T2ss, 3T2ss, 4T2ss, 3T3ss, 4T3ss + and 4T4ss. + Each set starts with the number of channel bandwidth or + resource unit settings for which the rate set applies, + followed by 10 power limit values. The order of the + channel resource unit settings is RU26, RU52, RU106, + RU242/SU20, RU484/SU40, RU996/SU80 and RU2x996/SU160. + minItems: 1 + maxItems: 7 + items: + minItems: 11 + maxItems: 11 + + paths-ru-bf: + $ref: /schemas/types.yaml#/definitions/uint8-matrix + description: + Sets of half-dBm backoff (beamforming) values for 802.11ax + rates for 1T1ss (aka 1 transmitting antenna with 1 spacial + stream), 2T1ss, 3T1ss, 4T1ss, 2T2ss, 3T2ss, 4T2ss, 3T3ss, + 4T3ss and 4T4ss. + Each set starts with the number of channel bandwidth or + resource unit settings for which the rate set applies, + followed by 10 power limit values. The order of the + channel resource unit settings is RU26, RU52, RU106, + RU242/SU20, RU484/SU40, RU996/SU80 and RU2x996/SU160. + minItems: 1 + maxItems: 7 + items: + minItems: 11 + maxItems: 11 + txs-delta: $ref: /schemas/types.yaml#/definitions/uint32-array description: diff --git a/Documentation/devicetree/bindings/vendor-prefixes.yaml b/Documentation/devicetree/bindings/vendor-prefixes.yaml index 647746e6f75f..beab81e462d2 100644 --- a/Documentation/devicetree/bindings/vendor-prefixes.yaml +++ b/Documentation/devicetree/bindings/vendor-prefixes.yaml @@ -20,7 +20,7 @@ patternProperties: "^(keypad|m25p|max8952|max8997|max8998|mpmc),.*": true "^(pciclass|pinctrl-single|#pinctrl-single|PowerPC),.*": true "^(pl022|pxa-mmc|rcar_sound|rotary-encoder|s5m8767|sdhci),.*": true - "^(simple-audio-card|st-plgpio|st-spics|ts),.*": true + "^(simple-audio-card|st-plgpio|st-spics|ts|vsc8531),.*": true "^pool[0-3],.*": true # Keep list in alphabetical order. diff --git a/Documentation/driver-api/dpll.rst b/Documentation/driver-api/dpll.rst index be1fc643b645..83118c728ed9 100644 --- a/Documentation/driver-api/dpll.rst +++ b/Documentation/driver-api/dpll.rst @@ -198,26 +198,28 @@ be requested with the same attribute with ``DPLL_CMD_DEVICE_SET`` command. ================================== ====================================== Device may also provide ability to adjust a signal phase on a pin. -If pin phase adjustment is supported, minimal and maximal values that pin -handle shall be provide to the user on ``DPLL_CMD_PIN_GET`` respond -with ``DPLL_A_PIN_PHASE_ADJUST_MIN`` and ``DPLL_A_PIN_PHASE_ADJUST_MAX`` +If pin phase adjustment is supported, minimal and maximal values and +granularity that pin handle shall be provided to the user on +``DPLL_CMD_PIN_GET`` respond with ``DPLL_A_PIN_PHASE_ADJUST_MIN``, +``DPLL_A_PIN_PHASE_ADJUST_MAX`` and ``DPLL_A_PIN_PHASE_ADJUST_GRAN`` attributes. Configured phase adjust value is provided with ``DPLL_A_PIN_PHASE_ADJUST`` attribute of a pin, and value change can be requested with the same attribute with ``DPLL_CMD_PIN_SET`` command. - =============================== ====================================== - ``DPLL_A_PIN_ID`` configured pin id - ``DPLL_A_PIN_PHASE_ADJUST_MIN`` attr minimum value of phase adjustment - ``DPLL_A_PIN_PHASE_ADJUST_MAX`` attr maximum value of phase adjustment - ``DPLL_A_PIN_PHASE_ADJUST`` attr configured value of phase - adjustment on parent dpll device - ``DPLL_A_PIN_PARENT_DEVICE`` nested attribute for requesting - configuration on given parent dpll - device - ``DPLL_A_PIN_PARENT_ID`` parent dpll device id - ``DPLL_A_PIN_PHASE_OFFSET`` attr measured phase difference - between a pin and parent dpll device - =============================== ====================================== + ================================ ========================================== + ``DPLL_A_PIN_ID`` configured pin id + ``DPLL_A_PIN_PHASE_ADJUST_GRAN`` attr granularity of phase adjustment value + ``DPLL_A_PIN_PHASE_ADJUST_MIN`` attr minimum value of phase adjustment + ``DPLL_A_PIN_PHASE_ADJUST_MAX`` attr maximum value of phase adjustment + ``DPLL_A_PIN_PHASE_ADJUST`` attr configured value of phase + adjustment on parent dpll device + ``DPLL_A_PIN_PARENT_DEVICE`` nested attribute for requesting + configuration on given parent dpll + device + ``DPLL_A_PIN_PARENT_ID`` parent dpll device id + ``DPLL_A_PIN_PHASE_OFFSET`` attr measured phase difference + between a pin and parent dpll device + ================================ ========================================== All phase related values are provided in pico seconds, which represents time difference between signals phase. The negative value means that @@ -384,6 +386,8 @@ according to attribute purpose. frequencies ``DPLL_A_PIN_ANY_FREQUENCY_MIN`` attr minimum value of frequency ``DPLL_A_PIN_ANY_FREQUENCY_MAX`` attr maximum value of frequency + ``DPLL_A_PIN_PHASE_ADJUST_GRAN`` attr granularity of phase + adjustment value ``DPLL_A_PIN_PHASE_ADJUST_MIN`` attr minimum value of phase adjustment ``DPLL_A_PIN_PHASE_ADJUST_MAX`` attr maximum value of phase diff --git a/Documentation/filesystems/ext4/inodes.rst b/Documentation/filesystems/ext4/inodes.rst index cfc6c1659931..55cd5c380e92 100644 --- a/Documentation/filesystems/ext4/inodes.rst +++ b/Documentation/filesystems/ext4/inodes.rst @@ -297,6 +297,8 @@ The ``i_flags`` field is a combination of these values: - Inode has inline data (EXT4_INLINE_DATA_FL). * - 0x20000000 - Create children with the same project ID (EXT4_PROJINHERIT_FL). + * - 0x40000000 + - Use case-insensitive lookups for directory contents (EXT4_CASEFOLD_FL). * - 0x80000000 - Reserved for ext4 library (EXT4_RESERVED_FL). * - diff --git a/Documentation/filesystems/ext4/super.rst b/Documentation/filesystems/ext4/super.rst index 1b240661bfa3..9a59cded9bd7 100644 --- a/Documentation/filesystems/ext4/super.rst +++ b/Documentation/filesystems/ext4/super.rst @@ -671,7 +671,9 @@ following: * - 0x8000 - Data in inode (INCOMPAT_INLINE_DATA). * - 0x10000 - - Encrypted inodes are present on the filesystem. (INCOMPAT_ENCRYPT). + - Encrypted inodes can be present. (INCOMPAT_ENCRYPT). + * - 0x20000 + - Directories can be marked case-insensitive. (INCOMPAT_CASEFOLD). .. _super_rocompat: diff --git a/Documentation/filesystems/gfs2-glocks.rst b/Documentation/filesystems/gfs2/glocks.rst index ce5ff08cbd59..ce5ff08cbd59 100644 --- a/Documentation/filesystems/gfs2-glocks.rst +++ b/Documentation/filesystems/gfs2/glocks.rst diff --git a/Documentation/filesystems/gfs2.rst b/Documentation/filesystems/gfs2/index.rst index 1bc48a13430c..e5e195403561 100644 --- a/Documentation/filesystems/gfs2.rst +++ b/Documentation/filesystems/gfs2/index.rst @@ -4,6 +4,9 @@ Global File System 2 ==================== +Overview +======== + GFS2 is a cluster file system. It allows a cluster of computers to simultaneously use a block device that is shared between them (with FC, iSCSI, NBD, etc). GFS2 reads and writes to the block device like a local @@ -50,3 +53,12 @@ The following man pages are available from gfs2-utils: gfs2_convert to convert a gfs filesystem to GFS2 in-place mkfs.gfs2 to make a filesystem ============ ============================================= + +Implementation Notes +==================== + +.. toctree:: + :maxdepth: 1 + + glocks + uevents diff --git a/Documentation/filesystems/gfs2-uevents.rst b/Documentation/filesystems/gfs2/uevents.rst index f162a2c76c69..f162a2c76c69 100644 --- a/Documentation/filesystems/gfs2-uevents.rst +++ b/Documentation/filesystems/gfs2/uevents.rst diff --git a/Documentation/filesystems/index.rst b/Documentation/filesystems/index.rst index af516e528ded..f4873197587d 100644 --- a/Documentation/filesystems/index.rst +++ b/Documentation/filesystems/index.rst @@ -89,9 +89,7 @@ Documentation for filesystem implementations. ext3 ext4/index f2fs - gfs2 - gfs2-uevents - gfs2-glocks + gfs2/index hfs hfsplus hpfs diff --git a/Documentation/filesystems/xfs/xfs-online-fsck-design.rst b/Documentation/filesystems/xfs/xfs-online-fsck-design.rst index 55e727b5f12e..3d9233f403db 100644 --- a/Documentation/filesystems/xfs/xfs-online-fsck-design.rst +++ b/Documentation/filesystems/xfs/xfs-online-fsck-design.rst @@ -105,10 +105,8 @@ occur; this capability aids both strategies. TLDR; Show Me the Code! ----------------------- -Code is posted to the kernel.org git trees as follows: -`kernel changes <https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux.git/log/?h=repair-symlink>`_, -`userspace changes <https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfsprogs-dev.git/log/?h=scrub-media-scan-service>`_, and -`QA test changes <https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfstests-dev.git/log/?h=repair-dirs>`_. +Kernel and userspace code has been fully merged as of October 2025. + Each kernel patchset adding an online repair function will use the same branch name across the kernel, xfsprogs, and fstests git repos. @@ -764,12 +762,8 @@ allow the online fsck developers to compare online fsck against offline fsck, and they enable XFS developers to find deficiencies in the code base. Proposed patchsets include -`general fuzzer improvements -<https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfstests-dev.git/log/?h=fuzzer-improvements>`_, `fuzzing baselines -<https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfstests-dev.git/log/?h=fuzz-baseline>`_, -and `improvements in fuzz testing comprehensiveness -<https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfstests-dev.git/log/?h=more-fuzz-testing>`_. +<https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfstests-dev.git/log/?h=fuzz-baseline>`_. Stress Testing -------------- @@ -801,11 +795,6 @@ Success is defined by the ability to run all of these tests without observing any unexpected filesystem shutdowns due to corrupted metadata, kernel hang check warnings, or any other sort of mischief. -Proposed patchsets include `general stress testing -<https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfstests-dev.git/log/?h=race-scrub-and-mount-state-changes>`_ -and the `evolution of existing per-function stress testing -<https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfstests-dev.git/log/?h=refactor-scrub-stress>`_. - 4. User Interface ================= @@ -886,10 +875,6 @@ apply as nice of a priority to IO and CPU scheduling as possible. This measure was taken to minimize delays in the rest of the filesystem. No such hardening has been performed for the cron job. -Proposed patchset: -`Enabling the xfs_scrub background service -<https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfsprogs-dev.git/log/?h=scrub-media-scan-service>`_. - Health Reporting ---------------- @@ -912,13 +897,6 @@ notifications and initiate a repair? *Answer*: These questions remain unanswered, but should be a part of the conversation with early adopters and potential downstream users of XFS. -Proposed patchsets include -`wiring up health reports to correction returns -<https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux.git/log/?h=corruption-health-reports>`_ -and -`preservation of sickness info during memory reclaim -<https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux.git/log/?h=indirect-health-reporting>`_. - 5. Kernel Algorithms and Data Structures ======================================== @@ -1310,21 +1288,6 @@ Space allocation records are cross-referenced as follows: are there the same number of reverse mapping records for each block as the reference count record claims? -Proposed patchsets are the series to find gaps in -`refcount btree -<https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux.git/log/?h=scrub-detect-refcount-gaps>`_, -`inode btree -<https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux.git/log/?h=scrub-detect-inobt-gaps>`_, and -`rmap btree -<https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux.git/log/?h=scrub-detect-rmapbt-gaps>`_ records; -to find -`mergeable records -<https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux.git/log/?h=scrub-detect-mergeable-records>`_; -and to -`improve cross referencing with rmap -<https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux.git/log/?h=scrub-strengthen-rmap-checking>`_ -before starting a repair. - Checking Extended Attributes ```````````````````````````` @@ -1756,10 +1719,6 @@ For scrub, the drain works as follows: To avoid polling in step 4, the drain provides a waitqueue for scrub threads to be woken up whenever the intent count drops to zero. -The proposed patchset is the -`scrub intent drain series -<https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux.git/log/?h=scrub-drain-intents>`_. - .. _jump_labels: Static Keys (aka Jump Label Patching) @@ -2036,10 +1995,6 @@ The ``xfarray_store_anywhere`` function is used to insert a record in any null record slot in the bag; and the ``xfarray_unset`` function removes a record from the bag. -The proposed patchset is the -`big in-memory array -<https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux.git/log/?h=big-array>`_. - Iterating Array Elements ^^^^^^^^^^^^^^^^^^^^^^^^ @@ -2172,10 +2127,6 @@ However, it should be noted that these repair functions only use blob storage to cache a small number of entries before adding them to a temporary ondisk file, which is why compaction is not required. -The proposed patchset is at the start of the -`extended attribute repair -<https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux.git/log/?h=repair-xattrs>`_ series. - .. _xfbtree: In-Memory B+Trees @@ -2214,11 +2165,6 @@ xfiles enables reuse of the entire btree library. Btrees built atop an xfile are collectively known as ``xfbtrees``. The next few sections describe how they actually work. -The proposed patchset is the -`in-memory btree -<https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux.git/log/?h=in-memory-btrees>`_ -series. - Using xfiles as a Buffer Cache Target ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ @@ -2459,14 +2405,6 @@ This enables the log to release the old EFI to keep the log moving forwards. EFIs have a role to play during the commit and reaping phases; please see the next section and the section about :ref:`reaping<reaping>` for more details. -Proposed patchsets are the -`bitmap rework -<https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux.git/log/?h=repair-bitmap-rework>`_ -and the -`preparation for bulk loading btrees -<https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux.git/log/?h=repair-prep-for-bulk-loading>`_. - - Writing the New Tree ```````````````````` @@ -2623,11 +2561,6 @@ The number of records for the inode btree is the number of xfarray records, but the record count for the free inode btree has to be computed as inode chunk records are stored in the xfarray. -The proposed patchset is the -`AG btree repair -<https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux.git/log/?h=repair-ag-btrees>`_ -series. - Case Study: Rebuilding the Space Reference Counts ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ @@ -2716,11 +2649,6 @@ Reverse mappings are added to the bag using ``xfarray_store_anywhere`` and removed via ``xfarray_unset``. Bag members are examined through ``xfarray_iter`` loops. -The proposed patchset is the -`AG btree repair -<https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux.git/log/?h=repair-ag-btrees>`_ -series. - Case Study: Rebuilding File Fork Mapping Indices ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ @@ -2757,11 +2685,6 @@ EXTENTS format instead of BMBT, which may require a conversion. Third, the incore extent map must be reloaded carefully to avoid disturbing any delayed allocation extents. -The proposed patchset is the -`file mapping repair -<https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux.git/log/?h=repair-file-mappings>`_ -series. - .. _reaping: Reaping Old Metadata Blocks @@ -2843,11 +2766,6 @@ blocks. As stated earlier, online repair functions use very large transactions to minimize the chances of this occurring. -The proposed patchset is the -`preparation for bulk loading btrees -<https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux.git/log/?h=repair-prep-for-bulk-loading>`_ -series. - Case Study: Reaping After a Regular Btree Repair ```````````````````````````````````````````````` @@ -2943,11 +2861,6 @@ When the walk is complete, the bitmap disunion operation ``(ag_owner_bitmap & btrees. These blocks can then be reaped using the methods outlined above. -The proposed patchset is the -`AG btree repair -<https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux.git/log/?h=repair-ag-btrees>`_ -series. - .. _rmap_reap: Case Study: Reaping After Repairing Reverse Mapping Btrees @@ -2972,11 +2885,6 @@ methods outlined above. The rest of the process of rebuildng the reverse mapping btree is discussed in a separate :ref:`case study<rmap_repair>`. -The proposed patchset is the -`AG btree repair -<https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux.git/log/?h=repair-ag-btrees>`_ -series. - Case Study: Rebuilding the AGFL ``````````````````````````````` @@ -3024,11 +2932,6 @@ more complicated, because computing the correct value requires traversing the forks, or if that fails, leaving the fields invalid and waiting for the fork fsck functions to run. -The proposed patchset is the -`inode -<https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux.git/log/?h=repair-inodes>`_ -repair series. - Quota Record Repairs -------------------- @@ -3045,11 +2948,6 @@ checking are obviously bad limits and timer values. Quota usage counters are checked, repaired, and discussed separately in the section about :ref:`live quotacheck <quotacheck>`. -The proposed patchset is the -`quota -<https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux.git/log/?h=repair-quota>`_ -repair series. - .. _fscounters: Freezing to Fix Summary Counters @@ -3145,11 +3043,6 @@ long enough to check and correct the summary counters. | This bug was fixed in Linux 5.17. | +--------------------------------------------------------------------------+ -The proposed patchset is the -`summary counter cleanup -<https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux.git/log/?h=repair-fscounters>`_ -series. - Full Filesystem Scans --------------------- @@ -3277,15 +3170,6 @@ Second, if the incore inode is stuck in some intermediate state, the scan coordinator must release the AGI and push the main filesystem to get the inode back into a loadable state. -The proposed patches are the -`inode scanner -<https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux.git/log/?h=scrub-iscan>`_ -series. -The first user of the new functionality is the -`online quotacheck -<https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux.git/log/?h=repair-quotacheck>`_ -series. - Inode Management ```````````````` @@ -3381,12 +3265,6 @@ To capture these nuances, the online fsck code has a separate ``xchk_irele`` function to set or clear the ``DONTCACHE`` flag to get the required release behavior. -Proposed patchsets include fixing -`scrub iget usage -<https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux.git/log/?h=scrub-iget-fixes>`_ and -`dir iget usage -<https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux.git/log/?h=scrub-dir-iget-fixes>`_. - .. _ilocking: Locking Inodes @@ -3443,11 +3321,6 @@ If the dotdot entry changes while the directory is unlocked, then a move or rename operation must have changed the child's parentage, and the scan can exit early. -The proposed patchset is the -`directory repair -<https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux.git/log/?h=repair-dirs>`_ -series. - .. _fshooks: Filesystem Hooks @@ -3594,11 +3467,6 @@ The inode scan APIs are pretty simple: - ``xchk_iscan_teardown`` to finish the scan -This functionality is also a part of the -`inode scanner -<https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux.git/log/?h=scrub-iscan>`_ -series. - .. _quotacheck: Case Study: Quota Counter Checking @@ -3686,11 +3554,6 @@ needing to hold any locks for a long duration. If repairs are desired, the real and shadow dquots are locked and their resource counts are set to the values in the shadow dquot. -The proposed patchset is the -`online quotacheck -<https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux.git/log/?h=repair-quotacheck>`_ -series. - .. _nlinks: Case Study: File Link Count Checking @@ -3744,11 +3607,6 @@ shadow information. If no parents are found, the file must be :ref:`reparented <orphanage>` to the orphanage to prevent the file from being lost forever. -The proposed patchset is the -`file link count repair -<https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux.git/log/?h=scrub-nlinks>`_ -series. - .. _rmap_repair: Case Study: Rebuilding Reverse Mapping Records @@ -3828,11 +3686,6 @@ scan for reverse mapping records. 12. Free the xfbtree now that it not needed. -The proposed patchset is the -`rmap repair -<https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux.git/log/?h=repair-rmap-btree>`_ -series. - Staging Repairs with Temporary Files on Disk -------------------------------------------- @@ -3971,11 +3824,6 @@ Once a good copy of a data file has been constructed in a temporary file, it must be conveyed to the file being repaired, which is the topic of the next section. -The proposed patches are in the -`repair temporary files -<https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux.git/log/?h=repair-tempfiles>`_ -series. - Logged File Content Exchanges ----------------------------- @@ -4025,11 +3873,6 @@ The new ``XFS_SB_FEAT_INCOMPAT_EXCHRANGE`` incompatible feature flag in the superblock protects these new log item records from being replayed on old kernels. -The proposed patchset is the -`file contents exchange -<https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux.git/log/?h=atomic-file-updates>`_ -series. - +--------------------------------------------------------------------------+ | **Sidebar: Using Log-Incompatible Feature Flags** | +--------------------------------------------------------------------------+ @@ -4323,11 +4166,6 @@ To repair the summary file, write the xfile contents into the temporary file and use atomic mapping exchange to commit the new contents. The temporary file is then reaped. -The proposed patchset is the -`realtime summary repair -<https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux.git/log/?h=repair-rtsummary>`_ -series. - Case Study: Salvaging Extended Attributes ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ @@ -4369,11 +4207,6 @@ Salvaging extended attributes is done as follows: 4. Reap the temporary file. -The proposed patchset is the -`extended attribute repair -<https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux.git/log/?h=repair-xattrs>`_ -series. - Fixing Directories ------------------ @@ -4448,11 +4281,6 @@ Unfortunately, the current dentry cache design doesn't provide a means to walk every child dentry of a specific directory, which makes this a hard problem. There is no known solution. -The proposed patchset is the -`directory repair -<https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux.git/log/?h=repair-dirs>`_ -series. - Parent Pointers ``````````````` @@ -4612,11 +4440,6 @@ a :ref:`directory entry live update hook <liveupdate>` as follows: 7. Reap the temporary directory. -The proposed patchset is the -`parent pointers directory repair -<https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux.git/log/?h=pptrs-fsck>`_ -series. - Case Study: Repairing Parent Pointers ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ @@ -4662,11 +4485,6 @@ directory reconstruction: 8. Reap the temporary file. -The proposed patchset is the -`parent pointers repair -<https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux.git/log/?h=pptrs-fsck>`_ -series. - Digression: Offline Checking of Parent Pointers ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ @@ -4755,11 +4573,6 @@ connectivity checks: 4. Move on to examining link counts, as we do today. -The proposed patchset is the -`offline parent pointers repair -<https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfsprogs-dev.git/log/?h=pptrs-fsck>`_ -series. - Rebuilding directories from parent pointers in offline repair would be very challenging because xfs_repair currently uses two single-pass scans of the filesystem during phases 3 and 4 to decide which files are corrupt enough to be @@ -4903,12 +4716,6 @@ Repairing the directory tree works as follows: 6. If the subdirectory has zero paths, attach it to the lost and found. -The proposed patches are in the -`directory tree repair -<https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux.git/log/?h=scrub-directory-tree>`_ -series. - - .. _orphanage: The Orphanage @@ -4973,11 +4780,6 @@ Orphaned files are adopted by the orphanage as follows: 7. If a runtime error happens, call ``xrep_adoption_cancel`` to release all resources. -The proposed patches are in the -`orphanage adoption -<https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux.git/log/?h=repair-orphanage>`_ -series. - 6. Userspace Algorithms and Data Structures =========================================== @@ -5091,14 +4893,6 @@ first workqueue's workers until the backlog eases. This doesn't completely solve the balancing problem, but reduces it enough to move on to more pressing issues. -The proposed patchsets are the scrub -`performance tweaks -<https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfsprogs-dev.git/log/?h=scrub-performance-tweaks>`_ -and the -`inode scan rebalance -<https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfsprogs-dev.git/log/?h=scrub-iscan-rebalance>`_ -series. - .. _scrubrepair: Scheduling Repairs @@ -5179,20 +4973,6 @@ immediately. Corrupt file data blocks reported by phase 6 cannot be recovered by the filesystem. -The proposed patchsets are the -`repair warning improvements -<https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfsprogs-dev.git/log/?h=scrub-better-repair-warnings>`_, -refactoring of the -`repair data dependency -<https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfsprogs-dev.git/log/?h=scrub-repair-data-deps>`_ -and -`object tracking -<https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfsprogs-dev.git/log/?h=scrub-object-tracking>`_, -and the -`repair scheduling -<https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfsprogs-dev.git/log/?h=scrub-repair-scheduling>`_ -improvement series. - Checking Names for Confusable Unicode Sequences ----------------------------------------------- @@ -5372,6 +5152,8 @@ The extra flexibility enables several new use cases: This emulates an atomic device write in software, and can support arbitrary scattered writes. +(This functionality was merged into mainline as of 2025) + Vectorized Scrub ---------------- @@ -5393,13 +5175,7 @@ It is hoped that ``io_uring`` will pick up enough of this functionality that online fsck can use that instead of adding a separate vectored scrub system call to XFS. -The relevant patchsets are the -`kernel vectorized scrub -<https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux.git/log/?h=vectorized-scrub>`_ -and -`userspace vectorized scrub -<https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfsprogs-dev.git/log/?h=vectorized-scrub>`_ -series. +(This functionality was merged into mainline as of 2025) Quality of Service Targets for Scrub ------------------------------------ diff --git a/Documentation/netlink/genetlink-c.yaml b/Documentation/netlink/genetlink-c.yaml index 5a234e9b5fa2..57f59fe23e3f 100644 --- a/Documentation/netlink/genetlink-c.yaml +++ b/Documentation/netlink/genetlink-c.yaml @@ -227,7 +227,7 @@ properties: Optional format indicator that is intended only for choosing the right formatting mechanism when displaying values of this type. - enum: [ hex, mac, fddi, ipv4, ipv6, uuid ] + enum: [ hex, mac, fddi, ipv4, ipv6, ipv4-or-v6, uuid ] # Start genetlink-c name-prefix: type: string diff --git a/Documentation/netlink/genetlink.yaml b/Documentation/netlink/genetlink.yaml index 7b1ec153e834..b020a537d8ac 100644 --- a/Documentation/netlink/genetlink.yaml +++ b/Documentation/netlink/genetlink.yaml @@ -185,7 +185,7 @@ properties: Optional format indicator that is intended only for choosing the right formatting mechanism when displaying values of this type. - enum: [ hex, mac, fddi, ipv4, ipv6, uuid ] + enum: [ hex, mac, fddi, ipv4, ipv6, ipv4-or-v6, uuid ] # Make sure name-prefix does not appear in subsets (subsets inherit naming) dependencies: diff --git a/Documentation/netlink/netlink-raw.yaml b/Documentation/netlink/netlink-raw.yaml index 246fa07bccf6..0166a7e4afbb 100644 --- a/Documentation/netlink/netlink-raw.yaml +++ b/Documentation/netlink/netlink-raw.yaml @@ -157,7 +157,7 @@ properties: Optional format indicator that is intended only for choosing the right formatting mechanism when displaying values of this type. - enum: [ hex, mac, fddi, ipv4, ipv6, uuid ] + enum: [ hex, mac, fddi, ipv4, ipv6, ipv4-or-v6, uuid ] struct: description: Name of the nested struct type. type: string diff --git a/Documentation/netlink/specs/conntrack.yaml b/Documentation/netlink/specs/conntrack.yaml index bef528633b17..db7cddcda50a 100644 --- a/Documentation/netlink/specs/conntrack.yaml +++ b/Documentation/netlink/specs/conntrack.yaml @@ -457,7 +457,7 @@ attribute-sets: name: labels type: binary - - name: labels mask + name: labels-mask type: binary - name: synproxy diff --git a/Documentation/netlink/specs/devlink.yaml b/Documentation/netlink/specs/devlink.yaml index 3db59c965869..837112da6738 100644 --- a/Documentation/netlink/specs/devlink.yaml +++ b/Documentation/netlink/specs/devlink.yaml @@ -99,6 +99,8 @@ definitions: name: legacy - name: switchdev + - + name: switchdev-inactive - type: enum name: eswitch-inline-mode @@ -857,6 +859,14 @@ attribute-sets: name: health-reporter-burst-period type: u64 doc: Time (in msec) for recoveries before starting the grace period. + + # TODO: fill in the attributes in between + + - + name: param-reset-default + type: flag + doc: Request restoring parameter to its default value. + value: 183 - name: dl-dev-stats subset-of: devlink @@ -1791,6 +1801,7 @@ operations: - param-type # param-value-data is missing here as the type is variable - param-value-cmode + - param-reset-default - name: region-get diff --git a/Documentation/netlink/specs/dpll.yaml b/Documentation/netlink/specs/dpll.yaml index 80728f6f9bc8..78d0724d7e12 100644 --- a/Documentation/netlink/specs/dpll.yaml +++ b/Documentation/netlink/specs/dpll.yaml @@ -440,6 +440,12 @@ attribute-sets: doc: | Capable pin provides list of pins that can be bound to create a reference-sync pin pair. + - + name: phase-adjust-gran + type: u32 + doc: | + Granularity of phase adjustment, in picoseconds. The value of + phase adjustment must be a multiple of this granularity. - name: pin-parent-device @@ -616,6 +622,7 @@ operations: - capabilities - parent-device - parent-pin + - phase-adjust-gran - phase-adjust-min - phase-adjust-max - phase-adjust diff --git a/Documentation/netlink/specs/ethtool.yaml b/Documentation/netlink/specs/ethtool.yaml index 6a0fb1974513..0a2d2343f79a 100644 --- a/Documentation/netlink/specs/ethtool.yaml +++ b/Documentation/netlink/specs/ethtool.yaml @@ -1269,7 +1269,7 @@ attribute-sets: - name: hist type: nest - multi-attr: True + multi-attr: true nested-attributes: fec-hist - name: fec @@ -1823,6 +1823,73 @@ attribute-sets: type: uint enum: pse-event doc: List of events reported by the PSE controller + - + name: mse-capabilities + doc: MSE capabilities attribute set + attr-cnt-name: --ethtool-a-mse-capabilities-cnt + attributes: + - + name: max-average-mse + type: uint + - + name: max-peak-mse + type: uint + - + name: refresh-rate-ps + type: uint + - + name: num-symbols + type: uint + - + name: mse-snapshot + doc: MSE snapshot attribute set + attr-cnt-name: --ethtool-a-mse-snapshot-cnt + attributes: + - + name: average-mse + type: uint + - + name: peak-mse + type: uint + - + name: worst-peak-mse + type: uint + - + name: mse + attr-cnt-name: --ethtool-a-mse-cnt + attributes: + - + name: header + type: nest + nested-attributes: header + - + name: capabilities + type: nest + nested-attributes: mse-capabilities + - + name: channel-a + type: nest + nested-attributes: mse-snapshot + - + name: channel-b + type: nest + nested-attributes: mse-snapshot + - + name: channel-c + type: nest + nested-attributes: mse-snapshot + - + name: channel-d + type: nest + nested-attributes: mse-snapshot + - + name: worst-channel + type: nest + nested-attributes: mse-snapshot + - + name: link + type: nest + nested-attributes: mse-snapshot operations: enum-model: directional @@ -2756,6 +2823,25 @@ operations: attributes: - header - context + - + name: mse-get + doc: Get PHY MSE measurement data and capabilities. + attribute-set: mse + do: &mse-get-op + request: + attributes: + - header + reply: + attributes: + - header + - capabilities + - channel-a + - channel-b + - channel-c + - channel-d + - worst-channel + - link + dump: *mse-get-op mcast-groups: list: diff --git a/Documentation/netlink/specs/netdev.yaml b/Documentation/netlink/specs/netdev.yaml index e00d3fa1c152..82bf5cb2617d 100644 --- a/Documentation/netlink/specs/netdev.yaml +++ b/Documentation/netlink/specs/netdev.yaml @@ -88,7 +88,7 @@ definitions: - name: napi-threaded type: enum - entries: [disabled, enabled] + entries: [disabled, enabled, busy-poll] attribute-sets: - @@ -291,7 +291,8 @@ attribute-sets: name: threaded doc: Whether the NAPI is configured to operate in threaded polling mode. If this is set to enabled then the NAPI context operates - in threaded polling mode. + in threaded polling mode. If this is set to busy-poll, then the + threaded polling mode also busy polls. type: u32 enum: napi-threaded - @@ -732,6 +733,29 @@ operations: - rx-bytes - tx-packets - tx-bytes + - rx-alloc-fail + - rx-hw-drops + - rx-hw-drop-overruns + - rx-csum-complete + - rx-csum-unnecessary + - rx-csum-none + - rx-csum-bad + - rx-hw-gro-packets + - rx-hw-gro-bytes + - rx-hw-gro-wire-packets + - rx-hw-gro-wire-bytes + - rx-hw-drop-ratelimits + - tx-hw-drops + - tx-hw-drop-errors + - tx-csum-none + - tx-needs-csum + - tx-hw-gso-packets + - tx-hw-gso-bytes + - tx-hw-gso-wire-packets + - tx-hw-gso-wire-bytes + - tx-hw-drop-ratelimits + - tx-stop + - tx-wake - name: bind-rx doc: Bind dmabuf to netdev diff --git a/Documentation/netlink/specs/nftables.yaml b/Documentation/netlink/specs/nftables.yaml index cce88819ba71..17ad707fa0d5 100644 --- a/Documentation/netlink/specs/nftables.yaml +++ b/Documentation/netlink/specs/nftables.yaml @@ -915,7 +915,7 @@ attribute-sets: type: string doc: Name of set to use - - name: set id + name: set-id type: u32 byte-order: big-endian doc: ID of set to use diff --git a/Documentation/netlink/specs/psp.yaml b/Documentation/netlink/specs/psp.yaml index 944429e5c9a8..f3a57782d2cf 100644 --- a/Documentation/netlink/specs/psp.yaml +++ b/Documentation/netlink/specs/psp.yaml @@ -76,6 +76,83 @@ attribute-sets: name: spi doc: Security Parameters Index (SPI) of the association. type: u32 + - + name: stats + attributes: + - + name: dev-id + doc: PSP device ID. + type: u32 + checks: + min: 1 + - + name: key-rotations + type: uint + doc: | + Number of key rotations during the lifetime of the device. + Kernel statistic. + - + name: stale-events + type: uint + doc: | + Number of times a socket's Rx got shut down due to using + a key which went stale (fully rotated out). + Kernel statistic. + - + name: rx-packets + type: uint + doc: | + Number of successfully processed and authenticated PSP packets. + Device statistic (from the PSP spec). + - + name: rx-bytes + type: uint + doc: | + Number of successfully authenticated PSP bytes received, counting from + the first byte after the IV through the last byte of payload. + The fixed initial portion of the PSP header (16 bytes) + and the PSP trailer/ICV (16 bytes) are not included in this count. + Device statistic (from the PSP spec). + - + name: rx-auth-fail + type: uint + doc: | + Number of received PSP packets with unsuccessful authentication. + Device statistic (from the PSP spec). + - + name: rx-error + type: uint + doc: | + Number of received PSP packets with length/framing errors. + Device statistic (from the PSP spec). + - + name: rx-bad + type: uint + doc: | + Number of received PSP packets with miscellaneous errors + (invalid master key indicated by SPI, unsupported version, etc.) + Device statistic (from the PSP spec). + - + name: tx-packets + type: uint + doc: | + Number of successfully processed PSP packets for transmission. + Device statistic (from the PSP spec). + - + name: tx-bytes + type: uint + doc: | + Number of successfully processed PSP bytes for transmit, counting from + the first byte after the IV through the last byte of payload. + The fixed initial portion of the PSP header (16 bytes) + and the PSP trailer/ICV (16 bytes) are not included in this count. + Device statistic (from the PSP spec). + - + name: tx-error + type: uint + doc: | + Number of PSP packets for transmission with errors. + Device statistic (from the PSP spec). operations: list: @@ -177,6 +254,24 @@ operations: pre: psp-assoc-device-get-locked post: psp-device-unlock + - + name: get-stats + doc: Get device statistics. + attribute-set: stats + do: + request: + attributes: + - dev-id + reply: &stats-all + attributes: + - dev-id + - key-rotations + - stale-events + pre: psp-device-get-locked + post: psp-device-unlock + dump: + reply: *stats-all + mcast-groups: list: - diff --git a/Documentation/netlink/specs/rt-addr.yaml b/Documentation/netlink/specs/rt-addr.yaml index 3a582eac1629..163a106c41bb 100644 --- a/Documentation/netlink/specs/rt-addr.yaml +++ b/Documentation/netlink/specs/rt-addr.yaml @@ -86,17 +86,18 @@ attribute-sets: - name: address type: binary - display-hint: ipv4 + display-hint: ipv4-or-v6 - name: local type: binary - display-hint: ipv4 + display-hint: ipv4-or-v6 - name: label type: string - name: broadcast - type: binary + type: u32 + byte-order: big-endian display-hint: ipv4 - name: anycast diff --git a/Documentation/netlink/specs/rt-link.yaml b/Documentation/netlink/specs/rt-link.yaml index 2a23e9699c0b..6beeb6ee5adf 100644 --- a/Documentation/netlink/specs/rt-link.yaml +++ b/Documentation/netlink/specs/rt-link.yaml @@ -1707,11 +1707,11 @@ attribute-sets: - name: local type: binary - display-hint: ipv4 + display-hint: ipv4-or-v6 - name: remote type: binary - display-hint: ipv4 + display-hint: ipv4-or-v6 - name: ttl type: u8 @@ -1833,11 +1833,11 @@ attribute-sets: - name: local type: binary - display-hint: ipv4 + display-hint: ipv4-or-v6 - name: remote type: binary - display-hint: ipv4 + display-hint: ipv4-or-v6 - name: fwmark type: u32 @@ -1868,7 +1868,8 @@ attribute-sets: type: u32 - name: remote - type: binary + type: u32 + byte-order: big-endian display-hint: ipv4 - name: ttl @@ -1914,6 +1915,35 @@ attribute-sets: type: binary struct: ifla-geneve-port-range - + name: linkinfo-hsr-attrs + name-prefix: ifla-hsr- + attributes: + - + name: slave1 + type: u32 + - + name: slave2 + type: u32 + - + name: multicast-spec + type: u8 + - + name: supervision-addr + type: binary + display-hint: mac + - + name: seq-nr + type: u16 + - + name: version + type: u8 + - + name: protocol + type: u8 + - + name: interlink + type: u32 + - name: linkinfo-iptun-attrs name-prefix: ifla-iptun- attributes: @@ -1923,11 +1953,11 @@ attribute-sets: - name: local type: binary - display-hint: ipv4 + display-hint: ipv4-or-v6 - name: remote type: binary - display-hint: ipv4 + display-hint: ipv4-or-v6 - name: ttl type: u8 @@ -1957,7 +1987,8 @@ attribute-sets: display-hint: ipv6 - name: 6rd-relay-prefix - type: binary + type: u32 + byte-order: big-endian display-hint: ipv4 - name: 6rd-prefixlen @@ -2300,6 +2331,9 @@ sub-messages: value: geneve attribute-set: linkinfo-geneve-attrs - + value: hsr + attribute-set: linkinfo-hsr-attrs + - value: ipip attribute-set: linkinfo-iptun-attrs - diff --git a/Documentation/netlink/specs/rt-neigh.yaml b/Documentation/netlink/specs/rt-neigh.yaml index 2f568a6231c9..0f46ef313590 100644 --- a/Documentation/netlink/specs/rt-neigh.yaml +++ b/Documentation/netlink/specs/rt-neigh.yaml @@ -194,7 +194,7 @@ attribute-sets: - name: dst type: binary - display-hint: ipv4 + display-hint: ipv4-or-v6 - name: lladdr type: binary diff --git a/Documentation/netlink/specs/rt-route.yaml b/Documentation/netlink/specs/rt-route.yaml index 1ecb3fadc067..33195db96746 100644 --- a/Documentation/netlink/specs/rt-route.yaml +++ b/Documentation/netlink/specs/rt-route.yaml @@ -87,11 +87,11 @@ attribute-sets: - name: dst type: binary - display-hint: ipv4 + display-hint: ipv4-or-v6 - name: src type: binary - display-hint: ipv4 + display-hint: ipv4-or-v6 - name: iif type: u32 @@ -101,14 +101,14 @@ attribute-sets: - name: gateway type: binary - display-hint: ipv4 + display-hint: ipv4-or-v6 - name: priority type: u32 - name: prefsrc type: binary - display-hint: ipv4 + display-hint: ipv4-or-v6 - name: metrics type: nest diff --git a/Documentation/netlink/specs/rt-rule.yaml b/Documentation/netlink/specs/rt-rule.yaml index bebee452a950..7f03a44ab036 100644 --- a/Documentation/netlink/specs/rt-rule.yaml +++ b/Documentation/netlink/specs/rt-rule.yaml @@ -96,10 +96,12 @@ attribute-sets: attributes: - name: dst - type: u32 + type: binary + display-hint: ipv4-or-v6 - name: src - type: u32 + type: binary + display-hint: ipv4-or-v6 - name: iifname type: string diff --git a/Documentation/netlink/specs/wireguard.yaml b/Documentation/netlink/specs/wireguard.yaml new file mode 100644 index 000000000000..30479fc6bb69 --- /dev/null +++ b/Documentation/netlink/specs/wireguard.yaml @@ -0,0 +1,298 @@ +# SPDX-License-Identifier: ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) +--- +name: wireguard +protocol: genetlink-legacy + +doc: | + **Netlink protocol to control WireGuard network devices.** + + The below enums and macros are for interfacing with WireGuard, using generic + netlink, with family ``WG_GENL_NAME`` and version ``WG_GENL_VERSION``. It + defines two commands: get and set. Note that while they share many common + attributes, these two commands actually accept a slightly different set of + inputs and outputs. These differences are noted under the individual + attributes. +c-family-name: wg-genl-name +c-version-name: wg-genl-version +max-by-define: true + +definitions: + - + name-prefix: wg- + name: key-len + type: const + value: 32 + - + name: --kernel-timespec + type: struct + header: linux/time_types.h + members: + - + name: sec + type: u64 + doc: Number of seconds, since UNIX epoch. + - + name: nsec + type: u64 + doc: Number of nanoseconds, after the second began. + - + name: wgdevice-flags + name-prefix: wgdevice-f- + enum-name: wgdevice-flag + type: flags + entries: + - replace-peers + - + name: wgpeer-flags + name-prefix: wgpeer-f- + enum-name: wgpeer-flag + type: flags + entries: + - remove-me + - replace-allowedips + - update-only + - + name: wgallowedip-flags + name-prefix: wgallowedip-f- + enum-name: wgallowedip-flag + type: flags + entries: + - remove-me + +attribute-sets: + - + name: wgdevice + enum-name: wgdevice-attribute + name-prefix: wgdevice-a- + attr-cnt-name: --wgdevice-a-last + attributes: + - + name: unspec + type: unused + value: 0 + - + name: ifindex + type: u32 + - + name: ifname + type: string + checks: + max-len: 15 + - + name: private-key + type: binary + doc: Set to all zeros to remove. + display-hint: hex + checks: + exact-len: wg-key-len + - + name: public-key + type: binary + display-hint: hex + checks: + exact-len: wg-key-len + - + name: flags + type: u32 + doc: | + ``0`` or ``WGDEVICE_F_REPLACE_PEERS`` if all current peers should be + removed prior to adding the list below. + enum: wgdevice-flags + - + name: listen-port + type: u16 + doc: Set as ``0`` to choose randomly. + - + name: fwmark + type: u32 + doc: Set as ``0`` to disable. + - + name: peers + type: indexed-array + sub-type: nest + nested-attributes: wgpeer + doc: | + The index/type parameter is unused on ``SET_DEVICE`` operations and is + zero on ``GET_DEVICE`` operations. + - + name: wgpeer + enum-name: wgpeer-attribute + name-prefix: wgpeer-a- + attr-cnt-name: --wgpeer-a-last + attributes: + - + name: unspec + type: unused + value: 0 + - + name: public-key + type: binary + display-hint: hex + checks: + exact-len: wg-key-len + - + name: preshared-key + type: binary + doc: Set as all zeros to remove. + display-hint: hex + checks: + exact-len: wg-key-len + - + name: flags + type: u32 + doc: | + ``0`` and/or ``WGPEER_F_REMOVE_ME`` if the specified peer should not + exist at the end of the operation, rather than added/updated and/or + ``WGPEER_F_REPLACE_ALLOWEDIPS`` if all current allowed IPs of this + peer should be removed prior to adding the list below and/or + ``WGPEER_F_UPDATE_ONLY`` if the peer should only be set if it already + exists. + enum: wgpeer-flags + - + name: endpoint + type: binary + doc: struct sockaddr_in or struct sockaddr_in6 + checks: + min-len: 16 + - + name: persistent-keepalive-interval + type: u16 + doc: Set as ``0`` to disable. + - + name: last-handshake-time + type: binary + struct: --kernel-timespec + checks: + exact-len: 16 + - + name: rx-bytes + type: u64 + - + name: tx-bytes + type: u64 + - + name: allowedips + type: indexed-array + sub-type: nest + nested-attributes: wgallowedip + doc: | + The index/type parameter is unused on ``SET_DEVICE`` operations and is + zero on ``GET_DEVICE`` operations. + - + name: protocol-version + type: u32 + doc: | + Should not be set or used at all by most users of this API, as the + most recent protocol will be used when this is unset. Otherwise, + must be set to ``1``. + - + name: wgallowedip + enum-name: wgallowedip-attribute + name-prefix: wgallowedip-a- + attr-cnt-name: --wgallowedip-a-last + attributes: + - + name: unspec + type: unused + value: 0 + - + name: family + type: u16 + doc: IP family, either ``AF_INET`` or ``AF_INET6``. + - + name: ipaddr + type: binary + doc: Either ``struct in_addr`` or ``struct in6_addr``. + display-hint: ipv4-or-v6 + checks: + min-len: 4 + - + name: cidr-mask + type: u8 + - + name: flags + type: u32 + doc: | + ``WGALLOWEDIP_F_REMOVE_ME`` if the specified IP should be removed; + otherwise, this IP will be added if it is not already present. + enum: wgallowedip-flags + +operations: + enum-name: wg-cmd + name-prefix: wg-cmd- + list: + - + name: get-device + value: 0 + doc: | + Retrieve WireGuard device + ~~~~~~~~~~~~~~~~~~~~~~~~~ + + The command should be called with one but not both of: + + - ``WGDEVICE_A_IFINDEX`` + - ``WGDEVICE_A_IFNAME`` + + The kernel will then return several messages (``NLM_F_MULTI``). It is + possible that all of the allowed IPs of a single peer will not fit + within a single netlink message. In that case, the same peer will be + written in the following message, except it will only contain + ``WGPEER_A_PUBLIC_KEY`` and ``WGPEER_A_ALLOWEDIPS``. This may occur + several times in a row for the same peer. It is then up to the receiver + to coalesce adjacent peers. Likewise, it is possible that all peers will + not fit within a single message. So, subsequent peers will be sent in + following messages, except those will only contain ``WGDEVICE_A_IFNAME`` + and ``WGDEVICE_A_PEERS``. It is then up to the receiver to coalesce + these messages to form the complete list of peers. + + Since this is an ``NLA_F_DUMP`` command, the final message will always + be ``NLMSG_DONE``, even if an error occurs. However, this ``NLMSG_DONE`` + message contains an integer error code. It is either zero or a negative + error code corresponding to the errno. + attribute-set: wgdevice + flags: [uns-admin-perm] + + dump: + pre: wg-get-device-start + post: wg-get-device-done + request: + attributes: + - ifindex + - ifname + reply: &all-attrs + attributes: + - ifindex + - ifname + - private-key + - public-key + - flags + - listen-port + - fwmark + - peers + - + name: set-device + value: 1 + doc: | + Set WireGuard device + ~~~~~~~~~~~~~~~~~~~~ + + This command should be called with a wgdevice set, containing one but + not both of ``WGDEVICE_A_IFINDEX`` and ``WGDEVICE_A_IFNAME``. + + It is possible that the amount of configuration data exceeds that of the + maximum message length accepted by the kernel. In that case, several + messages should be sent one after another, with each successive one + filling in information not contained in the prior. Note that if + ``WGDEVICE_F_REPLACE_PEERS`` is specified in the first message, it + probably should not be specified in fragments that come after, so that + the list of peers is only cleared the first time but appended after. + Likewise for peers, if ``WGPEER_F_REPLACE_ALLOWEDIPS`` is specified in + the first message of a peer, it likely should not be specified in + subsequent fragments. + + If an error occurs, ``NLMSG_ERROR`` will reply containing an errno. + attribute-set: wgdevice + flags: [uns-admin-perm] + + do: + request: *all-attrs diff --git a/Documentation/networking/6pack.rst b/Documentation/networking/6pack.rst index bc5bf1f1a98f..66d5fd4fc821 100644 --- a/Documentation/networking/6pack.rst +++ b/Documentation/networking/6pack.rst @@ -94,7 +94,7 @@ kernels may lead to a compilation error because the interface to a kernel function has been changed in the 2.1.8x kernels. How to turn on 6pack support: -============================= +----------------------------- - In the linux kernel configuration program, select the code maturity level options menu and turn on the prompting for development drivers. diff --git a/Documentation/networking/arcnet-hardware.rst b/Documentation/networking/arcnet-hardware.rst index 3bf7f99cd7bb..20e5075d0d0e 100644 --- a/Documentation/networking/arcnet-hardware.rst +++ b/Documentation/networking/arcnet-hardware.rst @@ -4,18 +4,20 @@ ARCnet Hardware =============== +:Author: Avery Pennarun <apenwarr@worldvisions.ca> + .. note:: - 1) This file is a supplement to arcnet.txt. Please read that for general + 1) This file is a supplement to arcnet.rst. Please read that for general driver configuration help. 2) This file is no longer Linux-specific. It should probably be moved out of the kernel sources. Ideas? Because so many people (myself included) seem to have obtained ARCnet cards without manuals, this file contains a quick introduction to ARCnet hardware, -some cabling tips, and a listing of all jumper settings I can find. Please -e-mail apenwarr@worldvisions.ca with any settings for your particular card, -or any other information you have! +some cabling tips, and a listing of all jumper settings I can find. If you +have any settings for your particular card, and/or any other information you +have, do not hesitate to :ref:`email to netdev <arcnet-netdev>`. Introduction to ARCnet @@ -72,11 +74,10 @@ level of encapsulation is defined by RFC1201, which I call "packet splitting," that allows "virtual packets" to grow as large as 64K each, although they are generally kept down to the Ethernet-style 1500 bytes. -For more information on the advantages and disadvantages (mostly the -advantages) of ARCnet networks, you might try the "ARCnet Trade Association" -WWW page: +For more information on ARCnet networks, visit the "ARCNET Resource Center" +WWW page at: - http://www.arcnet.com + https://www.arcnet.cc Cabling ARCnet Networks @@ -3226,9 +3227,6 @@ Settings for IRQ Selection (Lower Jumper Line) Other Cards =========== -I have no information on other models of ARCnet cards at the moment. Please -send any and all info to: - - apenwarr@worldvisions.ca +I have no information on other models of ARCnet cards at the moment. Thanks. diff --git a/Documentation/networking/arcnet.rst b/Documentation/networking/arcnet.rst index 82fce606c0f0..cd43a18ad149 100644 --- a/Documentation/networking/arcnet.rst +++ b/Documentation/networking/arcnet.rst @@ -4,6 +4,8 @@ ARCnet ====== +:Author: Avery Pennarun <apenwarr@worldvisions.ca> + .. note:: See also arcnet-hardware.txt in this directory for jumper-setting @@ -30,18 +32,7 @@ Come on, be a sport! Send me a success report! (hey, that was even better than my original poem... this is getting bad!) - -.. warning:: - - If you don't e-mail me about your success/failure soon, I may be forced to - start SINGING. And we don't want that, do we? - - (You know, it might be argued that I'm pushing this point a little too much. - If you think so, why not flame me in a quick little e-mail? Please also - include the type of card(s) you're using, software, size of network, and - whether it's working or not.) - - My e-mail address is: apenwarr@worldvisions.ca +---- These are the ARCnet drivers for Linux. @@ -59,23 +50,14 @@ ARCnet 2.10 ALPHA, Tomasz's all-new-and-improved RFC1051 support has been included and seems to be working fine! +.. _arcnet-netdev: + Where do I discuss these drivers? --------------------------------- -Tomasz has been so kind as to set up a new and improved mailing list. -Subscribe by sending a message with the BODY "subscribe linux-arcnet YOUR -REAL NAME" to listserv@tichy.ch.uj.edu.pl. Then, to submit messages to the -list, mail to linux-arcnet@tichy.ch.uj.edu.pl. - -There are archives of the mailing list at: - - http://epistolary.org/mailman/listinfo.cgi/arcnet - -The people on linux-net@vger.kernel.org (now defunct, replaced by -netdev@vger.kernel.org) have also been known to be very helpful, especially -when we're talking about ALPHA Linux kernels that may or may not work right -in the first place. - +ARCnet discussions take place on netdev. Simply send your email to +netdev@vger.kernel.org and make sure to Cc: maintainer listed in +"ARCNET NETWORK LAYER" heading of Documentation/process/maintainers.rst. Other Drivers and Info ---------------------- @@ -523,17 +505,9 @@ can set up your network then: It works: what now? ------------------- -Send mail describing your setup, preferably including driver version, kernel -version, ARCnet card model, CPU type, number of systems on your network, and -list of software in use to me at the following address: - - apenwarr@worldvisions.ca - -I do send (sometimes automated) replies to all messages I receive. My email -can be weird (and also usually gets forwarded all over the place along the -way to me), so if you don't get a reply within a reasonable time, please -resend. - +Send mail following :ref:`arcnet-netdev`. Describe your setup, preferably +including driver version, kernel version, ARCnet card model, CPU type, number +of systems on your network, and list of software in use. It doesn't work: what now? -------------------------- diff --git a/Documentation/networking/device_drivers/cellular/qualcomm/rmnet.rst b/Documentation/networking/device_drivers/cellular/qualcomm/rmnet.rst index 6877a3260582..5aedbabb7382 100644 --- a/Documentation/networking/device_drivers/cellular/qualcomm/rmnet.rst +++ b/Documentation/networking/device_drivers/cellular/qualcomm/rmnet.rst @@ -28,6 +28,7 @@ these MAP frames and send them to appropriate PDN's. ================ a. MAP packet v1 (data / control) +--------------------------------- MAP header fields are in big endian format. @@ -54,6 +55,7 @@ Payload length includes the padding length but does not include MAP header length. b. Map packet v4 (data / control) +--------------------------------- MAP header fields are in big endian format. @@ -107,6 +109,7 @@ over which checksum is computed. Checksum value, indicates the checksum computed. c. MAP packet v5 (data / control) +--------------------------------- MAP header fields are in big endian format. @@ -134,6 +137,7 @@ Payload length includes the padding length but does not include MAP header length. d. Checksum offload header v5 +----------------------------- Checksum offload header fields are in big endian format. @@ -158,7 +162,10 @@ indicates that the calculated packet checksum is invalid. Reserved bits must be zero when sent and ignored when received. -e. MAP packet v1/v5 (command specific):: +e. MAP packet v1/v5 (command specific) +-------------------------------------- + +Packet format:: Bit 0 1 2-7 8 - 15 16 - 31 Function Command Reserved Pad Multiplexer ID Payload length @@ -181,6 +188,7 @@ Command types = ========================================== f. Aggregation +-------------- Aggregation is multiple MAP packets (can be data or command) delivered to rmnet in a single linear skb. rmnet will process the individual diff --git a/Documentation/networking/device_drivers/ethernet/index.rst b/Documentation/networking/device_drivers/ethernet/index.rst index 7cfcd183054f..bcc02355f828 100644 --- a/Documentation/networking/device_drivers/ethernet/index.rst +++ b/Documentation/networking/device_drivers/ethernet/index.rst @@ -47,6 +47,7 @@ Contents: mellanox/mlx5/index meta/fbnic microsoft/netvsc + mucse/rnpgbe neterion/s2io netronome/nfp pensando/ionic diff --git a/Documentation/networking/device_drivers/ethernet/mucse/rnpgbe.rst b/Documentation/networking/device_drivers/ethernet/mucse/rnpgbe.rst new file mode 100644 index 000000000000..d35cf8a46b6c --- /dev/null +++ b/Documentation/networking/device_drivers/ethernet/mucse/rnpgbe.rst @@ -0,0 +1,17 @@ +.. SPDX-License-Identifier: GPL-2.0 + +=========================================================== +Linux Base Driver for MUCSE(R) Gigabit PCI Express Adapters +=========================================================== + +Contents +======== + +- Identifying Your Adapter + +Identifying Your Adapter +======================== +The driver is compatible with devices based on the following: + + * MUCSE(R) Ethernet Controller N210 series + * MUCSE(R) Ethernet Controller N500 series diff --git a/Documentation/networking/devlink/devlink-eswitch-attr.rst b/Documentation/networking/devlink/devlink-eswitch-attr.rst index 08bb39ab1528..eafe09abc40c 100644 --- a/Documentation/networking/devlink/devlink-eswitch-attr.rst +++ b/Documentation/networking/devlink/devlink-eswitch-attr.rst @@ -39,6 +39,10 @@ The following is a list of E-Switch attributes. rules. * ``switchdev`` allows for more advanced offloading capabilities of the E-Switch to hardware. + * ``switchdev_inactive`` switchdev mode but starts inactive, doesn't allow traffic + until explicitly activated. This mode is useful for orchestrators that + want to prepare the device in switchdev mode but only activate it when + all configurations are done. * - ``inline-mode`` - enum - Some HWs need the VF driver to put part of the packet @@ -74,3 +78,12 @@ Example Usage # enable encap-mode with legacy mode $ devlink dev eswitch set pci/0000:08:00.0 mode legacy inline-mode none encap-mode basic + + # start switchdev mode in inactive state + $ devlink dev eswitch set pci/0000:08:00.0 mode switchdev_inactive + + # setup switchdev configurations, representors, FDB entries, etc.. + ... + + # activate switchdev mode to allow traffic + $ devlink dev eswitch set pci/0000:08:00.0 mode switchdev diff --git a/Documentation/networking/devlink/devlink-params.rst b/Documentation/networking/devlink/devlink-params.rst index 0a9c20d70122..ea17756dcda6 100644 --- a/Documentation/networking/devlink/devlink-params.rst +++ b/Documentation/networking/devlink/devlink-params.rst @@ -41,6 +41,16 @@ In order for ``driverinit`` parameters to take effect, the driver must support reloading via the ``devlink-reload`` command. This command will request a reload of the device driver. +Default parameter values +========================= + +Drivers may optionally export default values for parameters of cmode +``runtime`` and ``permanent``. For ``driverinit`` parameters, the last +value set by the driver will be used as the default value. Drivers can +also support resetting params with cmode ``runtime`` and ``permanent`` +to their default values. Resetting ``driverinit`` params is supported +by devlink core without additional driver support needed. + .. _devlink_params_generic: Generic configuration parameters @@ -151,3 +161,7 @@ own name. * - ``num_doorbells`` - u32 - Controls the number of doorbells used by the device. + * - ``max_mac_per_vf`` + - u32 + - Controls the maximum number of MAC address filters that can be assigned + to a Virtual Function (VF). diff --git a/Documentation/networking/devlink/i40e.rst b/Documentation/networking/devlink/i40e.rst index d3cb5bb5197e..51c887f0dc83 100644 --- a/Documentation/networking/devlink/i40e.rst +++ b/Documentation/networking/devlink/i40e.rst @@ -7,6 +7,40 @@ i40e devlink support This document describes the devlink features implemented by the ``i40e`` device driver. +Parameters +========== + +.. list-table:: Generic parameters implemented + :widths: 5 5 90 + + * - Name + - Mode + - Notes + * - ``max_mac_per_vf`` + - runtime + - Controls the maximum number of MAC addresses a VF can use + on i40e devices. + + By default (``0``), the driver enforces its internally calculated per-VF + MAC filter limit, which is based on the number of allocated VFS. + + If set to a non-zero value, this parameter acts as a strict cap: + the driver will use the user-provided value instead of its internal + calculation. + + **Important notes:** + + - This value **must be set before enabling SR-IOV**. + Attempting to change it while SR-IOV is enabled will return an error. + - MAC filters are a **shared hardware resource** across all VFs. + Setting a high value may cause other VFs to be starved of filters. + - This value is a **Administrative policy**. The hardware may return + errors when its absolute limit is reached, regardless of the value + set here. + + The default value is ``0`` (internal calculation is used). + + Info versions ============= diff --git a/Documentation/networking/devlink/index.rst b/Documentation/networking/devlink/index.rst index 0c58e5c729d9..35b12a2bfeba 100644 --- a/Documentation/networking/devlink/index.rst +++ b/Documentation/networking/devlink/index.rst @@ -99,5 +99,6 @@ parameters, info versions, and other features it supports. prestera qed sfc + stmmac ti-cpsw-switch zl3073x diff --git a/Documentation/networking/devlink/mlx5.rst b/Documentation/networking/devlink/mlx5.rst index 0e5f9c76e514..4bba4d780a4a 100644 --- a/Documentation/networking/devlink/mlx5.rst +++ b/Documentation/networking/devlink/mlx5.rst @@ -218,6 +218,20 @@ parameters. * ``balanced`` : Merges fewer CQEs, resulting in a moderate compression ratio but maintaining a balance between bandwidth savings and performance * ``aggressive`` : Merges more CQEs into a single entry, achieving a higher compression rate and maximizing performance, particularly under high traffic loads + * - ``swp_l4_csum_mode`` + - string + - permanent + - Configure how the L4 checksum is calculated by the device when using + Software Parser (SWP) hints for header locations. + + * ``default`` : Use the device's default checksum calculation + mode. The driver will discover during init whether or + full_csum or l4_only is in use. Setting this value explicitly + from userspace is not allowed, but some firmware versions may + return this value on param read. + * ``full_csum`` : Calculate full checksum including the pseudo-header + * ``l4_only`` : Calculate L4-only checksum, excluding the pseudo-header + The ``mlx5`` driver supports reloading via ``DEVLINK_CMD_RELOAD`` Info versions diff --git a/Documentation/networking/devlink/stmmac.rst b/Documentation/networking/devlink/stmmac.rst new file mode 100644 index 000000000000..47e3ff10bc08 --- /dev/null +++ b/Documentation/networking/devlink/stmmac.rst @@ -0,0 +1,40 @@ +.. SPDX-License-Identifier: GPL-2.0 + +======================================= +stmmac (synopsys dwmac) devlink support +======================================= + +This document describes the devlink features implemented by the ``stmmac`` +device driver. + +Parameters +========== + +The ``stmmac`` driver implements the following driver-specific parameters. + +.. list-table:: Driver-specific parameters implemented + :widths: 5 5 5 85 + + * - Name + - Type + - Mode + - Description + * - ``phc_coarse_adj`` + - Boolean + - runtime + - Enable the Coarse timestamping mode, as defined in the DWMAC TRM. + A detailed explanation of this timestamping mode can be found in the + Socfpga Functionnal Description [1]. + + In Coarse mode, the ptp clock is expected to be fed by a high-precision + clock that is externally adjusted, and the subsecond increment used for + timestamping is set to 1/ptp_clock_rate. + + In Fine mode (i.e. Coarse mode == false), the ptp clock frequency is + continuously adjusted, but the subsecond increment is set to + 2/ptp_clock_rate. + + Coarse mode is suitable for PTP Grand Master operation. If unsure, leave + the parameter to False. + + [1] https://www.intel.com/content/www/us/en/docs/programmable/683126/21-2/functional-description-of-the-emac.html diff --git a/Documentation/networking/dsa/dsa.rst b/Documentation/networking/dsa/dsa.rst index 7b2e69cd7ef0..5c79740a533b 100644 --- a/Documentation/networking/dsa/dsa.rst +++ b/Documentation/networking/dsa/dsa.rst @@ -1104,12 +1104,11 @@ health of the network and for discovery of other nodes. In Linux, both HSR and PRP are implemented in the hsr driver, which instantiates a virtual, stackable network interface with two member ports. The driver only implements the basic roles of DANH (Doubly Attached Node -implementing HSR) and DANP (Doubly Attached Node implementing PRP); the roles -of RedBox and QuadBox are not implemented (therefore, bridging a hsr network -interface with a physical switch port does not produce the expected result). +implementing HSR), DANP (Doubly Attached Node implementing PRP) and RedBox +(allows non-HSR devices to connect to the ring via Interlink ports). -A driver which is able of offloading certain functions of a DANP or DANH should -declare the corresponding netdev features as indicated by the documentation at +A driver which is able of offloading certain functions should declare the +corresponding netdev features as indicated by the documentation at ``Documentation/networking/netdev-features.rst``. Additionally, the following methods must be implemented: @@ -1120,6 +1119,14 @@ methods must be implemented: - ``port_hsr_leave``: function invoked when a given switch port leaves a DANP/DANH and returns to normal operation as a standalone port. +Note that the ``NETIF_F_HW_HSR_DUP`` feature relies on transmission towards +multiple ports, which is generally available whenever the tagging protocol uses +the ``dsa_xmit_port_mask()`` helper function. If the helper is used, the HSR +offload feature should also be set. The ``dsa_port_simple_hsr_join()`` and +``dsa_port_simple_hsr_leave()`` methods can be used as generic implementations +of ``port_hsr_join`` and ``port_hsr_leave``, if this is the only supported +offload feature. + TODO ==== diff --git a/Documentation/networking/ethtool-netlink.rst b/Documentation/networking/ethtool-netlink.rst index b270886c5f5d..af56c304cef4 100644 --- a/Documentation/networking/ethtool-netlink.rst +++ b/Documentation/networking/ethtool-netlink.rst @@ -242,6 +242,7 @@ Userspace to kernel: ``ETHTOOL_MSG_RSS_SET`` set RSS settings ``ETHTOOL_MSG_RSS_CREATE_ACT`` create an additional RSS context ``ETHTOOL_MSG_RSS_DELETE_ACT`` delete an additional RSS context + ``ETHTOOL_MSG_MSE_GET`` get MSE diagnostic data ===================================== ================================= Kernel to userspace: @@ -299,6 +300,7 @@ Kernel to userspace: ``ETHTOOL_MSG_RSS_CREATE_ACT_REPLY`` create an additional RSS context ``ETHTOOL_MSG_RSS_CREATE_NTF`` additional RSS context created ``ETHTOOL_MSG_RSS_DELETE_NTF`` additional RSS context deleted + ``ETHTOOL_MSG_MSE_GET_REPLY`` MSE diagnostic data ======================================== ================================= ``GET`` requests are sent by userspace applications to retrieve device @@ -2458,6 +2460,68 @@ Kernel response contents: For a description of each attribute, see ``TSCONFIG_GET``. +MSE_GET +======= + +Retrieves detailed Mean Square Error (MSE) diagnostic information from the PHY. + +Request Contents: + + ==================================== ====== ============================ + ``ETHTOOL_A_MSE_HEADER`` nested request header + ==================================== ====== ============================ + +Kernel Response Contents: + + ==================================== ====== ================================ + ``ETHTOOL_A_MSE_HEADER`` nested reply header + ``ETHTOOL_A_MSE_CAPABILITIES`` nested capability/scale info for MSE + measurements + ``ETHTOOL_A_MSE_CHANNEL_A`` nested snapshot for Channel A + ``ETHTOOL_A_MSE_CHANNEL_B`` nested snapshot for Channel B + ``ETHTOOL_A_MSE_CHANNEL_C`` nested snapshot for Channel C + ``ETHTOOL_A_MSE_CHANNEL_D`` nested snapshot for Channel D + ``ETHTOOL_A_MSE_WORST_CHANNEL`` nested snapshot for worst channel + ``ETHTOOL_A_MSE_LINK`` nested snapshot for link-wide aggregate + ==================================== ====== ================================ + +MSE Capabilities +---------------- + +This nested attribute reports the capability / scaling properties used to +interpret snapshot values. + + ============================================== ====== ========================= + ``ETHTOOL_A_MSE_CAPABILITIES_MAX_AVERAGE_MSE`` uint max avg_mse scale + ``ETHTOOL_A_MSE_CAPABILITIES_MAX_PEAK_MSE`` uint max peak_mse scale + ``ETHTOOL_A_MSE_CAPABILITIES_REFRESH_RATE_PS`` uint sample rate (picoseconds) + ``ETHTOOL_A_MSE_CAPABILITIES_NUM_SYMBOLS`` uint symbols per HW sample + ============================================== ====== ========================= + +The max-average/peak fields are included only if the corresponding metric +is supported by the PHY. Their absence indicates that the metric is not +available. + +See ``struct phy_mse_capability`` kernel documentation in +``include/linux/phy.h``. + +MSE Snapshot +------------ + +Each per-channel nest contains an atomic snapshot of MSE values for that +selector (channel A/B/C/D, worst channel, or link). + + ========================================== ====== =================== + ``ETHTOOL_A_MSE_SNAPSHOT_AVERAGE_MSE`` uint average MSE value + ``ETHTOOL_A_MSE_SNAPSHOT_PEAK_MSE`` uint current peak MSE + ``ETHTOOL_A_MSE_SNAPSHOT_WORST_PEAK_MSE`` uint worst-case peak MSE + ========================================== ====== =================== + +Within each channel nest, only the metrics supported by the PHY will be present. + +See ``struct phy_mse_snapshot`` kernel documentation in +``include/linux/phy.h``. + Request translation =================== diff --git a/Documentation/networking/index.rst b/Documentation/networking/index.rst index c775cababc8c..75db2251649b 100644 --- a/Documentation/networking/index.rst +++ b/Documentation/networking/index.rst @@ -131,10 +131,7 @@ Contents: vxlan x25 x25-iface - xfrm_device - xfrm_proc - xfrm_sync - xfrm_sysctl + xfrm/index xdp-rx-metadata xsk-tx-metadata diff --git a/Documentation/networking/ip-sysctl.rst b/Documentation/networking/ip-sysctl.rst index a06cb99d66dc..bc9a01606daf 100644 --- a/Documentation/networking/ip-sysctl.rst +++ b/Documentation/networking/ip-sysctl.rst @@ -673,6 +673,16 @@ tcp_moderate_rcvbuf - BOOLEAN Default: 1 (enabled) +tcp_rcvbuf_low_rtt - INTEGER + rcvbuf autotuning can over estimate final socket rcvbuf, which + can lead to cache trashing for high throughput flows. + + For small RTT flows (below tcp_rcvbuf_low_rtt usecs), we can relax + rcvbuf growth: Few additional ms to reach the final (and smaller) + rcvbuf is a good tradeoff. + + Default : 1000 (1 ms) + tcp_mtu_probing - INTEGER Controls TCP Packetization-Layer Path MTU Discovery. Takes three values: @@ -854,9 +864,18 @@ tcp_sack - BOOLEAN Default: 1 (enabled) +tcp_comp_sack_rtt_percent - INTEGER + Percentage of SRTT used for the compressed SACK feature. + See tcp_comp_sack_nr, tcp_comp_sack_delay_ns, tcp_comp_sack_slack_ns. + + Possible values : 1 - 1000 + + Default : 33 % + tcp_comp_sack_delay_ns - LONG INTEGER - TCP tries to reduce number of SACK sent, using a timer - based on 5% of SRTT, capped by this sysctl, in nano seconds. + TCP tries to reduce number of SACK sent, using a timer based + on tcp_comp_sack_rtt_percent of SRTT, capped by this sysctl + in nano seconds. The default is 1ms, based on TSO autosizing period. Default : 1,000,000 ns (1 ms) @@ -866,8 +885,9 @@ tcp_comp_sack_slack_ns - LONG INTEGER timer used by SACK compression. This gives extra time for small RTT flows, and reduces system overhead by allowing opportunistic reduction of timer interrupts. + Too big values might reduce goodput. - Default : 100,000 ns (100 us) + Default : 10,000 ns (10 us) tcp_comp_sack_nr - INTEGER Max number of SACK that can be compressed. @@ -1796,6 +1816,23 @@ icmp_errors_use_inbound_ifaddr - BOOLEAN Default: 0 (disabled) +icmp_errors_extension_mask - UNSIGNED INTEGER + Bitmask of ICMP extensions to append to ICMPv4 error messages + ("Destination Unreachable", "Time Exceeded" and "Parameter Problem"). + The original datagram is trimmed / padded to 128 bytes in order to be + compatible with applications that do not comply with RFC 4884. + + Possible extensions are: + + ==== ============================================================== + 0x01 Incoming IP interface information according to RFC 5837. + Extension will include the index, IPv4 address (if present), + name and MTU of the IP interface that received the datagram + which elicited the ICMP error. + ==== ============================================================== + + Default: 0x00 (no extensions) + igmp_max_memberships - INTEGER Change the maximum number of multicast groups we can subscribe to. Default: 20 @@ -3262,6 +3299,23 @@ error_anycast_as_unicast - BOOLEAN Default: 0 (disabled) +errors_extension_mask - UNSIGNED INTEGER + Bitmask of ICMP extensions to append to ICMPv6 error messages + ("Destination Unreachable" and "Time Exceeded"). The original datagram + is trimmed / padded to 128 bytes in order to be compatible with + applications that do not comply with RFC 4884. + + Possible extensions are: + + ==== ============================================================== + 0x01 Incoming IP interface information according to RFC 5837. + Extension will include the index, IPv6 address (if present), + name and MTU of the IP interface that received the datagram + which elicited the ICMP error. + ==== ============================================================== + + Default: 0x00 (no extensions) + xfrm6_gc_thresh - INTEGER (Obsolete since linux-4.14) The threshold at which we will start garbage collecting for IPv6 diff --git a/Documentation/networking/napi.rst b/Documentation/networking/napi.rst index 7dd60366f4ff..4e008efebb35 100644 --- a/Documentation/networking/napi.rst +++ b/Documentation/networking/napi.rst @@ -263,7 +263,9 @@ are not well known). Busy polling is enabled by either setting ``SO_BUSY_POLL`` on selected sockets or using the global ``net.core.busy_poll`` and ``net.core.busy_read`` sysctls. An io_uring API for NAPI busy polling -also exists. +also exists. Threaded polling of NAPI also has a mode to busy poll for +packets (:ref:`threaded busy polling<threaded_busy_poll>`) using the NAPI +processing kthread. epoll-based busy polling ------------------------ @@ -426,6 +428,52 @@ Therefore, setting ``gro_flush_timeout`` and ``napi_defer_hard_irqs`` is the recommended usage, because otherwise setting ``irq-suspend-timeout`` might not have any discernible effect. +.. _threaded_busy_poll: + +Threaded NAPI busy polling +-------------------------- + +Threaded NAPI busy polling extends threaded NAPI and adds support to do +continuous busy polling of the NAPI. This can be useful for forwarding or +AF_XDP applications. + +Threaded NAPI busy polling can be enabled on per NIC queue basis using Netlink. + +For example, using the following script: + +.. code-block:: bash + + $ ynl --family netdev --do napi-set \ + --json='{"id": 66, "threaded": "busy-poll"}' + +The kernel will create a kthread that busy polls on this NAPI. + +The user may elect to set the CPU affinity of this kthread to an unused CPU +core to improve how often the NAPI is polled at the expense of wasted CPU +cycles. Note that this will keep the CPU core busy with 100% usage. + +Once threaded busy polling is enabled for a NAPI, PID of the kthread can be +retrieved using Netlink so the affinity of the kthread can be set up. + +For example, the following script can be used to fetch the PID: + +.. code-block:: bash + + $ ynl --family netdev --do napi-get --json='{"id": 66}' + +This will output something like following, the pid `258` is the PID of the +kthread that is polling this NAPI. + +.. code-block:: bash + + $ {'defer-hard-irqs': 0, + 'gro-flush-timeout': 0, + 'id': 66, + 'ifindex': 2, + 'irq-suspend-timeout': 0, + 'pid': 258, + 'threaded': 'busy-poll'} + .. _threaded: Threaded NAPI diff --git a/Documentation/networking/net_cachelines/inet_connection_sock.rst b/Documentation/networking/net_cachelines/inet_connection_sock.rst index 8fae85ebb773..cc2000f55c29 100644 --- a/Documentation/networking/net_cachelines/inet_connection_sock.rst +++ b/Documentation/networking/net_cachelines/inet_connection_sock.rst @@ -12,8 +12,8 @@ struct inet_sock icsk_inet read_mostly r struct request_sock_queue icsk_accept_queue struct inet_bind_bucket icsk_bind_hash read_mostly tcp_set_state struct inet_bind2_bucket icsk_bind2_hash read_mostly tcp_set_state,inet_put_port -struct timer_list icsk_retransmit_timer read_write inet_csk_reset_xmit_timer,tcp_connect struct timer_list icsk_delack_timer read_mostly inet_csk_reset_xmit_timer,tcp_connect +struct timer_list icsk_keepalive_timer u32 icsk_rto read_write tcp_cwnd_validate,tcp_schedule_loss_probe,tcp_connect_init,tcp_connect,tcp_write_xmit,tcp_push_one u32 icsk_rto_min u32 icsk_rto_max read_mostly tcp_reset_xmit_timer diff --git a/Documentation/networking/net_cachelines/inet_sock.rst b/Documentation/networking/net_cachelines/inet_sock.rst index b11bf48fa2b3..4c72a28a7012 100644 --- a/Documentation/networking/net_cachelines/inet_sock.rst +++ b/Documentation/networking/net_cachelines/inet_sock.rst @@ -5,42 +5,43 @@ inet_sock struct fast path usage breakdown ========================================== -======================= ===================== =================== =================== ====================================================================================================== -Type Name fastpath_tx_access fastpath_rx_access comment -======================= ===================== =================== =================== ====================================================================================================== -struct sock sk read_mostly read_mostly tcp_init_buffer_space,tcp_init_transfer,tcp_finish_connect,tcp_connect,tcp_send_rcvq,tcp_send_syn_data -struct ipv6_pinfo* pinet6 -be16 inet_sport read_mostly __tcp_transmit_skb -be32 inet_daddr read_mostly ip_select_ident_segs -be32 inet_rcv_saddr -be16 inet_dport read_mostly __tcp_transmit_skb -u16 inet_num -be32 inet_saddr -s16 uc_ttl read_mostly __ip_queue_xmit/ip_select_ttl -u16 cmsg_flags -struct ip_options_rcu* inet_opt read_mostly __ip_queue_xmit -u16 inet_id read_mostly ip_select_ident_segs -u8 tos read_mostly ip_queue_xmit -u8 min_ttl -u8 mc_ttl -u8 pmtudisc -u8:1 recverr -u8:1 is_icsk -u8:1 freebind -u8:1 hdrincl -u8:1 mc_loop -u8:1 transparent -u8:1 mc_all -u8:1 nodefrag -u8:1 bind_address_no_port -u8:1 recverr_rfc4884 -u8:1 defer_connect read_mostly tcp_sendmsg_fastopen -u8 rcv_tos -u8 convert_csum -int uc_index -int mc_index -be32 mc_addr -struct ip_mc_socklist* mc_list -struct inet_cork_full cork read_mostly __tcp_transmit_skb -struct local_port_range -======================= ===================== =================== =================== ====================================================================================================== +======================== ===================== =================== =================== ====================================================================================================== +Type Name fastpath_tx_access fastpath_rx_access comment +======================== ===================== =================== =================== ====================================================================================================== +struct sock sk read_mostly read_mostly tcp_init_buffer_space,tcp_init_transfer,tcp_finish_connect,tcp_connect,tcp_send_rcvq,tcp_send_syn_data +struct ipv6_pinfo* pinet6 +struct ipv6_fl_socklist* ipv6_fl_list read_mostly tcp_v6_connect,__ip6_datagram_connect,udpv6_sendmsg,rawv6_sendmsg +be16 inet_sport read_mostly __tcp_transmit_skb +be32 inet_daddr read_mostly ip_select_ident_segs +be32 inet_rcv_saddr +be16 inet_dport read_mostly __tcp_transmit_skb +u16 inet_num +be32 inet_saddr +s16 uc_ttl read_mostly __ip_queue_xmit/ip_select_ttl +u16 cmsg_flags +struct ip_options_rcu* inet_opt read_mostly __ip_queue_xmit +u16 inet_id read_mostly ip_select_ident_segs +u8 tos read_mostly ip_queue_xmit +u8 min_ttl +u8 mc_ttl +u8 pmtudisc +u8:1 recverr +u8:1 is_icsk +u8:1 freebind +u8:1 hdrincl +u8:1 mc_loop +u8:1 transparent +u8:1 mc_all +u8:1 nodefrag +u8:1 bind_address_no_port +u8:1 recverr_rfc4884 +u8:1 defer_connect read_mostly tcp_sendmsg_fastopen +u8 rcv_tos +u8 convert_csum +int uc_index +int mc_index +be32 mc_addr +struct ip_mc_socklist* mc_list +struct inet_cork_full cork read_mostly __tcp_transmit_skb +struct local_port_range +======================== ===================== =================== =================== ====================================================================================================== diff --git a/Documentation/networking/net_cachelines/netns_ipv4_sysctl.rst b/Documentation/networking/net_cachelines/netns_ipv4_sysctl.rst index 6e7b20afd2d4..beaf1880a19b 100644 --- a/Documentation/networking/net_cachelines/netns_ipv4_sysctl.rst +++ b/Documentation/networking/net_cachelines/netns_ipv4_sysctl.rst @@ -102,7 +102,8 @@ u8 sysctl_tcp_app_win u8 sysctl_tcp_frto tcp_enter_loss u8 sysctl_tcp_nometrics_save TCP_LAST_ACK/tcp_update_metrics u8 sysctl_tcp_no_ssthresh_metrics_save TCP_LAST_ACK/tcp_(update/init)_metrics -u8 sysctl_tcp_moderate_rcvbuf read_mostly read_mostly tcp_tso_should_defer(tx);tcp_rcv_space_adjust(rx) +u8 sysctl_tcp_moderate_rcvbuf read_mostly tcp_rcvbuf_grow() +u32 sysctl_tcp_rcvbuf_low_rtt read_mostly tcp_rcvbuf_grow() u8 sysctl_tcp_tso_win_divisor read_mostly tcp_tso_should_defer(tcp_write_xmit) u8 sysctl_tcp_workaround_signed_windows tcp_select_window int sysctl_tcp_limit_output_bytes read_mostly tcp_small_queue_check(tcp_write_xmit) diff --git a/Documentation/networking/netconsole.rst b/Documentation/networking/netconsole.rst index 2555e75e5cc1..4ab5d7b05cf1 100644 --- a/Documentation/networking/netconsole.rst +++ b/Documentation/networking/netconsole.rst @@ -88,7 +88,7 @@ for example: nc -u -l -p <port>' / 'nc -u -l <port> - or:: + or:: netcat -u -l -p <port>' / 'netcat -u -l <port> diff --git a/Documentation/networking/nfc.rst b/Documentation/networking/nfc.rst index 9aab3a88c9b2..401735006143 100644 --- a/Documentation/networking/nfc.rst +++ b/Documentation/networking/nfc.rst @@ -71,7 +71,8 @@ Userspace interface The userspace interface is divided in control operations and low-level data exchange operation. -CONTROL OPERATIONS: +Control operations +------------------ Generic netlink is used to implement the interface to the control operations. The operations are composed by commands and events, all listed below: @@ -100,7 +101,8 @@ relevant information such as the supported NFC protocols. All polling operations requested through one netlink socket are stopped when it's closed. -LOW-LEVEL DATA EXCHANGE: +Low-level data exchange +----------------------- The userspace must use PF_NFC sockets to perform any data communication with targets. All NFC sockets use AF_NFC:: diff --git a/Documentation/networking/smc-sysctl.rst b/Documentation/networking/smc-sysctl.rst index a874d007f2db..904a910f198e 100644 --- a/Documentation/networking/smc-sysctl.rst +++ b/Documentation/networking/smc-sysctl.rst @@ -71,3 +71,43 @@ smcr_max_conns_per_lgr - INTEGER acceptable value ranges from 16 to 255. Only for SMC-R v2.1 and later. Default: 255 + +smcr_max_send_wr - INTEGER + So-called work request buffers are SMCR link (and RDMA queue pair) level + resources necessary for performing RDMA operations. Since up to 255 + connections can share a link group and thus also a link and the number + of the work request buffers is decided when the link is allocated, + depending on the workload it can be a bottleneck in a sense that threads + have to wait for work request buffers to become available. Before the + introduction of this control the maximal number of work request buffers + available on the send path used to be hard coded to 16. With this control + it becomes configurable. The acceptable range is between 2 and 2048. + + Please be aware that all the buffers need to be allocated as a physically + continuous array in which each element is a single buffer and has the size + of SMC_WR_BUF_SIZE (48) bytes. If the allocation fails, we keep retrying + with half of the buffer count until it is ether successful or (unlikely) + we dip below the old hard coded value which is 16 where we give up much + like before having this control. + + Default: 16 + +smcr_max_recv_wr - INTEGER + So-called work request buffers are SMCR link (and RDMA queue pair) level + resources necessary for performing RDMA operations. Since up to 255 + connections can share a link group and thus also a link and the number + of the work request buffers is decided when the link is allocated, + depending on the workload it can be a bottleneck in a sense that threads + have to wait for work request buffers to become available. Before the + introduction of this control the maximal number of work request buffers + available on the receive path used to be hard coded to 16. With this control + it becomes configurable. The acceptable range is between 2 and 2048. + + Please be aware that all the buffers need to be allocated as a physically + continuous array in which each element is a single buffer and has the size + of SMC_WR_BUF_SIZE (48) bytes. If the allocation fails, we keep retrying + with half of the buffer count until it is ether successful or (unlikely) + we dip below the old hard coded value which is 16 where we give up much + like before having this control. + + Default: 48 diff --git a/Documentation/networking/statistics.rst b/Documentation/networking/statistics.rst index 518284e287b0..66b0ef941457 100644 --- a/Documentation/networking/statistics.rst +++ b/Documentation/networking/statistics.rst @@ -184,9 +184,11 @@ Protocol-related statistics can be requested in get commands by setting the `ETHTOOL_FLAG_STATS` flag in `ETHTOOL_A_HEADER_FLAGS`. Currently statistics are supported in the following commands: - - `ETHTOOL_MSG_PAUSE_GET` - `ETHTOOL_MSG_FEC_GET` + - `ETHTOOL_MSG_LINKSTATE_GET` - `ETHTOOL_MSG_MM_GET` + - `ETHTOOL_MSG_PAUSE_GET` + - `ETHTOOL_MSG_TSINFO_GET` debugfs ------- diff --git a/Documentation/networking/tls.rst b/Documentation/networking/tls.rst index 36cc7afc2527..980c442d7161 100644 --- a/Documentation/networking/tls.rst +++ b/Documentation/networking/tls.rst @@ -280,6 +280,26 @@ If the record decrypted turns out to had been padded or is not a data record it will be decrypted again into a kernel buffer without zero copy. Such events are counted in the ``TlsDecryptRetry`` statistic. +TLS_TX_MAX_PAYLOAD_LEN +~~~~~~~~~~~~~~~~~~~~~~ + +Specifies the maximum size of the plaintext payload for transmitted TLS records. + +When this option is set, the kernel enforces the specified limit on all outgoing +TLS records. No plaintext fragment will exceed this size. This option can be used +to implement the TLS Record Size Limit extension [1]. + +* For TLS 1.2, the value corresponds directly to the record size limit. +* For TLS 1.3, the value should be set to record_size_limit - 1, since + the record size limit includes one additional byte for the ContentType + field. + +The valid range for this option is 64 to 16384 bytes for TLS 1.2, and 63 to +16384 bytes for TLS 1.3. The lower minimum for TLS 1.3 accounts for the +extra byte used by the ContentType field. + +[1] https://datatracker.ietf.org/doc/html/rfc8449 + Statistics ========== diff --git a/Documentation/networking/xfrm/index.rst b/Documentation/networking/xfrm/index.rst new file mode 100644 index 000000000000..7d866da836fe --- /dev/null +++ b/Documentation/networking/xfrm/index.rst @@ -0,0 +1,13 @@ +.. SPDX-License-Identifier: GPL-2.0 + +============== +XFRM Framework +============== + +.. toctree:: + :maxdepth: 2 + + xfrm_device + xfrm_proc + xfrm_sync + xfrm_sysctl diff --git a/Documentation/networking/xfrm_device.rst b/Documentation/networking/xfrm/xfrm_device.rst index 122204da0fff..b0d85a5f57d1 100644 --- a/Documentation/networking/xfrm_device.rst +++ b/Documentation/networking/xfrm/xfrm_device.rst @@ -20,11 +20,15 @@ can radically increase throughput and decrease CPU utilization. The XFRM Device interface allows NIC drivers to offer to the stack access to the hardware offload. -Right now, there are two types of hardware offload that kernel supports. +Right now, there are two types of hardware offload that kernel supports: + * IPsec crypto offload: + * NIC performs encrypt/decrypt * Kernel does everything else + * IPsec packet offload: + * NIC performs encrypt/decrypt * NIC does encapsulation * Kernel and NIC have SA and policy in-sync @@ -34,7 +38,7 @@ Right now, there are two types of hardware offload that kernel supports. Userland access to the offload is typically through a system such as libreswan or KAME/raccoon, but the iproute2 'ip xfrm' command set can be handy when experimenting. An example command might look something -like this for crypto offload: +like this for crypto offload:: ip x s add proto esp dst 14.0.0.70 src 14.0.0.52 spi 0x07 mode transport \ reqid 0x07 replay-window 32 \ @@ -42,7 +46,7 @@ like this for crypto offload: sel src 14.0.0.52/24 dst 14.0.0.70/24 proto tcp \ offload dev eth4 dir in -and for packet offload +and for packet offload:: ip x s add proto esp dst 14.0.0.70 src 14.0.0.52 spi 0x07 mode transport \ reqid 0x07 replay-window 32 \ @@ -153,26 +157,26 @@ the packet's skb. At this point the data should be decrypted but the IPsec headers are still in the packet data; they are removed later up the stack in xfrm_input(). - find and hold the SA that was used to the Rx skb:: +1. Find and hold the SA that was used to the Rx skb:: - get spi, protocol, and destination IP from packet headers + /* get spi, protocol, and destination IP from packet headers */ xs = find xs from (spi, protocol, dest_IP) xfrm_state_hold(xs); - store the state information into the skb:: +2. Store the state information into the skb:: sp = secpath_set(skb); if (!sp) return; sp->xvec[sp->len++] = xs; sp->olen++; - indicate the success and/or error status of the offload:: +3. Indicate the success and/or error status of the offload:: xo = xfrm_offload(skb); xo->flags = CRYPTO_DONE; xo->status = crypto_status; - hand the packet to napi_gro_receive() as usual +4. Hand the packet to napi_gro_receive() as usual. In ESN mode, xdo_dev_state_advance_esn() is called from xfrm_replay_advance_esn() for RX, and xfrm_replay_overflow_offload_esn for TX. diff --git a/Documentation/networking/xfrm_proc.rst b/Documentation/networking/xfrm/xfrm_proc.rst index 973d1571acac..973d1571acac 100644 --- a/Documentation/networking/xfrm_proc.rst +++ b/Documentation/networking/xfrm/xfrm_proc.rst diff --git a/Documentation/networking/xfrm_sync.rst b/Documentation/networking/xfrm/xfrm_sync.rst index 6246503ceab2..dfc2ec0df380 100644 --- a/Documentation/networking/xfrm_sync.rst +++ b/Documentation/networking/xfrm/xfrm_sync.rst @@ -1,8 +1,8 @@ .. SPDX-License-Identifier: GPL-2.0 -==== -XFRM -==== +========= +XFRM sync +========= The sync patches work is based on initial patches from Krisztian <hidden@balabit.hu> and others and additional patches @@ -36,7 +36,7 @@ is not driven by packet arrival. - the replay sequence for both inbound and outbound 1) Message Structure ----------------------- +-------------------- nlmsghdr:aevent_id:optional-TLVs. @@ -83,31 +83,31 @@ when going from kernel to user space) A program needs to subscribe to multicast group XFRMNLGRP_AEVENTS to get notified of these events. -2) TLVS reflect the different parameters: ------------------------------------------ +2) TLVS reflect the different parameters +---------------------------------------- a) byte value (XFRMA_LTIME_VAL) -This TLV carries the running/current counter for byte lifetime since -last event. + This TLV carries the running/current counter for byte lifetime since + last event. -b)replay value (XFRMA_REPLAY_VAL) +b) replay value (XFRMA_REPLAY_VAL) -This TLV carries the running/current counter for replay sequence since -last event. + This TLV carries the running/current counter for replay sequence since + last event. -c)replay threshold (XFRMA_REPLAY_THRESH) +c) replay threshold (XFRMA_REPLAY_THRESH) -This TLV carries the threshold being used by the kernel to trigger events -when the replay sequence is exceeded. + This TLV carries the threshold being used by the kernel to trigger events + when the replay sequence is exceeded. d) expiry timer (XFRMA_ETIMER_THRESH) -This is a timer value in milliseconds which is used as the nagle -value to rate limit the events. + This is a timer value in milliseconds which is used as the nagle + value to rate limit the events. -3) Default configurations for the parameters: ---------------------------------------------- +3) Default configurations for the parameters +-------------------------------------------- By default these events should be turned off unless there is at least one listener registered to listen to the multicast @@ -121,12 +121,14 @@ in case they are not specified. the two sysctls/proc entries are: a) /proc/sys/net/core/sysctl_xfrm_aevent_etime -used to provide default values for the XFRMA_ETIMER_THRESH in incremental -units of time of 100ms. The default is 10 (1 second) + + Used to provide default values for the XFRMA_ETIMER_THRESH in incremental + units of time of 100ms. The default is 10 (1 second) b) /proc/sys/net/core/sysctl_xfrm_aevent_rseqth -used to provide default values for XFRMA_REPLAY_THRESH parameter -in incremental packet count. The default is two packets. + + Used to provide default values for XFRMA_REPLAY_THRESH parameter + in incremental packet count. The default is two packets. 4) Message types ---------------- @@ -134,50 +136,51 @@ in incremental packet count. The default is two packets. a) XFRM_MSG_GETAE issued by user-->kernel. XFRM_MSG_GETAE does not carry any TLVs. -The response is a XFRM_MSG_NEWAE which is formatted based on what -XFRM_MSG_GETAE queried for. + The response is a XFRM_MSG_NEWAE which is formatted based on what + XFRM_MSG_GETAE queried for. + + The response will always have XFRMA_LTIME_VAL and XFRMA_REPLAY_VAL TLVs. -The response will always have XFRMA_LTIME_VAL and XFRMA_REPLAY_VAL TLVs. -* if XFRM_AE_RTHR flag is set, then XFRMA_REPLAY_THRESH is also retrieved -* if XFRM_AE_ETHR flag is set, then XFRMA_ETIMER_THRESH is also retrieved + * if XFRM_AE_RTHR flag is set, then XFRMA_REPLAY_THRESH is also retrieved + * if XFRM_AE_ETHR flag is set, then XFRMA_ETIMER_THRESH is also retrieved b) XFRM_MSG_NEWAE is issued by either user space to configure or kernel to announce events or respond to a XFRM_MSG_GETAE. -i) user --> kernel to configure a specific SA. + i) user --> kernel to configure a specific SA. -any of the values or threshold parameters can be updated by passing the -appropriate TLV. + any of the values or threshold parameters can be updated by passing the + appropriate TLV. -A response is issued back to the sender in user space to indicate success -or failure. + A response is issued back to the sender in user space to indicate success + or failure. -In the case of success, additionally an event with -XFRM_MSG_NEWAE is also issued to any listeners as described in iii). + In the case of success, additionally an event with + XFRM_MSG_NEWAE is also issued to any listeners as described in iii). -ii) kernel->user direction as a response to XFRM_MSG_GETAE + ii) kernel->user direction as a response to XFRM_MSG_GETAE -The response will always have XFRMA_LTIME_VAL and XFRMA_REPLAY_VAL TLVs. + The response will always have XFRMA_LTIME_VAL and XFRMA_REPLAY_VAL TLVs. -The threshold TLVs will be included if explicitly requested in -the XFRM_MSG_GETAE message. + The threshold TLVs will be included if explicitly requested in + the XFRM_MSG_GETAE message. -iii) kernel->user to report as event if someone sets any values or - thresholds for an SA using XFRM_MSG_NEWAE (as described in #i above). - In such a case XFRM_AE_CU flag is set to inform the user that - the change happened as a result of an update. - The message will always have XFRMA_LTIME_VAL and XFRMA_REPLAY_VAL TLVs. + iii) kernel->user to report as event if someone sets any values or + thresholds for an SA using XFRM_MSG_NEWAE (as described in #i above). + In such a case XFRM_AE_CU flag is set to inform the user that + the change happened as a result of an update. + The message will always have XFRMA_LTIME_VAL and XFRMA_REPLAY_VAL TLVs. -iv) kernel->user to report event when replay threshold or a timeout - is exceeded. + iv) kernel->user to report event when replay threshold or a timeout + is exceeded. In such a case either XFRM_AE_CR (replay exceeded) or XFRM_AE_CE (timeout happened) is set to inform the user what happened. Note the two flags are mutually exclusive. The message will always have XFRMA_LTIME_VAL and XFRMA_REPLAY_VAL TLVs. -Exceptions to threshold settings --------------------------------- +5) Exceptions to threshold settings +----------------------------------- If you have an SA that is getting hit by traffic in bursts such that there is a period where the timer threshold expires with no packets diff --git a/Documentation/networking/xfrm_sysctl.rst b/Documentation/networking/xfrm/xfrm_sysctl.rst index 47b9bbdd0179..7d0c4b17c0bd 100644 --- a/Documentation/networking/xfrm_sysctl.rst +++ b/Documentation/networking/xfrm/xfrm_sysctl.rst @@ -4,8 +4,8 @@ XFRM Syscall ============ -/proc/sys/net/core/xfrm_* Variables: -==================================== +/proc/sys/net/core/xfrm_* Variables +=================================== xfrm_acq_expires - INTEGER default 30 - hard timeout in seconds for acquire requests |
