diff options
Diffstat (limited to 'Documentation/filesystems')
| -rw-r--r-- | Documentation/filesystems/ext4/inodes.rst | 2 | ||||
| -rw-r--r-- | Documentation/filesystems/ext4/super.rst | 4 | ||||
| -rw-r--r-- | Documentation/filesystems/fscrypt.rst | 2 | ||||
| -rw-r--r-- | Documentation/filesystems/gfs2/glocks.rst (renamed from Documentation/filesystems/gfs2-glocks.rst) | 0 | ||||
| -rw-r--r-- | Documentation/filesystems/gfs2/index.rst (renamed from Documentation/filesystems/gfs2.rst) | 12 | ||||
| -rw-r--r-- | Documentation/filesystems/gfs2/uevents.rst (renamed from Documentation/filesystems/gfs2-uevents.rst) | 0 | ||||
| -rw-r--r-- | Documentation/filesystems/index.rst | 4 | ||||
| -rw-r--r-- | Documentation/filesystems/iomap/operations.rst | 50 | ||||
| -rw-r--r-- | Documentation/filesystems/porting.rst | 15 | ||||
| -rw-r--r-- | Documentation/filesystems/ramfs-rootfs-initramfs.rst | 12 | ||||
| -rw-r--r-- | Documentation/filesystems/resctrl.rst | 134 | ||||
| -rw-r--r-- | Documentation/filesystems/xfs/xfs-online-fsck-design.rst | 238 |
12 files changed, 200 insertions, 273 deletions
diff --git a/Documentation/filesystems/ext4/inodes.rst b/Documentation/filesystems/ext4/inodes.rst index cfc6c1659931..55cd5c380e92 100644 --- a/Documentation/filesystems/ext4/inodes.rst +++ b/Documentation/filesystems/ext4/inodes.rst @@ -297,6 +297,8 @@ The ``i_flags`` field is a combination of these values: - Inode has inline data (EXT4_INLINE_DATA_FL). * - 0x20000000 - Create children with the same project ID (EXT4_PROJINHERIT_FL). + * - 0x40000000 + - Use case-insensitive lookups for directory contents (EXT4_CASEFOLD_FL). * - 0x80000000 - Reserved for ext4 library (EXT4_RESERVED_FL). * - diff --git a/Documentation/filesystems/ext4/super.rst b/Documentation/filesystems/ext4/super.rst index 1b240661bfa3..9a59cded9bd7 100644 --- a/Documentation/filesystems/ext4/super.rst +++ b/Documentation/filesystems/ext4/super.rst @@ -671,7 +671,9 @@ following: * - 0x8000 - Data in inode (INCOMPAT_INLINE_DATA). * - 0x10000 - - Encrypted inodes are present on the filesystem. (INCOMPAT_ENCRYPT). + - Encrypted inodes can be present. (INCOMPAT_ENCRYPT). + * - 0x20000 + - Directories can be marked case-insensitive. (INCOMPAT_CASEFOLD). .. _super_rocompat: diff --git a/Documentation/filesystems/fscrypt.rst b/Documentation/filesystems/fscrypt.rst index 696a5844bfa3..70af896822e1 100644 --- a/Documentation/filesystems/fscrypt.rst +++ b/Documentation/filesystems/fscrypt.rst @@ -450,9 +450,7 @@ API, but the filenames mode still does. - CONFIG_CRYPTO_HCTR2 - Recommended: - arm64: CONFIG_CRYPTO_AES_ARM64_CE_BLK - - arm64: CONFIG_CRYPTO_POLYVAL_ARM64_CE - x86: CONFIG_CRYPTO_AES_NI_INTEL - - x86: CONFIG_CRYPTO_POLYVAL_CLMUL_NI - Adiantum - Mandatory: diff --git a/Documentation/filesystems/gfs2-glocks.rst b/Documentation/filesystems/gfs2/glocks.rst index ce5ff08cbd59..ce5ff08cbd59 100644 --- a/Documentation/filesystems/gfs2-glocks.rst +++ b/Documentation/filesystems/gfs2/glocks.rst diff --git a/Documentation/filesystems/gfs2.rst b/Documentation/filesystems/gfs2/index.rst index 1bc48a13430c..e5e195403561 100644 --- a/Documentation/filesystems/gfs2.rst +++ b/Documentation/filesystems/gfs2/index.rst @@ -4,6 +4,9 @@ Global File System 2 ==================== +Overview +======== + GFS2 is a cluster file system. It allows a cluster of computers to simultaneously use a block device that is shared between them (with FC, iSCSI, NBD, etc). GFS2 reads and writes to the block device like a local @@ -50,3 +53,12 @@ The following man pages are available from gfs2-utils: gfs2_convert to convert a gfs filesystem to GFS2 in-place mkfs.gfs2 to make a filesystem ============ ============================================= + +Implementation Notes +==================== + +.. toctree:: + :maxdepth: 1 + + glocks + uevents diff --git a/Documentation/filesystems/gfs2-uevents.rst b/Documentation/filesystems/gfs2/uevents.rst index f162a2c76c69..f162a2c76c69 100644 --- a/Documentation/filesystems/gfs2-uevents.rst +++ b/Documentation/filesystems/gfs2/uevents.rst diff --git a/Documentation/filesystems/index.rst b/Documentation/filesystems/index.rst index af516e528ded..f4873197587d 100644 --- a/Documentation/filesystems/index.rst +++ b/Documentation/filesystems/index.rst @@ -89,9 +89,7 @@ Documentation for filesystem implementations. ext3 ext4/index f2fs - gfs2 - gfs2-uevents - gfs2-glocks + gfs2/index hfs hfsplus hpfs diff --git a/Documentation/filesystems/iomap/operations.rst b/Documentation/filesystems/iomap/operations.rst index 387fd9cc72ca..da982ca7e413 100644 --- a/Documentation/filesystems/iomap/operations.rst +++ b/Documentation/filesystems/iomap/operations.rst @@ -135,6 +135,27 @@ These ``struct kiocb`` flags are significant for buffered I/O with iomap: * ``IOCB_DONTCACHE``: Turns on ``IOMAP_DONTCACHE``. +``struct iomap_read_ops`` +-------------------------- + +.. code-block:: c + + struct iomap_read_ops { + int (*read_folio_range)(const struct iomap_iter *iter, + struct iomap_read_folio_ctx *ctx, size_t len); + void (*submit_read)(struct iomap_read_folio_ctx *ctx); + }; + +iomap calls these functions: + + - ``read_folio_range``: Called to read in the range. This must be provided + by the caller. If this succeeds, iomap_finish_folio_read() must be called + after the range is read in, regardless of whether the read succeeded or + failed. + + - ``submit_read``: Submit any pending read requests. This function is + optional. + Internal per-Folio State ------------------------ @@ -182,6 +203,28 @@ The ``flags`` argument to ``->iomap_begin`` will be set to zero. The pagecache takes whatever locks it needs before calling the filesystem. +Both ``iomap_readahead`` and ``iomap_read_folio`` pass in a ``struct +iomap_read_folio_ctx``: + +.. code-block:: c + + struct iomap_read_folio_ctx { + const struct iomap_read_ops *ops; + struct folio *cur_folio; + struct readahead_control *rac; + void *read_ctx; + }; + +``iomap_readahead`` must set: + * ``ops->read_folio_range()`` and ``rac`` + +``iomap_read_folio`` must set: + * ``ops->read_folio_range()`` and ``cur_folio`` + +``ops->submit_read()`` and ``read_ctx`` are optional. ``read_ctx`` is used to +pass in any custom data the caller needs accessible in the ops callbacks for +fulfilling reads. + Buffered Writes --------------- @@ -317,6 +360,9 @@ The fields are as follows: delalloc reservations to avoid having delalloc reservations for clean pagecache. This function must be supplied by the filesystem. + If this succeeds, iomap_finish_folio_write() must be called once writeback + completes for the range, regardless of whether the writeback succeeded or + failed. - ``writeback_submit``: Submit the previous built writeback context. Block based file systems should use the iomap_ioend_writeback_submit @@ -444,10 +490,6 @@ These ``struct kiocb`` flags are significant for direct I/O with iomap: Only meaningful for asynchronous I/O, and only if the entire I/O can be issued as a single ``struct bio``. - * ``IOCB_DIO_CALLER_COMP``: Try to run I/O completion from the caller's - process context. - See ``linux/fs.h`` for more details. - Filesystems should call ``iomap_dio_rw`` from ``->read_iter`` and ``->write_iter``, and set ``FMODE_CAN_ODIRECT`` in the ``->open`` function for the file. diff --git a/Documentation/filesystems/porting.rst b/Documentation/filesystems/porting.rst index 7233b04668fc..d33429294252 100644 --- a/Documentation/filesystems/porting.rst +++ b/Documentation/filesystems/porting.rst @@ -211,7 +211,7 @@ test and set for you. e.g.:: inode = iget_locked(sb, ino); - if (inode->i_state & I_NEW) { + if (inode_state_read_once(inode) & I_NEW) { err = read_inode_from_disk(inode); if (err < 0) { iget_failed(inode); @@ -1309,3 +1309,16 @@ a different length, use vfs_parse_fs_qstr(fc, key, &QSTR_LEN(value, len)) instead. + +--- + +**mandatory** + +vfs_mkdir() now returns a dentry - the one returned by ->mkdir(). If +that dentry is different from the dentry passed in, including if it is +an IS_ERR() dentry pointer, the original dentry is dput(). + +When vfs_mkdir() returns an error, and so both dputs() the original +dentry and doesn't provide a replacement, it also unlocks the parent. +Consequently the return value from vfs_mkdir() can be passed to +end_creating() and the parent will be unlocked precisely when necessary. diff --git a/Documentation/filesystems/ramfs-rootfs-initramfs.rst b/Documentation/filesystems/ramfs-rootfs-initramfs.rst index fa4f81099cb4..a9d271e171c3 100644 --- a/Documentation/filesystems/ramfs-rootfs-initramfs.rst +++ b/Documentation/filesystems/ramfs-rootfs-initramfs.rst @@ -290,11 +290,11 @@ Why cpio rather than tar? This decision was made back in December, 2001. The discussion started here: - http://www.uwsg.iu.edu/hypermail/linux/kernel/0112.2/1538.html +- https://lore.kernel.org/lkml/a03cke$640$1@cesium.transmeta.com/ And spawned a second thread (specifically on tar vs cpio), starting here: - http://www.uwsg.iu.edu/hypermail/linux/kernel/0112.2/1587.html +- https://lore.kernel.org/lkml/3C25A06D.7030408@zytor.com/ The quick and dirty summary version (which is no substitute for reading the above threads) is: @@ -310,7 +310,7 @@ the above threads) is: either way about the archive format, and there are alternative tools, such as: - http://freecode.com/projects/afio + https://linux.die.net/man/1/afio 2) The cpio archive format chosen by the kernel is simpler and cleaner (and thus easier to create and parse) than any of the (literally dozens of) @@ -331,12 +331,12 @@ the above threads) is: 5) Al Viro made the decision (quote: "tar is ugly as hell and not going to be supported on the kernel side"): - http://www.uwsg.iu.edu/hypermail/linux/kernel/0112.2/1540.html + - https://lore.kernel.org/lkml/Pine.GSO.4.21.0112222109050.21702-100000@weyl.math.psu.edu/ explained his reasoning: - - http://www.uwsg.iu.edu/hypermail/linux/kernel/0112.2/1550.html - - http://www.uwsg.iu.edu/hypermail/linux/kernel/0112.2/1638.html + - https://lore.kernel.org/lkml/Pine.GSO.4.21.0112222240530.21702-100000@weyl.math.psu.edu/ + - https://lore.kernel.org/lkml/Pine.GSO.4.21.0112230849550.23300-100000@weyl.math.psu.edu/ and, most importantly, designed and implemented the initramfs code. diff --git a/Documentation/filesystems/resctrl.rst b/Documentation/filesystems/resctrl.rst index b7f35b07876a..8c8ce678148a 100644 --- a/Documentation/filesystems/resctrl.rst +++ b/Documentation/filesystems/resctrl.rst @@ -17,17 +17,18 @@ AMD refers to this feature as AMD Platform Quality of Service(AMD QoS). This feature is enabled by the CONFIG_X86_CPU_RESCTRL and the x86 /proc/cpuinfo flag bits: -=============================================== ================================ -RDT (Resource Director Technology) Allocation "rdt_a" -CAT (Cache Allocation Technology) "cat_l3", "cat_l2" -CDP (Code and Data Prioritization) "cdp_l3", "cdp_l2" -CQM (Cache QoS Monitoring) "cqm_llc", "cqm_occup_llc" -MBM (Memory Bandwidth Monitoring) "cqm_mbm_total", "cqm_mbm_local" -MBA (Memory Bandwidth Allocation) "mba" -SMBA (Slow Memory Bandwidth Allocation) "" -BMEC (Bandwidth Monitoring Event Configuration) "" -ABMC (Assignable Bandwidth Monitoring Counters) "" -=============================================== ================================ +=============================================================== ================================ +RDT (Resource Director Technology) Allocation "rdt_a" +CAT (Cache Allocation Technology) "cat_l3", "cat_l2" +CDP (Code and Data Prioritization) "cdp_l3", "cdp_l2" +CQM (Cache QoS Monitoring) "cqm_llc", "cqm_occup_llc" +MBM (Memory Bandwidth Monitoring) "cqm_mbm_total", "cqm_mbm_local" +MBA (Memory Bandwidth Allocation) "mba" +SMBA (Slow Memory Bandwidth Allocation) "" +BMEC (Bandwidth Monitoring Event Configuration) "" +ABMC (Assignable Bandwidth Monitoring Counters) "" +SDCIAE (Smart Data Cache Injection Allocation Enforcement) "" +=============================================================== ================================ Historically, new features were made visible by default in /proc/cpuinfo. This resulted in the feature flags becoming hard to parse by humans. Adding a new @@ -72,6 +73,11 @@ The 'info' directory contains information about the enabled resources. Each resource has its own subdirectory. The subdirectory names reflect the resource names. +Most of the files in the resource's subdirectory are read-only, and +describe properties of the resource. Resources that support global +configuration options also include writable files that can be used +to modify those settings. + Each subdirectory contains the following files with respect to allocation: @@ -90,12 +96,19 @@ related to allocation: must be set when writing a mask. "shareable_bits": - Bitmask of shareable resource with other executing - entities (e.g. I/O). User can use this when - setting up exclusive cache partitions. Note that - some platforms support devices that have their - own settings for cache use which can over-ride - these bits. + Bitmask of shareable resource with other executing entities + (e.g. I/O). Applies to all instances of this resource. User + can use this when setting up exclusive cache partitions. + Note that some platforms support devices that have their + own settings for cache use which can over-ride these bits. + + When "io_alloc" is enabled, a portion of each cache instance can + be configured for shared use between hardware and software. + "bit_usage" should be used to see which portions of each cache + instance is configured for hardware use via "io_alloc" feature + because every cache instance can have its "io_alloc" bitmask + configured independently via "io_alloc_cbm". + "bit_usage": Annotated capacity bitmasks showing how all instances of the resource are used. The legend is: @@ -109,16 +122,16 @@ related to allocation: "H": Corresponding region is used by hardware only but available for software use. If a resource - has bits set in "shareable_bits" but not all - of these bits appear in the resource groups' - schematas then the bits appearing in - "shareable_bits" but no resource group will - be marked as "H". + has bits set in "shareable_bits" or "io_alloc_cbm" + but not all of these bits appear in the resource + groups' schemata then the bits appearing in + "shareable_bits" or "io_alloc_cbm" but no + resource group will be marked as "H". "X": Corresponding region is available for sharing and - used by hardware and software. These are the - bits that appear in "shareable_bits" as - well as a resource group's allocation. + used by hardware and software. These are the bits + that appear in "shareable_bits" or "io_alloc_cbm" + as well as a resource group's allocation. "S": Corresponding region is used by software and available for sharing. @@ -136,6 +149,77 @@ related to allocation: "1": Non-contiguous 1s value in CBM is supported. +"io_alloc": + "io_alloc" enables system software to configure the portion of + the cache allocated for I/O traffic. File may only exist if the + system supports this feature on some of its cache resources. + + "disabled": + Resource supports "io_alloc" but the feature is disabled. + Portions of cache used for allocation of I/O traffic cannot + be configured. + "enabled": + Portions of cache used for allocation of I/O traffic + can be configured using "io_alloc_cbm". + "not supported": + Support not available for this resource. + + The feature can be modified by writing to the interface, for example: + + To enable:: + + # echo 1 > /sys/fs/resctrl/info/L3/io_alloc + + To disable:: + + # echo 0 > /sys/fs/resctrl/info/L3/io_alloc + + The underlying implementation may reduce resources available to + general (CPU) cache allocation. See architecture specific notes + below. Depending on usage requirements the feature can be enabled + or disabled. + + On AMD systems, io_alloc feature is supported by the L3 Smart + Data Cache Injection Allocation Enforcement (SDCIAE). The CLOSID for + io_alloc is the highest CLOSID supported by the resource. When + io_alloc is enabled, the highest CLOSID is dedicated to io_alloc and + no longer available for general (CPU) cache allocation. When CDP is + enabled, io_alloc routes I/O traffic using the highest CLOSID allocated + for the instruction cache (CDP_CODE), making this CLOSID no longer + available for general (CPU) cache allocation for both the CDP_CODE + and CDP_DATA resources. + +"io_alloc_cbm": + Capacity bitmasks that describe the portions of cache instances to + which I/O traffic from supported I/O devices are routed when "io_alloc" + is enabled. + + CBMs are displayed in the following format: + + <cache_id0>=<cbm>;<cache_id1>=<cbm>;... + + Example:: + + # cat /sys/fs/resctrl/info/L3/io_alloc_cbm + 0=ffff;1=ffff + + CBMs can be configured by writing to the interface. + + Example:: + + # echo 1=ff > /sys/fs/resctrl/info/L3/io_alloc_cbm + # cat /sys/fs/resctrl/info/L3/io_alloc_cbm + 0=ffff;1=00ff + + # echo "0=ff;1=f" > /sys/fs/resctrl/info/L3/io_alloc_cbm + # cat /sys/fs/resctrl/info/L3/io_alloc_cbm + 0=00ff;1=000f + + When CDP is enabled "io_alloc_cbm" associated with the CDP_DATA and CDP_CODE + resources may reflect the same values. For example, values read from and + written to /sys/fs/resctrl/info/L3DATA/io_alloc_cbm may be reflected by + /sys/fs/resctrl/info/L3CODE/io_alloc_cbm and vice versa. + Memory bandwidth(MB) subdirectory contains the following files with respect to allocation: diff --git a/Documentation/filesystems/xfs/xfs-online-fsck-design.rst b/Documentation/filesystems/xfs/xfs-online-fsck-design.rst index 8cbcd3c26434..3d9233f403db 100644 --- a/Documentation/filesystems/xfs/xfs-online-fsck-design.rst +++ b/Documentation/filesystems/xfs/xfs-online-fsck-design.rst @@ -105,10 +105,8 @@ occur; this capability aids both strategies. TLDR; Show Me the Code! ----------------------- -Code is posted to the kernel.org git trees as follows: -`kernel changes <https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux.git/log/?h=repair-symlink>`_, -`userspace changes <https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfsprogs-dev.git/log/?h=scrub-media-scan-service>`_, and -`QA test changes <https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfstests-dev.git/log/?h=repair-dirs>`_. +Kernel and userspace code has been fully merged as of October 2025. + Each kernel patchset adding an online repair function will use the same branch name across the kernel, xfsprogs, and fstests git repos. @@ -249,7 +247,7 @@ sharing and lock acquisition rules as the regular filesystem. This means that scrub cannot take *any* shortcuts to save time, because doing so could lead to concurrency problems. In other words, online fsck is not a complete replacement for offline fsck, and -a complete run of online fsck may take longer than online fsck. +a complete run of online fsck may take longer than offline fsck. However, both of these limitations are acceptable tradeoffs to satisfy the different motivations of online fsck, which are to **minimize system downtime** and to **increase predictability of operation**. @@ -764,12 +762,8 @@ allow the online fsck developers to compare online fsck against offline fsck, and they enable XFS developers to find deficiencies in the code base. Proposed patchsets include -`general fuzzer improvements -<https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfstests-dev.git/log/?h=fuzzer-improvements>`_, `fuzzing baselines -<https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfstests-dev.git/log/?h=fuzz-baseline>`_, -and `improvements in fuzz testing comprehensiveness -<https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfstests-dev.git/log/?h=more-fuzz-testing>`_. +<https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfstests-dev.git/log/?h=fuzz-baseline>`_. Stress Testing -------------- @@ -801,11 +795,6 @@ Success is defined by the ability to run all of these tests without observing any unexpected filesystem shutdowns due to corrupted metadata, kernel hang check warnings, or any other sort of mischief. -Proposed patchsets include `general stress testing -<https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfstests-dev.git/log/?h=race-scrub-and-mount-state-changes>`_ -and the `evolution of existing per-function stress testing -<https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfstests-dev.git/log/?h=refactor-scrub-stress>`_. - 4. User Interface ================= @@ -886,10 +875,6 @@ apply as nice of a priority to IO and CPU scheduling as possible. This measure was taken to minimize delays in the rest of the filesystem. No such hardening has been performed for the cron job. -Proposed patchset: -`Enabling the xfs_scrub background service -<https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfsprogs-dev.git/log/?h=scrub-media-scan-service>`_. - Health Reporting ---------------- @@ -912,13 +897,6 @@ notifications and initiate a repair? *Answer*: These questions remain unanswered, but should be a part of the conversation with early adopters and potential downstream users of XFS. -Proposed patchsets include -`wiring up health reports to correction returns -<https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux.git/log/?h=corruption-health-reports>`_ -and -`preservation of sickness info during memory reclaim -<https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux.git/log/?h=indirect-health-reporting>`_. - 5. Kernel Algorithms and Data Structures ======================================== @@ -1310,21 +1288,6 @@ Space allocation records are cross-referenced as follows: are there the same number of reverse mapping records for each block as the reference count record claims? -Proposed patchsets are the series to find gaps in -`refcount btree -<https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux.git/log/?h=scrub-detect-refcount-gaps>`_, -`inode btree -<https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux.git/log/?h=scrub-detect-inobt-gaps>`_, and -`rmap btree -<https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux.git/log/?h=scrub-detect-rmapbt-gaps>`_ records; -to find -`mergeable records -<https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux.git/log/?h=scrub-detect-mergeable-records>`_; -and to -`improve cross referencing with rmap -<https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux.git/log/?h=scrub-strengthen-rmap-checking>`_ -before starting a repair. - Checking Extended Attributes ```````````````````````````` @@ -1756,10 +1719,6 @@ For scrub, the drain works as follows: To avoid polling in step 4, the drain provides a waitqueue for scrub threads to be woken up whenever the intent count drops to zero. -The proposed patchset is the -`scrub intent drain series -<https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux.git/log/?h=scrub-drain-intents>`_. - .. _jump_labels: Static Keys (aka Jump Label Patching) @@ -2036,10 +1995,6 @@ The ``xfarray_store_anywhere`` function is used to insert a record in any null record slot in the bag; and the ``xfarray_unset`` function removes a record from the bag. -The proposed patchset is the -`big in-memory array -<https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux.git/log/?h=big-array>`_. - Iterating Array Elements ^^^^^^^^^^^^^^^^^^^^^^^^ @@ -2172,10 +2127,6 @@ However, it should be noted that these repair functions only use blob storage to cache a small number of entries before adding them to a temporary ondisk file, which is why compaction is not required. -The proposed patchset is at the start of the -`extended attribute repair -<https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux.git/log/?h=repair-xattrs>`_ series. - .. _xfbtree: In-Memory B+Trees @@ -2214,11 +2165,6 @@ xfiles enables reuse of the entire btree library. Btrees built atop an xfile are collectively known as ``xfbtrees``. The next few sections describe how they actually work. -The proposed patchset is the -`in-memory btree -<https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux.git/log/?h=in-memory-btrees>`_ -series. - Using xfiles as a Buffer Cache Target ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ @@ -2459,14 +2405,6 @@ This enables the log to release the old EFI to keep the log moving forwards. EFIs have a role to play during the commit and reaping phases; please see the next section and the section about :ref:`reaping<reaping>` for more details. -Proposed patchsets are the -`bitmap rework -<https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux.git/log/?h=repair-bitmap-rework>`_ -and the -`preparation for bulk loading btrees -<https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux.git/log/?h=repair-prep-for-bulk-loading>`_. - - Writing the New Tree ```````````````````` @@ -2623,11 +2561,6 @@ The number of records for the inode btree is the number of xfarray records, but the record count for the free inode btree has to be computed as inode chunk records are stored in the xfarray. -The proposed patchset is the -`AG btree repair -<https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux.git/log/?h=repair-ag-btrees>`_ -series. - Case Study: Rebuilding the Space Reference Counts ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ @@ -2716,11 +2649,6 @@ Reverse mappings are added to the bag using ``xfarray_store_anywhere`` and removed via ``xfarray_unset``. Bag members are examined through ``xfarray_iter`` loops. -The proposed patchset is the -`AG btree repair -<https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux.git/log/?h=repair-ag-btrees>`_ -series. - Case Study: Rebuilding File Fork Mapping Indices ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ @@ -2757,11 +2685,6 @@ EXTENTS format instead of BMBT, which may require a conversion. Third, the incore extent map must be reloaded carefully to avoid disturbing any delayed allocation extents. -The proposed patchset is the -`file mapping repair -<https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux.git/log/?h=repair-file-mappings>`_ -series. - .. _reaping: Reaping Old Metadata Blocks @@ -2843,11 +2766,6 @@ blocks. As stated earlier, online repair functions use very large transactions to minimize the chances of this occurring. -The proposed patchset is the -`preparation for bulk loading btrees -<https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux.git/log/?h=repair-prep-for-bulk-loading>`_ -series. - Case Study: Reaping After a Regular Btree Repair ```````````````````````````````````````````````` @@ -2943,11 +2861,6 @@ When the walk is complete, the bitmap disunion operation ``(ag_owner_bitmap & btrees. These blocks can then be reaped using the methods outlined above. -The proposed patchset is the -`AG btree repair -<https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux.git/log/?h=repair-ag-btrees>`_ -series. - .. _rmap_reap: Case Study: Reaping After Repairing Reverse Mapping Btrees @@ -2972,11 +2885,6 @@ methods outlined above. The rest of the process of rebuildng the reverse mapping btree is discussed in a separate :ref:`case study<rmap_repair>`. -The proposed patchset is the -`AG btree repair -<https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux.git/log/?h=repair-ag-btrees>`_ -series. - Case Study: Rebuilding the AGFL ``````````````````````````````` @@ -3024,11 +2932,6 @@ more complicated, because computing the correct value requires traversing the forks, or if that fails, leaving the fields invalid and waiting for the fork fsck functions to run. -The proposed patchset is the -`inode -<https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux.git/log/?h=repair-inodes>`_ -repair series. - Quota Record Repairs -------------------- @@ -3045,11 +2948,6 @@ checking are obviously bad limits and timer values. Quota usage counters are checked, repaired, and discussed separately in the section about :ref:`live quotacheck <quotacheck>`. -The proposed patchset is the -`quota -<https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux.git/log/?h=repair-quota>`_ -repair series. - .. _fscounters: Freezing to Fix Summary Counters @@ -3145,11 +3043,6 @@ long enough to check and correct the summary counters. | This bug was fixed in Linux 5.17. | +--------------------------------------------------------------------------+ -The proposed patchset is the -`summary counter cleanup -<https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux.git/log/?h=repair-fscounters>`_ -series. - Full Filesystem Scans --------------------- @@ -3277,15 +3170,6 @@ Second, if the incore inode is stuck in some intermediate state, the scan coordinator must release the AGI and push the main filesystem to get the inode back into a loadable state. -The proposed patches are the -`inode scanner -<https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux.git/log/?h=scrub-iscan>`_ -series. -The first user of the new functionality is the -`online quotacheck -<https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux.git/log/?h=repair-quotacheck>`_ -series. - Inode Management ```````````````` @@ -3381,12 +3265,6 @@ To capture these nuances, the online fsck code has a separate ``xchk_irele`` function to set or clear the ``DONTCACHE`` flag to get the required release behavior. -Proposed patchsets include fixing -`scrub iget usage -<https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux.git/log/?h=scrub-iget-fixes>`_ and -`dir iget usage -<https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux.git/log/?h=scrub-dir-iget-fixes>`_. - .. _ilocking: Locking Inodes @@ -3443,11 +3321,6 @@ If the dotdot entry changes while the directory is unlocked, then a move or rename operation must have changed the child's parentage, and the scan can exit early. -The proposed patchset is the -`directory repair -<https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux.git/log/?h=repair-dirs>`_ -series. - .. _fshooks: Filesystem Hooks @@ -3594,11 +3467,6 @@ The inode scan APIs are pretty simple: - ``xchk_iscan_teardown`` to finish the scan -This functionality is also a part of the -`inode scanner -<https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux.git/log/?h=scrub-iscan>`_ -series. - .. _quotacheck: Case Study: Quota Counter Checking @@ -3686,11 +3554,6 @@ needing to hold any locks for a long duration. If repairs are desired, the real and shadow dquots are locked and their resource counts are set to the values in the shadow dquot. -The proposed patchset is the -`online quotacheck -<https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux.git/log/?h=repair-quotacheck>`_ -series. - .. _nlinks: Case Study: File Link Count Checking @@ -3744,11 +3607,6 @@ shadow information. If no parents are found, the file must be :ref:`reparented <orphanage>` to the orphanage to prevent the file from being lost forever. -The proposed patchset is the -`file link count repair -<https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux.git/log/?h=scrub-nlinks>`_ -series. - .. _rmap_repair: Case Study: Rebuilding Reverse Mapping Records @@ -3828,11 +3686,6 @@ scan for reverse mapping records. 12. Free the xfbtree now that it not needed. -The proposed patchset is the -`rmap repair -<https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux.git/log/?h=repair-rmap-btree>`_ -series. - Staging Repairs with Temporary Files on Disk -------------------------------------------- @@ -3971,11 +3824,6 @@ Once a good copy of a data file has been constructed in a temporary file, it must be conveyed to the file being repaired, which is the topic of the next section. -The proposed patches are in the -`repair temporary files -<https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux.git/log/?h=repair-tempfiles>`_ -series. - Logged File Content Exchanges ----------------------------- @@ -4025,11 +3873,6 @@ The new ``XFS_SB_FEAT_INCOMPAT_EXCHRANGE`` incompatible feature flag in the superblock protects these new log item records from being replayed on old kernels. -The proposed patchset is the -`file contents exchange -<https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux.git/log/?h=atomic-file-updates>`_ -series. - +--------------------------------------------------------------------------+ | **Sidebar: Using Log-Incompatible Feature Flags** | +--------------------------------------------------------------------------+ @@ -4323,11 +4166,6 @@ To repair the summary file, write the xfile contents into the temporary file and use atomic mapping exchange to commit the new contents. The temporary file is then reaped. -The proposed patchset is the -`realtime summary repair -<https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux.git/log/?h=repair-rtsummary>`_ -series. - Case Study: Salvaging Extended Attributes ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ @@ -4369,11 +4207,6 @@ Salvaging extended attributes is done as follows: 4. Reap the temporary file. -The proposed patchset is the -`extended attribute repair -<https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux.git/log/?h=repair-xattrs>`_ -series. - Fixing Directories ------------------ @@ -4448,11 +4281,6 @@ Unfortunately, the current dentry cache design doesn't provide a means to walk every child dentry of a specific directory, which makes this a hard problem. There is no known solution. -The proposed patchset is the -`directory repair -<https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux.git/log/?h=repair-dirs>`_ -series. - Parent Pointers ``````````````` @@ -4612,11 +4440,6 @@ a :ref:`directory entry live update hook <liveupdate>` as follows: 7. Reap the temporary directory. -The proposed patchset is the -`parent pointers directory repair -<https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux.git/log/?h=pptrs-fsck>`_ -series. - Case Study: Repairing Parent Pointers ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ @@ -4662,11 +4485,6 @@ directory reconstruction: 8. Reap the temporary file. -The proposed patchset is the -`parent pointers repair -<https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux.git/log/?h=pptrs-fsck>`_ -series. - Digression: Offline Checking of Parent Pointers ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ @@ -4755,11 +4573,6 @@ connectivity checks: 4. Move on to examining link counts, as we do today. -The proposed patchset is the -`offline parent pointers repair -<https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfsprogs-dev.git/log/?h=pptrs-fsck>`_ -series. - Rebuilding directories from parent pointers in offline repair would be very challenging because xfs_repair currently uses two single-pass scans of the filesystem during phases 3 and 4 to decide which files are corrupt enough to be @@ -4903,12 +4716,6 @@ Repairing the directory tree works as follows: 6. If the subdirectory has zero paths, attach it to the lost and found. -The proposed patches are in the -`directory tree repair -<https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux.git/log/?h=scrub-directory-tree>`_ -series. - - .. _orphanage: The Orphanage @@ -4973,11 +4780,6 @@ Orphaned files are adopted by the orphanage as follows: 7. If a runtime error happens, call ``xrep_adoption_cancel`` to release all resources. -The proposed patches are in the -`orphanage adoption -<https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux.git/log/?h=repair-orphanage>`_ -series. - 6. Userspace Algorithms and Data Structures =========================================== @@ -5091,14 +4893,6 @@ first workqueue's workers until the backlog eases. This doesn't completely solve the balancing problem, but reduces it enough to move on to more pressing issues. -The proposed patchsets are the scrub -`performance tweaks -<https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfsprogs-dev.git/log/?h=scrub-performance-tweaks>`_ -and the -`inode scan rebalance -<https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfsprogs-dev.git/log/?h=scrub-iscan-rebalance>`_ -series. - .. _scrubrepair: Scheduling Repairs @@ -5179,20 +4973,6 @@ immediately. Corrupt file data blocks reported by phase 6 cannot be recovered by the filesystem. -The proposed patchsets are the -`repair warning improvements -<https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfsprogs-dev.git/log/?h=scrub-better-repair-warnings>`_, -refactoring of the -`repair data dependency -<https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfsprogs-dev.git/log/?h=scrub-repair-data-deps>`_ -and -`object tracking -<https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfsprogs-dev.git/log/?h=scrub-object-tracking>`_, -and the -`repair scheduling -<https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfsprogs-dev.git/log/?h=scrub-repair-scheduling>`_ -improvement series. - Checking Names for Confusable Unicode Sequences ----------------------------------------------- @@ -5372,6 +5152,8 @@ The extra flexibility enables several new use cases: This emulates an atomic device write in software, and can support arbitrary scattered writes. +(This functionality was merged into mainline as of 2025) + Vectorized Scrub ---------------- @@ -5393,13 +5175,7 @@ It is hoped that ``io_uring`` will pick up enough of this functionality that online fsck can use that instead of adding a separate vectored scrub system call to XFS. -The relevant patchsets are the -`kernel vectorized scrub -<https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux.git/log/?h=vectorized-scrub>`_ -and -`userspace vectorized scrub -<https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfsprogs-dev.git/log/?h=vectorized-scrub>`_ -series. +(This functionality was merged into mainline as of 2025) Quality of Service Targets for Scrub ------------------------------------ |
