user/sven/linux.git/include/linux/raid, branch v4.4.234

md/raid6 algorithms: delta syndrome functions

2015-04-21T22:00:41Z

v3: s-o-b comment, explanation of performance and descision for the start/stop implementation Implementing rmw functionality for RAID6 requires optimized syndrome calculation. Up to now we can only generate a complete syndrome. The target P/Q pages are always overwritten. With this patch we provide a framework for inplace P/Q modification. In the first place simply fill those functions with NULL values. xor_syndrome() has two additional parameters: start & stop. These will indicate the first and last page that are changing during a rmw run. That makes it possible to avoid several unneccessary loops and speed up calculation. The caller needs to implement the following logic to make the functions work. 1) xor_syndrome(disks, start, stop, ...): "Remove" all data of source blocks inside P/Q between (and including) start and end. 2) modify any block with start <= block <= stop 3) xor_syndrome(disks, start, stop, ...): "Reinsert" all data of source blocks into P/Q between (and including) start and end. Pages between start and stop that won't be changed should be filled with a pointer to the kernel zero page. The reasons for not taking NULL pages are: 1) Algorithms cross the whole source data line by line. Thus avoid additional branches. 2) Having a NULL page avoids calculating the XOR P parity but still need calulation steps for the Q parity. Depending on the algorithm unrolling that might be only a difference of 2 instructions per loop. The benchmark numbers of the gen_syndrome() functions are displayed in the kernel log. Do the same for the xor_syndrome() functions. This will help to analyze performance problems and give an rough estimate how well the algorithm works. The choice of the fastest algorithm will still depend on the gen_syndrome() performance. With the start/stop page implementation the speed can vary a lot in real life. E.g. a change of page 0 & page 15 on a stripe will be harder to compute than the case where page 0 & page 1 are XOR candidates. To be not to enthusiatic about the expected speeds we will run a worse case test that simulates a change on the upper half of the stripe. So we do: 1) calculation of P/Q for the upper pages 2) continuation of Q for the lower (empty) pages Signed-off-by: Markus Stockhausen Signed-off-by: NeilBrown

Merge tag 'md/3.12' of git://neil.brown.name/md

2013-09-10T20:03:41Z

Pull md update from Neil Brown: "Headline item is multithreading for RAID5 so that more IO/sec can be supported on fast (SSD) devices. Also TILE-Gx SIMD suppor for RAID6 calculations and an assortment of bug fixes" * tag 'md/3.12' of git://neil.brown.name/md: raid5: only wakeup necessary threads md/raid5: flush out all pending requests before proceeding with reshape. md/raid5: use seqcount to protect access to shape in make_request. raid5: sysfs entry to control worker thread number raid5: offload stripe handle to workqueue raid5: fix stripe release order raid5: make release_stripe lockless md: avoid deadlock when dirty buffers during md_stop. md: Don't test all of mddev->flags at once. md: Fix apparent cut-and-paste error in super_90_validate raid6/test: replace echo -e with printf RAID: add tilegx SIMD implementation of raid6 md: fix safe_mode buglet. md: don't call md_allow_write in get_bitmap_file.

RAID: add tilegx SIMD implementation of raid6

2013-08-27T06:05:50Z

This change adds TILE-Gx SIMD instructions to the software raid (md), modeling the Altivec implementation. This is only for Syndrome generation; there is more that could be done to improve recovery, as in the recent Intel SSE3 recovery implementation. The code unrolls 8 times; this turns out to be the best on tilegx hardware among the set 1, 2, 4, 8 or 16. The code reads one cache-line of data from each disk, stores P and Q then goes to the next cache-line. The test code in sys/linux/lib/raid6/test reports 2008 MB/s data read rate for syndrome generation using 18 disks (16 data and 2 parity). It was 1512 MB/s before this SIMD optimizations. This is running on 1 core with all the data in cache. This is based on the paper The Mathematics of RAID-6. (http://kernel.org/pub/linux/kernel/people/hpa/raid6.pdf). Signed-off-by: Ken Steele Signed-off-by: Chris Metcalf Signed-off-by: NeilBrown

lib/raid6: add ARM-NEON accelerated syndrome calculation

2013-07-08T21:09:18Z

Rebased/reworked a patch contributed by Rob Herring that uses NEON intrinsics to perform the RAID-6 syndrome calculations. It uses the existing unroll.awk code to generate several unrolled versions of which the best performing one is selected at boot time. Signed-off-by: Ard Biesheuvel Acked-by: Nicolas Pitre Cc: hpa@linux.intel.com

UAPI: Remove empty Kbuild files

2013-01-03T01:36:10Z

Empty files can get deleted by the patch program, so remove empty Kbuild files and their links from the parent Kbuilds. Signed-off-by: David Howells Signed-off-by: Linus Torvalds

lib/raid6: Add AVX2 optimized gen_syndrome functions

2012-12-13T08:51:03Z

Add AVX2 optimized gen_syndrom functions, which is simply based on sse2.c written by hpa. Signed-off-by: Yuanhan Liu Reviewed-by: H. Peter Anvin Signed-off-by: Jim Kukunas Signed-off-by: NeilBrown

lib/raid6: Add AVX2 optimized recovery functions

2012-12-13T05:42:01Z

Optimize RAID6 recovery functions to take advantage of the 256-bit YMM integer instructions introduced in AVX2. The patch was tested and benchmarked before submission. However hardware is not yet released so benchmark numbers cannot be reported. Acked-by: "H. Peter Anvin" Signed-off-by: Jim Kukunas Signed-off-by: NeilBrown

UAPI: (Scripted) Disintegrate include/linux/raid

2012-10-09T08:49:02Z

Signed-off-by: David Howells Acked-by: Arnd Bergmann Acked-by: Thomas Gleixner Acked-by: Michael Kerrisk Acked-by: Paul E. McKenney Acked-by: Dave Jones

lib/raid6: Add SSSE3 optimized recovery functions

2012-05-22T03:54:18Z

Add SSSE3 optimized recovery functions, as well as a system for selecting the most appropriate recovery functions to use. Originally-by: H. Peter Anvin Signed-off-by: Jim Kukunas Signed-off-by: NeilBrown

md: add possibility to change data-offset for devices.

2012-05-20T23:27:00Z

When reshaping we can avoid costly intermediate backup by changing the 'start' address of the array on the device (if there is enough room). So as a first step, allow such a change to be requested through sysfs, and recorded in v1.x metadata. (As we didn't previous check that all 'pad' fields were zero, we need a new FEATURE flag for this. A (belatedly) check that all remaining 'pad' fields are zero to avoid a repeat of this) The new data offset must be requested separately for each device. This allows each to have a different change in the data offset. This is not likely to be used often but as data_offset can be set per-device, new_data_offset should be too. This patch also removes the 'acknowledged' arg to rdev_set_badblocks as it is never used and never will be. At the same time we add a new arg ('in_new') which is currently always zero but will be used more soon. When a reshape finishes we will need to update the data_offset and rdev->sectors. So provide an exported function to do that. Signed-off-by: NeilBrown