user/sven/linux.git/lib/raid6, branch v4.9.53

Merge tag 'md/4.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shli/md

2016-10-07T16:45:43Z

Pull MD updates from Shaohua Li: "This update includes: - new AVX512 instruction based raid6 gen/recovery algorithm - a couple of md-cluster related bug fixes - fix a potential deadlock - set nonrotational bit for raid array with SSD - set correct max_hw_sectors for raid5/6, which hopefuly can improve performance a little bit - other minor fixes" * tag 'md/4.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shli/md: md: set rotational bit raid6/test/test.c: bug fix: Specify aligned(alignment) attributes to the char arrays raid5: handle register_shrinker failure raid5: fix to detect failure of register_shrinker md: fix a potential deadlock md/bitmap: fix wrong cleanup raid5: allow arbitrary max_hw_sectors lib/raid6: Add AVX512 optimized xor_syndrome functions lib/raid6/test/Makefile: Add avx512 gen_syndrome and recovery functions lib/raid6: Add AVX512 optimized recovery functions lib/raid6: Add AVX512 optimized gen_syndrome functions md-cluster: make resync lock also could be interruptted md-cluster: introduce dlm_lock_sync_interruptible to fix tasks hang md-cluster: convert the completion to wait queue md-cluster: protect md_find_rdev_nr_rcu with rcu lock md-cluster: clean related infos of cluster md: changes for MD_STILL_CLOSED flag md-cluster: remove some unnecessary dlm_unlock_sync md-cluster: use FORCEUNLOCK in lockres_free md-cluster: call md_kick_rdev_from_array once ack failed

raid6/test/test.c: bug fix: Specify aligned(alignment) attributes to the char arrays

2016-09-26T23:18:21Z

Specifying the aligned attributes to the char data[NDISKS][PAGE_SIZE], char recovi[PAGE_SIZE] and char recovi[PAGE_SIZE] arrays, so that all malloc memory is page boundary aligned. Without these alignment attributes, the test causes a segfault in userspace when the NDISKS are changed to 4 from 16. The RAID stripes will be page aligned anyway, so we want to test what the kernel actually will execute. Cc: H. Peter Anvin Cc: Yu-cheng Yu Signed-off-by: Gayatri Kammela Reviewed-by: H. Peter Anvin Signed-off-by: Shaohua Li

lib/raid6: Add AVX512 optimized xor_syndrome functions

2016-09-21T16:09:44Z

Optimize RAID6 xor_syndrome functions to take advantage of the 512-bit ZMM integer instructions introduced in AVX512. AVX512 optimized xor_syndrome functions, which is simply based on sse2.c written by hpa. The patch was tested and benchmarked before submission on a hardware that has AVX512 flags to support such instructions Cc: H. Peter Anvin Cc: Jim Kukunas Cc: Fenghua Yu Cc: Megha Dey Signed-off-by: Gayatri Kammela Reviewed-by: Fenghua Yu Signed-off-by: Shaohua Li

lib/raid6/test/Makefile: Add avx512 gen_syndrome and recovery functions

2016-09-21T16:09:44Z

Adding avx512 gen_syndrome and recovery functions so as to allow code to be compiled and tested successfully in userspace. This patch is tested in userspace and improvement in performace is observed. Cc: H. Peter Anvin Cc: Jim Kukunas Cc: Fenghua Yu Signed-off-by: Megha Dey Signed-off-by: Gayatri Kammela Reviewed-by: Fenghua Yu Signed-off-by: Shaohua Li

lib/raid6: Add AVX512 optimized recovery functions

2016-09-21T16:09:44Z

Optimize RAID6 recovery functions to take advantage of the 512-bit ZMM integer instructions introduced in AVX512. AVX512 optimized recovery functions, which is simply based on recov_avx2.c written by Jim Kukunas This patch was tested and benchmarked before submission on a hardware that has AVX512 flags to support such instructions Cc: Jim Kukunas Cc: H. Peter Anvin Cc: Fenghua Yu Signed-off-by: Megha Dey Signed-off-by: Gayatri Kammela Reviewed-by: Fenghua Yu Signed-off-by: Shaohua Li

lib/raid6: Add AVX512 optimized gen_syndrome functions

2016-09-21T16:09:44Z

Optimize RAID6 gen_syndrom functions to take advantage of the 512-bit ZMM integer instructions introduced in AVX512. AVX512 optimized gen_syndrom functions, which is simply based on avx2.c written by Yuanhan Liu and sse2.c written by hpa. The patch was tested and benchmarked before submission on a hardware that has AVX512 flags to support such instructions Cc: H. Peter Anvin Cc: Jim Kukunas Cc: Fenghua Yu Signed-off-by: Megha Dey Signed-off-by: Gayatri Kammela Reviewed-by: Fenghua Yu Signed-off-by: Shaohua Li

RAID/s390: provide raid6 recovery optimization

2016-09-01T14:13:25Z

The XC instruction can be used to improve the speed of the raid6 recovery. The loops now operate on blocks of 256 bytes. Signed-off-by: Martin Schwidefsky

RAID/s390: add SIMD implementation for raid6 gen/xor

2016-08-29T09:05:04Z

Using vector registers is slightly faster: raid6: vx128x8 gen() 19705 MB/s raid6: vx128x8 xor() 11886 MB/s raid6: using algorithm vx128x8 gen() 19705 MB/s raid6: .... xor() 11886 MB/s, rmw enabled vs the software algorithms: raid6: int64x1 gen() 3018 MB/s raid6: int64x1 xor() 1429 MB/s raid6: int64x2 gen() 4661 MB/s raid6: int64x2 xor() 3143 MB/s raid6: int64x4 gen() 5392 MB/s raid6: int64x4 xor() 3509 MB/s raid6: int64x8 gen() 4441 MB/s raid6: int64x8 xor() 3207 MB/s raid6: using algorithm int64x4 gen() 5392 MB/s raid6: .... xor() 3509 MB/s, rmw enabled Signed-off-by: Martin Schwidefsky

powerpc: Create disable_kernel_{fp,altivec,vsx,spe}()

2015-12-01T02:52:25Z

The enable_kernel_*() functions leave the relevant MSR bits enabled until we exit the kernel sometime later. Create disable versions that wrap the kernel use of FP, Altivec VSX or SPE. While we don't want to disable it normally for performance reasons (MSR writes are slow), it will be used for a debug boot option that does this and catches bad uses in other areas of the kernel. Signed-off-by: Anton Blanchard Signed-off-by: Michael Ellerman

md/raid6: delta syndrome for ARM NEON

2015-08-31T17:29:05Z

This implements XOR syndrome calculation using NEON intrinsics. As before, the module can be built for ARM and arm64 from the same source. Relative performance on a Cortex-A57 based system: raid6: int64x1 gen() 905 MB/s raid6: int64x1 xor() 881 MB/s raid6: int64x2 gen() 1343 MB/s raid6: int64x2 xor() 1286 MB/s raid6: int64x4 gen() 1896 MB/s raid6: int64x4 xor() 1321 MB/s raid6: int64x8 gen() 1773 MB/s raid6: int64x8 xor() 1165 MB/s raid6: neonx1 gen() 1834 MB/s raid6: neonx1 xor() 1278 MB/s raid6: neonx2 gen() 2528 MB/s raid6: neonx2 xor() 1942 MB/s raid6: neonx4 gen() 2888 MB/s raid6: neonx4 xor() 2334 MB/s raid6: neonx8 gen() 2957 MB/s raid6: neonx8 xor() 2232 MB/s raid6: using algorithm neonx8 gen() 2957 MB/s raid6: .... xor() 2232 MB/s, rmw enabled Cc: Markus Stockhausen Cc: Neil Brown Signed-off-by: Ard Biesheuvel Signed-off-by: NeilBrown