<feed xmlns='http://www.w3.org/2005/Atom'>
<title>user/sven/linux.git/lib/raid6, branch v4.9.53</title>
<subtitle>Linux Kernel
</subtitle>
<id>https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v4.9.53</id>
<link rel='self' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v4.9.53'/>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/'/>
<updated>2016-10-07T16:45:43Z</updated>
<entry>
<title>Merge tag 'md/4.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shli/md</title>
<updated>2016-10-07T16:45:43Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2016-10-07T16:45:43Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=c23112e0395a89c8a52cd955442240de7fba46aa'/>
<id>urn:sha1:c23112e0395a89c8a52cd955442240de7fba46aa</id>
<content type='text'>
Pull MD updates from Shaohua Li:
 "This update includes:

   - new AVX512 instruction based raid6 gen/recovery algorithm

   - a couple of md-cluster related bug fixes

   - fix a potential deadlock

   - set nonrotational bit for raid array with SSD

   - set correct max_hw_sectors for raid5/6, which hopefuly can improve
     performance a little bit

   - other minor fixes"

* tag 'md/4.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shli/md:
  md: set rotational bit
  raid6/test/test.c: bug fix: Specify aligned(alignment) attributes to the char arrays
  raid5: handle register_shrinker failure
  raid5: fix to detect failure of register_shrinker
  md: fix a potential deadlock
  md/bitmap: fix wrong cleanup
  raid5: allow arbitrary max_hw_sectors
  lib/raid6: Add AVX512 optimized xor_syndrome functions
  lib/raid6/test/Makefile: Add avx512 gen_syndrome and recovery functions
  lib/raid6: Add AVX512 optimized recovery functions
  lib/raid6: Add AVX512 optimized gen_syndrome functions
  md-cluster: make resync lock also could be interruptted
  md-cluster: introduce dlm_lock_sync_interruptible to fix tasks hang
  md-cluster: convert the completion to wait queue
  md-cluster: protect md_find_rdev_nr_rcu with rcu lock
  md-cluster: clean related infos of cluster
  md: changes for MD_STILL_CLOSED flag
  md-cluster: remove some unnecessary dlm_unlock_sync
  md-cluster: use FORCEUNLOCK in lockres_free
  md-cluster: call md_kick_rdev_from_array once ack failed
</content>
</entry>
<entry>
<title>raid6/test/test.c: bug fix: Specify aligned(alignment) attributes to the char arrays</title>
<updated>2016-09-26T23:18:21Z</updated>
<author>
<name>Gayatri Kammela</name>
<email>gayatri.kammela@intel.com</email>
</author>
<published>2016-09-26T21:37:38Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=099b548c429217a8306adbd1552d326615c9b903'/>
<id>urn:sha1:099b548c429217a8306adbd1552d326615c9b903</id>
<content type='text'>
Specifying the aligned attributes to the char data[NDISKS][PAGE_SIZE],
char recovi[PAGE_SIZE] and char recovi[PAGE_SIZE] arrays, so that all
malloc memory is page boundary aligned.

Without these alignment attributes, the test causes a segfault in
userspace when the NDISKS are changed to 4 from 16.

The RAID stripes will be page aligned anyway, so we want to test what
the kernel actually will execute.

Cc: H. Peter Anvin &lt;hpa@zytor.com&gt;
Cc: Yu-cheng Yu &lt;yu-cheng.yu@intel.com&gt;
Signed-off-by: Gayatri Kammela &lt;gayatri.kammela@intel.com&gt;
Reviewed-by: H. Peter Anvin &lt;hpa@linux.intel.com&gt;
Signed-off-by: Shaohua Li &lt;shli@fb.com&gt;
</content>
</entry>
<entry>
<title>lib/raid6: Add AVX512 optimized xor_syndrome functions</title>
<updated>2016-09-21T16:09:44Z</updated>
<author>
<name>Gayatri Kammela</name>
<email>gayatri.kammela@intel.com</email>
</author>
<published>2016-08-13T01:03:22Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=694dda62d28674c21c8186d43b8ca0139bf01585'/>
<id>urn:sha1:694dda62d28674c21c8186d43b8ca0139bf01585</id>
<content type='text'>
Optimize RAID6 xor_syndrome functions to take advantage of the 512-bit
ZMM integer instructions introduced in AVX512.

AVX512 optimized xor_syndrome functions, which is simply based on sse2.c
written by hpa.

The patch was tested and benchmarked before submission on
a hardware that has AVX512 flags to support such instructions

Cc: H. Peter Anvin &lt;hpa@zytor.com&gt;
Cc: Jim Kukunas &lt;james.t.kukunas@linux.intel.com&gt;
Cc: Fenghua Yu &lt;fenghua.yu@intel.com&gt;
Cc: Megha Dey &lt;megha.dey@linux.intel.com&gt;
Signed-off-by: Gayatri Kammela &lt;gayatri.kammela@intel.com&gt;
Reviewed-by: Fenghua Yu &lt;fenghua.yu@intel.com&gt;
Signed-off-by: Shaohua Li &lt;shli@fb.com&gt;
</content>
</entry>
<entry>
<title>lib/raid6/test/Makefile: Add avx512 gen_syndrome and recovery functions</title>
<updated>2016-09-21T16:09:44Z</updated>
<author>
<name>Gayatri Kammela</name>
<email>gayatri.kammela@intel.com</email>
</author>
<published>2016-08-13T01:03:21Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=161db5d165123a72792e2687ecfd8de146dbae1a'/>
<id>urn:sha1:161db5d165123a72792e2687ecfd8de146dbae1a</id>
<content type='text'>
Adding avx512 gen_syndrome and recovery functions so as to allow code to
be compiled and tested successfully in userspace.

This patch is tested in userspace and improvement in performace is
observed.

Cc: H. Peter Anvin &lt;hpa@zytor.com&gt;
Cc: Jim Kukunas &lt;james.t.kukunas@linux.intel.com&gt;
Cc: Fenghua Yu &lt;fenghua.yu@intel.com&gt;
Signed-off-by: Megha Dey &lt;megha.dey@linux.intel.com&gt;
Signed-off-by: Gayatri Kammela &lt;gayatri.kammela@intel.com&gt;
Reviewed-by: Fenghua Yu &lt;fenghua.yu@intel.com&gt;
Signed-off-by: Shaohua Li &lt;shli@fb.com&gt;
</content>
</entry>
<entry>
<title>lib/raid6: Add AVX512 optimized recovery functions</title>
<updated>2016-09-21T16:09:44Z</updated>
<author>
<name>Gayatri Kammela</name>
<email>gayatri.kammela@intel.com</email>
</author>
<published>2016-08-13T01:03:20Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=13c520b2993c9faae6770264d33ff1e1ea4c2ceb'/>
<id>urn:sha1:13c520b2993c9faae6770264d33ff1e1ea4c2ceb</id>
<content type='text'>
Optimize RAID6 recovery functions to take advantage of
the 512-bit ZMM integer instructions introduced in AVX512.

AVX512 optimized recovery functions, which is simply based
on recov_avx2.c written by Jim Kukunas

This patch was tested and benchmarked before submission on
a hardware that has AVX512 flags to support such instructions

Cc: Jim Kukunas &lt;james.t.kukunas@linux.intel.com&gt;
Cc: H. Peter Anvin &lt;hpa@zytor.com&gt;
Cc: Fenghua Yu &lt;fenghua.yu@intel.com&gt;
Signed-off-by: Megha Dey &lt;megha.dey@linux.intel.com&gt;
Signed-off-by: Gayatri Kammela &lt;gayatri.kammela@intel.com&gt;
Reviewed-by: Fenghua Yu &lt;fenghua.yu@intel.com&gt;
Signed-off-by: Shaohua Li &lt;shli@fb.com&gt;
</content>
</entry>
<entry>
<title>lib/raid6: Add AVX512 optimized gen_syndrome functions</title>
<updated>2016-09-21T16:09:44Z</updated>
<author>
<name>Gayatri Kammela</name>
<email>gayatri.kammela@intel.com</email>
</author>
<published>2016-08-13T01:03:19Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=e0a491c1296874a1aca51cc68452f12a4d950029'/>
<id>urn:sha1:e0a491c1296874a1aca51cc68452f12a4d950029</id>
<content type='text'>
Optimize RAID6 gen_syndrom functions to take advantage of
the 512-bit ZMM integer instructions introduced in AVX512.

AVX512 optimized gen_syndrom functions, which is simply based
on avx2.c written by Yuanhan Liu and sse2.c written by hpa.

The patch was tested and benchmarked before submission on
a hardware that has AVX512 flags to support such instructions

Cc: H. Peter Anvin &lt;hpa@zytor.com&gt;
Cc: Jim Kukunas &lt;james.t.kukunas@linux.intel.com&gt;
Cc: Fenghua Yu &lt;fenghua.yu@intel.com&gt;
Signed-off-by: Megha Dey &lt;megha.dey@linux.intel.com&gt;
Signed-off-by: Gayatri Kammela &lt;gayatri.kammela@intel.com&gt;
Reviewed-by: Fenghua Yu &lt;fenghua.yu@intel.com&gt;
Signed-off-by: Shaohua Li &lt;shli@fb.com&gt;
</content>
</entry>
<entry>
<title>RAID/s390: provide raid6 recovery optimization</title>
<updated>2016-09-01T14:13:25Z</updated>
<author>
<name>Martin Schwidefsky</name>
<email>schwidefsky@de.ibm.com</email>
</author>
<published>2016-08-31T07:27:35Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=f5b55fa1f81d518925d68b50d2316850c525d1ad'/>
<id>urn:sha1:f5b55fa1f81d518925d68b50d2316850c525d1ad</id>
<content type='text'>
The XC instruction can be used to improve the speed of the raid6
recovery. The loops now operate on blocks of 256 bytes.

Signed-off-by: Martin Schwidefsky &lt;schwidefsky@de.ibm.com&gt;
</content>
</entry>
<entry>
<title>RAID/s390: add SIMD implementation for raid6 gen/xor</title>
<updated>2016-08-29T09:05:04Z</updated>
<author>
<name>Martin Schwidefsky</name>
<email>schwidefsky@de.ibm.com</email>
</author>
<published>2016-08-23T11:30:24Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=474fd6e80fe529e9adeeb7ea9d4e5d6c4da0b7fe'/>
<id>urn:sha1:474fd6e80fe529e9adeeb7ea9d4e5d6c4da0b7fe</id>
<content type='text'>
Using vector registers is slightly faster:

raid6: vx128x8  gen() 19705 MB/s
raid6: vx128x8  xor() 11886 MB/s
raid6: using algorithm vx128x8 gen() 19705 MB/s
raid6: .... xor() 11886 MB/s, rmw enabled

vs the software algorithms:

raid6: int64x1  gen()  3018 MB/s
raid6: int64x1  xor()  1429 MB/s
raid6: int64x2  gen()  4661 MB/s
raid6: int64x2  xor()  3143 MB/s
raid6: int64x4  gen()  5392 MB/s
raid6: int64x4  xor()  3509 MB/s
raid6: int64x8  gen()  4441 MB/s
raid6: int64x8  xor()  3207 MB/s
raid6: using algorithm int64x4 gen() 5392 MB/s
raid6: .... xor() 3509 MB/s, rmw enabled

Signed-off-by: Martin Schwidefsky &lt;schwidefsky@de.ibm.com&gt;
</content>
</entry>
<entry>
<title>powerpc: Create disable_kernel_{fp,altivec,vsx,spe}()</title>
<updated>2015-12-01T02:52:25Z</updated>
<author>
<name>Anton Blanchard</name>
<email>anton@samba.org</email>
</author>
<published>2015-10-29T00:44:05Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=dc4fbba11e4661a6a77a1f89ba32f9082e6395ff'/>
<id>urn:sha1:dc4fbba11e4661a6a77a1f89ba32f9082e6395ff</id>
<content type='text'>
The enable_kernel_*() functions leave the relevant MSR bits enabled
until we exit the kernel sometime later. Create disable versions
that wrap the kernel use of FP, Altivec VSX or SPE.

While we don't want to disable it normally for performance reasons
(MSR writes are slow), it will be used for a debug boot option that
does this and catches bad uses in other areas of the kernel.

Signed-off-by: Anton Blanchard &lt;anton@samba.org&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
</content>
</entry>
<entry>
<title>md/raid6: delta syndrome for ARM NEON</title>
<updated>2015-08-31T17:29:05Z</updated>
<author>
<name>Ard Biesheuvel</name>
<email>ard.biesheuvel@linaro.org</email>
</author>
<published>2015-07-01T02:19:56Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=0e833e697bcf4c2f3f7fb9fce39d08cd4439e5d7'/>
<id>urn:sha1:0e833e697bcf4c2f3f7fb9fce39d08cd4439e5d7</id>
<content type='text'>
This implements XOR syndrome calculation using NEON intrinsics.
As before, the module can be built for ARM and arm64 from the
same source.

Relative performance on a Cortex-A57 based system:

  raid6: int64x1  gen()   905 MB/s
  raid6: int64x1  xor()   881 MB/s
  raid6: int64x2  gen()  1343 MB/s
  raid6: int64x2  xor()  1286 MB/s
  raid6: int64x4  gen()  1896 MB/s
  raid6: int64x4  xor()  1321 MB/s
  raid6: int64x8  gen()  1773 MB/s
  raid6: int64x8  xor()  1165 MB/s
  raid6: neonx1   gen()  1834 MB/s
  raid6: neonx1   xor()  1278 MB/s
  raid6: neonx2   gen()  2528 MB/s
  raid6: neonx2   xor()  1942 MB/s
  raid6: neonx4   gen()  2888 MB/s
  raid6: neonx4   xor()  2334 MB/s
  raid6: neonx8   gen()  2957 MB/s
  raid6: neonx8   xor()  2232 MB/s
  raid6: using algorithm neonx8 gen() 2957 MB/s
  raid6: .... xor() 2232 MB/s, rmw enabled

Cc: Markus Stockhausen &lt;stockhausen@collogia.de&gt;
Cc: Neil Brown &lt;neilb@suse.de&gt;
Signed-off-by: Ard Biesheuvel &lt;ard.biesheuvel@linaro.org&gt;
Signed-off-by: NeilBrown &lt;neilb@suse.com&gt;
</content>
</entry>
</feed>
