<feed xmlns='http://www.w3.org/2005/Atom'>
<title>user/sven/linux.git/lib/raid6/Makefile, branch v4.12.6</title>
<subtitle>Linux Kernel
</subtitle>
<id>https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v4.12.6</id>
<link rel='self' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v4.12.6'/>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/'/>
<updated>2016-10-07T16:45:43Z</updated>
<entry>
<title>Merge tag 'md/4.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shli/md</title>
<updated>2016-10-07T16:45:43Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2016-10-07T16:45:43Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=c23112e0395a89c8a52cd955442240de7fba46aa'/>
<id>urn:sha1:c23112e0395a89c8a52cd955442240de7fba46aa</id>
<content type='text'>
Pull MD updates from Shaohua Li:
 "This update includes:

   - new AVX512 instruction based raid6 gen/recovery algorithm

   - a couple of md-cluster related bug fixes

   - fix a potential deadlock

   - set nonrotational bit for raid array with SSD

   - set correct max_hw_sectors for raid5/6, which hopefuly can improve
     performance a little bit

   - other minor fixes"

* tag 'md/4.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shli/md:
  md: set rotational bit
  raid6/test/test.c: bug fix: Specify aligned(alignment) attributes to the char arrays
  raid5: handle register_shrinker failure
  raid5: fix to detect failure of register_shrinker
  md: fix a potential deadlock
  md/bitmap: fix wrong cleanup
  raid5: allow arbitrary max_hw_sectors
  lib/raid6: Add AVX512 optimized xor_syndrome functions
  lib/raid6/test/Makefile: Add avx512 gen_syndrome and recovery functions
  lib/raid6: Add AVX512 optimized recovery functions
  lib/raid6: Add AVX512 optimized gen_syndrome functions
  md-cluster: make resync lock also could be interruptted
  md-cluster: introduce dlm_lock_sync_interruptible to fix tasks hang
  md-cluster: convert the completion to wait queue
  md-cluster: protect md_find_rdev_nr_rcu with rcu lock
  md-cluster: clean related infos of cluster
  md: changes for MD_STILL_CLOSED flag
  md-cluster: remove some unnecessary dlm_unlock_sync
  md-cluster: use FORCEUNLOCK in lockres_free
  md-cluster: call md_kick_rdev_from_array once ack failed
</content>
</entry>
<entry>
<title>lib/raid6: Add AVX512 optimized recovery functions</title>
<updated>2016-09-21T16:09:44Z</updated>
<author>
<name>Gayatri Kammela</name>
<email>gayatri.kammela@intel.com</email>
</author>
<published>2016-08-13T01:03:20Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=13c520b2993c9faae6770264d33ff1e1ea4c2ceb'/>
<id>urn:sha1:13c520b2993c9faae6770264d33ff1e1ea4c2ceb</id>
<content type='text'>
Optimize RAID6 recovery functions to take advantage of
the 512-bit ZMM integer instructions introduced in AVX512.

AVX512 optimized recovery functions, which is simply based
on recov_avx2.c written by Jim Kukunas

This patch was tested and benchmarked before submission on
a hardware that has AVX512 flags to support such instructions

Cc: Jim Kukunas &lt;james.t.kukunas@linux.intel.com&gt;
Cc: H. Peter Anvin &lt;hpa@zytor.com&gt;
Cc: Fenghua Yu &lt;fenghua.yu@intel.com&gt;
Signed-off-by: Megha Dey &lt;megha.dey@linux.intel.com&gt;
Signed-off-by: Gayatri Kammela &lt;gayatri.kammela@intel.com&gt;
Reviewed-by: Fenghua Yu &lt;fenghua.yu@intel.com&gt;
Signed-off-by: Shaohua Li &lt;shli@fb.com&gt;
</content>
</entry>
<entry>
<title>lib/raid6: Add AVX512 optimized gen_syndrome functions</title>
<updated>2016-09-21T16:09:44Z</updated>
<author>
<name>Gayatri Kammela</name>
<email>gayatri.kammela@intel.com</email>
</author>
<published>2016-08-13T01:03:19Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=e0a491c1296874a1aca51cc68452f12a4d950029'/>
<id>urn:sha1:e0a491c1296874a1aca51cc68452f12a4d950029</id>
<content type='text'>
Optimize RAID6 gen_syndrom functions to take advantage of
the 512-bit ZMM integer instructions introduced in AVX512.

AVX512 optimized gen_syndrom functions, which is simply based
on avx2.c written by Yuanhan Liu and sse2.c written by hpa.

The patch was tested and benchmarked before submission on
a hardware that has AVX512 flags to support such instructions

Cc: H. Peter Anvin &lt;hpa@zytor.com&gt;
Cc: Jim Kukunas &lt;james.t.kukunas@linux.intel.com&gt;
Cc: Fenghua Yu &lt;fenghua.yu@intel.com&gt;
Signed-off-by: Megha Dey &lt;megha.dey@linux.intel.com&gt;
Signed-off-by: Gayatri Kammela &lt;gayatri.kammela@intel.com&gt;
Reviewed-by: Fenghua Yu &lt;fenghua.yu@intel.com&gt;
Signed-off-by: Shaohua Li &lt;shli@fb.com&gt;
</content>
</entry>
<entry>
<title>RAID/s390: provide raid6 recovery optimization</title>
<updated>2016-09-01T14:13:25Z</updated>
<author>
<name>Martin Schwidefsky</name>
<email>schwidefsky@de.ibm.com</email>
</author>
<published>2016-08-31T07:27:35Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=f5b55fa1f81d518925d68b50d2316850c525d1ad'/>
<id>urn:sha1:f5b55fa1f81d518925d68b50d2316850c525d1ad</id>
<content type='text'>
The XC instruction can be used to improve the speed of the raid6
recovery. The loops now operate on blocks of 256 bytes.

Signed-off-by: Martin Schwidefsky &lt;schwidefsky@de.ibm.com&gt;
</content>
</entry>
<entry>
<title>RAID/s390: add SIMD implementation for raid6 gen/xor</title>
<updated>2016-08-29T09:05:04Z</updated>
<author>
<name>Martin Schwidefsky</name>
<email>schwidefsky@de.ibm.com</email>
</author>
<published>2016-08-23T11:30:24Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=474fd6e80fe529e9adeeb7ea9d4e5d6c4da0b7fe'/>
<id>urn:sha1:474fd6e80fe529e9adeeb7ea9d4e5d6c4da0b7fe</id>
<content type='text'>
Using vector registers is slightly faster:

raid6: vx128x8  gen() 19705 MB/s
raid6: vx128x8  xor() 11886 MB/s
raid6: using algorithm vx128x8 gen() 19705 MB/s
raid6: .... xor() 11886 MB/s, rmw enabled

vs the software algorithms:

raid6: int64x1  gen()  3018 MB/s
raid6: int64x1  xor()  1429 MB/s
raid6: int64x2  gen()  4661 MB/s
raid6: int64x2  xor()  3143 MB/s
raid6: int64x4  gen()  5392 MB/s
raid6: int64x4  xor()  3509 MB/s
raid6: int64x8  gen()  4441 MB/s
raid6: int64x8  xor()  3207 MB/s
raid6: using algorithm int64x4 gen() 5392 MB/s
raid6: .... xor() 3509 MB/s, rmw enabled

Signed-off-by: Martin Schwidefsky &lt;schwidefsky@de.ibm.com&gt;
</content>
</entry>
<entry>
<title>powerpc: Only use -mabi=altivec if toolchain supports it</title>
<updated>2015-06-11T07:33:05Z</updated>
<author>
<name>Anton Blanchard</name>
<email>anton@samba.org</email>
</author>
<published>2015-05-25T22:53:26Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=1fb3f5a7ca599f322e6bf21272ad215301159aa0'/>
<id>urn:sha1:1fb3f5a7ca599f322e6bf21272ad215301159aa0</id>
<content type='text'>
The -mabi=altivec option is not recognised on LLVM, so use call cc-option
to check for support.

Signed-off-by: Anton Blanchard &lt;anton@samba.org&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
</content>
</entry>
<entry>
<title>Merge tag 'md/3.12' of git://neil.brown.name/md</title>
<updated>2013-09-10T20:03:41Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2013-09-10T20:03:41Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=4d7696f1b05f4aeb586c74868fe3da2731daca4b'/>
<id>urn:sha1:4d7696f1b05f4aeb586c74868fe3da2731daca4b</id>
<content type='text'>
Pull md update from Neil Brown:
 "Headline item is multithreading for RAID5 so that more IO/sec can be
  supported on fast (SSD) devices.  Also TILE-Gx SIMD suppor for RAID6
  calculations and an assortment of bug fixes"

* tag 'md/3.12' of git://neil.brown.name/md:
  raid5: only wakeup necessary threads
  md/raid5: flush out all pending requests before proceeding with reshape.
  md/raid5: use seqcount to protect access to shape in make_request.
  raid5: sysfs entry to control worker thread number
  raid5: offload stripe handle to workqueue
  raid5: fix stripe release order
  raid5: make release_stripe lockless
  md: avoid deadlock when dirty buffers during md_stop.
  md: Don't test all of mddev-&gt;flags at once.
  md: Fix apparent cut-and-paste error in super_90_validate
  raid6/test: replace echo -e with printf
  RAID: add tilegx SIMD implementation of raid6
  md: fix safe_mode buglet.
  md: don't call md_allow_write in get_bitmap_file.
</content>
</entry>
<entry>
<title>RAID: add tilegx SIMD implementation of raid6</title>
<updated>2013-08-27T06:05:50Z</updated>
<author>
<name>Ken Steele</name>
<email>ken@tilera.com</email>
</author>
<published>2013-08-07T16:39:56Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=ae77cbc1e7b90473a2b0963bce0e1eb163873214'/>
<id>urn:sha1:ae77cbc1e7b90473a2b0963bce0e1eb163873214</id>
<content type='text'>
This change adds TILE-Gx SIMD instructions to the software raid
(md), modeling the Altivec implementation. This is only for Syndrome
generation; there is more that could be done to improve recovery,
as in the recent Intel SSE3 recovery implementation.

The code unrolls 8 times; this turns out to be the best on tilegx
hardware among the set 1, 2, 4, 8 or 16.  The code reads one
cache-line of data from each disk, stores P and Q then goes to the
next cache-line.

The test code in sys/linux/lib/raid6/test reports 2008 MB/s data
read rate for syndrome generation using 18 disks (16 data and 2
parity). It was 1512 MB/s before this SIMD optimizations. This is
running on 1 core with all the data in cache.

This is based on the paper The Mathematics of RAID-6.
(http://kernel.org/pub/linux/kernel/people/hpa/raid6.pdf).

Signed-off-by: Ken Steele &lt;ken@tilera.com&gt;
Signed-off-by: Chris Metcalf &lt;cmetcalf@tilera.com&gt;
Signed-off-by: NeilBrown &lt;neilb@suse.de&gt;
</content>
</entry>
<entry>
<title>lib/raid6: add ARM-NEON accelerated syndrome calculation</title>
<updated>2013-07-08T21:09:18Z</updated>
<author>
<name>Ard Biesheuvel</name>
<email>ard.biesheuvel@linaro.org</email>
</author>
<published>2013-05-16T15:20:32Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=7d11965ddb9b9b1e0a5d13c58345ada1ccbc663b'/>
<id>urn:sha1:7d11965ddb9b9b1e0a5d13c58345ada1ccbc663b</id>
<content type='text'>
Rebased/reworked a patch contributed by Rob Herring that uses
NEON intrinsics to perform the RAID-6 syndrome calculations.
It uses the existing unroll.awk code to generate several
unrolled versions of which the best performing one is selected
at boot time.

Signed-off-by: Ard Biesheuvel &lt;ard.biesheuvel@linaro.org&gt;
Acked-by: Nicolas Pitre &lt;nico@linaro.org&gt;
Cc: hpa@linux.intel.com
</content>
</entry>
<entry>
<title>lib/raid6: build proper files on corresponding arch</title>
<updated>2012-12-13T08:51:04Z</updated>
<author>
<name>Yuanhan Liu</name>
<email>yuanhan.liu@linux.intel.com</email>
</author>
<published>2012-11-30T21:10:40Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=4f8c55c5ad491dbc7b52ce08bb702ca39ce944cf'/>
<id>urn:sha1:4f8c55c5ad491dbc7b52ce08bb702ca39ce944cf</id>
<content type='text'>
sse and avx2 stuff only exist on x86 arch, and we don't need to build
altivec on x86. And we can do that at lib/raid6/Makefile.

Proposed-by: H. Peter Anvin &lt;hpa@zytor.com&gt;
Signed-off-by: Yuanhan Liu &lt;yuanhan.liu@linux.intel.com&gt;
Reviewed-by: H. Peter Anvin &lt;hpa@zytor.com&gt;
Signed-off-by: Jim Kukunas &lt;james.t.kukunas@linux.intel.com&gt;
Signed-off-by: NeilBrown &lt;neilb@suse.de&gt;
</content>
</entry>
</feed>
