<feed xmlns='http://www.w3.org/2005/Atom'>
<title>user/sven/linux.git/drivers/dax, branch v4.9.110</title>
<subtitle>Linux Kernel
</subtitle>
<id>https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v4.9.110</id>
<link rel='self' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v4.9.110'/>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/'/>
<updated>2018-02-28T09:18:33Z</updated>
<entry>
<title>device-dax: implement -&gt;split() to catch invalid munmap attempts</title>
<updated>2018-02-28T09:18:33Z</updated>
<author>
<name>Dan Williams</name>
<email>dan.j.williams@intel.com</email>
</author>
<published>2018-02-23T22:05:43Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=be38759eb2d697f9e907eddfbb8d33aa26d37acc'/>
<id>urn:sha1:be38759eb2d697f9e907eddfbb8d33aa26d37acc</id>
<content type='text'>
commit 9702cffdbf2129516db679e4467db81e1cd287da upstream.

Similar to how device-dax enforces that the 'address', 'offset', and
'len' parameters to mmap() be aligned to the device's fundamental
alignment, the same constraints apply to munmap().  Implement -&gt;split()
to fail munmap calls that violate the alignment constraint.

Otherwise, we later fail VM_BUG_ON checks in the unmap_page_range() path
with crash signatures of the form:

    vma ffff8800b60c8a88 start 00007f88c0000000 end 00007f88c0e00000
    next           (null) prev           (null) mm ffff8800b61150c0
    prot 8000000000000027 anon_vma           (null) vm_ops ffffffffa0091240
    pgoff 0 file ffff8800b638ef80 private_data           (null)
    flags: 0x380000fb(read|write|shared|mayread|maywrite|mayexec|mayshare|softdirty|mixedmap|hugepage)
    ------------[ cut here ]------------
    kernel BUG at mm/huge_memory.c:2014!
    [..]
    RIP: 0010:__split_huge_pud+0x12a/0x180
    [..]
    Call Trace:
     unmap_page_range+0x245/0xa40
     ? __vma_adjust+0x301/0x990
     unmap_vmas+0x4c/0xa0
     unmap_region+0xae/0x120
     ? __vma_rb_erase+0x11a/0x230
     do_munmap+0x276/0x410
     vm_munmap+0x6a/0xa0
     SyS_munmap+0x1d/0x30

Link: http://lkml.kernel.org/r/151130418681.4029.7118245855057952010.stgit@dwillia2-desk3.amr.corp.intel.com
Fixes: dee410792419 ("/dev/dax, core: file operations and dax-mmap")
Signed-off-by: Dan Williams &lt;dan.j.williams@intel.com&gt;
Reported-by: Jeff Moyer &lt;jmoyer@redhat.com&gt;
Cc: &lt;stable@vger.kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>device-dax: fix sysfs duplicate warnings</title>
<updated>2017-08-07T01:59:43Z</updated>
<author>
<name>Dan Williams</name>
<email>dan.j.williams@intel.com</email>
</author>
<published>2017-07-19T00:49:14Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=a3ff46097a1d05fc95e34f9c4ca493488d8fe766'/>
<id>urn:sha1:a3ff46097a1d05fc95e34f9c4ca493488d8fe766</id>
<content type='text'>
commit bbb3be170ac2891526ad07b18af7db226879a8e7 upstream.

Fix warnings of the form...

     WARNING: CPU: 10 PID: 4983 at fs/sysfs/dir.c:31 sysfs_warn_dup+0x62/0x80
     sysfs: cannot create duplicate filename '/class/dax/dax12.0'
     Call Trace:
      dump_stack+0x63/0x86
      __warn+0xcb/0xf0
      warn_slowpath_fmt+0x5a/0x80
      ? kernfs_path_from_node+0x4f/0x60
      sysfs_warn_dup+0x62/0x80
      sysfs_do_create_link_sd.isra.2+0x97/0xb0
      sysfs_create_link+0x25/0x40
      device_add+0x266/0x630
      devm_create_dax_dev+0x2cf/0x340 [dax]
      dax_pmem_probe+0x1f5/0x26e [dax_pmem]
      nvdimm_bus_probe+0x71/0x120

...by reusing the namespace id for the device-dax instance name.

Now that we have decided that there will never by more than one
device-dax instance per libnvdimm-namespace parent device [1], we can
directly reuse the namepace ids. There are some possible follow-on
cleanups, but those are saved for a later patch to simplify the -stable
backport.

[1]: https://lists.01.org/pipermail/linux-nvdimm/2016-December/008266.html

Fixes: 98a29c39dc68 ("libnvdimm, namespace: allow creation of multiple pmem...")
Cc: Jeff Moyer &lt;jmoyer@redhat.com&gt;
Reported-by: Dariusz Dokupil &lt;dariusz.dokupil@intel.com&gt;
Signed-off-by: Dan Williams &lt;dan.j.williams@intel.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>device-dax: fix cdev leak</title>
<updated>2017-05-20T12:28:41Z</updated>
<author>
<name>Dan Williams</name>
<email>dan.j.williams@intel.com</email>
</author>
<published>2017-03-17T18:48:09Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=8dd114ef78c855814558472ed13e99357e098e33'/>
<id>urn:sha1:8dd114ef78c855814558472ed13e99357e098e33</id>
<content type='text'>
commit ed01e50acdd3e4a640cf9ebd28a7e810c3ceca97 upstream.

If device_add() fails, cleanup the cdev. Otherwise, we leak a kobj_map()
with a stale device number.

As Jason points out, there is a small possibility that userspace has
opened and mapped the device in the time between cdev_add() and the
device_add() failure. We need a new kill_dax_dev() helper to invalidate
any established mappings.

Fixes: ba09c01d2fa8 ("dax: convert to the cdev api")
Reported-by: Jason Gunthorpe &lt;jgunthorpe@obsidianresearch.com&gt;
Signed-off-by: Dan Williams &lt;dan.j.williams@intel.com&gt;
Signed-off-by: Logan Gunthorpe &lt;logang@deltatee.com&gt;
Reviewed-by: Johannes Thumshirn &lt;jthumshirn@suse.de&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>device-dax: switch to srcu, fix rcu_read_lock() vs pte allocation</title>
<updated>2017-04-27T07:10:39Z</updated>
<author>
<name>Dan Williams</name>
<email>dan.j.williams@intel.com</email>
</author>
<published>2017-04-07T23:42:08Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=c36eaa6ca346dc8d45e9125af62db75625b42a6f'/>
<id>urn:sha1:c36eaa6ca346dc8d45e9125af62db75625b42a6f</id>
<content type='text'>
commit 956a4cd2c957acf638ff29951aabaa9d8e92bbc2 upstream.

The following warning triggers with a new unit test that stresses the
device-dax interface.

 ===============================
 [ ERR: suspicious RCU usage.  ]
 4.11.0-rc4+ #1049 Tainted: G           O
 -------------------------------
 ./include/linux/rcupdate.h:521 Illegal context switch in RCU read-side critical section!

 other info that might help us debug this:

 rcu_scheduler_active = 2, debug_locks = 0
 2 locks held by fio/9070:
  #0:  (&amp;mm-&gt;mmap_sem){++++++}, at: [&lt;ffffffff8d0739d7&gt;] __do_page_fault+0x167/0x4f0
  #1:  (rcu_read_lock){......}, at: [&lt;ffffffffc03fbd02&gt;] dax_dev_huge_fault+0x32/0x620 [dax]

 Call Trace:
  dump_stack+0x86/0xc3
  lockdep_rcu_suspicious+0xd7/0x110
  ___might_sleep+0xac/0x250
  __might_sleep+0x4a/0x80
  __alloc_pages_nodemask+0x23a/0x360
  alloc_pages_current+0xa1/0x1f0
  pte_alloc_one+0x17/0x80
  __pte_alloc+0x1e/0x120
  __get_locked_pte+0x1bf/0x1d0
  insert_pfn.isra.70+0x3a/0x100
  ? lookup_memtype+0xa6/0xd0
  vm_insert_mixed+0x64/0x90
  dax_dev_huge_fault+0x520/0x620 [dax]
  ? dax_dev_huge_fault+0x32/0x620 [dax]
  dax_dev_fault+0x10/0x20 [dax]
  __do_fault+0x1e/0x140
  __handle_mm_fault+0x9af/0x10d0
  handle_mm_fault+0x16d/0x370
  ? handle_mm_fault+0x47/0x370
  __do_page_fault+0x28c/0x4f0
  trace_do_page_fault+0x58/0x2a0
  do_async_page_fault+0x1a/0xa0
  async_page_fault+0x28/0x30

Inserting a page table entry may trigger an allocation while we are
holding a read lock to keep the device instance alive for the duration
of the fault. Use srcu for this keep-alive protection.

Fixes: dee410792419 ("/dev/dax, core: file operations and dax-mmap")
Signed-off-by: Dan Williams &lt;dan.j.williams@intel.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>device-dax: fix pmd/pte fault fallback handling</title>
<updated>2017-03-30T07:41:27Z</updated>
<author>
<name>Dave Jiang</name>
<email>dave.jiang@intel.com</email>
</author>
<published>2017-03-10T20:24:22Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=eae72468c45d827484a2c0980a519f5220b1985c'/>
<id>urn:sha1:eae72468c45d827484a2c0980a519f5220b1985c</id>
<content type='text'>
commit 0134ed4fb9e78672ee9f7b18007114404c81e63f upstream.

Jeff Moyer reports:

    With a device dax alignment of 4KB or 2MB, I get sigbus when running
    the attached fio job file for the current kernel (4.11.0-rc1+).  If
    I specify an alignment of 1GB, it works.

    I turned on debug output, and saw that it was failing in the huge
    fault code.

     dax dax1.0: dax_open
     dax dax1.0: dax_mmap
     dax dax1.0: dax_dev_huge_fault: fio: write (0x7f08f0a00000 -
     dax dax1.0: __dax_dev_pud_fault: phys_to_pgoff(0xffffffffcf60
     dax dax1.0: dax_release

    fio config for reproduce:
    [global]
    ioengine=dev-dax
    direct=0
    filename=/dev/dax0.0
    bs=2m

    [write]
    rw=write

    [read]
    stonewall
    rw=read

The driver fails to fallback when taking a fault that is larger than
the device alignment, or handling a larger fault when a smaller
mapping is already established. While we could support larger
mappings for a device with a smaller alignment, that change is
too large for the immediate fix. The simplest change is to force
fallback until the fault size matches the alignment.

Fixes: dee410792419 ("/dev/dax, core: file operations and dax-mmap")
Cc: &lt;stable@vger.kernel.org&gt;
Reported-by: Jeff Moyer &lt;jmoyer@redhat.com&gt;
Signed-off-by: Dave Jiang &lt;dave.jiang@intel.com&gt;
Signed-off-by: Dan Williams &lt;dan.j.williams@intel.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>device-dax: fix private mapping restriction, permit read-only</title>
<updated>2016-12-07T01:42:37Z</updated>
<author>
<name>Dan Williams</name>
<email>dan.j.williams@intel.com</email>
</author>
<published>2016-12-07T01:03:35Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=325896ffdf90f7cbd59fb873b7ba20d60d1ddf3c'/>
<id>urn:sha1:325896ffdf90f7cbd59fb873b7ba20d60d1ddf3c</id>
<content type='text'>
Hugh notes in response to commit 4cb19355ea19 "device-dax: fail all
private mapping attempts":

  "I think that is more restrictive than you intended: haven't tried, but I
  believe it rejects a PROT_READ, MAP_SHARED, O_RDONLY fd mmap, leaving no
  way to mmap /dev/dax without write permission to it."

Indeed it does restrict read-only mappings, switch to checking
VM_MAYSHARE, not VM_SHARED.

Cc: &lt;stable@vger.kernel.org&gt;
Cc: Dave Hansen &lt;dave.hansen@linux.intel.com&gt;
Cc: Pawel Lebioda &lt;pawel.lebioda@intel.com&gt;
Fixes: 4cb19355ea19 ("device-dax: fail all private mapping attempts")
Reported-by: Hugh Dickins &lt;hughd@google.com&gt;
Signed-off-by: Dan Williams &lt;dan.j.williams@intel.com&gt;
</content>
</entry>
<entry>
<title>device-dax: fail all private mapping attempts</title>
<updated>2016-11-16T17:00:38Z</updated>
<author>
<name>Dan Williams</name>
<email>dan.j.williams@intel.com</email>
</author>
<published>2016-11-16T17:00:38Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=4cb19355ea19995941ccaad115dbfac6b75215ca'/>
<id>urn:sha1:4cb19355ea19995941ccaad115dbfac6b75215ca</id>
<content type='text'>
The device-dax implementation originally tried to be tricky and allow
private read-only mappings, but in the process allowed writable
MAP_PRIVATE + MAP_NORESERVE mappings.  For simplicity and predictability
just fail all private mapping attempts since device-dax memory is
statically allocated and will never support overcommit.

Cc: &lt;stable@vger.kernel.org&gt;
Cc: Dave Hansen &lt;dave.hansen@linux.intel.com&gt;
Fixes: dee410792419 ("/dev/dax, core: file operations and dax-mmap")
Reported-by: Pawel Lebioda &lt;pawel.lebioda@intel.com&gt;
Signed-off-by: Dan Williams &lt;dan.j.williams@intel.com&gt;
</content>
</entry>
<entry>
<title>device-dax: check devm_nsio_enable() return value</title>
<updated>2016-10-28T21:35:25Z</updated>
<author>
<name>Dan Williams</name>
<email>dan.j.williams@intel.com</email>
</author>
<published>2016-10-28T21:34:51Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=6a84fb4b4e439a8ef0ce19ec7e7661ad76f655c9'/>
<id>urn:sha1:6a84fb4b4e439a8ef0ce19ec7e7661ad76f655c9</id>
<content type='text'>
If the dax_pmem driver is passed a resource that is already busy the
driver probe attempt should fail with a message like the following:

  dax_pmem dax0.1: could not reserve region [mem 0x100000000-0x11fffffff]

However, if we do not catch the error we crash for the obvious reason of
accessing memory that is not mapped.

 BUG: unable to handle kernel paging request at ffffc90020001000
 IP: [&lt;ffffffff81496712&gt;] __memcpy+0x12/0x20
 [..]
 Call Trace:
  [&lt;ffffffff815c4960&gt;] ? nsio_rw_bytes+0x60/0x180
  [&lt;ffffffff815c6045&gt;] nd_pfn_validate+0x75/0x320
  [&lt;ffffffff815c63a9&gt;] nvdimm_setup_pfn+0xb9/0x5d0
  [&lt;ffffffff815c48ef&gt;] ? devm_nsio_enable+0xff/0x110
  [&lt;ffffffff815cb699&gt;] dax_pmem_probe+0x59/0x260

Cc: &lt;stable@vger.kernel.org&gt;
Fixes: ab68f2622136 ("/dev/dax, pmem: direct access to persistent memory")
Reported-by: Dave Hansen &lt;dave.hansen@linux.intel.com&gt;
Signed-off-by: Dan Williams &lt;dan.j.williams@intel.com&gt;
</content>
</entry>
<entry>
<title>device-dax: fix percpu_ref_exit ordering</title>
<updated>2016-10-28T00:04:05Z</updated>
<author>
<name>Dan Williams</name>
<email>dan.j.williams@intel.com</email>
</author>
<published>2016-10-28T00:04:05Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=52e73eb2872c9af6f382b2b22954ca8214397a4e'/>
<id>urn:sha1:52e73eb2872c9af6f382b2b22954ca8214397a4e</id>
<content type='text'>
We need to wait until the percpu_ref is released before exit. Otherwise,
we sometimes lose the race and trigger this new warning that was added
in v4.9 (commit a67823c1ed10 "percpu-refcount: init -&gt;confirm_switch
member properly"):

 WARNING: CPU: 0 PID: 3629 at lib/percpu-refcount.c:107 percpu_ref_exit+0x51/0x60
 [..]
 Call Trace:
  [&lt;ffffffff814bf093&gt;] dump_stack+0x85/0xc2
  [&lt;ffffffff810b15db&gt;] __warn+0xcb/0xf0
  [&lt;ffffffff810b170d&gt;] warn_slowpath_null+0x1d/0x20
  [&lt;ffffffff814d70c1&gt;] percpu_ref_exit+0x51/0x60
  [&lt;ffffffffa005706a&gt;] dax_pmem_percpu_exit+0x1a/0x50 [dax_pmem]
  [&lt;ffffffff81615f1f&gt;] devm_action_release+0xf/0x20

Cc: &lt;stable@vger.kernel.org&gt;
Fixes: ab68f2622136 ("/dev/dax, pmem: direct access to persistent memory")
Signed-off-by: Dan Williams &lt;dan.j.williams@intel.com&gt;
</content>
</entry>
<entry>
<title>nvdimm: make CONFIG_NVDIMM_DAX 'bool'</title>
<updated>2016-10-27T23:16:21Z</updated>
<author>
<name>Arnd Bergmann</name>
<email>arnd@arndb.de</email>
</author>
<published>2016-10-25T15:52:04Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=867dfe342118b1ea0256a85f7c0d9ceb0ead032a'/>
<id>urn:sha1:867dfe342118b1ea0256a85f7c0d9ceb0ead032a</id>
<content type='text'>
A bugfix just tried to address a randconfig build problem and introduced
a variant of the same problem: with CONFIG_LIBNVDIMM=y and
CONFIG_NVDIMM_DAX=m, the nvdimm module now fails to link:

drivers/nvdimm/built-in.o: In function `to_nd_device_type':
bus.c:(.text+0x1b5d): undefined reference to `is_nd_dax'
drivers/nvdimm/built-in.o: In function `nd_region_notify_driver_action.constprop.2':
region_devs.c:(.text+0x6b6c): undefined reference to `is_nd_dax'
region_devs.c:(.text+0x6b8c): undefined reference to `to_nd_dax'
drivers/nvdimm/built-in.o: In function `nd_region_probe':
region.c:(.text+0x70f3): undefined reference to `nd_dax_create'
drivers/nvdimm/built-in.o: In function `mode_show':
namespace_devs.c:(.text+0xa196): undefined reference to `is_nd_dax'
drivers/nvdimm/built-in.o: In function `nvdimm_namespace_common_probe':
(.text+0xa55f): undefined reference to `is_nd_dax'
drivers/nvdimm/built-in.o: In function `nvdimm_namespace_common_probe':
(.text+0xa56e): undefined reference to `to_nd_dax'

This reverts the earlier fix, making NVDIMM_DAX a 'bool' option again
as it should be (it gets linked into the libnvdimm module). To fix
the original problem, I'm adding a dependency on LIBNVDIMM to
DEV_DAX_PMEM, which ensures we can't have that one built-in if the
rest is a module.

Fixes: 4e65e9381c7a ("/dev/dax: fix Kconfig dependency build breakage")
Signed-off-by: Arnd Bergmann &lt;arnd@arndb.de&gt;
Reviewed-by: Ross Zwisler &lt;ross.zwisler@linux.intel.com&gt;
Signed-off-by: Dan Williams &lt;dan.j.williams@intel.com&gt;
</content>
</entry>
</feed>
