<feed xmlns='http://www.w3.org/2005/Atom'>
<title>user/sven/linux.git/include/linux/hugetlb.h, branch v4.4.2</title>
<subtitle>Linux Kernel
</subtitle>
<id>https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v4.4.2</id>
<link rel='self' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v4.4.2'/>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/'/>
<updated>2015-11-06T03:34:48Z</updated>
<entry>
<title>mm: hugetlb: proc: add HugetlbPages field to /proc/PID/status</title>
<updated>2015-11-06T03:34:48Z</updated>
<author>
<name>Naoya Horiguchi</name>
<email>n-horiguchi@ah.jp.nec.com</email>
</author>
<published>2015-11-06T02:47:14Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=5d317b2b6536592a9b51fe65faed43d65ca9158e'/>
<id>urn:sha1:5d317b2b6536592a9b51fe65faed43d65ca9158e</id>
<content type='text'>
Currently there's no easy way to get per-process usage of hugetlb pages,
which is inconvenient because userspace applications which use hugetlb
typically want to control their processes on the basis of how much memory
(including hugetlb) they use.  So this patch simply provides easy access
to the info via /proc/PID/status.

Signed-off-by: Naoya Horiguchi &lt;n-horiguchi@ah.jp.nec.com&gt;
Acked-by: Joern Engel &lt;joern@logfs.org&gt;
Acked-by: David Rientjes &lt;rientjes@google.com&gt;
Acked-by: Michal Hocko &lt;mhocko@suse.com&gt;
Cc: Mike Kravetz &lt;mike.kravetz@oracle.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>hugetlbfs: add hugetlbfs_fallocate()</title>
<updated>2015-09-08T22:35:28Z</updated>
<author>
<name>Mike Kravetz</name>
<email>mike.kravetz@oracle.com</email>
</author>
<published>2015-09-08T22:01:54Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=70c3547e36f5c9fbc4caecfeca98f0effa6932c5'/>
<id>urn:sha1:70c3547e36f5c9fbc4caecfeca98f0effa6932c5</id>
<content type='text'>
This is based on the shmem version, but it has diverged quite a bit.  We
have no swap to worry about, nor the new file sealing.  Add
synchronication via the fault mutex table to coordinate page faults,
fallocate allocation and fallocate hole punch.

What this allows us to do is move physical memory in and out of a
hugetlbfs file without having it mapped.  This also gives us the ability
to support MADV_REMOVE since it is currently implemented using
fallocate().  MADV_REMOVE lets madvise() remove pages from the middle of
a hugetlbfs file, which wasn't possible before.

hugetlbfs fallocate only operates on whole huge pages.

Based on code by Dave Hansen.

Signed-off-by: Mike Kravetz &lt;mike.kravetz@oracle.com&gt;
Reviewed-by: Naoya Horiguchi &lt;n-horiguchi@ah.jp.nec.com&gt;
Acked-by: Hillf Danton &lt;hillf.zj@alibaba-inc.com&gt;
Cc: Dave Hansen &lt;dave.hansen@linux.intel.com&gt;
Cc: David Rientjes &lt;rientjes@google.com&gt;
Cc: Hugh Dickins &lt;hughd@google.com&gt;
Cc: Davidlohr Bueso &lt;dave@stgolabs.net&gt;
Cc: Aneesh Kumar &lt;aneesh.kumar@linux.vnet.ibm.com&gt;
Cc: Christoph Hellwig &lt;hch@infradead.org&gt;
Cc: Michal Hocko &lt;mhocko@suse.cz&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>hugetlbfs: New huge_add_to_page_cache helper routine</title>
<updated>2015-09-08T22:35:28Z</updated>
<author>
<name>Mike Kravetz</name>
<email>mike.kravetz@oracle.com</email>
</author>
<published>2015-09-08T22:01:50Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=ab76ad540a50191308e5bb6b5e2d9e26c78616d3'/>
<id>urn:sha1:ab76ad540a50191308e5bb6b5e2d9e26c78616d3</id>
<content type='text'>
Currently, there is only a single place where hugetlbfs pages are added
to the page cache.  The new fallocate code be adding a second one, so
break the functionality out into its own helper.

Signed-off-by: Dave Hansen &lt;dave.hansen@linux.intel.com&gt;
Signed-off-by: Mike Kravetz &lt;mike.kravetz@oracle.com&gt;
Reviewed-by: Naoya Horiguchi &lt;n-horiguchi@ah.jp.nec.com&gt;
Acked-by: Hillf Danton &lt;hillf.zj@alibaba-inc.com&gt;
Cc: David Rientjes &lt;rientjes@google.com&gt;
Cc: Hugh Dickins &lt;hughd@google.com&gt;
Cc: Davidlohr Bueso &lt;dave@stgolabs.net&gt;
Cc: Aneesh Kumar &lt;aneesh.kumar@linux.vnet.ibm.com&gt;
Cc: Christoph Hellwig &lt;hch@infradead.org&gt;
Cc: Michal Hocko &lt;mhocko@suse.cz&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>hugetlbfs: truncate_hugepages() takes a range of pages</title>
<updated>2015-09-08T22:35:28Z</updated>
<author>
<name>Mike Kravetz</name>
<email>mike.kravetz@oracle.com</email>
</author>
<published>2015-09-08T22:01:41Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=b5cec28d36f5ee6b4e6f68a0a40aa1e4045d6d99'/>
<id>urn:sha1:b5cec28d36f5ee6b4e6f68a0a40aa1e4045d6d99</id>
<content type='text'>
Modify truncate_hugepages() to take a range of pages (start, end)
instead of simply start.  If an end value of LLONG_MAX is passed, the
current "truncate" functionality is maintained.  Existing callers are
modified to pass LLONG_MAX as end of range.  By keying off end ==
LLONG_MAX, the routine behaves differently for truncate and hole punch.
Page removal is now synchronized with page allocation via faults by
using the fault mutex table.  The hole punch case can experience the
rare region_del error and must handle accordingly.

Add the routine hugetlb_fix_reserve_counts to fix up reserve counts in
the case where region_del returns an error.

Since the routine handles more than just the truncate case, it is
renamed to remove_inode_hugepages().  To be consistent, the routine
truncate_huge_page() is renamed remove_huge_page().

Downstream of remove_inode_hugepages(), the routine
hugetlb_unreserve_pages() is also modified to take a range of pages.
hugetlb_unreserve_pages is modified to detect an error from region_del and
pass it back to the caller.

Signed-off-by: Mike Kravetz &lt;mike.kravetz@oracle.com&gt;
Reviewed-by: Naoya Horiguchi &lt;n-horiguchi@ah.jp.nec.com&gt;
Acked-by: Hillf Danton &lt;hillf.zj@alibaba-inc.com&gt;
Cc: Dave Hansen &lt;dave.hansen@linux.intel.com&gt;
Cc: David Rientjes &lt;rientjes@google.com&gt;
Cc: Hugh Dickins &lt;hughd@google.com&gt;
Cc: Davidlohr Bueso &lt;dave@stgolabs.net&gt;
Cc: Aneesh Kumar &lt;aneesh.kumar@linux.vnet.ibm.com&gt;
Cc: Christoph Hellwig &lt;hch@infradead.org&gt;
Cc: Michal Hocko &lt;mhocko@suse.cz&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm/hugetlb: expose hugetlb fault mutex for use by fallocate</title>
<updated>2015-09-08T22:35:28Z</updated>
<author>
<name>Mike Kravetz</name>
<email>mike.kravetz@oracle.com</email>
</author>
<published>2015-09-08T22:01:35Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=c672c7f29f2fdb73e1f72911bf499675c81fcdbb'/>
<id>urn:sha1:c672c7f29f2fdb73e1f72911bf499675c81fcdbb</id>
<content type='text'>
hugetlb page faults are currently synchronized by the table of mutexes
(htlb_fault_mutex_table).  fallocate code will need to synchronize with
the page fault code when it allocates or deletes pages.  Expose
interfaces so that fallocate operations can be synchronized with page
faults.  Minor name changes to be more consistent with other global
hugetlb symbols.

Signed-off-by: Mike Kravetz &lt;mike.kravetz@oracle.com&gt;
Reviewed-by: Naoya Horiguchi &lt;n-horiguchi@ah.jp.nec.com&gt;
Acked-by: Hillf Danton &lt;hillf.zj@alibaba-inc.com&gt;
Cc: Dave Hansen &lt;dave.hansen@linux.intel.com&gt;
Cc: David Rientjes &lt;rientjes@google.com&gt;
Cc: Hugh Dickins &lt;hughd@google.com&gt;
Cc: Davidlohr Bueso &lt;dave@stgolabs.net&gt;
Cc: Aneesh Kumar &lt;aneesh.kumar@linux.vnet.ibm.com&gt;
Cc: Christoph Hellwig &lt;hch@infradead.org&gt;
Cc: Michal Hocko &lt;mhocko@suse.cz&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm/hugetlb: add cache of descriptors to resv_map for region_add</title>
<updated>2015-09-08T22:35:28Z</updated>
<author>
<name>Mike Kravetz</name>
<email>mike.kravetz@oracle.com</email>
</author>
<published>2015-09-08T22:01:28Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=5e9113731a3ce616e8b5aa128ffc1aeaa4942571'/>
<id>urn:sha1:5e9113731a3ce616e8b5aa128ffc1aeaa4942571</id>
<content type='text'>
hugetlbfs is used today by applications that want a high degree of
control over huge page usage.  Often, large hugetlbfs files are used to
map a large number huge pages into the application processes.  The
applications know when page ranges within these large files will no
longer be used, and ideally would like to release them back to the
subpool or global pools for other uses.  The fallocate() system call
provides an interface for preallocation and hole punching within files.
This patch set adds fallocate functionality to hugetlbfs.

fallocate hole punch will want to remove a specific range of pages.
When pages are removed, their associated entries in the region/reserve
map will also be removed.  This will break an assumption in the
region_chg/region_add calling sequence.  If a new region descriptor must
be allocated, it is done as part of the region_chg processing.  In this
way, region_add can not fail because it does not need to attempt an
allocation.

To prepare for fallocate hole punch, create a "cache" of descriptors
that can be used by region_add if necessary.  region_chg will ensure
there are sufficient entries in the cache.  It will be necessary to
track the number of in progress add operations to know a sufficient
number of descriptors reside in the cache.  A new routine region_abort
is added to adjust this in progress count when add operations are
aborted.  vma_abort_reservation is also added for callers creating
reservations with vma_needs_reservation/vma_commit_reservation.

[akpm@linux-foundation.org: fix typo in comment, use more cols]
Signed-off-by: Mike Kravetz &lt;mike.kravetz@oracle.com&gt;
Reviewed-by: Naoya Horiguchi &lt;n-horiguchi@ah.jp.nec.com&gt;
Acked-by: Hillf Danton &lt;hillf.zj@alibaba-inc.com&gt;
Cc: Dave Hansen &lt;dave.hansen@linux.intel.com&gt;
Cc: David Rientjes &lt;rientjes@google.com&gt;
Cc: Hugh Dickins &lt;hughd@google.com&gt;
Cc: Davidlohr Bueso &lt;dave@stgolabs.net&gt;
Cc: Aneesh Kumar &lt;aneesh.kumar@linux.vnet.ibm.com&gt;
Cc: Christoph Hellwig &lt;hch@infradead.org&gt;
Cc: Michal Hocko &lt;mhocko@suse.cz&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm: hugetlb: allow hugepages_supported to be architecture specific</title>
<updated>2015-07-17T23:39:52Z</updated>
<author>
<name>Dominik Dingel</name>
<email>dingel@linux.vnet.ibm.com</email>
</author>
<published>2015-07-17T23:23:37Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=2531c8cf56a640cd7d17057df8484e570716a450'/>
<id>urn:sha1:2531c8cf56a640cd7d17057df8484e570716a450</id>
<content type='text'>
s390 has a constant hugepage size, by setting HPAGE_SHIFT we also change
e.g. the pageblock_order, which should be independent in respect to
hugepage support.

With this patch every architecture is free to define how to check
for hugepage support.

Signed-off-by: Dominik Dingel &lt;dingel@linux.vnet.ibm.com&gt;
Acked-by: Martin Schwidefsky &lt;schwidefsky@de.ibm.com&gt;
Cc: Heiko Carstens &lt;heiko.carstens@de.ibm.com&gt;
Cc: Christian Borntraeger &lt;borntraeger@de.ibm.com&gt;
Cc: Michael Holzheu &lt;holzheu@linux.vnet.ibm.com&gt;
Cc: Gerald Schaefer &lt;gerald.schaefer@de.ibm.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm: hugetlb: cleanup using paeg_huge_active()</title>
<updated>2015-04-15T23:35:19Z</updated>
<author>
<name>Naoya Horiguchi</name>
<email>n-horiguchi@ah.jp.nec.com</email>
</author>
<published>2015-04-15T23:14:41Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=7e1f049efb86bd86c06b80eeac0ef80cdeb8c0e7'/>
<id>urn:sha1:7e1f049efb86bd86c06b80eeac0ef80cdeb8c0e7</id>
<content type='text'>
Now we have an easy access to hugepages' activeness, so existing helpers to
get the information can be cleaned up.

[akpm@linux-foundation.org: s/PageHugeActive/page_huge_active/]
Signed-off-by: Naoya Horiguchi &lt;n-horiguchi@ah.jp.nec.com&gt;
Cc: Hugh Dickins &lt;hughd@google.com&gt;
Reviewed-by: Michal Hocko &lt;mhocko@suse.cz&gt;
Cc: Mel Gorman &lt;mgorman@suse.de&gt;
Cc: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Cc: David Rientjes &lt;rientjes@google.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>hugetlbfs: accept subpool min_size mount option and setup accordingly</title>
<updated>2015-04-15T23:35:18Z</updated>
<author>
<name>Mike Kravetz</name>
<email>mike.kravetz@oracle.com</email>
</author>
<published>2015-04-15T23:13:42Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=7ca02d0ae586fe7df59632966a64f3f1a756ef05'/>
<id>urn:sha1:7ca02d0ae586fe7df59632966a64f3f1a756ef05</id>
<content type='text'>
Make 'min_size=&lt;value&gt;' be an option when mounting a hugetlbfs.  This
option takes the same value as the 'size' option.  min_size can be
specified without specifying size.  If both are specified, min_size must
be less that or equal to size else the mount will fail.  If min_size is
specified, then at mount time an attempt is made to reserve min_size
pages.  If the reservation fails, the mount fails.  At umount time, the
reserved pages are released.

Signed-off-by: Mike Kravetz &lt;mike.kravetz@oracle.com&gt;
Cc: Davidlohr Bueso &lt;dave@stgolabs.net&gt;
Cc: Aneesh Kumar &lt;aneesh.kumar@linux.vnet.ibm.com&gt;
Cc: Joonsoo Kim &lt;iamjoonsoo.kim@lge.com&gt;
Cc: Andi Kleen &lt;andi@firstfloor.org&gt;
Cc: David Rientjes &lt;rientjes@google.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>hugetlbfs: add minimum size tracking fields to subpool structure</title>
<updated>2015-04-15T23:35:17Z</updated>
<author>
<name>Mike Kravetz</name>
<email>mike.kravetz@oracle.com</email>
</author>
<published>2015-04-15T23:13:36Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=c6a918200c4f4ebf74b7e1ae4fea9115c7b297f8'/>
<id>urn:sha1:c6a918200c4f4ebf74b7e1ae4fea9115c7b297f8</id>
<content type='text'>
hugetlbfs allocates huge pages from the global pool as needed.  Even if
the global pool contains a sufficient number pages for the filesystem size
at mount time, those global pages could be grabbed for some other use.  As
a result, filesystem huge page allocations may fail due to lack of pages.

Applications such as a database want to use huge pages for performance
reasons.  hugetlbfs filesystem semantics with ownership and modes work
well to manage access to a pool of huge pages.  However, the application
would like some reasonable assurance that allocations will not fail due to
a lack of huge pages.  At application startup time, the application would
like to configure itself to use a specific number of huge pages.  Before
starting, the application can check to make sure that enough huge pages
exist in the system global pools.  However, there are no guarantees that
those pages will be available when needed by the application.  What the
application wants is exclusive use of a subset of huge pages.

Add a new hugetlbfs mount option 'min_size=&lt;value&gt;' to indicate that the
specified number of pages will be available for use by the filesystem.  At
mount time, this number of huge pages will be reserved for exclusive use
of the filesystem.  If there is not a sufficient number of free pages, the
mount will fail.  As pages are allocated to and freeed from the
filesystem, the number of reserved pages is adjusted so that the specified
minimum is maintained.

This patch (of 4):

Add a field to the subpool structure to indicate the minimimum number of
huge pages to always be used by this subpool.  This minimum count includes
allocated pages as well as reserved pages.  If the minimum number of pages
for the subpool have not been allocated, pages are reserved up to this
minimum.  An additional field (rsv_hpages) is used to track the number of
pages reserved to meet this minimum size.  The hstate pointer in the
subpool is convenient to have when reserving and unreserving the pages.

Signed-off-by: Mike Kravetz &lt;mike.kravetz@oracle.com&gt;
Cc: Davidlohr Bueso &lt;dave@stgolabs.net&gt;
Cc: Aneesh Kumar &lt;aneesh.kumar@linux.vnet.ibm.com&gt;
Cc: Joonsoo Kim &lt;iamjoonsoo.kim@lge.com&gt;
Cc: Andi Kleen &lt;andi@firstfloor.org&gt;
Cc: David Rientjes &lt;rientjes@google.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
</feed>
