<feed xmlns='http://www.w3.org/2005/Atom'>
<title>user/sven/linux.git/drivers/cpufreq, branch v3.18.51</title>
<subtitle>Linux Kernel
</subtitle>
<id>https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v3.18.51</id>
<link rel='self' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v3.18.51'/>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/'/>
<updated>2016-06-20T03:47:45Z</updated>
<entry>
<title>cpufreq: intel_pstate: Fix -&gt;set_policy() interface for no_turbo</title>
<updated>2016-06-20T03:47:45Z</updated>
<author>
<name>Srinivas Pandruvada</name>
<email>srinivas.pandruvada@linux.intel.com</email>
</author>
<published>2016-06-08T00:38:53Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=d05438b3ba85517631b5ea2e1c88172f84dffe8b'/>
<id>urn:sha1:d05438b3ba85517631b5ea2e1c88172f84dffe8b</id>
<content type='text'>
[ Upstream commit 983e600e88835f0321d1a0ea06f52d48b7b5a544 ]

When turbo is disabled, the -&gt;set_policy() interface is broken.

For example, when turbo is disabled and cpuinfo.max = 2900000 (full
max turbo frequency), setting the limits results in frequency less
than the requested one:
Set 1000000 KHz results in 0700000 KHz
Set 1500000 KHz results in 1100000 KHz
Set 2000000 KHz results in  1500000 KHz

This is because the limits-&gt;max_perf fraction is calculated using
the max turbo frequency as the reference, but when the max P-State is
capped in intel_pstate_get_min_max(), the reference is not the max
turbo P-State. This results in reducing max P-State.

One option is to always use max turbo as reference for calculating
limits. But this will not be correct. By definition the intel_pstate
sysfs limits, shows percentage of available performance. So when
BIOS has disabled turbo, the available performance is max non turbo.
So the max_perf_pct should still show 100%.

Signed-off-by: Srinivas Pandruvada &lt;srinivas.pandruvada@linux.intel.com&gt;
[ rjw : Subject &amp; changelog, rewrite in fewer lines of code ]
Cc: All applicable &lt;stable@vger.kernel.org&gt;
Signed-off-by: Rafael J. Wysocki &lt;rafael.j.wysocki@intel.com&gt;

Signed-off-by: Sasha Levin &lt;sasha.levin@oracle.com&gt;
</content>
</entry>
<entry>
<title>intel_pstate: Fix overflow in busy_scaled due to long delay</title>
<updated>2015-10-28T02:14:30Z</updated>
<author>
<name>Prarit Bhargava</name>
<email>prarit@redhat.com</email>
</author>
<published>2015-06-15T17:43:29Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=e4f238fc4f13f1107f342a0bf12415b42de0a165'/>
<id>urn:sha1:e4f238fc4f13f1107f342a0bf12415b42de0a165</id>
<content type='text'>
[ Upstream commit 7180dddf7c32c49975c7e7babf2b60ed450cb760 ]

The kernel may delay interrupts for a long time which can result in timers
being delayed. If this occurs the intel_pstate driver will crash with a
divide by zero error:

divide error: 0000 [#1] SMP
Modules linked in: btrfs zlib_deflate raid6_pq xor msdos ext4 mbcache jbd2 binfmt_misc arc4 md4 nls_utf8 cifs dns_resolver tcp_lp bnep bluetooth rfkill fuse dm_service_time iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi nf_conntrack_netbios_ns nf_conntrack_broadcast nf_conntrack_ftp ip6t_rpfilter ip6t_REJECT ipt_REJECT xt_conntrack ebtable_nat ebtable_broute bridge stp llc ebtable_filter ebtables ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security ip6table_raw ip6table_filter ip6_tables iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle iptable_security iptable_raw iptable_filter ip_tables intel_powerclamp coretemp vfat fat kvm_intel iTCO_wdt iTCO_vendor_support ipmi_devintf sr_mod kvm crct10dif_pclmul
 crc32_pclmul crc32c_intel ghash_clmulni_intel aesni_intel cdc_ether lrw usbnet cdrom mii gf128mul glue_helper ablk_helper cryptd lpc_ich mfd_core pcspkr sb_edac edac_core ipmi_si ipmi_msghandler ioatdma wmi shpchp acpi_pad nfsd auth_rpcgss nfs_acl lockd uinput dm_multipath sunrpc xfs libcrc32c usb_storage sd_mod crc_t10dif crct10dif_common ixgbe mgag200 syscopyarea sysfillrect sysimgblt mdio drm_kms_helper ttm igb drm ptp pps_core dca i2c_algo_bit megaraid_sas i2c_core dm_mirror dm_region_hash dm_log dm_mod
CPU: 113 PID: 0 Comm: swapper/113 Tainted: G        W   --------------   3.10.0-229.1.2.el7.x86_64 #1
Hardware name: IBM x3950 X6 -[3837AC2]-/00FN827, BIOS -[A8E112BUS-1.00]- 08/27/2014
task: ffff880fe8abe660 ti: ffff880fe8ae4000 task.ti: ffff880fe8ae4000
RIP: 0010:[&lt;ffffffff814a9279&gt;]  [&lt;ffffffff814a9279&gt;] intel_pstate_timer_func+0x179/0x3d0
RSP: 0018:ffff883fff4e3db8  EFLAGS: 00010206
RAX: 0000000027100000 RBX: ffff883fe6965100 RCX: 0000000000000000
RDX: 0000000000000000 RSI: 0000000000000010 RDI: 000000002e53632d
RBP: ffff883fff4e3e20 R08: 000e6f69a5a125c0 R09: ffff883fe84ec001
R10: 0000000000000002 R11: 0000000000000005 R12: 00000000000049f5
R13: 0000000000271000 R14: 00000000000049f5 R15: 0000000000000246
FS:  0000000000000000(0000) GS:ffff883fff4e0000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f7668601000 CR3: 000000000190a000 CR4: 00000000001407e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Stack:
 ffff883fff4e3e58 ffffffff81099dc1 0000000000000086 0000000000000071
 ffff883fff4f3680 0000000000000071 fbdc8a965e33afee ffffffff810b69dd
 ffff883fe84ec000 ffff883fe6965108 0000000000000100 ffffffff814a9100
Call Trace:
 &lt;IRQ&gt;

 [&lt;ffffffff81099dc1&gt;] ? run_posix_cpu_timers+0x51/0x840
 [&lt;ffffffff810b69dd&gt;] ? trigger_load_balance+0x5d/0x200
 [&lt;ffffffff814a9100&gt;] ? pid_param_set+0x130/0x130
 [&lt;ffffffff8107df56&gt;] call_timer_fn+0x36/0x110
 [&lt;ffffffff814a9100&gt;] ? pid_param_set+0x130/0x130
 [&lt;ffffffff8107fdcf&gt;] run_timer_softirq+0x21f/0x320
 [&lt;ffffffff81077b2f&gt;] __do_softirq+0xef/0x280
 [&lt;ffffffff816156dc&gt;] call_softirq+0x1c/0x30
 [&lt;ffffffff81015d95&gt;] do_softirq+0x65/0xa0
 [&lt;ffffffff81077ec5&gt;] irq_exit+0x115/0x120
 [&lt;ffffffff81616355&gt;] smp_apic_timer_interrupt+0x45/0x60
 [&lt;ffffffff81614a1d&gt;] apic_timer_interrupt+0x6d/0x80
 &lt;EOI&gt;

 [&lt;ffffffff814a9c32&gt;] ? cpuidle_enter_state+0x52/0xc0
 [&lt;ffffffff814a9c28&gt;] ? cpuidle_enter_state+0x48/0xc0
 [&lt;ffffffff814a9d65&gt;] cpuidle_idle_call+0xc5/0x200
 [&lt;ffffffff8101d14e&gt;] arch_cpu_idle+0xe/0x30
 [&lt;ffffffff810c67c1&gt;] cpu_startup_entry+0xf1/0x290
 [&lt;ffffffff8104228a&gt;] start_secondary+0x1ba/0x230
Code: 42 0f 00 45 89 e6 48 01 c2 43 8d 44 6d 00 39 d0 73 26 49 c1 e5 08 89 d2 4d 63 f4 49 63 c5 48 c1 e2 08 48 c1 e0 08 48 63 ca 48 99 &lt;48&gt; f7 f9 48 98 4c 0f af f0 49 c1 ee 08 8b 43 78 c1 e0 08 44 29
RIP  [&lt;ffffffff814a9279&gt;] intel_pstate_timer_func+0x179/0x3d0
 RSP &lt;ffff883fff4e3db8&gt;

The kernel values for cpudata for CPU 113 were:

struct cpudata {
  cpu = 113,
  timer = {
    entry = {
      next = 0x0,
      prev = 0xdead000000200200
    },
    expires = 8357799745,
    base = 0xffff883fe84ec001,
    function = 0xffffffff814a9100 &lt;intel_pstate_timer_func&gt;,
    data = 18446612406765768960,
&lt;snip&gt;
    i_gain = 0,
    d_gain = 0,
    deadband = 0,
    last_err = 22489
  },
  last_sample_time = {
    tv64 = 4063132438017305
  },
  prev_aperf = 287326796397463,
  prev_mperf = 251427432090198,
  sample = {
    core_pct_busy = 23081,
    aperf = 2937407,
    mperf = 3257884,
    freq = 2524484,
    time = {
      tv64 = 4063149215234118
    }
  }
}

which results in the time between samples = last_sample_time - sample.time
= 4063149215234118 - 4063132438017305 = 16777216813 which is 16.777 seconds.

The duration between reads of the APERF and MPERF registers overflowed a s32
sized integer in intel_pstate_get_scaled_busy()'s call to div_fp().  The result
is that int_tofp(duration_us) == 0, and the kernel attempts to divide by 0.

While the kernel shouldn't be delaying for a long time, it can and does
happen and the intel_pstate driver should not panic in this situation.  This
patch changes the div_fp() function to use div64_s64() to allow for "long"
division.  This will avoid the overflow condition on long delays.

[v2]: use div64_s64() in div_fp()

Signed-off-by: Prarit Bhargava &lt;prarit@redhat.com&gt;
Signed-off-by: Rafael J. Wysocki &lt;rafael.j.wysocki@intel.com&gt;
Signed-off-by: Sasha Levin &lt;sasha.levin@oracle.com&gt;
</content>
</entry>
<entry>
<title>cpufreq: dt: Tolerance applies on both sides of target voltage</title>
<updated>2015-10-28T02:14:09Z</updated>
<author>
<name>Viresh Kumar</name>
<email>viresh.kumar@linaro.org</email>
</author>
<published>2015-09-02T09:06:50Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=d3e32326c31c915dc6cd86f015deed2708e5d1bb'/>
<id>urn:sha1:d3e32326c31c915dc6cd86f015deed2708e5d1bb</id>
<content type='text'>
[ Upstream commit a2022001cebd0825b96aa0f3345ea3ad44ae79d4 ]

Tolerance applies on both sides of the target voltage, i.e. both min and
max sides. But while checking if a voltage is supported by the regulator
or not, we haven't taken care of tolerance on the lower side. Fix that.

Cc: Lucas Stach &lt;l.stach@pengutronix.de&gt;
Fixes: 045ee45c4ff2 ("cpufreq: cpufreq-dt: disable unsupported OPPs")
Signed-off-by: Viresh Kumar &lt;viresh.kumar@linaro.org&gt;
Reviewed-by: Lucas Stach &lt;l.stach@pengutronix.de&gt;
Signed-off-by: Rafael J. Wysocki &lt;rafael.j.wysocki@intel.com&gt;
Signed-off-by: Sasha Levin &lt;sasha.levin@oracle.com&gt;
</content>
</entry>
<entry>
<title>cpufreq: pcc: Enable autoload of pcc-cpufreq for ACPI processors</title>
<updated>2015-08-27T17:25:58Z</updated>
<author>
<name>Lenny Szubowicz</name>
<email>lszubowi@redhat.com</email>
</author>
<published>2014-11-13T18:51:52Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=57a45cb53cba2261d7e909f944a3ddd4d61e8ac2'/>
<id>urn:sha1:57a45cb53cba2261d7e909f944a3ddd4d61e8ac2</id>
<content type='text'>
[ Upstream commit 7e7e8fe69820c6fa31395dbbd8e348e3c69cd2a9 ]

The pcc-cpufreq driver is not automatically loaded on systems where
the platform's power management setting requires this driver.
Instead, on those systems no CPU frequency driver is registered and
active.

Make the autoloading matching criteria for loading the pcc-cpufreq
driver the same as done in acpi-cpufreq by commit c655affbd524d01
("ACPI / cpufreq: Add ACPI processor device IDs to acpi-cpufreq").

x86 CPU frequency drivers are now typically autoloaded by specifying
MODULE_DEVICE_TABLE entries and x86cpu model specific matching.
But pcc-cpufreq was omitted when acpi-cpufreq and other drivers were
changed to use this approach.

Both acpi-cpufreq and pcc-cpufreq depend on a distinct and mutually
exclusive set of ACPI methods which are not directly tied to specific
processor model numbers. Both of these drivers have init routines
which look for their required ACPI methods. As a result, only the
appropriate driver registers as the cpu frequency driver and the other
one ends up being unloaded.

Tested on various systems where acpi-cpufreq, intel_pstate, and
pcc-cpufreq are the expected cpu frequency drivers.

Signed-off-by: Lenny Szubowicz &lt;lszubowi@redhat.com&gt;
Signed-off-by: Joseph Szczypek &lt;joseph.szczypek@hp.com&gt;
Reported-by: Trinh Dao &lt;trinh.dao@hp.com&gt;
Signed-off-by: Rafael J. Wysocki &lt;rafael.j.wysocki@intel.com&gt;
Signed-off-by: Sasha Levin &lt;sasha.levin@oracle.com&gt;
</content>
</entry>
<entry>
<title>intel_pstate: set BYT MSR with wrmsrl_on_cpu()</title>
<updated>2015-07-04T03:02:15Z</updated>
<author>
<name>Joe Konno</name>
<email>joe.konno@intel.com</email>
</author>
<published>2015-05-12T14:59:42Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=d36bec4cb0ae2ca617e9fba2c80467e9443f0d9c'/>
<id>urn:sha1:d36bec4cb0ae2ca617e9fba2c80467e9443f0d9c</id>
<content type='text'>
[ Upstream commit 0dd23f94251f49da99a6cbfb22418b2d757d77d6 ]

Commit 007bea098b86 (intel_pstate: Add setting voltage value for
baytrail P states.) introduced byt_set_pstate() with the assumption that
it would always be run by the CPU whose MSR is to be written by it.  It
turns out, however, that is not always the case in practice, so modify
byt_set_pstate() to enforce the MSR write done by it to always happen on
the right CPU.

Fixes: 007bea098b86 (intel_pstate: Add setting voltage value for baytrail P states.)
Signed-off-by: Joe Konno &lt;joe.konno@intel.com&gt;
Acked-by: Kristen Carlson Accardi &lt;kristen@linux.intel.com&gt;
Cc: 3.14+ &lt;stable@vger.kernel.org&gt; # 3.14+
Signed-off-by: Rafael J. Wysocki &lt;rafael.j.wysocki@intel.com&gt;
Signed-off-by: Sasha Levin &lt;sasha.levin@oracle.com&gt;
</content>
</entry>
<entry>
<title>cpufreq: Schedule work for the first-online CPU on resume</title>
<updated>2015-04-24T21:13:44Z</updated>
<author>
<name>Viresh Kumar</name>
<email>viresh.kumar@linaro.org</email>
</author>
<published>2015-04-02T04:51:33Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=5f44e3971fd6c83677a844f396a9e57b179e266f'/>
<id>urn:sha1:5f44e3971fd6c83677a844f396a9e57b179e266f</id>
<content type='text'>
[ Upstream commit c75de0ac0756d4b442f460e10461720c7c2412c2 ]

All CPUs leaving the first-online CPU are hotplugged out on suspend and
and cpufreq core stops managing them.

On resume, we need to call cpufreq_update_policy() for this CPU's policy
to make sure its frequency is in sync with cpufreq's cached value, as it
might have got updated by hardware during suspend/resume.

The policies are always added to the top of the policy-list. So, in
normal circumstances, CPU 0's policy will be the last one in the list.
And so the code checks for the last policy.

But there are cases where it will fail. Consider quad-core system, with
policy-per core. If CPU0 is hotplugged out and added back again, the
last policy will be on CPU1 :(

To fix this in a proper way, always look for the policy of the first
online CPU. That way we will be sure that we are calling
cpufreq_update_policy() for the only CPU that wasn't hotplugged out.

Cc: 3.15+ &lt;stable@vger.kernel.org&gt; # 3.15+
Fixes: 2f0aea936360 ("cpufreq: suspend governors on system suspend/hibernate")
Reported-by: Saravana Kannan &lt;skannan@codeaurora.org&gt;
Signed-off-by: Viresh Kumar &lt;viresh.kumar@linaro.org&gt;
Acked-by: Saravana Kannan &lt;skannan@codeaurora.org&gt;
Signed-off-by: Rafael J. Wysocki &lt;rafael.j.wysocki@intel.com&gt;
Signed-off-by: Sasha Levin &lt;sasha.levin@oracle.com&gt;
</content>
</entry>
<entry>
<title>cpufreq: s3c: remove last use of resume_clocks callback</title>
<updated>2015-03-06T22:52:48Z</updated>
<author>
<name>Arnd Bergmann</name>
<email>arnd@arndb.de</email>
</author>
<published>2015-02-18T20:55:53Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=8ebba75fe5b8d465cf02ff1b24c4db89bee0740e'/>
<id>urn:sha1:8ebba75fe5b8d465cf02ff1b24c4db89bee0740e</id>
<content type='text'>
commit 67fadaa2768716209ee19a8b8bf05bc3ac399445 upstream.

Commit 32726d2d550 ("ARM: SAMSUNG: Remove legacy clock code")
already removed the callback pointer, but there was one remaining
user:

drivers/cpufreq/s3c24xx-cpufreq.c: In function 's3c_cpufreq_resume_clocks':
drivers/cpufreq/s3c24xx-cpufreq.c:149:14: error: 'struct s3c_cpufreq_info' has no member named 'resume_clocks'
  cpu_cur.info-&gt;resume_clocks();
              ^

Signed-off-by: Arnd Bergmann &lt;arnd@arndb.de&gt;
Fixes: 32726d2d550 ("ARM: SAMSUNG: Remove legacy clock code")
Acked-by: Viresh Kumar &lt;viresh.kumar@linaro.org&gt;
Signed-off-by: Rafael J. Wysocki &lt;rafael.j.wysocki@intel.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>cpufreq: s3c: remove incorrect __init annotations</title>
<updated>2015-03-06T22:52:48Z</updated>
<author>
<name>Arnd Bergmann</name>
<email>arnd@arndb.de</email>
</author>
<published>2015-02-18T20:55:03Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=2b21e24709ed6e2e0fd7044745b1e800ffd98a67'/>
<id>urn:sha1:2b21e24709ed6e2e0fd7044745b1e800ffd98a67</id>
<content type='text'>
commit 61882b63171736571e1139ab5aa929e3bb336016 upstream.

The two functions s3c2416_cpufreq_driver_init and s3c_cpufreq_register
are marked init but are called from a context that might be run after
the __init sections are discarded, as the compiler points out:

WARNING: vmlinux.o(.data+0x1ad9dc): Section mismatch in reference from the variable s3c2416_cpufreq_driver to the function .init.text:s3c2416_cpufreq_driver_init()
WARNING: drivers/built-in.o(.text+0x35b5dc): Section mismatch in reference from the function s3c2410a_cpufreq_add() to the function .init.text:s3c_cpufreq_register()

This removes the __init markings.

Signed-off-by: Arnd Bergmann &lt;arnd@arndb.de&gt;
Acked-by: Viresh Kumar &lt;viresh.kumar@linaro.org&gt;
Signed-off-by: Rafael J. Wysocki &lt;rafael.j.wysocki@intel.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>cpufreq: speedstep-smi: enable interrupts when waiting</title>
<updated>2015-03-06T22:52:48Z</updated>
<author>
<name>Mikulas Patocka</name>
<email>mpatocka@redhat.com</email>
</author>
<published>2015-02-09T18:38:17Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=bfb4e3fb85206bd3a1601497f3b6d785af708124'/>
<id>urn:sha1:bfb4e3fb85206bd3a1601497f3b6d785af708124</id>
<content type='text'>
commit d4d4eda23794c701442e55129dd4f8f2fefd5e4d upstream.

On Dell Latitude C600 laptop with Pentium 3 850MHz processor, the
speedstep-smi driver sometimes loads and sometimes doesn't load with
"change to state X failed" message.

The hardware sometimes refuses to change frequency and in this case, we
need to retry later. I found out that we need to enable interrupts while
waiting. When we enable interrupts, the hardware blockage that prevents
frequency transition resolves and the transition is possible. With
disabled interrupts, the blockage doesn't resolve (no matter how long do
we wait). The exact reasons for this hardware behavior are unknown.

This patch enables interrupts in the function speedstep_set_state that can
be called with disabled interrupts. However, this function is called with
disabled interrupts only from speedstep_get_freqs, so it shouldn't cause
any problem.

Signed-off-by: Mikulas Patocka &lt;mpatocka@redhat.com
Acked-by: Viresh Kumar &lt;viresh.kumar@linaro.org&gt;
Signed-off-by: Rafael J. Wysocki &lt;rafael.j.wysocki@intel.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>cpufreq: Set cpufreq_cpu_data to NULL before putting kobject</title>
<updated>2015-03-06T22:52:47Z</updated>
<author>
<name>Viresh Kumar</name>
<email>viresh.kumar@linaro.org</email>
</author>
<published>2015-01-31T00:32:44Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=125c662fd50eefd9bd5400772639366f1f42b614'/>
<id>urn:sha1:125c662fd50eefd9bd5400772639366f1f42b614</id>
<content type='text'>
commit 6ffae8c06fab058d6c3f8ecb7f921327721034e7 upstream.

In __cpufreq_remove_dev_finish(), per-cpu 'cpufreq_cpu_data' needs
to be cleared before calling kobject_put(&amp;policy-&gt;kobj) and under
cpufreq_driver_lock. Otherwise, if someone else calls cpufreq_cpu_get()
in parallel with it, they can obtain a non-NULL policy from that after
kobject_put(&amp;policy-&gt;kobj) was executed.

Consider this case:

Thread A				Thread B
cpufreq_cpu_get()
  acquire cpufreq_driver_lock
  read-per-cpu cpufreq_cpu_data
					kobject_put(&amp;policy-&gt;kobj);
  kobject_get(&amp;policy-&gt;kobj);
					...
					per_cpu(&amp;cpufreq_cpu_data, cpu) = NULL

And this will result in a warning like this one:

 ------------[ cut here ]------------
 WARNING: CPU: 0 PID: 4 at include/linux/kref.h:47
 kobject_get+0x41/0x50()
 Modules linked in: acpi_cpufreq(+) nfsd auth_rpcgss nfs_acl
 lockd grace sunrpc xfs libcrc32c sd_mod ixgbe igb mdio ahci hwmon
 ...
 Call Trace:
  [&lt;ffffffff81661b14&gt;] dump_stack+0x46/0x58
  [&lt;ffffffff81072b61&gt;] warn_slowpath_common+0x81/0xa0
  [&lt;ffffffff81072c7a&gt;] warn_slowpath_null+0x1a/0x20
  [&lt;ffffffff812e16d1&gt;] kobject_get+0x41/0x50
  [&lt;ffffffff815262a5&gt;] cpufreq_cpu_get+0x75/0xc0
  [&lt;ffffffff81527c3e&gt;] cpufreq_update_policy+0x2e/0x1f0
  [&lt;ffffffff810b8cb2&gt;] ? up+0x32/0x50
  [&lt;ffffffff81381aa9&gt;] ? acpi_ns_get_node+0xcb/0xf2
  [&lt;ffffffff81381efd&gt;] ? acpi_evaluate_object+0x22c/0x252
  [&lt;ffffffff813824f6&gt;] ? acpi_get_handle+0x95/0xc0
  [&lt;ffffffff81360967&gt;] ? acpi_has_method+0x25/0x40
  [&lt;ffffffff81391e08&gt;] acpi_processor_ppc_has_changed+0x77/0x82
  [&lt;ffffffff81089566&gt;] ? move_linked_works+0x66/0x90
  [&lt;ffffffff8138e8ed&gt;] acpi_processor_notify+0x58/0xe7
  [&lt;ffffffff8137410c&gt;] acpi_ev_notify_dispatch+0x44/0x5c
  [&lt;ffffffff8135f293&gt;] acpi_os_execute_deferred+0x15/0x22
  [&lt;ffffffff8108c910&gt;] process_one_work+0x160/0x410
  [&lt;ffffffff8108d05b&gt;] worker_thread+0x11b/0x520
  [&lt;ffffffff8108cf40&gt;] ? rescuer_thread+0x380/0x380
  [&lt;ffffffff81092421&gt;] kthread+0xe1/0x100
  [&lt;ffffffff81092340&gt;] ? kthread_create_on_node+0x1b0/0x1b0
  [&lt;ffffffff81669ebc&gt;] ret_from_fork+0x7c/0xb0
  [&lt;ffffffff81092340&gt;] ? kthread_create_on_node+0x1b0/0x1b0
 ---[ end trace 89e66eb9795efdf7 ]---

The actual code flow is as follows:

 Thread A: Workqueue: kacpi_notify

 acpi_processor_notify()
   acpi_processor_ppc_has_changed()
         cpufreq_update_policy()
           cpufreq_cpu_get()
             kobject_get()

 Thread B: xenbus_thread()

 xenbus_thread()
   msg-&gt;u.watch.handle-&gt;callback()
     handle_vcpu_hotplug_event()
       vcpu_hotplug()
         cpu_down()
           __cpu_notify(CPU_POST_DEAD..)
             cpufreq_cpu_callback()
               __cpufreq_remove_dev_finish()
                 cpufreq_policy_put_kobj()
                   kobject_put()

cpufreq_cpu_get() gets the policy from per-cpu variable cpufreq_cpu_data
under cpufreq_driver_lock, and once it gets a valid policy it expects it
to not be freed until cpufreq_cpu_put() is called.

But the race happens when another thread puts the kobject first and updates
cpufreq_cpu_data before or later. And so the first thread gets a valid policy
structure and before it does kobject_get() on it, the second one has already
done kobject_put().

Fix this by setting cpufreq_cpu_data to NULL before putting the kobject and that
too under locks.

Reported-by: Ethan Zhao &lt;ethan.zhao@oracle.com&gt;
Reported-by: Santosh Shilimkar &lt;santosh.shilimkar@oracle.com&gt;
Signed-off-by: Viresh Kumar &lt;viresh.kumar@linaro.org&gt;
Signed-off-by: Rafael J. Wysocki &lt;rafael.j.wysocki@intel.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
</feed>
