aboutsummaryrefslogtreecommitdiff
path: root/kernel
Commit message (Collapse)AuthorAgeFilesLines
* Merge branch 'android-msm-shamrock-3.10-nougat-mr1-release' of ↵Rygebin2017-06-043-4/+18
|\ | | | | | | https://github.com/android/kernel_msm into cm-14.1
| * perf: Tighten (and fix) the grouping conditionPeter Zijlstra2017-04-241-2/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit c3c87e770458aa004bd7ed3f29945ff436fd6511 upstream. The fix from 9fc81d87420d ("perf: Fix events installation during moving group") was incomplete in that it failed to recognise that creating a group with events for different CPUs is semantically broken -- they cannot be co-scheduled. Furthermore, it leads to real breakage where, when we create an event for CPU Y and then migrate it to form a group on CPU X, the code gets confused where the counter is programmed -- triggered in practice as well by me via the perf fuzzer. Fix this by tightening the rules for creating groups. Only allow grouping of counters that can be co-scheduled in the same context. This means for the same task and/or the same cpu. Fixes: 9fc81d87420d ("perf: Fix events installation during moving group") Bug: 34515362 Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/20150123125834.090683288@infradead.org Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Willy Tarreau <w@1wt.eu> Change-Id: I72247f51388177f172845ccd4621debcd3158940
| * Merge "trace: resolve stack corruption due to string copy" into ↵Richard Chang2017-04-241-1/+1
| |\ | | | | | | | | | mnc-dr-dev-qcom-lego
| | * trace: resolve stack corruption due to string copyAmey Telawane2017-04-111-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Strcpy has no limit on string being copied which causes stack corruption leading to kernel panic. Use strlcpy to resolve the issue by providing length of string to be copied. CRs-fixed: 1048480 Bug: 35399704 Change-Id: Ib290b25f7e0ff96927b8530e5c078869441d409f Signed-off-by: Amey Telawane <ameyt@codeaurora.org>
| * | tracing: do not leak kernel addressesNick Desaulniers2017-04-241-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | This likely breaks tracing tools like trace-cmd. It logs in the same format but now addresses are all 0x0. Bug: 34277115 Change-Id: Ifb0d4d2a184bf0d95726de05b1acee0287a375d9
| * | UPSTREAM: tracing: Fix trace_printk() to print when not using bprintk()Steven Rostedt (Red Hat)2017-04-241-0/+3
| |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The trace_printk() code will allocate extra buffers if the compile detects that a trace_printk() is used. To do this, the format of the trace_printk() is saved to the __trace_printk_fmt section, and if that section is bigger than zero, the buffers are allocated (along with a message that this has happened). If trace_printk() uses a format that is not a constant, and thus something not guaranteed to be around when the print happens, the compiler optimizes the fmt out, as it is not used, and the __trace_printk_fmt section is not filled. This means the kernel will not allocate the special buffers needed for the trace_printk() and the trace_printk() will not write anything to the tracing buffer. Adding a "__used" to the variable in the __trace_printk_fmt section will keep it around, even though it is set to NULL. This will keep the string from being printed in the debugfs/tracing/printk_formats section as it is not needed. Reported-by: Vlastimil Babka <vbabka@suse.cz> Fixes: 07d777fe8c398 "tracing: Add percpu buffers for trace_printk()" Cc: stable@vger.kernel.org # v3.5+ Bug: 34277115 Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Change-Id: I10ce56caa41c7644d9d290d9ed272a6d156c938c
* / shamrock: Only expose su when daemon is runningTom Marshall2017-06-023-0/+38
|/ | | | | | | | | | | | | | It has been claimed that the PG implementation of 'su' has security vulnerabilities even when disabled. Unfortunately, the people that find these vulnerabilities often like to keep them private so they can profit from exploits while leaving users exposed to malicious hackers. In order to reduce the attack surface for vulnerabilites, it is therefore necessary to make 'su' completely inaccessible when it is not in use (except by the root and system users). Change-Id: I79716c72f74d0b7af34ec3a8054896c6559a181d
* Merge "ring-buffer: Prevent overflow of size in ring_buffer_resize() If the ↵TreeHugger Robot2016-12-141-5/+4
|\ | | | | | | size passed to ring_buffer_resize() is greater than MAX_LONG - BUF_PAGE_SIZE then the DIV_ROUND_UP() will return zero." into mnc-dr-dev-qcom-lego
| * ring-buffer: Prevent overflow of size in ring_buffer_resize()Sivacharan Paka2016-12-071-5/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If the size passed to ring_buffer_resize() is greater than MAX_LONG - BUF_PAGE_SIZE then the DIV_ROUND_UP() will return zero. Here's the details: # echo 18014398509481980 > /sys/kernel/debug/tracing/buffer_size_kb tracing_entries_write() processes this and converts kb to bytes. 18014398509481980 << 10 = 18446744073709547520 and this is passed to ring_buffer_resize() as unsigned long size. size = DIV_ROUND_UP(size, BUF_PAGE_SIZE); Where DIV_ROUND_UP(a, b) is (a + b - 1)/b BUF_PAGE_SIZE is 4080 and here 18446744073709547520 + 4080 - 1 = 18446744073709551599 where 18446744073709551599 is still smaller than 2^64 2^64 - 18446744073709551599 = 17 But now 18446744073709551599 / 4080 = 4521260802379792 and size = size * 4080 = 18446744073709551360 This is checked to make sure its still greater than 2 * 4080, which it is. Then we convert to the number of buffer pages needed. nr_page = DIV_ROUND_UP(size, BUF_PAGE_SIZE) but this time size is 18446744073709551360 and 2^64 - (18446744073709551360 + 4080 - 1) = -3823 Thus it overflows and the resulting number is less than 4080, which makes 3823 / 4080 = 0 an nr_pages is set to this. As we already checked against the minimum that nr_pages may be, this causes the logic to fail as well, and we crash the kernel. There's no reason to have the two DIV_ROUND_UP() (that's just result of historical code changes), clean up the code and fix this bug. Change-Id: Ie0699ce4627a8c578f01a388d93f88083a6d7403 Cc: stable@vger.kernel.org # 3.5+ Fixes: 83f40318dab00 ("ring-buffer: Make removal of ring buffer pages atomic") Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
* | perf: don't leave group_entry on sibling list (use-after-free)John Dias2016-12-071-0/+7
|/ | | | | | | | | | | | When perf_group_detach is called on a group leader, it should empty its sibling list. Otherwise, when a sibling is later deallocated, list_del_event() removes the sibling's group_entry from its current list, which can be the now-deallocated group leader's sibling list (use-after-free bug). Bug: 32402548 Change-Id: I99f6bc97c8518df1cb0035814368012ba72ab1f1
* perf: protect group_leader from races that cause ctx double-freeJohn Dias2016-11-211-0/+15
| | | | | | | | | | | | | | | | | | When moving a group_leader perf event from a software-context to a hardware-context, there's a race in checking and updating that context. The existing locking solution doesn't work; note that it tries to grab a lock inside the group_leader's context object, which you can only get at by going through a pointer that should be protected from these races. To avoid that problem, and to produce a simple solution, we can just use a lock per group_leader to protect all checks on the group_leader's context. The new lock is grabbed and released when no context locks are held. Bug: 30955111 Bug: 31095224 Change-Id: If37124c100ca6f4aa962559fba3bd5dbbec8e052
* BACKPORT: perf: Fix event->ctx lockingSivacharan Paka2016-11-211-36/+205
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | There have been a few reported issues wrt. the lack of locking around changing event->ctx. This patch tries to address those. It avoids the whole rwsem thing; and while it appears to work, please give it some thought in review. What I did fail at is sensible runtime checks on the use of event->ctx, the RCU use makes it very hard. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/20150123125834.209535886@infradead.org Signed-off-by: Ingo Molnar <mingo@kernel.org> (cherry picked from commit f63a8daa5812afef4f06c962351687e1ff9ccb2b) Bug: 30955111 Bug: 31095224 Change-Id: I8dfc0aae8d1206c177454e0093dacd82b6129c55 Signed-off-by: Joao Dias <joaodias@google.com> Git-repo: http://git.kernel.org/cgit/linux/kernel/git/tip/tip.git Git-commit: f63a8daa5812afef4f06c962351687e1ff9ccb2b [rsiddoji@codeaurora.org: resloved some trival confilits] Signed-off-by: Ravi Kumar Siddojigari <rsiddoji@codeaurora.org>
* cgroup: prefer %pK to %pNick Desaulniers2016-10-201-1/+1
| | | | | | | Prevents leaking kernel pointers when using kptr_restrict. Bug: 30149174 Change-Id: I76d4132a0f47f4b0a9f042b8269e0f24edd111ed
* UPSTREAM: perf: Fix race in swevent hashPeter Zijlstra2016-10-191-19/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | (cherry picked from commit 12ca6ad2e3a896256f086497a7c7406a547ee373) There's a race on CPU unplug where we free the swevent hash array while it can still have events on. This will result in a use-after-free which is BAD. Simply do not free the hash array on unplug. This leaves the thing around and no use-after-free takes place. When the last swevent dies, we do a for_each_possible_cpu() iteration anyway to clean these up, at which time we'll free it, so no leakage will occur. Change-Id: Ic95f4aa857e910239e47719930c81485446a5a15 Reported-by: Sasha Levin <sasha.levin@oracle.com> Tested-by: Sasha Levin <sasha.levin@oracle.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Signed-off-by: Ingo Molnar <mingo@kernel.org> Bug: 30952077
* BACKPORT: audit: fix a double fetch in audit_log_single_execve_arg()Paul Moore2016-10-191-170/+167
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | (cherry picked from commit 43761473c254b45883a64441dd0bc85a42f3645c) There is a double fetch problem in audit_log_single_execve_arg() where we first check the execve(2) argumnets for any "bad" characters which would require hex encoding and then re-fetch the arguments for logging in the audit record[1]. Of course this leaves a window of opportunity for an unsavory application to munge with the data. This patch reworks things by only fetching the argument data once[2] into a buffer where it is scanned and logged into the audit records(s). In addition to fixing the double fetch, this patch improves on the original code in a few other ways: better handling of large arguments which require encoding, stricter record length checking, and some performance improvements (completely unverified, but we got rid of some strlen() calls, that's got to be a good thing). As part of the development of this patch, I've also created a basic regression test for the audit-testsuite, the test can be tracked on GitHub at the following link: * https://github.com/linux-audit/audit-testsuite/issues/25 [1] If you pay careful attention, there is actually a triple fetch problem due to a strnlen_user() call at the top of the function. [2] This is a tiny white lie, we do make a call to strnlen_user() prior to fetching the argument data. I don't like it, but due to the way the audit record is structured we really have no choice unless we copy the entire argument at once (which would require a rather wasteful allocation). The good news is that with this patch the kernel no longer relies on this strnlen_user() value for anything beyond recording it in the log, we also update it with a trustworthy value whenever possible. Reported-by: Pengfei Wang <wpengfeinudt@gmail.com> Cc: <stable@vger.kernel.org> Signed-off-by: Paul Moore <paul@paul-moore.com> Change-Id: I10e979e94605e3cf8d461e3e521f8f9837228aa5 Bug: 30956807
* ARM: eve: add sharp specific driver for handling panic and logging.Shigeru Murai2016-05-181-0/+8
| | | | | | | | | | | | | | | | | | | | | | | 1. Add sharp specific driver to drivers/soc/qcom/sharp. (SHARP_SHLOG) When this is enabled from device tree, this driver saves some informations which is necessary for our bootloaders to save logs after reset occurs. This driver needs address of xtime for our bootloaders to know when reset occurs. Sub-functions of shrlog driver listed below are also enabled when device tree is set. 2. Save modem parameters to shrlog driver. (SHARP_SHLOG_MODEM) This driver saves some informations which are necessary for our bootloaders to save modem dump after reset occurs. 3. Save pstore parameters to shrlog driver. (SHARP_SHLOG_RAMOOPS) This driver saves some informations which are necessary for our bootloaders to save logs after reset occurs. 4. Add shrlog panic handler. (SHARP_HANDLE_PANIC) This handler turns reset to "warm reset" and set reset-reasons to hardwares for our bootloaders to know what causes reset. 5. Add shrlog subsystem reset handler. (SHARP_PANIC_ON_SUBSYS) This handler turns subsystem-restart to panic when subsystem crashes. Change-Id: I364aa87a66e2189e00b0892ce4d2ebbe448d0a36
* cpuset: Make cpusets restore on hotplugRiley Andrews2016-05-121-15/+33
| | | | | | | | | | | | This deliberately changes the behavior of the per-cpuset cpus file to not be effected by hotplug. When a cpu is offlined, it will be removed from the cpuset/cpus file. When a cpu is onlined, if the cpuset originally requested that that cpu was part of the cpuset, that cpu will be restored to the cpuset. The cpus files still have to be hierachical, but the ranges no longer have to be out of the currently online cpus, just the physically present cpus. Change-Id: I3efbae24a1f6384be1e603fb56f0d3baef61d924
* cpuset: Add allow_attach hook for cpusets on android.Riley Andrews2016-05-121-0/+18
| | | | Change-Id: Ic1b61b2bbb7ce74c9e9422b5e22ee9078251de21
* sched/fair: Fix fairness issue on migrationPeter Zijlstra2016-05-121-6/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pavan reported that in the presence of very light tasks (or cgroups) the placement of migrated tasks can cause severe fairness issues. The problem is that enqueue_entity() places the task before it updates time, thereby it can place the task far in the past (remember that light tasks will shoot virtual time forward at a high speed, so in relation to the pre-existing light task, we can land far in the past). This is done because update_curr() needs the current task, and we might be placing the current task. The obvious solution is to differentiate between the current and any other task; placing the current before we update time, and placing any other task after, such that !curr tasks end up at the current moment in time, and not in the past. CRs-Fixed: 911833 Change-Id: I6ed9765c227a46d6e062a9cfc5e3309bc5475e72 Reported-by: Pavan Kondeti <pkondeti@codeaurora.org> Tested-by: Pavan Kondeti <pkondeti@codeaurora.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Ben Segall <bsegall@google.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Matt Fleming <matt@codeblueprint.co.uk> Cc: Mike Galbraith <umgwanakikbuti@gmail.com> Cc: Morten Rasmussen <morten.rasmussen@arm.com> Cc: Paul Turner <pjt@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: byungchul.park@lge.com Link: http://lkml.kernel.org/r/20160309120403.GK6344@twins.programming.kicks-ass.net Signed-off-by: Ingo Molnar <mingo@kernel.org> Git-commit: 3a47d5124a957358274e9ca7b115b2f3a914f56d Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git [pkondeti@codeaurora.org: Resolved minor merge conflicts] Signed-off-by: Pavankumar Kondeti <pkondeti@codeaurora.org>
* sched: turn off the TTWU_QUEUE featureSyed Rameez Mustafa2016-04-061-1/+1
| | | | | | | | | | | | | | | | While the feature TTWU_QUEUE has the advantage of reducing cache bouncing of runqueue locks, it has the side effect that runqueue statistics are not updated until the remote CPU has a chance to enqueue the task. Since there is no upper bound on the amount of time it can take the remote CPU to enqueue the task, several sequential wakeups can result in suboptimal task placement based on the stale statistics. Turn off the feature as the cost of sub-optimal placement is much higher than the cost of cache bouncing spinlocks for msm based systems. Change-Id: I0b85c0225237b2bc44f54934769f5e3750c0f3d6 Signed-off-by: Syed Rameez Mustafa <rameezmustafa@codeaurora.org> Signed-off-by: Pavankumar Kondeti <pkondeti@codeaurora.org>
* FROMLIST: mm: mmap: Add new /proc tunable for mmap_base ASLR.dcashman2016-04-061-0/+22
| | | | | | | | | | | | | | | | | | (cherry picked from commit https://lkml.org/lkml/2015/12/21/337) ASLR only uses as few as 8 bits to generate the random offset for the mmap base address on 32 bit architectures. This value was chosen to prevent a poorly chosen value from dividing the address space in such a way as to prevent large allocations. This may not be an issue on all platforms. Allow the specification of a minimum number of bits so that platforms desiring greater ASLR protection may determine where to place the trade-off. Bug: 24047224 Signed-off-by: Daniel Cashman <dcashman@android.com> Signed-off-by: Daniel Cashman <dcashman@google.com> Change-Id: I66ac01c6f4f2c8dcfc84d1f1e99490b8385b3ed4 (cherry picked from commit 00ead9ddada26be1726539b1ead14abf974d235d)
* msm: null pointer dereferencingWish Wu2016-02-242-2/+9
| | | | | | | | | | | | | | | Prevent unintended kernel NULL pointer dereferencing. Orignal code: hlist_del_rcu(&event->hlist_entry); Fix: Adding pointer check: if(!hlist_unhashed(&p_event->hlist_entry)) hlist_del_rcu(&p_event->hlist_entry); Bug: 25364034 Change-Id: I89e2eb26ae9d1c08c76bdcf9935c238edbdf521a Signed-off-by: Yuan Lin <yualin@google.com>
* irq_work: Remove BUG_ON in irq_work_run()Peter Zijlstra2016-01-071-42/+4
| | | | | | | | | | | | | | | | | | | | | | | | Because of a collision with 8d056c48e486 ("CPU hotplug, smp: flush any pending IPI callbacks before CPU offline"), which ends up calling hotplug_cfd()->flush_smp_call_function_queue()->irq_work_run(), which is not from IRQ context. And since that already calls irq_work_run() from the hotplug path, remove our entire hotplug handling. Change-Id: I6dbaa98954f42e45d66a3e70a12d4d569444b6ca Reported-by: Stephen Warren <swarren@wwwdotorg.org> Tested-by: Stephen Warren <swarren@wwwdotorg.org> Reviewed-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/n/tip-busatzs2gvz4v62258agipuf@git.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org> Git-commit: a77353e5eb56b6c6098bfce59aff1f449451b0b7 Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git [sramana@codeaurora.org: trivial merge conflicts resolved] Signed-off-by: Srinivas Ramana <sramana@codeaurora.org>
* CPU hotplug, smp: flush any pending IPI callbacks before CPU offlineSrivatsa S. Bhat2016-01-071-8/+50
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There is a race between the CPU offline code (within stop-machine) and the smp-call-function code, which can lead to getting IPIs on the outgoing CPU, *after* it has gone offline. Specifically, this can happen when using smp_call_function_single_async() to send the IPI, since this API allows sending asynchronous IPIs from IRQ disabled contexts. The exact race condition is described below. During CPU offline, in stop-machine, we don't enforce any rule in the _DISABLE_IRQ stage, regarding the order in which the outgoing CPU and the other CPUs disable their local interrupts. Due to this, we can encounter a situation in which an IPI is sent by one of the other CPUs to the outgoing CPU (while it is *still* online), but the outgoing CPU ends up noticing it only *after* it has gone offline. CPU 1 CPU 2 (Online CPU) (CPU going offline) Enter _PREPARE stage Enter _PREPARE stage Enter _DISABLE_IRQ stage = Got a device interrupt, and | Didn't notice the IPI the interrupt handler sent an | since interrupts were IPI to CPU 2 using | disabled on this CPU. smp_call_function_single_async() | = Enter _DISABLE_IRQ stage Enter _RUN stage Enter _RUN stage = Busy loop with interrupts | Invoke take_cpu_down() disabled. | and take CPU 2 offline = Enter _EXIT stage Enter _EXIT stage Re-enable interrupts Re-enable interrupts The pending IPI is noted immediately, but alas, the CPU is offline at this point. This of course, makes the smp-call-function IPI handler code running on CPU 2 unhappy and it complains about "receiving an IPI on an offline CPU". One real example of the scenario on CPU 1 is the block layer's complete-request call-path: __blk_complete_request() [interrupt-handler] raise_blk_irq() smp_call_function_single_async() However, if we look closely, the block layer does check that the target CPU is online before firing the IPI. So in this case, it is actually the unfortunate ordering/timing of events in the stop-machine phase that leads to receiving IPIs after the target CPU has gone offline. In reality, getting a late IPI on an offline CPU is not too bad by itself (this can happen even due to hardware latencies in IPI send-receive). It is a bug only if the target CPU really went offline without executing all the callbacks queued on its list. (Note that a CPU is free to execute its pending smp-call-function callbacks in a batch, without waiting for the corresponding IPIs to arrive for each one of those callbacks). So, fixing this issue can be broken up into two parts: 1. Ensure that a CPU goes offline only after executing all the callbacks queued on it. 2. Modify the warning condition in the smp-call-function IPI handler code such that it warns only if an offline CPU got an IPI *and* that CPU had gone offline with callbacks still pending in its queue. Achieving part 1 is straight-forward - just flush (execute) all the queued callbacks on the outgoing CPU in the CPU_DYING stage[1], including those callbacks for which the source CPU's IPIs might not have been received on the outgoing CPU yet. Once we do this, an IPI that arrives late on the CPU going offline (either due to the race mentioned above, or due to hardware latencies) will be completely harmless, since the outgoing CPU would have executed all the queued callbacks before going offline. Overall, this fix (parts 1 and 2 put together) additionally guarantees that we will see a warning only when the *IPI-sender code* is buggy - that is, if it queues the callback _after_ the target CPU has gone offline. [1]. The CPU_DYING part needs a little more explanation: by the time we execute the CPU_DYING notifier callbacks, the CPU would have already been marked offline. But we want to flush out the pending callbacks at this stage, ignoring the fact that the CPU is offline. So restructure the IPI handler code so that we can by-pass the "is-cpu-offline?" check in this particular case. (Of course, the right solution here is to fix CPU hotplug to mark the CPU offline _after_ invoking the CPU_DYING notifiers, but this requires a lot of audit to ensure that this change doesn't break any existing code; hence lets go with the solution proposed above until that is done). Change-Id: I0928593cf0bfca5972a6fcb37b943c73d5b32eb5 [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Suggested-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Cc: Borislav Petkov <bp@suse.de> Cc: Christoph Hellwig <hch@infradead.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Gautham R Shenoy <ego@linux.vnet.ibm.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Mel Gorman <mgorman@suse.de> Cc: Mike Galbraith <mgalbraith@suse.de> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rafael J. Wysocki <rjw@rjwysocki.net> Cc: Rik van Riel <riel@redhat.com> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Tejun Heo <tj@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Tested-by: Sachin Kamat <sachin.kamat@samsung.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Git-commit: 8d056c48e486249e6487910b83e0f3be7c14acf7 Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git [sramana@codeaurora.org: merge conflicts resolved while porting to msm-3.10] Signed-off-by: Srinivas Ramana <sramana@codeaurora.org>
* Merge "rcu: Don't disable CPU hotplug during OOM notifiers"Linux Build Service Account2015-12-111-2/+0
|\
| * rcu: Don't disable CPU hotplug during OOM notifiersPaul E. McKenney2015-12-061-2/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | RCU's rcu_oom_notify() disables CPU hotplug in order to stabilize the list of online CPUs, which it traverses. However, this is completely pointless because smp_call_function_single() will quietly fail if invoked on an offline CPU. Because the count of requests is incremented in the rcu_oom_notify_cpu() function that is remotely invoked, everything works nicely even in the face of concurrent CPU-hotplug operations. Furthermore, in recent kernels, invoking get_online_cpus() from an OOM notifier can result in deadlock. This commit therefore removes the call to get_online_cpus() and put_online_cpus() from rcu_oom_notify(). Change-Id: Ib5cfc417a7481ae0ee9ae2ee82b88a73b2d12a89 Reported-by: Marcin Ślusarz <marcin.slusarz@gmail.com> Reported-by: David Rientjes <rientjes@google.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Acked-by: David Rientjes <rientjes@google.com> Tested-by: Marcin Ślusarz <marcin.slusarz@gmail.com> Git-commit: 9a54f98e341d09793247a6e598012edefb5ae7cb Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git Signed-off-by: Srinivas Ramana <sramana@codeaurora.org>
* | Merge "sched: encourage idle load balance and discourage active load balance"Linux Build Service Account2015-12-111-5/+11
|\ \ | |/ |/|
| * sched: encourage idle load balance and discourage active load balanceJoonwoo Park2015-11-251-5/+11
| | | | | | | | | | | | | | | | | | | | | | | | Encourage IDLE and NEWLY_IDLE load balance by ignoring cache hotness and discourage active load balance when by increasing busy balancing failure threshold to initiate active load balancer in order to reduce scheduler latency and avoid unnecessary active migration within a same domain. Change-Id: I22f6aba11932ccbb82a436c0532589c46f9148ed Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org> [pkondeti@codeaurora.org: resolved minor merge conflicts] Signed-off-by: Pavankumar Kondeti <pkondeti@codeaurora.org>
* | sched: colocate related threadsSrivatsa Vaddagiri2015-11-204-19/+550
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Provide userspace interface for tasks to be grouped together as "related" threads. For example, all threads involved in updating display buffer could be tagged as related. Scheduler will attempt to provide special treatment for group of related threads such as: 1) Colocation of related threads in same "preferred" cluster 2) Aggregation of demand towards determination of cluster frequency This patch extends scheduler to provide best-effort colocation support for a group of related threads. Change-Id: Ic2cd769faf5da4d03a8f3cb0ada6224d0101a5f5 Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org> Signed-off-by: Syed Rameez Mustafa <rameezmustafa@codeaurora.org> Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org> Signed-off-by: Pavankumar Kondeti <pkondeti@codeaurora.org>
* | sched: Consolidate cluster-specific informationSrivatsa Vaddagiri2015-11-205-236/+259
| | | | | | | | | | | | | | | | | | | | | | | | | | Many cluster-shared attributes like cur_freq, max_freq etc are needlessly maintained in per-cpu 'struct rq' currently. Consolidate them in cluster structure. Change-Id: I36e508082bb1e8a7c1a60e99902b5bc260f5f8f6 Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org> Signed-off-by: Syed Rameez Mustafa <rameezmustafa@codeaurora.org> Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org> Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org> Signed-off-by: Pavankumar Kondeti <pkondeti@codeaurora.org>
* | sched: Introduce sched_clusterSrivatsa Vaddagiri2015-11-203-8/+124
| | | | | | | | | | | | | | | | | | | | | | | | | | | | A cluster is set of CPUs sharing some power controls and an L2 cache. This patch buids a list of clusters at bootup which are sorted by their max_power_cost. Organizing CPUs in terms of clusters helps optimize cpu selection logic in select_best_cpu() quite a bit. A subsequent patch modifies select_best_cpu() to make use of clusters. Change-Id: I3fecc542b36db014afc0375a1ea4c4802e2f4dba Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org> Signed-off-by: Syed Rameez Mustafa <rameezmustafa@codeaurora.org> Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org> Signed-off-by: Pavankumar Kondeti <pkondeti@codeaurora.org>
* | sched: initialize frequency domain cpumaskJoonwoo Park2015-11-201-0/+1
|/ | | | | | | | | | | | | | It's possible select_best_cpu() gets called before the first cpufreq notifier call. In such scenario select_best_cpu() can hang forever by not clearing search_cpus. Initialize frequency domain cpumask with the CPU of rq to avoid such scenario. CRs-fixed: 931349 Change-Id: If8d31c5477efe61ad7c6b336ba9e27ca6f556b63 Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org> Signed-off-by: Pavankumar Kondeti <pkondeti@codeaurora.org>
* Merge "rtc: alarm: set power-on alarm via timerfd"Linux Build Service Account2015-11-071-29/+89
|\
| * rtc: alarm: set power-on alarm via timerfdMao Jinlong2015-11-061-29/+89
| | | | | | | | | | | | | | | | | | Android does not support powering-up the phone through alarm. Set rtc alarm via timerfd to power-up the phone after alarm expiration. Change-Id: I781389c658fb00ba7f0ce089d706c10f202a7dc6 Signed-off-by: Mao Jinlong <c_jmao@codeaurora.org>
* | sched/cputime: fix a deadlock on 32bit systemsPavankumar Kondeti2015-11-031-6/+9
|/ | | | | | | | | | | | | | | | | | | | | cpu_hardirq_time and cpu_softirq_time are protected with seqlock on 32bit systems. There is a potential deadlock with this seqlock and rq->lock. CPU 1 CPU0 ========================== ======================== --> acquire CPU0 rq->lock --> __irq_enter() ----> task enqueue/dequeue ----> irqtime_account_irq() ------> update_rq_clock() ------> irq_time_write_begin() --------> irq_time_read() --------> sched_account_irqtime() (waiting for the seqlock (waiting for the CPU0 rq->lock) held in irq_time_write_begin() Fix this issue by dropping the seqlock before calling sched_account_irqtime() Change-Id: I29a33876e372f99435a57cc11eada9c8cfd59a3f Signed-off-by: Pavankumar Kondeti <pkondeti@codeaurora.org>
* Merge upstream tag 'v3.10.84' into LA.BR.1.3.3Kaushal Kumar2015-09-3010-147/+56
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This merge brings us up-to-date as of upstream tag v3.10.84 * tag 'v3.10.84' (317 commits): Linux 3.10.84 fs: Fix S_NOSEC handling KVM: x86: make vapics_in_nmi_mode atomic MIPS: Fix KVM guest fixmap address x86/PCI: Use host bridge _CRS info on Foxconn K8M890-8237A powerpc/perf: Fix book3s kernel to userspace backtraces arm: KVM: force execution of HCPTR access on VM exit Revert "crypto: talitos - convert to use be16_add_cpu()" crypto: talitos - avoid memleak in talitos_alg_alloc() sctp: Fix race between OOTB responce and route removal packet: avoid out of bounds read in round robin fanout packet: read num_members once in packet_rcv_fanout() bridge: fix br_stp_set_bridge_priority race conditions bridge: fix multicast router rlist endless loop sparc: Use GFP_ATOMIC in ldc_alloc_exp_dring() as it can be called in softirq context Linux 3.10.83 bus: mvebu: pass the coherency availability information at init time KVM: nSVM: Check for NRIPS support before updating control field ARM: clk-imx6q: refine sata's parent d_walk() might skip too much ipv6: update ip6_rt_last_gc every time GC is run ipv6: prevent fib6_run_gc() contention xfrm: Increase the garbage collector threshold Btrfs: make xattr replace operations atomic x86/microcode/intel: Guard against stack overflow in the loader fs: take i_mutex during prepare_binprm for set[ug]id executables hpsa: add missing pci_set_master in kdump path hpsa: refine the pci enable/disable handling sb_edac: Fix erroneous bytes->gigabytes conversion ACPICA: Utilities: Cleanup to remove useless ACPI_PRINTF/FORMAT_xxx helpers. ACPICA: Utilities: Cleanup to convert physical address printing formats. __ptrace_may_access() should not deny sub-threads include/linux/sched.h: don't use task->pid/tgid in same_thread_group/has_group_leader_pid netfilter: Zero the tuple in nfnl_cthelper_parse_tuple() netfilter: nfnetlink_cthelper: Remove 'const' and '&' to avoid warnings config: Enable NEED_DMA_MAP_STATE by default when SWIOTLB is selected get rid of s_files and files_lock fput: turn "list_head delayed_fput_list" into llist_head Linux 3.10.82 lpfc: Add iotag memory barrier pipe: iovec: Fix memory corruption when retrying atomic copy as non-atomic drm/mgag200: Reject non-character-cell-aligned mode widths tracing: Have filter check for balanced ops crypto: caam - fix RNG buffer cache alignment Linux 3.10.81 btrfs: cleanup orphans while looking up default subvolume btrfs: incorrect handling for fiemap_fill_next_extent return cfg80211: wext: clear sinfo struct before calling driver mm/memory_hotplug.c: set zone->wait_table to null after freeing it drm/i915: Fix DDC probe for passive adapters pata_octeon_cf: fix broken build ozwpan: unchecked signed subtraction leads to DoS ozwpan: divide-by-zero leading to panic ozwpan: Use proper check to prevent heap overflow MIPS: Fix enabling of DEBUG_STACKOVERFLOW ring-buffer-benchmark: Fix the wrong sched_priority of producer USB: serial: ftdi_sio: Add support for a Motion Tracker Development Board USB: cp210x: add ID for HubZ dual ZigBee and Z-Wave dongle block: fix ext_dev_lock lockdep report Input: elantech - fix detection of touchpads where the revision matches a known rate ALSA: usb-audio: add MAYA44 USB+ mixer control names ALSA: usb-audio: Add mic volume fix quirk for Logitech Quickcam Fusion ALSA: hda/realtek - Add a fixup for another Acer Aspire 9420 iio: adis16400: Compute the scan mask from channel indices iio: adis16400: Use != channel indices for the two voltage channels iio: adis16400: Report pressure channel scale xen: netback: read hotplug script once at start of day. udp: fix behavior of wrong checksums net_sched: invoke ->attach() after setting dev->qdisc unix/caif: sk_socket can disappear when state is unlocked net: dp83640: fix broken calibration routine. bridge: fix parsing of MLDv2 reports ipv4: Avoid crashing in ip_error net: phy: Allow EEE for all RGMII variants Linux 3.10.80 fs/binfmt_elf.c:load_elf_binary(): return -EINVAL on zero-length mappings vfs: read file_handle only once in handle_to_path ACPI / init: Fix the ordering of acpi_reserve_resources() Input: elantech - fix semi-mt protocol for v3 HW rtlwifi: rtl8192cu: Fix kernel deadlock md/raid5: don't record new size if resize_stripes fails. svcrpc: fix potential GSSX_ACCEPT_SEC_CONTEXT decoding failures ARM: fix missing syscall trace exit ARM: dts: imx27: only map 4 Kbyte for fec registers crypto: s390/ghash - Fix incorrect ghash icv buffer handling. rt2x00: add new rt2800usb device DWA 130 libata: Ignore spurious PHY event on LPM policy change libata: Add helper to determine when PHY events should be ignored ext4: check for zero length extent explicitly ext4: convert write_begin methods to stable_page_writes semantics mmc: atmel-mci: fix bad variable type for clkdiv powerpc: Align TOC to 256 bytes usb: gadget: configfs: Fix interfaces array NULL-termination usb-storage: Add NO_WP_DETECT quirk for Lacie 059f:0651 devices USB: cp210x: add ID for KCF Technologies PRN device USB: pl2303: Remove support for Samsung I330 USB: visor: Match I330 phone more precisely xhci: gracefully handle xhci_irq dead device xhci: Solve full event ring by increasing TRBS_PER_SEGMENT to 256 xhci: fix isoc endpoint dequeue from advancing too far on transaction error target/pscsi: Don't leak scsi_host if hba is VIRTUAL_HOST ASoC: wm8994: correct BCLK DIV 348 to 384 ASoC: wm8960: fix "RINPUT3" audio route error ASoC: mc13783: Fix wrong mask value used in mc13xxx_reg_rmw() calls ALSA: hda - Add headphone quirk for Lifebook E752 ALSA: hda - Add Conexant codecs CX20721, CX20722, CX20723 and CX20724 d_walk() might skip too much lib: Fix strnlen_user() to not touch memory after specified maximum hwmon: (ntc_thermistor) Ensure iio channel is of type IIO_VOLTAGE libceph: request a new osdmap if lingering request maps to no osd lguest: fix out-by-one error in address checking. fs, omfs: add NULL terminator in the end up the token list KVM: MMU: fix CR4.SMEP=1, CR0.WP=0 with shadow pages net: socket: Fix the wrong returns for recvmsg and sendmsg kernel: use the gnu89 standard explicitly staging, rtl8192e, LLVMLinux: Remove unused inline prototype staging: rtl8712, rtl8712: avoid lots of build warnings staging, rtl8192e, LLVMLinux: Change extern inline to static inline drm/i915: Fix declaration of intel_gmbus_{is_forced_bit/is_port_falid} staging: wlags49_h2: fix extern inline functions Linux 3.10.79 ACPICA: Utilities: Cleanup to enforce ACPI_PHYSADDR_TO_PTR()/ACPI_PTR_TO_PHYSADDR(). ACPICA: Tables: Change acpi_find_root_pointer() to use acpi_physical_address. revert "softirq: Add support for triggering softirq work on softirqs" sound/oss: fix deadlock in sequencer_ioctl(SNDCTL_SEQ_OUTOFBAND) mmc: card: Don't access RPMB partitions for normal read/write pinctrl: Don't just pretend to protect pinctrl_maps, do it for real drm/i915: Add missing MacBook Pro models with dual channel LVDS ARM: mvebu: armada-xp-openblocks-ax3-4: Disable internal RTC ARM: dts: imx23-olinuxino: Fix dr_mode of usb0 ARM: dts: imx28: Fix AUART4 TX-DMA interrupt name ARM: dts: imx25: Add #pwm-cells to pwm4 gpio: sysfs: fix memory leaks and device hotplug gpio: unregister gpiochip device before removing it xen/console: Update console event channel on resume mm/memory-failure: call shake_page() when error hits thp tail page nilfs2: fix sanity check of btree level in nilfs_btree_root_broken() ocfs2: dlm: fix race between purge and get lock resource Linux 3.10.78 ARC: signal handling robustify UBI: fix soft lockup in ubi_check_volume() Drivers: hv: vmbus: Don't wait after requesting offers ARM: dts: dove: Fix uart[23] reg property staging: panel: fix lcd type usb: gadget: printer: enqueue printer's response for setup request usb: host: oxu210hp: use new USB_RESUME_TIMEOUT 3w-sas: fix command completion race 3w-9xxx: fix command completion race 3w-xxxx: fix command completion race ext4: fix data corruption caused by unwritten and delayed extents rbd: end I/O the entire obj_request on error serial: of-serial: Remove device_type = "serial" registration ALSA: hda - Fix mute-LED fixed mode ALSA: emu10k1: Emu10k2 32 bit DMA mode ALSA: emu10k1: Fix card shortname string buffer overflow ALSA: emux: Fix mutex deadlock in OSS emulation ALSA: emux: Fix mutex deadlock at unloading ipv4: Missing sk_nulls_node_init() in ping_unhash(). Linux 3.10.77 s390: Fix build error nosave: consolidate __nosave_{begin,end} in <asm/sections.h> memstick: mspro_block: add missing curly braces C6x: time: Ensure consistency in __init wl18xx: show rx_frames_per_rates as an array as it really is lib: memzero_explicit: use barrier instead of OPTIMIZER_HIDE_VAR e1000: add dummy allocator to fix race condition between mtu change and netpoll ksoftirqd: Enable IRQs and call cond_resched() before poking RCU RCU pathwalk breakage when running into a symlink overmounting something drm/i915: cope with large i2c transfers drm/radeon: fix doublescan modes (v2) i2c: core: Export bus recovery functions IB/mlx4: Fix WQE LSO segment calculation IB/core: don't disallow registering region starting at 0x0 IB/core: disallow registering 0-sized memory region stk1160: Make sure current buffer is released mvsas: fix panic on expander attached SATA devices Drivers: hv: vmbus: Fix a bug in the error path in vmbus_open() xtensa: provide __NR_sync_file_range2 instead of __NR_sync_file_range xtensa: xtfpga: fix hardware lockup caused by LCD driver ACPICA: Utilities: split IO address types from data type models. drivers: parport: Kconfig: exclude arm64 for PARPORT_PC scsi: storvsc: Fix a bug in copy_from_bounce_buffer() UBI: fix check for "too many bytes" UBI: initialize LEB number variable UBI: fix out of bounds write UBI: account for bitflips in both the VID header and data tools/power turbostat: Use $(CURDIR) instead of $(PWD) and add support for O= option in Makefile powerpc/perf: Cap 64bit userspace backtraces to PERF_MAX_STACK_DEPTH ext4: make fsync to sync parent dir in no-journal for real this time arm64: kernel: compiling issue, need delete read_current_timer() video: vgacon: Don't build on arm64 console: Disable VGA text console support on cris drivers: parport: Kconfig: exclude h8300 for PARPORT_PC parport: disable PC-style parallel port support on cris rtlwifi: rtl8192cu: Add new device ID rtlwifi: rtl8192cu: Add new USB ID ptrace: fix race between ptrace_resume() and wait_task_stopped() fs/binfmt_elf.c: fix bug in loading of PIE binaries Input: elantech - fix absolute mode setting on some ASUS laptops ALSA: emu10k1: don't deadlock in proc-functions usb: core: hub: use new USB_RESUME_TIMEOUT usb: host: sl811: use new USB_RESUME_TIMEOUT usb: host: xhci: use new USB_RESUME_TIMEOUT usb: host: isp116x: use new USB_RESUME_TIMEOUT usb: host: r8a66597: use new USB_RESUME_TIMEOUT usb: define a generic USB_RESUME_TIMEOUT macro usb: phy: Find the right match in devm_usb_phy_match ARM: S3C64XX: Use fixed IRQ bases to avoid conflicts on Cragganmore ARM: 8320/1: fix integer overflow in ELF_ET_DYN_BASE power_supply: lp8788-charger: Fix leaked power supply on probe fail ring-buffer: Replace this_cpu_*() with __this_cpu_*() spi: spidev: fix possible arithmetic overflow for multi-transfer message cdc-wdm: fix endianness bug in debug statements MIPS: Hibernate: flush TLB entries earlier KVM: use slowpath for cross page cached accesses s390/hibernate: fix save and restore of kernel text section KVM: s390: Zero out current VMDB of STSI before including level3 data. usb: gadget: composite: enable BESL support Btrfs: fix inode eviction infinite loop after cloning into it Btrfs: fix log tree corruption when fs mounted with -o discard tcp: avoid looping in tcp_send_fin() tcp: fix possible deadlock in tcp_send_fin() ip_forward: Drop frames with attached skb->sk Linux 3.10.76 dcache: Fix locking bugs in backported "deal with deadlock in d_walk()" arc: mm: Fix build failure sb_edac: avoid INTERNAL ERROR message in EDAC with unspecified channel x86: mm: move mmap_sem unlock from mm_fault_error() to caller vm: make stack guard page errors return VM_FAULT_SIGSEGV rather than SIGBUS vm: add VM_FAULT_SIGSEGV handling support deal with deadlock in d_walk() move d_rcu from overlapping d_child to overlapping d_alias kconfig: Fix warning "‘jump’ may be used uninitialized" KVM: x86: SYSENTER emulation is broken netfilter: conntrack: disable generic tracking for known protocols Bluetooth: Ignore isochronous endpoints for Intel USB bootloader Bluetooth: Add support for Intel bootloader devices Bluetooth: btusb: Add IMC Networks (Broadcom based) Bluetooth: Add firmware update for Atheros 0cf3:311f Bluetooth: Enable Atheros 0cf3:311e for firmware upload mm: Fix NULL pointer dereference in madvise(MADV_WILLNEED) support splice: Apply generic position and size checks to each write jfs: fix readdir regression serial: 8250_dw: Fix deadlock in LCR workaround benet: Call dev_kfree_skby_any instead of kfree_skb. ixgb: Call dev_kfree_skby_any instead of dev_kfree_skb. tg3: Call dev_kfree_skby_any instead of dev_kfree_skb. bnx2: Call dev_kfree_skby_any instead of dev_kfree_skb. r8169: Call dev_kfree_skby_any instead of dev_kfree_skb. 8139too: Call dev_kfree_skby_any instead of dev_kfree_skb. 8139cp: Call dev_kfree_skby_any instead of kfree_skb. tcp: tcp_make_synack() should clear skb->tstamp tcp: fix FRTO undo on cumulative ACK of SACKed range ipv6: Don't reduce hop limit for an interface tcp: prevent fetching dst twice in early demux code remove extra definitions of U32_MAX conditionally define U32_MAX Linux 3.10.75 pagemap: do not leak physical addresses to non-privileged userspace console: Fix console name size mismatch IB/mlx4: Saturate RoCE port PMA counters in case of overflow kernel.h: define u8, s8, u32, etc. limits net: llc: use correct size for sysctl timeout entries net: rds: use correct size for max unacked packets and bytes ipc: fix compat msgrcv with negative msgtyp core, nfqueue, openvswitch: fix compilation warning media: s5p-mfc: fix mmap support for 64bit arch iscsi target: fix oops when adding reject pdu ocfs2: _really_ sync the right range be2iscsi: Fix kernel panic when device initialization fails cifs: fix use-after-free bug in find_writable_file usb: xhci: apply XHCI_AVOID_BEI quirk to all Intel xHCI controllers cpuidle: ACPI: do not overwrite name and description of C0 dmaengine: omap-dma: Fix memory leak when terminating running transfer iio: imu: Use iio_trigger_get for indio_dev->trig assignment iio: inv_mpu6050: Clear timestamps fifo while resetting hardware fifo Defer processing of REQ_PREEMPT requests for blocked devices USB: ftdi_sio: Use jtag quirk for SNAP Connect E10 USB: ftdi_sio: Added custom PID for Synapse Wireless product radeon: Do not directly dereference pointers to BIOS area. writeback: fix possible underflow in write bandwidth calculation writeback: add missing INITIAL_JIFFIES init in global_update_bandwidth() mm/memory hotplug: postpone the reset of obsolete pgdat nbd: fix possible memory leak iwlwifi: dvm: run INIT firmware again upon .start() IB/uverbs: Prevent integer overflow in ib_umem_get address arithmetic IB/core: Avoid leakage from kernel to user space tcp: Fix crash in TCP Fast Open selinux: fix sel_write_enforce broken return value ALSA: hda - Fix headphone pin config for Lifebook T731 ALSA: usb - Creative USB X-Fi Pro SB1095 volume knob support ALSA: hda - Add one more node in the EAPD supporting candidate list Linux 3.10.74 net: ethernet: pcnet32: Setup the SRAM and NOUFLO on Am79C97{3, 5} powerpc/mpc85xx: Add ranges to etsec2 nodes hfsplus: fix B-tree corruption after insertion at position 0 dm: hold suspend_lock while suspending device during device deletion vt6655: RFbSetPower fix missing rate RATE_12M perf: Fix irq_work 'tail' recursion Revert "iwlwifi: mvm: fix failure path when power_update fails in add_interface" mac80211: drop unencrypted frames in mesh fwding mac80211: disable u-APSD queues by default nl80211: ignore HT/VHT capabilities without QoS/WMM tcm_qla2xxx: Fix incorrect use of __transport_register_session tcm_fc: missing curly braces in ft_invl_hw_context() ASoC: wm8955: Fix wrong value references for boolean kctl ASoC: adav80x: Fix wrong value references for boolean kctl ASoC: ak4641: Fix wrong value references for boolean kctl ASoC: wm8904: Fix wrong value references for boolean kctl ASoC: wm8903: Fix wrong value references for boolean kctl ASoC: wm2000: Fix wrong value references for boolean kctl ASoC: wm8731: Fix wrong value references for boolean kctl ASoC: tas5086: Fix wrong value references for boolean kctl ASoC: wm8960: Fix wrong value references for boolean kctl ASoC: cs4271: Fix wrong value references for boolean kctl ASoC: sgtl5000: remove useless register write clearing CHRGPUMP_POWERUP Change-Id: Ib7976ee2c7224e39074157e28db4158db40b00db Signed-off-by: Kaushal Kumar <kaushalk@codeaurora.org>
| * __ptrace_may_access() should not deny sub-threadsMark Grondona2015-07-031-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 73af963f9f3036dffed55c3a2898598186db1045 upstream. __ptrace_may_access() checks get_dumpable/ptrace_has_cap/etc if task != current, this can can lead to surprising results. For example, a sub-thread can't readlink("/proc/self/exe") if the executable is not readable. setup_new_exec()->would_dump() notices that inode_permission(MAY_READ) fails and then it does set_dumpable(suid_dumpable). After that get_dumpable() fails. (It is not clear why proc_pid_readlink() checks get_dumpable(), perhaps we could add PTRACE_MODE_NODUMPABLE) Change __ptrace_may_access() to use same_thread_group() instead of "task == current". Any security check is pointless when the tasks share the same ->mm. Signed-off-by: Mark Grondona <mgrondona@llnl.gov> Signed-off-by: Ben Woodard <woodard@redhat.com> Signed-off-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * tracing: Have filter check for balanced opsSteven Rostedt2015-06-291-2/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 2cf30dc180cea808077f003c5116388183e54f9e upstream. When the following filter is used it causes a warning to trigger: # cd /sys/kernel/debug/tracing # echo "((dev==1)blocks==2)" > events/ext4/ext4_truncate_exit/filter -bash: echo: write error: Invalid argument # cat events/ext4/ext4_truncate_exit/filter ((dev==1)blocks==2) ^ parse_error: No error ------------[ cut here ]------------ WARNING: CPU: 2 PID: 1223 at kernel/trace/trace_events_filter.c:1640 replace_preds+0x3c5/0x990() Modules linked in: bnep lockd grace bluetooth ... CPU: 3 PID: 1223 Comm: bash Tainted: G W 4.1.0-rc3-test+ #450 Hardware name: Hewlett-Packard HP Compaq Pro 6300 SFF/339A, BIOS K01 v02.05 05/07/2012 0000000000000668 ffff8800c106bc98 ffffffff816ed4f9 ffff88011ead0cf0 0000000000000000 ffff8800c106bcd8 ffffffff8107fb07 ffffffff8136b46c ffff8800c7d81d48 ffff8800d4c2bc00 ffff8800d4d4f920 00000000ffffffea Call Trace: [<ffffffff816ed4f9>] dump_stack+0x4c/0x6e [<ffffffff8107fb07>] warn_slowpath_common+0x97/0xe0 [<ffffffff8136b46c>] ? _kstrtoull+0x2c/0x80 [<ffffffff8107fb6a>] warn_slowpath_null+0x1a/0x20 [<ffffffff81159065>] replace_preds+0x3c5/0x990 [<ffffffff811596b2>] create_filter+0x82/0xb0 [<ffffffff81159944>] apply_event_filter+0xd4/0x180 [<ffffffff81152bbf>] event_filter_write+0x8f/0x120 [<ffffffff811db2a8>] __vfs_write+0x28/0xe0 [<ffffffff811dda43>] ? __sb_start_write+0x53/0xf0 [<ffffffff812e51e0>] ? security_file_permission+0x30/0xc0 [<ffffffff811dc408>] vfs_write+0xb8/0x1b0 [<ffffffff811dc72f>] SyS_write+0x4f/0xb0 [<ffffffff816f5217>] system_call_fastpath+0x12/0x6a ---[ end trace e11028bd95818dcd ]--- Worse yet, reading the error message (the filter again) it says that there was no error, when there clearly was. The issue is that the code that checks the input does not check for balanced ops. That is, having an op between a closed parenthesis and the next token. This would only cause a warning, and fail out before doing any real harm, but it should still not caues a warning, and the error reported should work: # cd /sys/kernel/debug/tracing # echo "((dev==1)blocks==2)" > events/ext4/ext4_truncate_exit/filter -bash: echo: write error: Invalid argument # cat events/ext4/ext4_truncate_exit/filter ((dev==1)blocks==2) ^ parse_error: Meaningless filter expression And give no kernel warning. Link: http://lkml.kernel.org/r/20150615175025.7e809215@gandalf.local.home Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Ingo Molnar <mingo@redhat.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Reported-by: Vince Weaver <vincent.weaver@maine.edu> Tested-by: Vince Weaver <vincent.weaver@maine.edu> Signed-off-by: Steven Rostedt <rostedt@goodmis.org> [ luis: backported to 3.16: - unconditionally decrement cnt as the OP_NOT logic was introduced only by e12c09cf3087 ("tracing: Add NOT to filtering logic") ] Signed-off-by: Luis Henriques <luis.henriques@canonical.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * ring-buffer-benchmark: Fix the wrong sched_priority of producerWang Long2015-06-221-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 108029323910c5dd1ef8fa2d10da1ce5fbce6e12 upstream. The producer should be used producer_fifo as its sched_priority, so correct it. Link: http://lkml.kernel.org/r/1433923957-67842-1-git-send-email-long.wanglong@huawei.com Signed-off-by: Wang Long <long.wanglong@huawei.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * revert "softirq: Add support for triggering softirq work on softirqs"Christoph Hellwig2015-05-171-131/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit fc21c0cff2f425891b28ff6fb6b03b325c977428 upstream. This commit was incomplete in that code to remove items from the per-cpu lists was missing and never acquired a user in the 5 years it has been in the tree. We're going to implement what it seems to try to archive in a simpler way, and this code is in the way of doing so. Signed-off-by: Christoph Hellwig <hch@lst.de> Cc: Jan Kara <jack@suse.cz> Cc: Jens Axboe <axboe@kernel.dk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Pan Xinhui <xinhuix.pan@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * ksoftirqd: Enable IRQs and call cond_resched() before poking RCUCalvin Owens2015-05-061-1/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 28423ad283d5348793b0c45cc9b1af058e776fd6 upstream. While debugging an issue with excessive softirq usage, I encountered the following note in commit 3e339b5dae24a706 ("softirq: Use hotplug thread infrastructure"): [ paulmck: Call rcu_note_context_switch() with interrupts enabled. ] ...but despite this note, the patch still calls RCU with IRQs disabled. This seemingly innocuous change caused a significant regression in softirq CPU usage on the sending side of a large TCP transfer (~1 GB/s): when introducing 0.01% packet loss, the softirq usage would jump to around 25%, spiking as high as 50%. Before the change, the usage would never exceed 5%. Moving the call to rcu_note_context_switch() after the cond_sched() call, as it was originally before the hotplug patch, completely eliminated this problem. Signed-off-by: Calvin Owens <calvinowens@fb.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Mike Galbraith <mgalbraith@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * ptrace: fix race between ptrace_resume() and wait_task_stopped()Oleg Nesterov2015-05-061-0/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit b72c186999e689cb0b055ab1c7b3cd8fffbeb5ed upstream. ptrace_resume() is called when the tracee is still __TASK_TRACED. We set tracee->exit_code and then wake_up_state() changes tracee->state. If the tracer's sub-thread does wait() in between, task_stopped_code(ptrace => T) wrongly looks like another report from tracee. This confuses debugger, and since wait_task_stopped() clears ->exit_code the tracee can miss a signal. Test-case: #include <stdio.h> #include <unistd.h> #include <sys/wait.h> #include <sys/ptrace.h> #include <pthread.h> #include <assert.h> int pid; void *waiter(void *arg) { int stat; for (;;) { assert(pid == wait(&stat)); assert(WIFSTOPPED(stat)); if (WSTOPSIG(stat) == SIGHUP) continue; assert(WSTOPSIG(stat) == SIGCONT); printf("ERR! extra/wrong report:%x\n", stat); } } int main(void) { pthread_t thread; pid = fork(); if (!pid) { assert(ptrace(PTRACE_TRACEME, 0,0,0) == 0); for (;;) kill(getpid(), SIGHUP); } assert(pthread_create(&thread, NULL, waiter, NULL) == 0); for (;;) ptrace(PTRACE_CONT, pid, 0, SIGCONT); return 0; } Note for stable: the bug is very old, but without 9899d11f6544 "ptrace: ensure arch_ptrace/ptrace_request can never race with SIGKILL" the fix should use lock_task_sighand(child). Signed-off-by: Oleg Nesterov <oleg@redhat.com> Reported-by: Pavel Labath <labath@google.com> Tested-by: Pavel Labath <labath@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * ring-buffer: Replace this_cpu_*() with __this_cpu_*()Steven Rostedt2015-05-061-6/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 80a9b64e2c156b6523e7a01f2ba6e5d86e722814 upstream. It has come to my attention that this_cpu_read/write are horrible on architectures other than x86. Worse yet, they actually disable preemption or interrupts! This caused some unexpected tracing results on ARM. 101.356868: preempt_count_add <-ring_buffer_lock_reserve 101.356870: preempt_count_sub <-ring_buffer_lock_reserve The ring_buffer_lock_reserve has recursion protection that requires accessing a per cpu variable. But since preempt_disable() is traced, it too got traced while accessing the variable that is suppose to prevent recursion like this. The generic version of this_cpu_read() and write() are: #define this_cpu_generic_read(pcp) \ ({ typeof(pcp) ret__; \ preempt_disable(); \ ret__ = *this_cpu_ptr(&(pcp)); \ preempt_enable(); \ ret__; \ }) #define this_cpu_generic_to_op(pcp, val, op) \ do { \ unsigned long flags; \ raw_local_irq_save(flags); \ *__this_cpu_ptr(&(pcp)) op val; \ raw_local_irq_restore(flags); \ } while (0) Which is unacceptable for locations that know they are within preempt disabled or interrupt disabled locations. Paul McKenney stated that __this_cpu_() versions produce much better code on other architectures than this_cpu_() does, if we know that the call is done in a preempt disabled location. I also changed the recursive_unlock() to use two local variables instead of accessing the per_cpu variable twice. Link: http://lkml.kernel.org/r/20150317114411.GE3589@linux.vnet.ibm.com Link: http://lkml.kernel.org/r/20150317104038.312e73d1@gandalf.local.home Acked-by: Christoph Lameter <cl@linux.com> Reported-by: Uwe Kleine-Koenig <u.kleine-koenig@pengutronix.de> Tested-by: Uwe Kleine-Koenig <u.kleine-koenig@pengutronix.de> Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * move d_rcu from overlapping d_child to overlapping d_aliasAl Viro2015-04-293-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | commit 946e51f2bf37f1656916eb75bd0742ba33983c28 upstream. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Cc: Ben Hutchings <ben@decadent.org.uk> [hujianyang: Backported to 3.10 refer to the work of Ben Hutchings in 3.2: - Apply name changes in all the different places we use d_alias and d_child - Move the WARN_ON() in __d_free() to d_free() as we don't have dentry_free()] Signed-off-by: hujianyang <hujianyang@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * console: Fix console name size mismatchPeter Hurley2015-04-191-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 30a22c215a0007603ffc08021f2e8b64018517dd upstream. commit 6ae9200f2cab7 ("enlarge console.name") increased the storage for the console name to 16 bytes, but not the corresponding struct console_cmdline::name storage. Console names longer than 8 bytes cause read beyond end-of-string and failure to match console; I'm not sure if there are other unexpected consequences. Signed-off-by: Peter Hurley <peter@hurleysoftware.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * perf: Fix irq_work 'tail' recursionPeter Zijlstra2015-04-131-0/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit d525211f9d1be8b523ec7633f080f2116f5ea536 upstream. Vince reported a watchdog lockup like: [<ffffffff8115e114>] perf_tp_event+0xc4/0x210 [<ffffffff810b4f8a>] perf_trace_lock+0x12a/0x160 [<ffffffff810b7f10>] lock_release+0x130/0x260 [<ffffffff816c7474>] _raw_spin_unlock_irqrestore+0x24/0x40 [<ffffffff8107bb4d>] do_send_sig_info+0x5d/0x80 [<ffffffff811f69df>] send_sigio_to_task+0x12f/0x1a0 [<ffffffff811f71ce>] send_sigio+0xae/0x100 [<ffffffff811f72b7>] kill_fasync+0x97/0xf0 [<ffffffff8115d0b4>] perf_event_wakeup+0xd4/0xf0 [<ffffffff8115d103>] perf_pending_event+0x33/0x60 [<ffffffff8114e3fc>] irq_work_run_list+0x4c/0x80 [<ffffffff8114e448>] irq_work_run+0x18/0x40 [<ffffffff810196af>] smp_trace_irq_work_interrupt+0x3f/0xc0 [<ffffffff816c99bd>] trace_irq_work_interrupt+0x6d/0x80 Which is caused by an irq_work generating new irq_work and therefore not allowing forward progress. This happens because processing the perf irq_work triggers another perf event (tracepoint stuff) which in turn generates an irq_work ad infinitum. Avoid this by raising the recursion counter in the irq_work -- which effectively disables all software events (including tracepoints) from actually triggering again. Reported-by: Vince Weaver <vincent.weaver@maine.edu> Tested-by: Vince Weaver <vincent.weaver@maine.edu> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Steven Rostedt <rostedt@goodmis.org> Link: http://lkml.kernel.org/r/20150219170311.GH21418@twins.programming.kicks-ass.net Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* | suspend: Return error when pending wakeup source is found.Ruchi Kandoi2015-09-161-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | If a wakeup source is found to be pending in the last stage of suspend after syscore suspend then the device doesn't suspend but the error is not propogated which causes an error in the accounting for the number of suspend aborts and successful suspends. Change-Id: Ib63b4ead755127eaf03e3b303aab3c782ad02ed1 Signed-off-by: Ruchi Kandoi <kandoiruchi@google.com> Git-commit: 6e9c6582376ac3c430bf7c89647313b2e685a597 Git-repo: https://android.googlesource.com/kernel/common.git Signed-off-by: Kaushal Kumar <kaushalk@codeaurora.org>
* | Power: Report suspend times from last_suspend_timejinqian2015-09-161-0/+36
| | | | | | | | | | | | | | | | | | | | | | | | This node epxorts two values separated by space. From left to right: 1. time spent in suspend/resume process 2. time spent sleep in suspend state Change-Id: I2cb9a9408a5fd12166aaec11b935a0fd6a408c63 Git-commit: c2e60d7cefa8f573f318c0f17895da8e765f1532 Git-repo: https://android.googlesource.com/kernel/common.git Signed-off-by: Kaushal Kumar <kaushalk@codeaurora.org>
* | sched: fair: Return current CPU if search_cpu(s) is emptyJoonwoo Park2015-09-021-1/+4
| | | | | | | | | | | | | | | | | | | | | | If a task is affined to CPUs that are offline, select_best_cpu and best_small_task_cpu will attempt to select a cpu from an empty mask, resulting in corruption/crashes. Fix this by returning the current CPU if search_cpu(s) is empty. Change-Id: Ib56a8a2513b0ae7ec8414b5ed3134e88b24dd2ac Signed-off-by: Vikram Mulukutla <markivx@codeaurora.org> Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>
* | sched: fix bug in small task CPU selectionPavankumar Kondeti2015-07-291-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The maximum capacity CPU is a fallback option for small tasks and selected only when all the allowed lower capacity CPUs are overloaded. When the first such fallback CPU is encountered during the search, its cluster cpumask is computed and removed from the search. cpumask_and(&fb_search_cpu, &search_cpu, &rq->freq_domain_cpumask); cpumask_andnot(&search_cpu, &search_cpu, &rq->freq_domain_cpumask); The iterator CPU is cleared from the search_cpu mask before this, due to which it is not set in fb_search_cpu mask. Later, this mask is used to construct a search mask for iterating over the lower capacity CPUs to find the busy CPU. cpumask_and(&search_cpu, tsk_cpus_allowed(p), cpu_online_mask); cpumask_andnot(&search_cpu, &search_cpu, &fb_search_cpu); The search_cpu mask has now a maximum capacity CPU due to the above mentioned bug and may get selected instead of a lower capacity CPU that can accommodate this small task under spill threshold. Change-Id: Id330af66542907e24dfe60a6424b195aa8623cea Signed-off-by: Pavankumar Kondeti <pkondeti@codeaurora.org>