aboutsummaryrefslogtreecommitdiff
path: root/kernel/trace
Commit message (Collapse)AuthorAgeFilesLines
* tracing: power: Add trace events for core controlJunjie Wu2017-03-071-1/+2
| | | | | | | Add trace events for core control module. Change-Id: I36da5381709f81ef1ba82025cd9cf8610edef3fc Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
* Merge remote-tracking branch 'aosp/android-msm-seed-3.10-nougat-mr1.1' into HEADArvin Quilao2017-01-081-5/+4
| | | | Change-Id: Ic50c8f4aea56d08a791d1a2b07fe69515757df22
* msm: null pointer dereferencingWish Wu2016-01-221-1/+4
| | | | | | | | | | | | | | | | | Prevent unintended kernel NULL pointer dereferencing. Code: hlist_del_rcu(&event->hlist_entry); Fix: Adding pointer check: if(!hlist_unhashed(&p_event->hlist_entry)) hlist_del_rcu(&p_event->hlist_entry); Bug: 25364034 Change-Id: Ib13a7400d4a36a4b08b0afc9b7d69c6027e741b6 Signed-off-by: Yuan Lin <yualin@google.com> Signed-off-by: Christian Bejram <cbejram@google.com> (cherry picked from commit 8e04ed2bc057656d343a28ab8b01718371ca9f42)
* msm: rtb: Add timestamp to rtb loggingVignesh Radhakrishnan2014-11-231-3/+11
| | | | | | | | | | | | | RTB logging currently doesn't log the time at which the logging was done. This can be useful to compare with dmesg during debug. The bytes for timestamp are taken by reducing the sentinel array size to three from eleven thus giving the extra 8 bytes to store time. This maintains the size of the layout at 32. Change-Id: Ifc7e4d2e89ed14d2a97467891ebefa9515983630 Signed-off-by: Vignesh Radhakrishnan <vigneshr@codeaurora.org>
* Merge upstream tag 'v3.10.49' into msm-3.10Ian Maund2014-08-205-50/+46
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * commit 'v3.10.49': (529 commits) Linux 3.10.49 ACPI / battery: Retry to get battery information if failed during probing x86, ioremap: Speed up check for RAM pages Score: Modify the Makefile of Score, remove -mlong-calls for compiling Score: The commit is for compiling successfully. Score: Implement the function csum_ipv6_magic score: normalize global variables exported by vmlinux.lds rtmutex: Plug slow unlock race rtmutex: Handle deadlock detection smarter rtmutex: Detect changes in the pi lock chain rtmutex: Fix deadlock detector for real ring-buffer: Check if buffer exists before polling drm/radeon: stop poisoning the GART TLB drm/radeon: fix typo in golden register setup on evergreen ext4: disable synchronous transaction batching if max_batch_time==0 ext4: clarify error count warning messages ext4: fix unjournalled bg descriptor while initializing inode bitmap dm io: fix a race condition in the wake up code for sync_io Drivers: hv: vmbus: Fix a bug in the channel callback dispatch code clk: spear3xx: Use proper control register offset ... In addition to bringing in upstream commits, this merge also makes minor changes to mainitain compatibility with upstream: The definition of list_next_entry in qcrypto.c and ipa_dp.c has been removed, as upstream has moved the definition to list.h. The implementation of list_next_entry was identical between the two. irq.c, for both arm and arm64 architecture, has had its calls to __irq_set_affinity_locked updated to reflect changes to the API upstream. Finally, as we have removed the sleep_length member variable of the tick_sched struct, all changes made by upstream commit ec804bd do not apply to our tree and have been removed from this merge. Only kernel/time/tick-sched.c is impacted. Change-Id: I63b7e0c1354812921c94804e1f3b33d1ad6ee3f1 Signed-off-by: Ian Maund <imaund@codeaurora.org>
| * ring-buffer: Check if buffer exists before pollingSteven Rostedt (Red Hat)2014-07-173-10/+24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 8b8b36834d0fff67fc8668093f4312dd04dcf21d upstream. The per_cpu buffers are created one per possible CPU. But these do not mean that those CPUs are online, nor do they even exist. With the addition of the ring buffer polling, it assumes that the caller polls on an existing buffer. But this is not the case if the user reads trace_pipe from a CPU that does not exist, and this causes the kernel to crash. Simple fix is to check the cpu against buffer bitmask against to see if the buffer was allocated or not and return -ENODEV if it is not. More updates were done to pass the -ENODEV back up to userspace. Link: http://lkml.kernel.org/r/5393DB61.6060707@oracle.com Reported-by: Sasha Levin <sasha.levin@oracle.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * tracing: Remove ftrace_stop/start() from reading the trace fileSteven Rostedt (Red Hat)2014-07-091-2/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 099ed151675cd1d2dbeae1dac697975f6a68716d upstream. Disabling reading and writing to the trace file should not be able to disable all function tracing callbacks. There's other users today (like kprobes and perf). Reading a trace file should not stop those from happening. Reviewed-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * tracing: Try again for saved cmdline if failed due to lockingSteven Rostedt (Red Hat)2014-07-061-6/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 379cfdac37923653c9d4242d10052378b7563005 upstream. In order to prevent the saved cmdline cache from being filled when tracing is not active, the comms are only recorded after a trace event is recorded. The problem is, a comm can fail to be recorded if the trace_cmdline_lock is held. That lock is taken via a trylock to allow it to happen from any context (including NMI). If the lock fails to be taken, the comm is skipped. No big deal, as we will try again later. But! Because of the code that was added to only record after an event, we may not try again later as the recording is made as a oneshot per event per CPU. Only disable the recording of the comm if the comm is actually recorded. Fixes: 7ffbd48d5cab "tracing: Cache comms only after an event occurred" Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * ftrace/module: Hardcode ftrace_module_init() call into load_module()Steven Rostedt (Red Hat)2014-06-071-23/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit a949ae560a511fe4e3adf48fa44fefded93e5c2b upstream. A race exists between module loading and enabling of function tracer. CPU 1 CPU 2 ----- ----- load_module() module->state = MODULE_STATE_COMING register_ftrace_function() mutex_lock(&ftrace_lock); ftrace_startup() update_ftrace_function(); ftrace_arch_code_modify_prepare() set_all_module_text_rw(); <enables-ftrace> ftrace_arch_code_modify_post_process() set_all_module_text_ro(); [ here all module text is set to RO, including the module that is loading!! ] blocking_notifier_call_chain(MODULE_STATE_COMING); ftrace_init_module() [ tries to modify code, but it's RO, and fails! ftrace_bug() is called] When this race happens, ftrace_bug() will produces a nasty warning and all of the function tracing features will be disabled until reboot. The simple solution is to treate module load the same way the core kernel is treated at boot. To hardcode the ftrace function modification of converting calls to mcount into nops. This is done in init/main.c there's no reason it could not be done in load_module(). This gives a better control of the changes and doesn't tie the state of the module to its notifiers as much. Ftrace is special, it needs to be treated as such. The reason this would work, is that the ftrace_module_init() would be called while the module is in MODULE_STATE_UNFORMED, which is ignored by the set_all_module_text_ro() call. Link: http://lkml.kernel.org/r/1395637826-3312-1-git-send-email-indou.takao@jp.fujitsu.com Reported-by: Takao Indoh <indou.takao@jp.fujitsu.com> Acked-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * blktrace: fix accounting of partially completed requestsRoman Pen2014-05-301-9/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit af5040da01ef980670b3741b3e10733ee3e33566 upstream. trace_block_rq_complete does not take into account that request can be partially completed, so we can get the following incorrect output of blkparser: C R 232 + 240 [0] C R 240 + 232 [0] C R 248 + 224 [0] C R 256 + 216 [0] but should be: C R 232 + 8 [0] C R 240 + 8 [0] C R 248 + 8 [0] C R 256 + 8 [0] Also, the whole output summary statistics of completed requests and final throughput will be incorrect. This patch takes into account real completion size of the request and fixes wrong completion accounting. Signed-off-by: Roman Pen <r.peniaev@gmail.com> CC: Steven Rostedt <rostedt@goodmis.org> CC: Frederic Weisbecker <fweisbec@gmail.com> CC: Ingo Molnar <mingo@redhat.com> CC: linux-kernel@vger.kernel.org Signed-off-by: Jens Axboe <axboe@fb.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* | Merge "ipc_logging: Validate the next page pointer"Linux Build Service Account2014-08-091-0/+4
|\ \
| * | ipc_logging: Validate the next page pointerZaheerulla Meer2014-08-061-0/+4
| | | | | | | | | | | | | | | | | | | | | Validate the pointer returned from get_next_page function. Change-Id: I32d7c0771d116186fffb206335993f6114e303f8 Signed-off-by: Zaheerulla Meer <zmeer@codeaurora.org>
* | | sched: add affinity, task load information to sched tracepointsSteve Muckle2014-07-221-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Knowing the affinity mask and CPU usage of a task is helpful in understanding the behavior of the system. Affinity information has been added to the enq_deq trace event, and the migration tracepoint now reports the load of the task migrated. Change-Id: I29d8a610292b4dfeeb8fe16174e9d4dc196649b7 Signed-off-by: Steve Muckle <smuckle@codeaurora.org> Signed-off-by: Syed Rameez Mustafa <rameezmustafa@codeaurora.org>
* | | trace, ring-buffer: Fix CPU hotplug callback registrationSrivatsa S. Bhat2014-07-011-8/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Subsystems that want to register CPU hotplug callbacks, as well as perform initialization for the CPUs that are already online, often do it as shown below: get_online_cpus(); for_each_online_cpu(cpu) init_cpu(cpu); register_cpu_notifier(&foobar_cpu_notifier); put_online_cpus(); This is wrong, since it is prone to ABBA deadlocks involving the cpu_add_remove_lock and the cpu_hotplug.lock (when running concurrently with CPU hotplug operations). Instead, the correct and race-free way of performing the callback registration is: cpu_notifier_register_begin(); for_each_online_cpu(cpu) init_cpu(cpu); /* Note the use of the double underscored version of the API */ __register_cpu_notifier(&foobar_cpu_notifier); cpu_notifier_register_done(); Fix the tracing ring-buffer code by using this latter form of callback registration. Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Acked-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Git-commit: d39ad278a3001c860da4d7c13e51259b1904bec5 Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git Signed-off-by: Osvaldo Banuelos <osvaldob@codeaurora.org>
* | | trace: ipc_logging: fix non-destructive read logicEric Holmberg2014-06-171-13/+33
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When dumping logs through debugfs, a non-destructive read is performed and after all data has been printed, the call waits on a completion for new log data. When that event is completed, the entire contents of the logs are incorrectly printed. Fix the read logic to correctly update the non-destructive read pointer such that only the new data is printed to the logs. CRs-Fixed: 681317 Change-Id: Ia12787b576b18b9f5174cefe91eb283979c286f7 Signed-off-by: Eric Holmberg <eholmber@codeaurora.org>
* | | trace: ipc_logging: reset completion instead of reinitializingEric Holmberg2014-06-171-1/+1
|/ / | | | | | | | | | | | | | | | | | | | | init_completion() is being used to reinitialize the read completion which will cause problems if complete_all() is ever used. Use INIT_COMPLETION instead of init_completion() to reinitialize the completion. Change-Id: I33a78559d9debb6c90a039bd9f22f391c6c15f27 Signed-off-by: Eric Holmberg <eholmber@codeaurora.org>
* | Merge "msm: ipc_logging: enhance log-extraction support"Linux Build Service Account2014-05-312-47/+231
|\ \
| * | msm: ipc_logging: enhance log-extraction supportEric Holmberg2014-05-272-47/+231
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | One of the main goals of the IPC Logging is to allow extracting logs from memory dumps. The current implementation has the following limitations: 1) if logs are being dumped through debugfs at the time of the crash, it is not possible to extract logs that were already dumped 2) it is not always possible to tell the difference between empty, partially full, and full log pages resulting in extraction complications 3) if the system memory is not cleared between reboots, then the same log ID and log page may be extracted leading to duplicate log pages Add additional debugfs read index, timestamps for log pages, and change the read/write indices to easily differentiate between empty, partially full, and full log pages. CRs-Fixed: 670345 Change-Id: I695f6e0605f3755b868ede59851f5ca919f8fb0d Signed-off-by: Eric Holmberg <eholmber@codeaurora.org>
* | | Merge "msm: ipc_logging: add locking hierarchy"Linux Build Service Account2014-05-312-26/+26
|\| |
| * | msm: ipc_logging: add locking hierarchyEric Holmberg2014-05-272-26/+26
| | | | | | | | | | | | | | | | | | | | | | | | Add locking hierarchy to allow identification of deadlocks by code inspection. Change-Id: I41cd479d72be7b2c729fbf09b5b89180c0a11191 Signed-off-by: Eric Holmberg <eholmber@codeaurora.org>
* | | Merge "msm: ipc_logging: add client version support"Linux Build Service Account2014-05-312-1/+31
|\| |
| * | msm: ipc_logging: add client version supportEric Holmberg2014-05-272-1/+31
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If clients use custom serialization functions, then they may need to define a version for deserialization support for log extraction. Add client version support. Change-Id: Id135f06d4142de39275b5d0caab88708d5496b5e Signed-off-by: Eric Holmberg <eholmber@codeaurora.org>
* | | Merge "msm: ipc_logging: remove internal details from header"Linux Build Service Account2014-05-312-18/+19
|\| |
| * | msm: ipc_logging: remove internal details from headerEric Holmberg2014-05-272-18/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | | Some internal details no longer need to be shared between modules, so they can be moved to the main implementation file to avoid unnecessary coupling in the future. Change-Id: Ie8774f0c95f5ec6a56a95c57fea79032194eb05e Signed-off-by: Eric Holmberg <eholmber@codeaurora.org>
* | | ipc_logging: Allow logging to resume after device suspendSteven Cahail2014-05-271-4/+6
|/ / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | IPC log collection from userspace cannot resume after the device is put into suspend mode, because 0 is returned to userspace when this happens and is interpreted as EOF (end-of-file). Allow the signal generated during a suspend to be propagated back to the userspace application, rather than returning 0, so log collection can be resumed after a suspend. CRs-Fixed: 670289 Change-Id: I67cc4cc980d35bbdc3cbc521a3dd0c347be4abf4 Signed-off-by: Steven Cahail <scahail@codeaurora.org>
* | block/fs: keep track of the task that dirtied the pageVenkat Gopalakrishnan2014-05-061-14/+66
| | | | | | | | | | | | | | | | | | | | | | | | | | Background writes happen in the context of a background thread. It is very useful to identify the actual task that generated the request instead of background task that submited the request. Hence keep track of the task when a page gets dirtied and dump this task info while tracing. Not all the pages in the bio are dirtied by the same task but most likely it will be, since the sectors accessed on the device must be adjacent. Change-Id: I6afba85a2063dd3350a0141ba87cf8440ce9f777 Signed-off-by: Venkat Gopalakrishnan <venkatg@codeaurora.org>
* | Merge upstream linux-stable v3.10.36 into msm-3.10Ian Maund2014-04-235-34/+175
|\| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * commit 'v3.10.36': (494 commits) Linux 3.10.36 netfilter: nf_conntrack_dccp: fix skb_header_pointer API usages mm: close PageTail race net: mvneta: rename MVNETA_GMAC2_PSC_ENABLE to MVNETA_GMAC2_PCS_ENABLE x86: fix boot on uniprocessor systems Input: cypress_ps2 - don't report as a button pads Input: synaptics - add manual min/max quirk for ThinkPad X240 Input: synaptics - add manual min/max quirk Input: mousedev - fix race when creating mixed device ext4: atomically set inode->i_flags in ext4_set_inode_flags() Linux 3.10.35 sched/autogroup: Fix race with task_groups list e100: Fix "disabling already-disabled device" warning xhci: Fix resume issues on Renesas chips in Samsung laptops Input: wacom - make sure touch_max is set for touch devices KVM: VMX: fix use after free of vmx->loaded_vmcs KVM: x86: handle invalid root_hpa everywhere KVM: MMU: handle invalid root_hpa at __direct_map Input: elantech - improve clickpad detection ARM: highbank: avoid L2 cache smc calls when PL310 is not present ... Change-Id: Ib68f565291702c53df09e914e637930c5d3e5310 Signed-off-by: Ian Maund <imaund@codeaurora.org>
| * tracing: Fix array size mismatch in format stringVaibhav Nagarnaik2014-03-312-11/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 87291347c49dc40aa339f587b209618201c2e527 upstream. In event format strings, the array size is reported in two locations. One in array subscript and then via the "size:" attribute. The values reported there have a mismatch. For e.g., in sched:sched_switch the prev_comm and next_comm character arrays have subscript values as [32] where as the actual field size is 16. name: sched_switch ID: 301 format: field:unsigned short common_type; offset:0; size:2; signed:0; field:unsigned char common_flags; offset:2; size:1; signed:0; field:unsigned char common_preempt_count; offset:3; size:1;signed:0; field:int common_pid; offset:4; size:4; signed:1; field:char prev_comm[32]; offset:8; size:16; signed:1; field:pid_t prev_pid; offset:24; size:4; signed:1; field:int prev_prio; offset:28; size:4; signed:1; field:long prev_state; offset:32; size:8; signed:1; field:char next_comm[32]; offset:40; size:16; signed:1; field:pid_t next_pid; offset:56; size:4; signed:1; field:int next_prio; offset:60; size:4; signed:1; After bisection, the following commit was blamed: 92edca0 tracing: Use direct field, type and system names This commit removes the duplication of strings for field->name and field->type assuming that all the strings passed in __trace_define_field() are immutable. This is not true for arrays, where the type string is created in event_storage variable and field->type for all array fields points to event_storage. Use __stringify() to create a string constant for the type string. Also, get rid of event_storage and event_storage_mutex that are not needed anymore. also, an added benefit is that this reduces the overhead of events a bit more: text data bss dec hex filename 8424787 2036472 1302528 11763787 b3804b vmlinux 8420814 2036408 1302528 11759750 b37086 vmlinux.patched Link: http://lkml.kernel.org/r/1392349908-29685-1-git-send-email-vnagarnaik@google.com Cc: Laurent Chavey <chavey@google.com> Signed-off-by: Vaibhav Nagarnaik <vnagarnaik@google.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * tracing: Do not add event files for modules that fail tracepointsSteven Rostedt (Red Hat)2014-03-231-0/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 45ab2813d40d88fc575e753c38478de242d03f88 upstream. If a module fails to add its tracepoints due to module tainting, do not create the module event infrastructure in the debugfs directory. As the events will not work and worse yet, they will silently fail, making the user wonder why the events they enable do not display anything. Having a warning on module load and the events not visible to the users will make the cause of the problem much clearer. Link: http://lkml.kernel.org/r/20140227154923.265882695@goodmis.org Fixes: 6d723736e472 "tracing/events: add support for modules to TRACE_EVENT" Acked-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * ring-buffer: Fix first commit on sub-buffer having non-zero deltaSteven Rostedt (Red Hat)2014-02-221-0/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit d651aa1d68a2f0a7ee65697b04c6a92f8c0a12f2 upstream. Each sub-buffer (buffer page) has a full 64 bit timestamp. The events on that page use a 27 bit delta against that timestamp in order to save on bits written to the ring buffer. If the time between events is larger than what the 27 bits can hold, a "time extend" event is added to hold the entire 64 bit timestamp again and the events after that hold a delta from that timestamp. As a "time extend" is always paired with an event, it is logical to just allocate the event with the time extend, to make things a bit more efficient. Unfortunately, when the pairing code was written, it removed the "delta = 0" from the first commit on a page, causing the events on the page to be slightly skewed. Fixes: 69d1b839f7ee "ring-buffer: Bind time extend and data events together" Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * ftrace: Have function graph only trace based on global_ops filtersSteven Rostedt2014-02-131-1/+44
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 23a8e8441a0a74dd612edf81dc89d1600bc0a3d1 upstream. Doing some different tests, I discovered that function graph tracing, when filtered via the set_ftrace_filter and set_ftrace_notrace files, does not always keep with them if another function ftrace_ops is registered to trace functions. The reason is that function graph just happens to trace all functions that the function tracer enables. When there was only one user of function tracing, the function graph tracer did not need to worry about being called by functions that it did not want to trace. But now that there are other users, this becomes a problem. For example, one just needs to do the following: # cd /sys/kernel/debug/tracing # echo schedule > set_ftrace_filter # echo function_graph > current_tracer # cat trace [..] 0) | schedule() { ------------------------------------------ 0) <idle>-0 => rcu_pre-7 ------------------------------------------ 0) ! 2980.314 us | } 0) | schedule() { ------------------------------------------ 0) rcu_pre-7 => <idle>-0 ------------------------------------------ 0) + 20.701 us | } # echo 1 > /proc/sys/kernel/stack_tracer_enabled # cat trace [..] 1) + 20.825 us | } 1) + 21.651 us | } 1) + 30.924 us | } /* SyS_ioctl */ 1) | do_page_fault() { 1) | __do_page_fault() { 1) 0.274 us | down_read_trylock(); 1) 0.098 us | find_vma(); 1) | handle_mm_fault() { 1) | _raw_spin_lock() { 1) 0.102 us | preempt_count_add(); 1) 0.097 us | do_raw_spin_lock(); 1) 2.173 us | } 1) | do_wp_page() { 1) 0.079 us | vm_normal_page(); 1) 0.086 us | reuse_swap_page(); 1) 0.076 us | page_move_anon_rmap(); 1) | unlock_page() { 1) 0.082 us | page_waitqueue(); 1) 0.086 us | __wake_up_bit(); 1) 1.801 us | } 1) 0.075 us | ptep_set_access_flags(); 1) | _raw_spin_unlock() { 1) 0.098 us | do_raw_spin_unlock(); 1) 0.105 us | preempt_count_sub(); 1) 1.884 us | } 1) 9.149 us | } 1) + 13.083 us | } 1) 0.146 us | up_read(); When the stack tracer was enabled, it enabled all functions to be traced, which now the function graph tracer also traces. This is a side effect that should not occur. To fix this a test is added when the function tracing is changed, as well as when the graph tracer is enabled, to see if anything other than the ftrace global_ops function tracer is enabled. If so, then the graph tracer calls a test trampoline that will look at the function that is being traced and compare it with the filters defined by the global_ops. As an optimization, if there's no other function tracers registered, or if the only registered function tracers also use the global ops, the function graph infrastructure will call the registered function graph callback directly and not go through the test trampoline. Fixes: d2d45c7a03a2 "tracing: Have stack_tracer use a separate list of functions" Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * ftrace: Fix synchronization location disabling and freeing ftrace_opsSteven Rostedt2014-02-131-18/+32
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit a4c35ed241129dd142be4cadb1e5a474a56d5464 upstream. The synchronization needed after ftrace_ops are unregistered must happen after the callback is disabled from becing called by functions. The current location happens after the function is being removed from the internal lists, but not after the function callbacks were disabled, leaving the functions susceptible of being called after their callbacks are freed. This affects perf and any externel users of function tracing (LTTng and SystemTap). Fixes: cdbe61bfe704 "ftrace: Allow dynamically allocated function tracers" Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * ftrace: Synchronize setting function_trace_op with ftrace_trace_functionSteven Rostedt2014-02-131-4/+72
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 405e1d834807e51b2ebd3dea81cb51e53fb61504 upstream. ftrace_trace_function is a variable that holds what function will be called directly by the assembly code (mcount). If just a single function is registered and it handles recursion itself, then the assembly will call that function directly without any helper function. It also passes in the ftrace_op that was registered with the callback. The ftrace_op to send is stored in the function_trace_op variable. The ftrace_trace_function and function_trace_op needs to be coordinated such that the called callback wont be called with the wrong ftrace_op, otherwise bad things can happen if it expected a different op. Luckily, there's no callback that doesn't use the helper functions that requires this. But there soon will be and this needs to be fixed. Use a set_function_trace_op to store the ftrace_op to set the function_trace_op to when it is safe to do so (during the update function within the breakpoint or stop machine calls). Or if dynamic ftrace is not being used (static tracing) then we have to do a bit more synchronization when the ftrace_trace_function is set as that takes affect immediately (as oppose to dynamic ftrace doing it with the modification of the trampoline). Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * tracing: Check if tracing is enabled in trace_puts()Steven Rostedt (Red Hat)2014-02-131-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 3132e107d608f8753240d82d61303c500fd515b4 upstream. If trace_puts() is used very early in boot up, it can crash the machine if it is called before the ring buffer is allocated. If a trace_printk() is used with no arguments, then it will be converted into a trace_puts() and suffer the same fate. Fixes: 09ae72348ecc "tracing: Add trace_puts() for even faster trace_printk() tracing" Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * tracing: Have trace buffer point back to trace_arraySteven Rostedt (Red Hat)2014-02-131-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit dced341b2d4f06668efaab33f88de5d287c0f45b upstream. The trace buffer has a descriptor pointer that goes back to the trace array. But it was never assigned. Luckily, nothing uses it (yet), but it will in the future. Although nothing currently uses this, if any of the new features get backported to older kernels, and because this is such a simple change, I'm marking it for stable too. Fixes: 12883efb670c "tracing: Consolidate max_tr into main trace_array structure" Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* | Merge upstream linux-stable v3.10.28 into msm-3.10Ian Maund2014-03-2410-288/+766
|\| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The following commits have been reverted from this merge, as they are known to introduce new bugs and are currently incompatible with our audio implementation. Investigation of these commits is ongoing, and they are expected to be brought in at a later time: 86e6de7 ALSA: compress: fix drain calls blocking other compress functions (v6) 16442d4 ALSA: compress: fix drain calls blocking other compress functions This merge commit also includes a change in block, necessary for compilation. Upstream has modified elevator_init_fn to prevent race conditions, requring updates to row_init_queue and test_init_queue. * commit 'v3.10.28': (1964 commits) Linux 3.10.28 ARM: 7938/1: OMAP4/highbank: Flush L2 cache before disabling drm/i915: Don't grab crtc mutexes in intel_modeset_gem_init() serial: amba-pl011: use port lock to guard control register access mm: Make {,set}page_address() static inline if WANT_PAGE_VIRTUAL md/raid5: Fix possible confusion when multiple write errors occur. md/raid10: fix two bugs in handling of known-bad-blocks. md/raid10: fix bug when raid10 recovery fails to recover a block. md: fix problem when adding device to read-only array with bitmap. drm/i915: fix DDI PLLs HW state readout code nilfs2: fix segctor bug that causes file system corruption thp: fix copy_page_rep GPF by testing is_huge_zero_pmd once only ftrace/x86: Load ftrace_ops in parameter not the variable holding it SELinux: Fix possible NULL pointer dereference in selinux_inode_permission() writeback: Fix data corruption on NFS hwmon: (coretemp) Fix truncated name of alarm attributes vfs: In d_path don't call d_dname on a mount point staging: comedi: adl_pci9111: fix incorrect irq passed to request_irq() staging: comedi: addi_apci_1032: fix subdevice type/flags bug mm/memory-failure.c: recheck PageHuge() after hugetlb page migrate successfully GFS2: Increase i_writecount during gfs2_setattr_chown perf/x86/amd/ibs: Fix waking up from S3 for AMD family 10h perf scripting perl: Fix build error on Fedora 12 ARM: 7815/1: kexec: offline non panic CPUs on Kdump panic Linux 3.10.27 sched: Guarantee new group-entities always have weight sched: Fix hrtimer_cancel()/rq->lock deadlock sched: Fix cfs_bandwidth misuse of hrtimer_expires_remaining sched: Fix race on toggling cfs_bandwidth_used x86, fpu, amd: Clear exceptions in AMD FXSAVE workaround netfilter: nf_nat: fix access to uninitialized buffer in IRC NAT helper SCSI: sd: Reduce buffer size for vpd request intel_pstate: Add X86_FEATURE_APERFMPERF to cpu match parameters. mac80211: move "bufferable MMPDU" check to fix AP mode scan ACPI / Battery: Add a _BIX quirk for NEC LZ750/LS ACPI / TPM: fix memory leak when walking ACPI namespace mfd: rtsx_pcr: Disable interrupts before cancelling delayed works clk: exynos5250: fix sysmmu_mfc{l,r} gate clocks clk: samsung: exynos5250: Add CLK_IGNORE_UNUSED flag for the sysreg clock clk: samsung: exynos4: Correct SRC_MFC register clk: clk-divider: fix divisor > 255 bug ahci: add PCI ID for Marvell 88SE9170 SATA controller parisc: Ensure full cache coherency for kmap/kunmap drm/nouveau/bios: make jump conditional ARM: shmobile: mackerel: Fix coherent DMA mask ARM: shmobile: armadillo: Fix coherent DMA mask ARM: shmobile: kzm9g: Fix coherent DMA mask ARM: dts: exynos5250: Fix MDMA0 clock number ARM: fix "bad mode in ... handler" message for undefined instructions ARM: fix footbridge clockevent device net: Loosen constraints for recalculating checksum in skb_segment() bridge: use spin_lock_bh() in br_multicast_set_hash_max netpoll: Fix missing TXQ unlock and and OOPS. net: llc: fix use after free in llc_ui_recvmsg virtio-net: fix refill races during restore virtio_net: don't leak memory or block when too many frags virtio-net: make all RX paths handle errors consistently virtio_net: fix error handling for mergeable buffers vlan: Fix header ops passthru when doing TX VLAN offload. net: rose: restore old recvmsg behavior rds: prevent dereference of a NULL device ipv6: always set the new created dst's from in ip6_rt_copy net: fec: fix potential use after free hamradio/yam: fix info leak in ioctl drivers/net/hamradio: Integer overflow in hdlcdrv_ioctl() net: inet_diag: zero out uninitialized idiag_{src,dst} fields ip_gre: fix msg_name parsing for recvfrom/recvmsg net: unix: allow bind to fail on mutex lock ipv6: fix illegal mac_header comparison on 32bit netvsc: don't flush peers notifying work during setting mtu tg3: Initialize REG_BASE_ADDR at PCI config offset 120 to 0 net: unix: allow set_peek_off to fail net: drop_monitor: fix the value of maxattr ipv6: don't count addrconf generated routes against gc limit packet: fix send path when running with proto == 0 virtio: delete napi structures from netdev before releasing memory macvtap: signal truncated packets tun: update file current position macvtap: update file current position macvtap: Do not double-count received packets rds: prevent BUG_ON triggered on congestion update to loopback net: do not pretend FRAGLIST support IPv6: Fixed support for blackhole and prohibit routes HID: Revert "Revert "HID: Fix logitech-dj: missing Unifying device issue"" gpio-rcar: R-Car GPIO IRQ share interrupt clocksource: em_sti: Set cpu_possible_mask to fix SMP broadcast irqchip: renesas-irqc: Fix irqc_probe error handling Linux 3.10.26 sh: add EXPORT_SYMBOL(min_low_pfn) and EXPORT_SYMBOL(max_low_pfn) to sh_ksyms_32.c ext4: fix bigalloc regression arm64: Use Normal NonCacheable memory for writecombine arm64: Do not flush the D-cache for anonymous pages arm64: Avoid cache flushing in flush_dcache_page() ARM: KVM: arch_timers: zero CNTVOFF upon return to host ARM: hyp: initialize CNTVOFF to zero clocksource: arch_timer: use virtual counters arm64: Remove unused cpu_name ascii in arch/arm64/mm/proc.S arm64: dts: Reserve the memory used for secondary CPU release address arm64: check for number of arguments in syscall_get/set_arguments() arm64: fix possible invalid FPSIMD initialization state ... Change-Id: Ia0e5d71b536ab49ec3a1179d59238c05bdd03106 Signed-off-by: Ian Maund <imaund@codeaurora.org>
| * ftrace: Initialize the ftrace profiler for each possible cpuMiao Xie2014-01-091-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit c4602c1c818bd6626178d6d3fcc152d9f2f48ac0 upstream. Ftrace currently initializes only the online CPUs. This implementation has two problems: - If we online a CPU after we enable the function profile, and then run the test, we will lose the trace information on that CPU. Steps to reproduce: # echo 0 > /sys/devices/system/cpu/cpu1/online # cd <debugfs>/tracing/ # echo <some function name> >> set_ftrace_filter # echo 1 > function_profile_enabled # echo 1 > /sys/devices/system/cpu/cpu1/online # run test - If we offline a CPU before we enable the function profile, we will not clear the trace information when we enable the function profile. It will trouble the users. Steps to reproduce: # cd <debugfs>/tracing/ # echo <some function name> >> set_ftrace_filter # echo 1 > function_profile_enabled # run test # cat trace_stat/function* # echo 0 > /sys/devices/system/cpu/cpu1/online # echo 0 > function_profile_enabled # echo 1 > function_profile_enabled # cat trace_stat/function* # run test # cat trace_stat/function* So it is better that we initialize the ftrace profiler for each possible cpu every time we enable the function profile instead of just the online ones. Link: http://lkml.kernel.org/r/1387178401-10619-1-git-send-email-miaox@cn.fujitsu.com Signed-off-by: Miao Xie <miaox@cn.fujitsu.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * ftrace: Fix function graph with loading of modulesSteven Rostedt (Red Hat)2013-12-041-29/+35
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 8a56d7761d2d041ae5e8215d20b4167d8aa93f51 upstream. Commit 8c4f3c3fa9681 "ftrace: Check module functions being traced on reload" fixed module loading and unloading with respect to function tracing, but it missed the function graph tracer. If you perform the following # cd /sys/kernel/debug/tracing # echo function_graph > current_tracer # modprobe nfsd # echo nop > current_tracer You'll get the following oops message: ------------[ cut here ]------------ WARNING: CPU: 2 PID: 2910 at /linux.git/kernel/trace/ftrace.c:1640 __ftrace_hash_rec_update.part.35+0x168/0x1b9() Modules linked in: nfsd exportfs nfs_acl lockd ipt_MASQUERADE sunrpc ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_filter ip6_tables uinput snd_hda_codec_idt CPU: 2 PID: 2910 Comm: bash Not tainted 3.13.0-rc1-test #7 Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./To be filled by O.E.M., BIOS SDBLI944.86P 05/08/2007 0000000000000668 ffff8800787efcf8 ffffffff814fe193 ffff88007d500000 0000000000000000 ffff8800787efd38 ffffffff8103b80a 0000000000000668 ffffffff810b2b9a ffffffff81a48370 0000000000000001 ffff880037aea000 Call Trace: [<ffffffff814fe193>] dump_stack+0x4f/0x7c [<ffffffff8103b80a>] warn_slowpath_common+0x81/0x9b [<ffffffff810b2b9a>] ? __ftrace_hash_rec_update.part.35+0x168/0x1b9 [<ffffffff8103b83e>] warn_slowpath_null+0x1a/0x1c [<ffffffff810b2b9a>] __ftrace_hash_rec_update.part.35+0x168/0x1b9 [<ffffffff81502f89>] ? __mutex_lock_slowpath+0x364/0x364 [<ffffffff810b2cc2>] ftrace_shutdown+0xd7/0x12b [<ffffffff810b47f0>] unregister_ftrace_graph+0x49/0x78 [<ffffffff810c4b30>] graph_trace_reset+0xe/0x10 [<ffffffff810bf393>] tracing_set_tracer+0xa7/0x26a [<ffffffff810bf5e1>] tracing_set_trace_write+0x8b/0xbd [<ffffffff810c501c>] ? ftrace_return_to_handler+0xb2/0xde [<ffffffff811240a8>] ? __sb_end_write+0x5e/0x5e [<ffffffff81122aed>] vfs_write+0xab/0xf6 [<ffffffff8150a185>] ftrace_graph_caller+0x85/0x85 [<ffffffff81122dbd>] SyS_write+0x59/0x82 [<ffffffff8150a185>] ftrace_graph_caller+0x85/0x85 [<ffffffff8150a2d2>] system_call_fastpath+0x16/0x1b ---[ end trace 940358030751eafb ]--- The above mentioned commit didn't go far enough. Well, it covered the function tracer by adding checks in __register_ftrace_function(). The problem is that the function graph tracer circumvents that (for a slight efficiency gain when function graph trace is running with a function tracer. The gain was not worth this). The problem came with ftrace_startup() which should always be called after __register_ftrace_function(), if you want this bug to be completely fixed. Anyway, this solution moves __register_ftrace_function() inside of ftrace_startup() and removes the need to call them both. Reported-by: Dave Wysochanski <dwysocha@redhat.com> Fixes: ed926f9b35cd ("ftrace: Use counters to enable functions to trace") Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * perf/ftrace: Fix paranoid level for enabling function tracerSteven Rostedt2013-11-291-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 12ae030d54ef250706da5642fc7697cc60ad0df7 upstream. The current default perf paranoid level is "1" which has "perf_paranoid_kernel()" return false, and giving any operations that use it, access to normal users. Unfortunately, this includes function tracing and normal users should not be allowed to enable function tracing by default. The proper level is defined at "-1" (full perf access), which "perf_paranoid_tracepoint_raw()" will only give access to. Use that check instead for enabling function tracing. Reported-by: Dave Jones <davej@redhat.com> Reported-by: Vince Weaver <vincent.weaver@maine.edu> Tested-by: Vince Weaver <vincent.weaver@maine.edu> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> CVE: CVE-2013-2930 Fixes: ced39002f5ea ("ftrace, perf: Add support to use function tracepoint in perf") Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * tracing: Fix potential out-of-bounds in trace_get_user()Steven Rostedt2013-11-201-1/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 057db8488b53d5e4faa0cedb2f39d4ae75dfbdbb upstream. Andrey reported the following report: ERROR: AddressSanitizer: heap-buffer-overflow on address ffff8800359c99f3 ffff8800359c99f3 is located 0 bytes to the right of 243-byte region [ffff8800359c9900, ffff8800359c99f3) Accessed by thread T13003: #0 ffffffff810dd2da (asan_report_error+0x32a/0x440) #1 ffffffff810dc6b0 (asan_check_region+0x30/0x40) #2 ffffffff810dd4d3 (__tsan_write1+0x13/0x20) #3 ffffffff811cd19e (ftrace_regex_release+0x1be/0x260) #4 ffffffff812a1065 (__fput+0x155/0x360) #5 ffffffff812a12de (____fput+0x1e/0x30) #6 ffffffff8111708d (task_work_run+0x10d/0x140) #7 ffffffff810ea043 (do_exit+0x433/0x11f0) #8 ffffffff810eaee4 (do_group_exit+0x84/0x130) #9 ffffffff810eafb1 (SyS_exit_group+0x21/0x30) #10 ffffffff81928782 (system_call_fastpath+0x16/0x1b) Allocated by thread T5167: #0 ffffffff810dc778 (asan_slab_alloc+0x48/0xc0) #1 ffffffff8128337c (__kmalloc+0xbc/0x500) #2 ffffffff811d9d54 (trace_parser_get_init+0x34/0x90) #3 ffffffff811cd7b3 (ftrace_regex_open+0x83/0x2e0) #4 ffffffff811cda7d (ftrace_filter_open+0x2d/0x40) #5 ffffffff8129b4ff (do_dentry_open+0x32f/0x430) #6 ffffffff8129b668 (finish_open+0x68/0xa0) #7 ffffffff812b66ac (do_last+0xb8c/0x1710) #8 ffffffff812b7350 (path_openat+0x120/0xb50) #9 ffffffff812b8884 (do_filp_open+0x54/0xb0) #10 ffffffff8129d36c (do_sys_open+0x1ac/0x2c0) #11 ffffffff8129d4b7 (SyS_open+0x37/0x50) #12 ffffffff81928782 (system_call_fastpath+0x16/0x1b) Shadow bytes around the buggy address: ffff8800359c9700: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd ffff8800359c9780: fd fd fd fd fd fd fd fd fa fa fa fa fa fa fa fa ffff8800359c9800: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa ffff8800359c9880: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa ffff8800359c9900: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 =>ffff8800359c9980: 00 00 00 00 00 00 00 00 00 00 00 00 00 00[03]fb ffff8800359c9a00: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa ffff8800359c9a80: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa ffff8800359c9b00: fa fa fa fa fa fa fa fa 00 00 00 00 00 00 00 00 ffff8800359c9b80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ffff8800359c9c00: 00 00 00 00 00 00 00 00 fa fa fa fa fa fa fa fa Shadow byte legend (one shadow byte represents 8 application bytes): Addressable: 00 Partially addressable: 01 02 03 04 05 06 07 Heap redzone: fa Heap kmalloc redzone: fb Freed heap region: fd Shadow gap: fe The out-of-bounds access happens on 'parser->buffer[parser->idx] = 0;' Although the crash happened in ftrace_regex_open() the real bug occurred in trace_get_user() where there's an incrementation to parser->idx without a check against the size. The way it is triggered is if userspace sends in 128 characters (EVENT_BUF_SIZE + 1), the loop that reads the last character stores it and then breaks out because there is no more characters. Then the last character is read to determine what to do next, and the index is incremented without checking size. Then the caller of trace_get_user() usually nulls out the last character with a zero, but since the index is equal to the size, it writes a nul character after the allocated space, which can corrupt memory. Luckily, only root user has write access to this file. Link: http://lkml.kernel.org/r/20131009222323.04fd1a0d@gandalf.local.home Reported-by: Andrey Konovalov <andreyknvl@google.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * ftrace: Check module functions being traced on reloadSteven Rostedt (Red Hat)2013-08-291-9/+62
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 8c4f3c3fa9681dc549cd35419b259496082fef8b upstream. There's been a nasty bug that would show up and not give much info. The bug displayed the following warning: WARNING: at kernel/trace/ftrace.c:1529 __ftrace_hash_rec_update+0x1e3/0x230() Pid: 20903, comm: bash Tainted: G O 3.6.11+ #38405.trunk Call Trace: [<ffffffff8103e5ff>] warn_slowpath_common+0x7f/0xc0 [<ffffffff8103e65a>] warn_slowpath_null+0x1a/0x20 [<ffffffff810c2ee3>] __ftrace_hash_rec_update+0x1e3/0x230 [<ffffffff810c4f28>] ftrace_hash_move+0x28/0x1d0 [<ffffffff811401cc>] ? kfree+0x2c/0x110 [<ffffffff810c68ee>] ftrace_regex_release+0x8e/0x150 [<ffffffff81149f1e>] __fput+0xae/0x220 [<ffffffff8114a09e>] ____fput+0xe/0x10 [<ffffffff8105fa22>] task_work_run+0x72/0x90 [<ffffffff810028ec>] do_notify_resume+0x6c/0xc0 [<ffffffff8126596e>] ? trace_hardirqs_on_thunk+0x3a/0x3c [<ffffffff815c0f88>] int_signal+0x12/0x17 ---[ end trace 793179526ee09b2c ]--- It was finally narrowed down to unloading a module that was being traced. It was actually more than that. When functions are being traced, there's a table of all functions that have a ref count of the number of active tracers attached to that function. When a function trace callback is registered to a function, the function's record ref count is incremented. When it is unregistered, the function's record ref count is decremented. If an inconsistency is detected (ref count goes below zero) the above warning is shown and the function tracing is permanently disabled until reboot. The ftrace callback ops holds a hash of functions that it filters on (and/or filters off). If the hash is empty, the default means to filter all functions (for the filter_hash) or to disable no functions (for the notrace_hash). When a module is unloaded, it frees the function records that represent the module functions. These records exist on their own pages, that is function records for one module will not exist on the same page as function records for other modules or even the core kernel. Now when a module unloads, the records that represents its functions are freed. When the module is loaded again, the records are recreated with a default ref count of zero (unless there's a callback that traces all functions, then they will also be traced, and the ref count will be incremented). The problem is that if an ftrace callback hash includes functions of the module being unloaded, those hash entries will not be removed. If the module is reloaded in the same location, the hash entries still point to the functions of the module but the module's ref counts do not reflect that. With the help of Steve and Joern, we found a reproducer: Using uinput module and uinput_release function. cd /sys/kernel/debug/tracing modprobe uinput echo uinput_release > set_ftrace_filter echo function > current_tracer rmmod uinput modprobe uinput # check /proc/modules to see if loaded in same addr, otherwise try again echo nop > current_tracer [BOOM] The above loads the uinput module, which creates a table of functions that can be traced within the module. We add uinput_release to the filter_hash to trace just that function. Enable function tracincg, which increments the ref count of the record associated to uinput_release. Remove uinput, which frees the records including the one that represents uinput_release. Load the uinput module again (and make sure it's at the same address). This recreates the function records all with a ref count of zero, including uinput_release. Disable function tracing, which will decrement the ref count for uinput_release which is now zero because of the module removal and reload, and we have a mismatch (below zero ref count). The solution is to check all currently tracing ftrace callbacks to see if any are tracing any of the module's functions when a module is loaded (it already does that with callbacks that trace all functions). If a callback happens to have a module function being traced, it increments that records ref count and starts tracing that function. There may be a strange side effect with this, where tracing module functions on unload and then reloading a new module may have that new module's functions being traced. This may be something that confuses the user, but it's not a big deal. Another approach is to disable all callback hashes on module unload, but this leaves some ftrace callbacks that may not be registered, but can still have hashes tracing the module's function where ftrace doesn't know about it. That situation can cause the same bug. This solution solves that case too. Another benefit of this solution, is it is possible to trace a module's function on unload and load. Link: http://lkml.kernel.org/r/20130705142629.GA325@redhat.com Reported-by: Jörn Engel <joern@logfs.org> Reported-by: Dave Jones <davej@redhat.com> Reported-by: Steve Hodgson <steve@purestorage.com> Tested-by: Steve Hodgson <steve@purestorage.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * tracing/uprobes: Fail to unregister if probe event files are in useSteven Rostedt (Red Hat)2013-08-291-13/+38
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit c6c2401d8bbaf9edc189b4c35a8cb2780b8b988e upstream. Uprobes suffer the same problem that kprobes have. There's a race between writing to the "enable" file and removing the probe. The probe checks for it being in use and if it is not, goes about deleting the probe and the event that represents it. But the problem with that is, after it checks if it is in use it can be enabled, and the deletion of the event (access to the probe) will fail, as it is in use. But the uprobe will still be deleted. This is a problem as the event can reference the uprobe that was deleted. The fix is to remove the event first, and check to make sure the event removal succeeds. Then it is safe to remove the probe. When the event exists, either ftrace or perf can enable the probe and prevent the event from being removed. Link: http://lkml.kernel.org/r/20130704034038.991525256@goodmis.org Acked-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * tracing/kprobes: Fail to unregister if probe event files are in useSteven Rostedt (Red Hat)2013-08-291-6/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 40c32592668b727cbfcf7b1c0567f581bd62a5e4 upstream. When a probe is being removed, it cleans up the event files that correspond to the probe. But there is a race between writing to one of these files and deleting the probe. This is especially true for the "enable" file. CPU 0 CPU 1 ----- ----- fd = open("enable",O_WRONLY); probes_open() release_all_trace_probes() unregister_trace_probe() if (trace_probe_is_enabled(tp)) return -EBUSY write(fd, "1", 1) __ftrace_set_clr_event() call->class->reg() (kprobe_register) enable_trace_probe(tp) __unregister_trace_probe(tp); list_del(&tp->list) unregister_probe_event(tp) <-- fails! free_trace_probe(tp) write(fd, "0", 1) __ftrace_set_clr_event() call->class->unreg (kprobe_register) disable_trace_probe(tp) <-- BOOM! A test program was written that used two threads to simulate the above scenario adding a nanosleep() interval to change the timings and after several thousand runs, it was able to trigger this bug and crash: BUG: unable to handle kernel paging request at 00000005000000f9 IP: [<ffffffff810dee70>] probes_open+0x3b/0xa7 PGD 7808a067 PUD 0 Oops: 0000 [#1] PREEMPT SMP Dumping ftrace buffer: --------------------------------- Modules linked in: ipt_MASQUERADE sunrpc ip6t_REJECT nf_conntrack_ipv6 CPU: 1 PID: 2070 Comm: test-kprobe-rem Not tainted 3.11.0-rc3-test+ #47 Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./To be filled by O.E.M., BIOS SDBLI944.86P 05/08/2007 task: ffff880077756440 ti: ffff880076e52000 task.ti: ffff880076e52000 RIP: 0010:[<ffffffff810dee70>] [<ffffffff810dee70>] probes_open+0x3b/0xa7 RSP: 0018:ffff880076e53c38 EFLAGS: 00010203 RAX: 0000000500000001 RBX: ffff88007844f440 RCX: 0000000000000003 RDX: 0000000000000003 RSI: 0000000000000003 RDI: ffff880076e52000 RBP: ffff880076e53c58 R08: ffff880076e53bd8 R09: 0000000000000000 R10: ffff880077756440 R11: 0000000000000006 R12: ffffffff810dee35 R13: ffff880079250418 R14: 0000000000000000 R15: ffff88007844f450 FS: 00007f87a276f700(0000) GS:ffff88007d480000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 00000005000000f9 CR3: 0000000077262000 CR4: 00000000000007e0 Stack: ffff880076e53c58 ffffffff81219ea0 ffff88007844f440 ffffffff810dee35 ffff880076e53ca8 ffffffff81130f78 ffff8800772986c0 ffff8800796f93a0 ffffffff81d1b5d8 ffff880076e53e04 0000000000000000 ffff88007844f440 Call Trace: [<ffffffff81219ea0>] ? security_file_open+0x2c/0x30 [<ffffffff810dee35>] ? unregister_trace_probe+0x4b/0x4b [<ffffffff81130f78>] do_dentry_open+0x162/0x226 [<ffffffff81131186>] finish_open+0x46/0x54 [<ffffffff8113f30b>] do_last+0x7f6/0x996 [<ffffffff8113cc6f>] ? inode_permission+0x42/0x44 [<ffffffff8113f6dd>] path_openat+0x232/0x496 [<ffffffff8113fc30>] do_filp_open+0x3a/0x8a [<ffffffff8114ab32>] ? __alloc_fd+0x168/0x17a [<ffffffff81131f4e>] do_sys_open+0x70/0x102 [<ffffffff8108f06e>] ? trace_hardirqs_on_caller+0x160/0x197 [<ffffffff81131ffe>] SyS_open+0x1e/0x20 [<ffffffff81522742>] system_call_fastpath+0x16/0x1b Code: e5 41 54 53 48 89 f3 48 83 ec 10 48 23 56 78 48 39 c2 75 6c 31 f6 48 c7 RIP [<ffffffff810dee70>] probes_open+0x3b/0xa7 RSP <ffff880076e53c38> CR2: 00000005000000f9 ---[ end trace 35f17d68fc569897 ]--- The unregister_trace_probe() must be done first, and if it fails it must fail the removal of the kprobe. Several changes have already been made by Oleg Nesterov and Masami Hiramatsu to allow moving the unregister_probe_event() before the removal of the probe and exit the function if it fails. This prevents the tp structure from being used after it is freed. Link: http://lkml.kernel.org/r/20130704034038.819592356@goodmis.org Acked-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * tracing: trace_remove_event_call() should fail if call/file is in useOleg Nesterov2013-08-291-2/+33
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 2816c551c796ec14620325b2c9ed75b9979d3125 upstream. Change trace_remove_event_call(call) to return the error if this call is active. This is what the callers assume but can't verify outside of the tracing locks. Both trace_kprobe.c/trace_uprobe.c need the additional changes, unregister_trace_probe() should abort if trace_remove_event_call() fails. The caller is going to free this call/file so we must ensure that nobody can use them after trace_remove_event_call() succeeds. debugfs should be fine after the previous changes and event_remove() does TRACE_REG_UNREGISTER, but still there are 2 reasons why we need the additional checks: - There could be a perf_event(s) attached to this tp_event, so the patch checks ->perf_refcount. - TRACE_REG_UNREGISTER can be suppressed by FTRACE_EVENT_FL_SOFT_MODE, so we simply check FTRACE_EVENT_FL_ENABLED protected by event_mutex. Link: http://lkml.kernel.org/r/20130729175033.GB26284@redhat.com Reviewed-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Signed-off-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * tracing: Change remove_event_file_dir() to clear "d_subdirs"->i_privateOleg Nesterov2013-08-291-32/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit bf682c3159c4d298d1126a56793ed3f5e80395f7 upstream. Change remove_event_file_dir() to clear ->i_private for every file we are going to remove. We need to check file->dir != NULL because event_create_dir() can fail. debugfs_remove_recursive(NULL) is fine but the patch moves it under the same check anyway for readability. spin_lock(d_lock) and "d_inode != NULL" check are not needed afaics, but I do not understand this code enough. tracing_open_generic_file() and tracing_release_generic_file() can go away, ftrace_enable_fops and ftrace_event_filter_fops() use tracing_open_generic() but only to check tracing_disabled. This fixes all races with event_remove() or instance_delete(). f_op->read/write/whatever can never use the freed file/call, all event/* files were changed to check and use ->i_private under event_mutex. Note: this doesn't not fix other problems, event_remove() can destroy the active ftrace_event_call, we need more changes but those changes are completely orthogonal. Link: http://lkml.kernel.org/r/20130728183527.GB16723@redhat.com Reviewed-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Signed-off-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * tracing: Introduce remove_event_file_dir()Oleg Nesterov2013-08-291-24/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit f6a84bdc75b5c11621dec58db73fe102cbaf40cc upstream. Preparation for the next patch. Extract the common code from remove_event_from_tracers() and __trace_remove_event_dirs() into the new helper, remove_event_file_dir(). The patch looks more complicated than it actually is, it also moves remove_subsystem() up to avoid the forward declaration. Link: http://lkml.kernel.org/r/20130726172547.GA3629@redhat.com Reviewed-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Signed-off-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * tracing: Change f_start() to take event_mutex and verify i_private != NULLOleg Nesterov2013-08-291-4/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit c5a44a1200c6eda2202434f25325e8ad19533fca upstream. trace_format_open() and trace_format_seq_ops are racy, nothing protects ftrace_event_call from trace_remove_event_call(). Change f_start() to take event_mutex and verify i_private != NULL, change f_stop() to drop this lock. This fixes nothing, but now we can change debugfs_remove("format") callers to nullify ->i_private and fix the the problem. Note: the usage of event_mutex is sub-optimal but simple, we can change this later. Link: http://lkml.kernel.org/r/20130726172543.GA3622@redhat.com Reviewed-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Signed-off-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * tracing: Change event_filter_read/write to verify i_private != NULLOleg Nesterov2013-08-292-18/+25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit e2912b091c26b8ea95e5e00a43a7ac620f6c94a6 upstream. event_filter_read/write() are racy, ftrace_event_call can be already freed by trace_remove_event_call() callers. 1. Shift mutex_lock(event_mutex) from print/apply_event_filter to the callers. 2. Change the callers, event_filter_read() and event_filter_write() to read i_private under this mutex and abort if it is NULL. This fixes nothing, but now we can change debugfs_remove("filter") callers to nullify ->i_private and fix the the problem. Link: http://lkml.kernel.org/r/20130726172540.GA3619@redhat.com Reviewed-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Signed-off-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * tracing: Change event_enable/disable_read() to verify i_private != NULLOleg Nesterov2013-08-291-9/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit bc6f6b08dee5645770efb4b76186ded313f23752 upstream. tracing_open_generic_file() is racy, ftrace_event_file can be already freed by rmdir or trace_remove_event_call(). Change event_enable_read() and event_disable_read() to read and verify "file = i_private" under event_mutex. This fixes nothing, but now we can change debugfs_remove("enable") callers to nullify ->i_private and fix the the problem. Link: http://lkml.kernel.org/r/20130726172536.GA3612@redhat.com Reviewed-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Signed-off-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * tracing: Turn event/id->i_private into call->event.typeOleg Nesterov2013-08-291-5/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 1a11126bcb7c93c289bf3218fa546fd3b0c0df8b upstream. event_id_read() is racy, ftrace_event_call can be already freed by trace_remove_event_call() callers. Change event_create_dir() to pass "data = call->event.type", this is all event_id_read() needs. ftrace_event_id_fops no longer needs tracing_open_generic(). We add the new helper, event_file_data(), to read ->i_private, it will have more users. Note: currently ACCESS_ONCE() and "id != 0" check are not needed, but we are going to change event_remove/rmdir to clear ->i_private. Link: http://lkml.kernel.org/r/20130726172532.GA3605@redhat.com Reviewed-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Signed-off-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>