summaryrefslogtreecommitdiff
path: root/tools/tracing/rtla
AgeCommit message (Collapse)Author
2026-01-13rtla: Fix parse_cpu_set() bug introduced by strtoi()Costa Shulyupin
The patch 'Replace atoi() with a robust strtoi()' introduced a bug in parse_cpu_set(), which relies on partial parsing of the input string. The function parses CPU specifications like '0-3,5' by incrementing a pointer through the string. strtoi() rejects strings with trailing characters, causing parse_cpu_set() to fail on any CPU list with multiple entries. Restore the original use of atoi() in parse_cpu_set(). Fixes: 7e9dfccf8f11 ("rtla: Replace atoi() with a robust strtoi()") Signed-off-by: Costa Shulyupin <costa.shul@redhat.com> Reviewed-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Link: https://lore.kernel.org/r/20260112192642.212848-2-costa.shul@redhat.com Signed-off-by: Tomas Glozar <tglozar@redhat.com>
2026-01-07rtla: Fix parse_cpu_set() return value documentationWander Lairson Costa
Correct the return value documentation for parse_cpu_set() function in utils.c. The comment incorrectly stated that the function returns 1 on success and 0 on failure, but the actual implementation returns 0 on success and 1 on failure, following the common error-on-nonzero convention used throughout the codebase. This documentation fix ensures that developers reading the code understand the correct return value semantics and prevents potential misuse of the function's return value in conditional checks. Signed-off-by: Wander Lairson Costa <wander@redhat.com> Link: https://lore.kernel.org/r/20260106133655.249887-18-wander@redhat.com Signed-off-by: Tomas Glozar <tglozar@redhat.com>
2026-01-07rtla: Ensure null termination after read operations in utils.cWander Lairson Costa
Add explicit null termination and buffer initialization for read() operations in procfs_is_workload_pid() and get_self_cgroup() functions. The read() system call does not null-terminate the data it reads, and when the buffer is filled to capacity, subsequent string operations will read past the buffer boundary searching for a null terminator. In procfs_is_workload_pid(), explicitly set buffer[MAX_PATH-1] to '\0' to ensure the buffer is always null-terminated before passing it to strncmp(). In get_self_cgroup(), use memset() to zero the path buffer before reading, which ensures null termination when retval is less than MAX_PATH. Additionally, set path[MAX_PATH-1] to '\0' after the read to handle the case where the buffer is filled completely. These defensive buffer handling practices prevent potential buffer overruns and align with the ongoing buffer safety improvements across the rtla codebase. Signed-off-by: Wander Lairson Costa <wander@redhat.com> Link: https://lore.kernel.org/r/20260106133655.249887-17-wander@redhat.com Signed-off-by: Tomas Glozar <tglozar@redhat.com>
2026-01-07rtla: Make stop_tracing variable volatileWander Lairson Costa
The stop_tracing global variable is accessed from both the signal handler context and the main program flow without synchronization. This creates a potential race condition where compiler optimizations could cache the variable value in registers, preventing the signal handler's updates from being visible to other parts of the program. Add the volatile qualifier to stop_tracing in both common.c and common.h to ensure all accesses to this variable bypass compiler optimizations and read directly from memory. This guarantees that when the signal handler sets stop_tracing, the change is immediately visible to the main program loop, preventing potential hangs or delayed shutdown when termination signals are received. Signed-off-by: Wander Lairson Costa <wander@redhat.com> Link: https://lore.kernel.org/r/20260106133655.249887-16-wander@redhat.com Signed-off-by: Tomas Glozar <tglozar@redhat.com>
2026-01-07rtla: Add generated output files to gitignoreWander Lairson Costa
The rtla tool generates various output files during testing and execution, including custom trace outputs and histogram data. These files are artifacts of running the tool with different options and should not be tracked in version control. Add gitignore entries for custom_filename.txt, osnoise_irq_noise_hist.txt, osnoise_trace.txt, and timerlat_trace.txt to prevent accidentally committing these generated files. This aligns with the existing pattern of ignoring build artifacts and generated headers like *.skel.h. Signed-off-by: Wander Lairson Costa <wander@redhat.com> Link: https://lore.kernel.org/r/20260106133655.249887-15-wander@redhat.com Signed-off-by: Tomas Glozar <tglozar@redhat.com>
2026-01-07rtla: Fix NULL pointer dereference in actions_parseWander Lairson Costa
The actions_parse() function uses strtok() to tokenize the trigger string, but does not check if the returned token is NULL before passing it to strcmp(). If the trigger parameter is an empty string or contains only delimiter characters, strtok() returns NULL, causing strcmp() to dereference a NULL pointer and crash the program. This issue can be triggered by malformed user input or edge cases in trigger string parsing. Add a NULL check immediately after the strtok() call to validate that a token was successfully extracted before using it. If no token is found, the function now returns -1 to indicate a parsing error. Signed-off-by: Wander Lairson Costa <wander@redhat.com> Link: https://lore.kernel.org/r/20260106133655.249887-13-wander@redhat.com Signed-off-by: Tomas Glozar <tglozar@redhat.com>
2026-01-07rtla: Remove unused headersWander Lairson Costa
Remove unused includes for <errno.h> and <signal.h> to clean up the code and reduce unnecessary dependencies. Signed-off-by: Wander Lairson Costa <wander@redhat.com> Link: https://lore.kernel.org/r/20260106133655.249887-12-wander@redhat.com Signed-off-by: Tomas Glozar <tglozar@redhat.com>
2026-01-07rtla: Remove redundant memset after callocWander Lairson Costa
The actions struct is allocated using calloc, which already returns zeroed memory. The subsequent memset call to zero the 'present' member is therefore redundant. Signed-off-by: Wander Lairson Costa <wander@redhat.com> Link: https://lore.kernel.org/r/20260106133655.249887-10-wander@redhat.com Signed-off-by: Tomas Glozar <tglozar@redhat.com>
2026-01-07rtla: Use standard exit codes for result enumWander Lairson Costa
The result enum defines custom values for PASSED, ERROR, and FAILED. These values correspond to standard exit codes EXIT_SUCCESS and EXIT_FAILURE. Update the enum to use the standard macros EXIT_SUCCESS and EXIT_FAILURE to improve readability and adherence to standard C practices. The FAILED value is implicitly assigned EXIT_FAILURE + 1, so there is no need to assign an explicit value. Signed-off-by: Wander Lairson Costa <wander@redhat.com> Link: https://lore.kernel.org/r/20260106133655.249887-9-wander@redhat.com Signed-off-by: Tomas Glozar <tglozar@redhat.com>
2026-01-07rtla: Replace atoi() with a robust strtoi()Wander Lairson Costa
The atoi() function does not perform error checking, which can lead to undefined behavior when parsing invalid or out-of-range strings. This can cause issues when parsing user-provided numerical inputs, such as signal numbers, PIDs, or CPU lists. To address this, introduce a new strtoi() helper function that safely converts a string to an integer. This function validates the input and checks for overflows, returning a negative value on failure. Replace all calls to atoi() with the new strtoi() function and add proper error handling to make the parsing more robust and prevent potential issues. Signed-off-by: Wander Lairson Costa <wander@redhat.com> Link: https://lore.kernel.org/r/20260106133655.249887-5-wander@redhat.com Signed-off-by: Tomas Glozar <tglozar@redhat.com>
2026-01-07rtla: Introduce for_each_action() helperWander Lairson Costa
The for loop to iterate over the list of actions is used in more than one place. To avoid code duplication and improve readability, introduce a for_each_action() helper macro. Replace the open-coded for loops with the new helper. Signed-off-by: Wander Lairson Costa <wander@redhat.com> Link: https://lore.kernel.org/r/20260106133655.249887-4-wander@redhat.com Signed-off-by: Tomas Glozar <tglozar@redhat.com>
2026-01-07tools/rtla: Deduplicate cgroup path opening codeCosta Shulyupin
Both set_pid_cgroup() and set_comm_cgroup() functions contain identical code for opening the cgroup.procs file. Extract this common code into a new helper function open_cgroup_procs() to reduce code duplication and improve maintainability. Signed-off-by: Costa Shulyupin <costa.shul@redhat.com> Link: https://lore.kernel.org/r/20251224125058.1771519-1-costa.shul@redhat.com Signed-off-by: Tomas Glozar <tglozar@redhat.com>
2026-01-07tools/rtla: Consolidate -H/--house-keeping option parsingCosta Shulyupin
Each rtla tool duplicates parsing of -H/--house-keeping. Migrate the option parsing from individual tools to the common_parse_options(). Signed-off-by: Costa Shulyupin <costa.shul@redhat.com> Link: https://lore.kernel.org/r/20251209100047.2692515-8-costa.shul@redhat.com Signed-off-by: Tomas Glozar <tglozar@redhat.com>
2026-01-07tools/rtla: Consolidate -P/--priority option parsingCosta Shulyupin
Each rtla tool duplicates parsing of -P/--priority. Migrate the option parsing from individual tools to the common_parse_options(). Signed-off-by: Costa Shulyupin <costa.shul@redhat.com> Link: https://lore.kernel.org/r/20251209100047.2692515-7-costa.shul@redhat.com Signed-off-by: Tomas Glozar <tglozar@redhat.com>
2026-01-07tools/rtla: Consolidate -e/--event option parsingCosta Shulyupin
Each rtla tool duplicates parsing of -e/--event. Migrate the option parsing from individual tools to the common_parse_options(). Signed-off-by: Costa Shulyupin <costa.shul@redhat.com> Link: https://lore.kernel.org/r/20251209100047.2692515-6-costa.shul@redhat.com Signed-off-by: Tomas Glozar <tglozar@redhat.com>
2026-01-07tools/rtla: Consolidate -d/--duration option parsingCosta Shulyupin
Each rtla tool duplicates parsing of -d/--duration. Migrate the option parsing from individual tools to the common_parse_options(). Signed-off-by: Costa Shulyupin <costa.shul@redhat.com> Link: https://lore.kernel.org/r/20251209100047.2692515-5-costa.shul@redhat.com Signed-off-by: Tomas Glozar <tglozar@redhat.com>
2026-01-07tools/rtla: Consolidate -D/--debug option parsingCosta Shulyupin
Each rtla tool duplicates parsing of -D/--debug. Migrate the option parsing from individual tools to the common_parse_options(). Signed-off-by: Costa Shulyupin <costa.shul@redhat.com> Link: https://lore.kernel.org/r/20251209100047.2692515-4-costa.shul@redhat.com Signed-off-by: Tomas Glozar <tglozar@redhat.com>
2026-01-07tools/rtla: Consolidate -C/--cgroup option parsingCosta Shulyupin
Each rtla tool duplicates parsing of -C/--cgroup. Migrate the option parsing from individual tools to the common_parse_options(). Signed-off-by: Costa Shulyupin <costa.shul@redhat.com> Link: https://lore.kernel.org/r/20251209100047.2692515-3-costa.shul@redhat.com Signed-off-by: Tomas Glozar <tglozar@redhat.com>
2026-01-07tools/rtla: Consolidate -c/--cpus option parsingCosta Shulyupin
Each rtla tool duplicates parsing of -c/--cpus. Migrate the option parsing from individual tools to the common_parse_options(). Signed-off-by: Costa Shulyupin <costa.shul@redhat.com> Link: https://lore.kernel.org/r/20251209100047.2692515-2-costa.shul@redhat.com Signed-off-by: Tomas Glozar <tglozar@redhat.com>
2026-01-07tools/rtla: Add common_parse_options()Costa Shulyupin
Each rtla tool duplicates parsing of many common options. This creates maintenance overhead and risks inconsistencies when updating these options. Add common_parse_options() to centralize parsing of options used across all tools. Common options to be migrated in future patches. Changes since v1: - restore opterr Signed-off-by: Costa Shulyupin <costa.shul@redhat.com> Link: https://lore.kernel.org/r/20251209100047.2692515-1-costa.shul@redhat.com Signed-off-by: Tomas Glozar <tglozar@redhat.com>
2026-01-07rtla/tests: Run Test::Harness in verbose modeTomas Glozar
Add -v flag to prove command to also print the names of tests that succeeded, not only those that failed, to allow easier debugging of the test suite. Also, drop printing the option and value to stdout in check_with_osnoise_options, which was a debugging print that was accidentally left in the final commit, and which would be otherwise now visible in make check output, as stdout is no longer suppressed. Suggested-by: Crystal Wood <crwood@redhat.com> Reviewed-by: Wander Lairson Costa <wander@redhat.com> Link: https://lore.kernel.org/r/20251126144205.331954-6-tglozar@redhat.com Signed-off-by: Tomas Glozar <tglozar@redhat.com>
2026-01-07rtla/tests: Test BPF action programTomas Glozar
Add a test that implements a BPF program writing to a test map, which is attached to RTLA via --bpf-action to be executed on theshold overflow. A combination of --on-threshold shell with bpftool (which is always present if BPF support is enabled) is used to check whether the BPF program has executed successfully. Suggested-by: Crystal Wood <crwood@redhat.com> Link: https://lore.kernel.org/r/20251126144205.331954-5-tglozar@redhat.com Signed-off-by: Tomas Glozar <tglozar@redhat.com>
2026-01-07rtla/timerlat: Add example for BPF action programTomas Glozar
Add an example BPF action program that prints the measured latency to the tracefs buffer via bpf_printk(). A new Makefile target, "examples", is added to build the example. In addition, "sample/" subfolder is renamed to "example". If BPF skeleton support is unavailable or disabled, a warning will be displayed when building the BPF action program example. Link: https://lore.kernel.org/r/20251126144205.331954-4-tglozar@redhat.com Signed-off-by: Tomas Glozar <tglozar@redhat.com>
2026-01-07rtla/timerlat: Add --bpf-action optionTomas Glozar
Add option --bpf-action that allows the user to attach an external BPF program that will be executed via BPF tail call on latency threshold overflow. Executing additional BPF code on latency threshold overflow allows doing low-latency and in-kernel troubleshooting of the cause of the overflow. The option takes an argument, which is a path to a BPF ELF file expected to contain a function named "action_handler" in a section named "tp/timerlat_action" (the section is necessary for libbpf to assign the correct BPF program type to it). Link: https://lore.kernel.org/r/20251126144205.331954-3-tglozar@redhat.com Signed-off-by: Tomas Glozar <tglozar@redhat.com>
2026-01-07rtla/timerlat: Support tail call from BPF programTomas Glozar
Add a map to the rtla-timerlat BPF program that holds a file descriptor of another BPF program, to be executed on threshold overflow. timerlat_bpf_set_action() is added as an interface to set the program. Link: https://lore.kernel.org/r/20251126144205.331954-2-tglozar@redhat.com Signed-off-by: Tomas Glozar <tglozar@redhat.com>
2026-01-07tools/rtla: Add common_usage()Costa Shulyupin
The rtla tools have significant code quadruplication in their usage functions. Each tool implements its own version of the same help text formatting and option descriptions, leading to maintenance overhead and inconsistencies. Documentation/tools/rtla/common_options.rst lists 14 common options. Add common_usage() infrastructure to consolidate help formatting. Subsequent patches will extend this to handle other common options. The refactored output is almost identical to the original, with the following changes: - add square brackets to specify optionality: `usage: [rtla] ...` - remove `-q` from timerlat hist because hist tools don't support it - minor spacing Signed-off-by: Costa Shulyupin <costa.shul@redhat.com> Link: https://lore.kernel.org/r/20251124063204.845425-1-costa.shul@redhat.com Signed-off-by: Tomas Glozar <tglozar@redhat.com>
2026-01-07rtla: Set stop threshold after all instances are enabledCrystal Wood
This avoids startup races where one of the instances hit a threshold before all instances were enabled, and thus tracing stops without the relevant event. In particular, this is not uncommon with the tests that set a very tight threshold and then complain if there's no analysis. This also ensures that we don't stop tracing during a warmup. The downside is a small chance of having an event over the threshold early in the output, without stopping on it, which could cause user confusion. This should be less likely if the warmup feature is used, but that doesn't eliminate the race window, just the odds of an unusual spike right at that moment. Signed-off-by: Crystal Wood <crwood@redhat.com> Link: https://lore.kernel.org/r/20251112152529.956778-6-crwood@redhat.com Signed-off-by: Tomas Glozar <tglozar@redhat.com>
2026-01-06tools/rtla: Remove unused function declarationsCosta Shulyupin
Historically four function declarations remain orphaned or duplicated. Remove them to keep the source clean. Signed-off-by: Costa Shulyupin <costa.shul@redhat.com> Link: https://lore.kernel.org/r/20251012071133.290225-1-costa.shul@redhat.com Signed-off-by: Tomas Glozar <tglozar@redhat.com>
2025-11-21rtla/timerlat: Exit top main loop on any non-zero wait_retvalCrystal Wood
Comparing to exactly 1 will fail if more than one ring buffer event was seen since the last call to timerlat_bpf_wait(), which can happen in some race scenarios. Signed-off-by: Crystal Wood <crwood@redhat.com> Link: https://lore.kernel.org/r/20251112152529.956778-5-crwood@redhat.com Signed-off-by: Tomas Glozar <tglozar@redhat.com>
2025-11-21rtla/tests: Don't rely on matching ^1ALLCrystal Wood
The timerlat "top stop at failed action" test was relying on "ALL" being printed immediately after the "1" from the threshold action. Besides being fragile, this depends on stdbuf behavior, which is easy to miss when recreating the test outside of the framework for debugging purposes. Instead, use the expected/unexpected text mechanism from the corresponding osnoise test. Signed-off-by: Crystal Wood <crwood@redhat.com> Link: https://lore.kernel.org/r/20251112152529.956778-2-crwood@redhat.com Signed-off-by: Tomas Glozar <tglozar@redhat.com>
2025-11-21rtla: Fix -a overriding -t argumentIvan Pravdin
When running rtla as `rtla <timerlat|osnoise> <top|hist> -t custom_file.txt -a 100` -a options override trace output filename specified by -t option. Running the command above will create <timerlat|osnoise>_trace.txt file instead of custom_file.txt. Fix this by making sure that -a option does not override trace output filename even if it's passed after trace output filename is specified. Fixes: 173a3b014827 ("rtla/timerlat: Add the automatic trace option") Signed-off-by: Ivan Pravdin <ipravdin.official@gmail.com> Reviewed-by: Tomas Glozar <tglozar@redhat.com> Link: https://lore.kernel.org/r/b6ae60424050b2c1c8709e18759adead6012b971.1762186418.git.ipravdin.official@gmail.com [ use capital letter in subject, as required by tracing subsystem ] Signed-off-by: Tomas Glozar <tglozar@redhat.com>
2025-11-21rtla: Fix -C/--cgroup interfaceIvan Pravdin
Currently, user can only specify cgroup to the tracer's thread the following ways: `-C[cgroup]` `-C[=cgroup]` `--cgroup[=cgroup]` If user tries to specify cgroup as `-C [cgroup]` or `--cgroup [cgroup]`, the parser silently fails and rtla's cgroup is used for the tracer threads. To make interface more user-friendly, allow user to specify cgroup in the aforementioned way, i.e. `-C [cgroup]` and `--cgroup [cgroup]`. Refactor identical logic between -t/--trace and -C/--cgroup into a common function. Change documentation to reflect this user interface change. Fixes: a957cbc02531 ("rtla: Add -C cgroup support") Signed-off-by: Ivan Pravdin <ipravdin.official@gmail.com> Reviewed-by: Tomas Glozar <tglozar@redhat.com> Link: https://lore.kernel.org/r/16132f1565cf5142b5fbd179975be370b529ced7.1762186418.git.ipravdin.official@gmail.com [ use capital letter in subject, as required by tracing subsystem ] Signed-off-by: Tomas Glozar <tglozar@redhat.com>
2025-11-21tools/rtla: Replace osnoise_hist_usage("...") with fatal("...")Costa Shulyupin
A long time ago, when the usage help was short, it was a favor to the user to show it on error. Now that the usage help has become very long, it is too noisy to dump the complete help text for each typo after the error message itself. Replace osnoise_hist_usage("...") with fatal("...") on errors. Remove the already unused 'usage' argument from osnoise_hist_usage(). Signed-off-by: Costa Shulyupin <costa.shul@redhat.com> Reviewed-by: Tomas Glozar <tglozar@redhat.com> Link: https://lore.kernel.org/r/20251011082738.173670-6-costa.shul@redhat.com Signed-off-by: Tomas Glozar <tglozar@redhat.com>
2025-11-21tools/rtla: Replace osnoise_top_usage("...") with fatal("...")Costa Shulyupin
A long time ago, when the usage help was short, it was a favor to the user to show it on error. Now that the usage help has become very long, it is too noisy to dump the complete help text for each typo after the error message itself. Replace osnoise_top_usage("...") with fatal("...") on errors. Remove the already unused 'usage' argument from osnoise_top_usage(). Signed-off-by: Costa Shulyupin <costa.shul@redhat.com> Reviewed-by: Tomas Glozar <tglozar@redhat.com> Link: https://lore.kernel.org/r/20251011082738.173670-5-costa.shul@redhat.com Signed-off-by: Tomas Glozar <tglozar@redhat.com>
2025-11-21tools/rtla: Replace timerlat_hist_usage("...") with fatal("...")Costa Shulyupin
A long time ago, when the usage help was short, it was a favor to the user to show it on error. Now that the usage help has become very long, it is too noisy to dump the complete help text for each typo after the error message itself. Replace timerlat_hist_usage("...\n") with fatal("...") on errors. Remove the already unused 'usage' argument from timerlat_hist_usage(). Signed-off-by: Costa Shulyupin <costa.shul@redhat.com> Reviewed-by: Tomas Glozar <tglozar@redhat.com> Link: https://lore.kernel.org/r/20251011082738.173670-4-costa.shul@redhat.com Signed-off-by: Tomas Glozar <tglozar@redhat.com>
2025-11-21tools/rtla: Replace timerlat_top_usage("...") with fatal("...")Costa Shulyupin
A long time ago, when the usage help was short, it was a favor to the user to show it on error. Now that the usage help has become very long, it is too noisy to dump the complete help text for each typo after the error message itself. Replace timerlat_top_usage("...\n") with fatal("...") on errors. Remove the already unused 'usage' argument from timerlat_top_usage(). Signed-off-by: Costa Shulyupin <costa.shul@redhat.com> Reviewed-by: Tomas Glozar <tglozar@redhat.com> Link: https://lore.kernel.org/r/20251011082738.173670-3-costa.shul@redhat.com Signed-off-by: Tomas Glozar <tglozar@redhat.com>
2025-11-21tools/rtla: Add fatal() and replace error handling patternCosta Shulyupin
The code contains some technical debt in error handling, which complicates the consolidation of duplicated code. Introduce an fatal() function to replace the common pattern of err_msg() followed by exit(EXIT_FAILURE), reducing the length of an already long function. Further patches using fatal() follow. Signed-off-by: Costa Shulyupin <costa.shul@redhat.com> Reviewed-by: Tomas Glozar <tglozar@redhat.com> Link: https://lore.kernel.org/r/20251011082738.173670-2-costa.shul@redhat.com Signed-off-by: Tomas Glozar <tglozar@redhat.com>
2025-11-21rtla/tests: Fix osnoise test calling timerlatTomas Glozar
osnoise test "top stop at failed action" is calling timerlat instead of osnoise by mistake. Fix it so that it calls the correct RTLA subcommand. Fixes: 05b7e10687c6 ("tools/rtla: Add remaining support for osnoise actions") Reviewed-by: Wander Lairson Costa <wander@redhat.com> Link: https://lore.kernel.org/r/20251007095341.186923-3-tglozar@redhat.com Signed-off-by: Tomas Glozar <tglozar@redhat.com>
2025-11-21rtla/tests: Extend action tests to 5sTomas Glozar
In non-BPF mode, it takes up to 1 second for RTLA to notice that tracing has been stopped. That means that action tests cannot have a 1 second duration, as the SIGALRM will be racing with the threshold overflow. Previously, non-BPF mode actions were buggy and always executed the action, even when stopping on duration or SIGINT, preventing this issue from manifesting. Now that this has been fixed, the tests have become flaky, and this has to be adjusted. Fixes: 4e26f84abfbb ("rtla/tests: Add tests for actions") Fixes: 05b7e10687c6 ("tools/rtla: Add remaining support for osnoise actions") Reviewed-by: Wander Lairson Costa <wander@redhat.com> Link: https://lore.kernel.org/r/20251007095341.186923-2-tglozar@redhat.com Signed-off-by: Tomas Glozar <tglozar@redhat.com>
2025-11-20tools/rtla: Fix --on-threshold always triggeringTomas Glozar
Commit 8d933d5c89e8 ("rtla/timerlat: Add continue action") moved the code performing on-threshold actions (enabled through --on-threshold option) to inside the RTLA main loop. The condition in the loop does not check whether the threshold was actually exceeded or if stop tracing was requested by the user through SIGINT or duration. This leads to a bug where on-threshold actions are always performed, even when the threshold was not hit. (BPF mode is not affected, since it uses a different condition in the while loop.) Add a condition that checks for !stop_tracing before executing the actions. Also, fix incorrect brackets in hist_main_loop to match the semantics of top_main_loop. Fixes: 8d933d5c89e8 ("rtla/timerlat: Add continue action") Fixes: 2f3172f9dd58 ("tools/rtla: Consolidate code between osnoise/timerlat and hist/top") Reviewed-by: Crystal Wood <crwood@redhat.com> Reviewed-by: Wander Lairson Costa <wander@redhat.com> Link: https://lore.kernel.org/r/20251007095341.186923-1-tglozar@redhat.com Signed-off-by: Tomas Glozar <tglozar@redhat.com>
2025-11-20rtla/timerlat_bpf: Stop tracing on user latencyTomas Glozar
rtla-timerlat allows a *thread* latency threshold to be set via the -T/--thread option. However, the timerlat tracer calls this *total* latency (stop_tracing_total_us), and stops tracing also when the return-to-user latency is over the threshold. Change the behavior of the timerlat BPF program to reflect what the timerlat tracer is doing, to avoid discrepancy between stopping collecting data in the BPF program and stopping tracing in the timerlat tracer. Cc: stable@vger.kernel.org Fixes: e34293ddcebd ("rtla/timerlat: Add BPF skeleton to collect samples") Reviewed-by: Wander Lairson Costa <wander@redhat.com> Link: https://lore.kernel.org/r/20251006143100.137255-1-tglozar@redhat.com Signed-off-by: Tomas Glozar <tglozar@redhat.com>
2025-11-20tools/rtla: Fix unassigned nr_cpusCosta Shulyupin
In recently introduced timerlat_free(), the variable 'nr_cpus' is not assigned. Assign it with sysconf(_SC_NPROCESSORS_CONF) as done elsewhere. Remove the culprit: -Wno-maybe-uninitialized. The rest of the code is clean. Signed-off-by: Costa Shulyupin <costa.shul@redhat.com> Reviewed-by: Tomas Glozar <tglozar@redhat.com> Fixes: 2f3172f9dd58 ("tools/rtla: Consolidate code between osnoise/timerlat and hist/top") Link: https://lore.kernel.org/r/20251002170846.437888-1-costa.shul@redhat.com Signed-off-by: Tomas Glozar <tglozar@redhat.com>
2025-11-20tools/rtla: Remove unused optional option_indexCosta Shulyupin
The longindex argument of getopt_long() is optional and tied to the unused local variable option_index. Remove it to shorten the four longest functions and make the code neater. Signed-off-by: Costa Shulyupin <costa.shul@redhat.com> Reviewed-by: Tomas Glozar <tglozar@redhat.com> Link: https://lore.kernel.org/r/20251002123553.389467-2-costa.shul@redhat.com Signed-off-by: Tomas Glozar <tglozar@redhat.com>
2025-11-20tools/rtla: Add for_each_monitored_cpu() helperCosta Shulyupin
The rtla tools have many instances of iterating over CPUs while checking if they are monitored. Add a for_each_monitored_cpu() helper macro to make the code more readable and reduce code duplication. Signed-off-by: Costa Shulyupin <costa.shul@redhat.com> Reviewed-by: Tomas Glozar <tglozar@redhat.com> Link: https://lore.kernel.org/r/20251002123553.389467-1-costa.shul@redhat.com Signed-off-by: Tomas Glozar <tglozar@redhat.com>
2025-10-05Merge tag 'trace-tools-v6.18' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace Pull tracing tools updates from Steven Rostedt - This is mostly just consolidating code between osnoise/timerlat and top/hist for easier maintenance and less future divergence * tag 'trace-tools-v6.18' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: tools/rtla: Add remaining support for osnoise actions tools/rtla: Add test engine support for unexpected output tools/rtla: Fix -A option name in test comment tools/rtla: Consolidate code between osnoise/timerlat and hist/top tools/rtla: Create common_apply_config() tools/rtla: Move top/hist params into common struct tools/rtla: Consolidate common parameters into shared structure
2025-09-27rtla/actions: Fix condition for buffer reallocationWander Lairson Costa
The condition to check if the actions buffer needs to be resized was incorrect. The check `self->size >= self->len` would evaluate to true on almost every call to `actions_new()`, causing the buffer to be reallocated unnecessarily each time an action was added. Fix the condition to `self->len >= self.size`, ensuring that the buffer is only resized when it is actually full. Cc: John Kacur <jkacur@redhat.com> Cc: Luis Goncalves <lgoncalv@redhat.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Chang Yin <cyin@redhat.com> Cc: Costa Shulyupin <costa.shul@redhat.com> Cc: Crystal Wood <crwood@redhat.com> Cc: Gabriele Monaco <gmonaco@redhat.com> Link: https://lore.kernel.org/20250915181101.52513-1-wander@redhat.com Fixes: 6ea082b171e00 ("rtla/timerlat: Add action on threshold feature") Signed-off-by: Wander Lairson Costa <wander@redhat.com> Reviewed-by: Tomas Glozar <tglozar@redhat.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2025-09-27rtla: Fix buffer overflow in actions_parseIvan Pravdin
Currently, tests 3 and 13-22 in tests/timerlat.t fail with error: *** buffer overflow detected ***: terminated timeout: the monitored command dumped core The result of running `sudo make check` is tests/timerlat.t (Wstat: 0 Tests: 22 Failed: 11) Failed tests: 3, 13-22 Files=3, Tests=34, 140 wallclock secs ( 0.07 usr 0.01 sys + 27.63 cusr 27.96 csys = 55.67 CPU) Result: FAIL Fix buffer overflow in actions_parse to avoid this error. After this change, the tests results are tests/hwnoise.t ... ok tests/osnoise.t ... ok tests/timerlat.t .. ok All tests successful. Files=3, Tests=34, 186 wallclock secs ( 0.06 usr 0.01 sys + 41.10 cusr 44.38 csys = 85.55 CPU) Result: PASS Link: https://lore.kernel.org/164ffc2ec8edacaf1295789dad82a07817b6263d.1757034919.git.ipravdin.official@gmail.com Fixes: 6ea082b171e0 ("rtla/timerlat: Add action on threshold feature") Signed-off-by: Ivan Pravdin <ipravdin.official@gmail.com> Reviewed-by: Tomas Glozar <tglozar@redhat.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2025-09-27tools/rtla: Add remaining support for osnoise actionsCrystal Wood
The basic functionality came with the consolidation; now hook up the command line options, and add documentation and tests. Cc: John Kacur <jkacur@redhat.com> Cc: Costa Shulyupin <costa.shul@redhat.com> Link: https://lore.kernel.org/20250907022325.243930-8-crwood@redhat.com Reviewed-by: Tomas Glozar <tglozar@redhat.com> Signed-off-by: Crystal Wood <crwood@redhat.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2025-09-27tools/rtla: Add test engine support for unexpected outputCrystal Wood
Add a check() parameter to indicate which text must not appear in the output. Simplify the code so that we can print failures as they happen rather than trying to figure out what went wrong after printing "not ok". This also means that "not ok" gets printed after the info rather than before, which seems more intuitive anyway. Cc: John Kacur <jkacur@redhat.com> Cc: Costa Shulyupin <costa.shul@redhat.com> Link: https://lore.kernel.org/20250907022325.243930-7-crwood@redhat.com Reviewed-by: Tomas Glozar <tglozar@redhat.com> Signed-off-by: Crystal Wood <crwood@redhat.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2025-09-27tools/rtla: Fix -A option name in test commentCrystal Wood
This was changed to --on-threshold when the patches were applied. Cc: John Kacur <jkacur@redhat.com> Cc: Costa Shulyupin <costa.shul@redhat.com> Link: https://lore.kernel.org/20250907022325.243930-6-crwood@redhat.com Reviewed-by: Tomas Glozar <tglozar@redhat.com> Signed-off-by: Crystal Wood <crwood@redhat.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>