1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
|
From ab79ec5b6389507b4970d68862abb95d0b2b94c9 Mon Sep 17 00:00:00 2001
From: Christian Ehrhardt <christian.ehrhardt@canonical.com>
Date: Mon, 17 Jun 2019 12:37:48 +0200
Subject: [PATCH] fix sparse node ids
CPU-ids can be sparse due to disabling a subset of CPUs.
On ppc64le this even will make the node_ids sparse, this is actually pretty
common on ppc64 when SMT is disabled.
Numad has the assumption of cpu/node-ids always being linear and due to that
accesses the 'node' array out of bounds. That triggers crashes like the
following:
Thread 1 "numad" received signal SIGSEGV, Segmentation fault.
#0 0x00000fb6cd2779f4 in bind_process_and_migrate_memory (p=0xfb6fc1e0f70)
at numad.c:998
#1 0x00000fb6cd27d148 in manage_loads () at numad.c:2225
#2 0x00000fb6cd2734dc in main (argc=<optimized out>, argv=<optimized out>)
at numad.c:2654
Instead of directly indexing with node_id we need to detect which array
element has the matching node_id and use that.
Forwarded: yes (https://pagure.io/numad/pull-request/3)
Bug-Ubuntu: https://bugs.launchpad.net/bugs/1832915
Bug-Debian: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=930725
Last-Update: 2019-06-19
Signed-off-by: Christian Ehrhardt <christian.ehrhardt@canonical.com>
---
numad.c | 13 ++++++++++++-
1 file changed, 12 insertions(+), 1 deletion(-)
diff --git a/numad.c b/numad.c
index a6a7a5d..524bf61 100644
--- a/numad.c
+++ b/numad.c
@@ -995,7 +995,18 @@ int bind_process_and_migrate_memory(process_data_p p) {
int node_id = 0;
while (nodes) {
if (ID_IS_IN_LIST(node_id, p->node_list_p)) {
- OR_LISTS(cpu_bind_list_p, cpu_bind_list_p, node[node_id].cpu_list_p);
+ int id = -1;
+ for (int node_ix = 0; (node_ix < num_nodes); node_ix++) {
+ if (node[node_ix].node_id == node_id) {
+ id = node_ix;
+ break;
+ }
+ }
+ if (id == -1) {
+ numad_log(LOG_CRIT, "Node %d is requested, but unknown\n", node_id);
+ exit(EXIT_FAILURE);
+ }
+ OR_LISTS(cpu_bind_list_p, cpu_bind_list_p, node[id].cpu_list_p);
nodes -= 1;
}
node_id += 1;
--
2.21.0
|