1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364
|
Interaction between KDB and LKCD.
Executive summary: Do not select CONFIG_KDB_CONTINUE_CATASTROPHIC=2 or
use KDB command 'sr c' without first patching LKCD to use KDB data.
Both KDB and LKCD try to stop all the other cpus, so the system is not
changing while it is being debugged or dumped. KDB will cope with cpus
that cannot be stopped, some versions of LKCD will just hang. In
particular, when LKCD is invoked from KDB, LKCD will attempt to stop
the other cpus again and may hang.
Some versions of LKCD detect that other cpus are not responding and
ignore them. This is almost as bad, the data is changing while it is
being dumped. Also the method used to avoid hung cpus has been known
to cause oops when LKCD has finished dumping.
LKCD does not know about several special cases on IA64, including INIT
and MCA backtraces, interrupt handlers, out of line code etc. LKCD
cannot capture cpu state on any cpu that is not responding to OS
interrupts, which means that any cpu that is spinning in a disabled
loop cannot be debugged. Any cpu that calls into SAL for MCA
rendezvous cannot be debugged. Even when LKCD captures IA64 state, the
user space lcrash code cannot unwind through any assembler code, which
rules out all the interesting cases.
KDB knows far more than LKCD about architecture peculiarities, stack
formats, interrupt handling etc. The methods used by KDB to stop the
other processors and capture their state are far more reliable than
those used by LKCD. KDB can capture INIT and MCA data on IA64, as well
as save the state of cpus before they enter SAL.
Rather than duplicating the complex KDB code in LKCD, LKCD can be
patched to use the information that has already been captured by KDB.
Obviously this only works when LKCD is invoked from KDB. If you invoke
LKCD directly from the console with SysRq-c or the dump() function is
called from code outside KDB then you get the old and broken LKCD
processing. Because lcrash uses the old unwind algorithm which cannot
unwind through IA64 assembler code, KDB kludges the saved state into
something that the old unwind algorithm can cope with. Calling LKCD
from KDB gives you a clean dump, but you have to patch LKCD first.
There are two ways to invoke LKCD from KDB. One way is manual, using
the KDB 'sr c' command. This is identical to doing SysRq-C from the
console except that it goes through KDB first, so LKCD can use the data
that KDB has captured. Obviously 'sr c' requires human intervention
and KDB must be on, it is up to the person doing the debugging if they
want to take a dump.
The second way is to set CONFIG_KDB_CONTINUE_CATASTROPHIC=2. With this
setting, you automatically get a dump for catastrophic errors. A
catastrophic error is a panic, oops, NMI or other watchdog tripping,
INIT and MCA events on IA64. CONFIG_KDB_CONTINUE_CATASTROPHIC=2 has no
effect on debugging events such as break points, single step etc. so it
does not interfere with manual debugging.
When CONFIG_KDB_CONTINUE_CATASTROPHIC=2 and KDB is on, a catastrophic
error will drop into KDB to allow manual debugging, typing 'go' will
take a dump and force a reboot. With this setting and KDB is off, KDB
detects a catastrophic error, does enough processing to capture the
state, takes a dump and forces a reboot - all automatic with no human
intervention.
For unattended and clean LKCD dumps, patch LKCD to use KDB data. Use
CONFIG_DUMP=y
CONFIG_KDB=y
CONFIG_KDB_OFF=y
CONFIG_KDB_CONTINUE_CATASTROPHIC=2
If you want human intervention before taking a dump, use
CONFIG_DUMP=y
CONFIG_KDB=y
CONFIG_KDB_OFF=n
CONFIG_KDB_CONTINUE_CATASTROPHIC=2
The following are indicative patches against lkcd 4.1, kernel 2.4.20.
You may have to to modify the patches for other kernels or other
versions of lkcd.
diff -urp lkcd/drivers/dump/dump_base.c lkcd/drivers/dump/dump_base.c
--- lkcd/drivers/dump/dump_base.c Thu May 1 13:10:12 2003
+++ lkcd/drivers/dump/dump_base.c Fri Jun 20 12:28:16 2003
@@ -207,6 +207,9 @@
#include <asm/hardirq.h>
#include <linux/version.h>
#include <asm/system.h>
+#ifdef CONFIG_KDB
+#include <linux/kdb.h>
+#endif
/*
* -----------------------------------------------------------------------
@@ -852,6 +855,13 @@ dump_silence_system(void)
unsigned int stage = 0;
int cpu = smp_processor_id();
+#ifdef CONFIG_KDB
+ if (KDB_IS_RUNNING()) {
+ /* kdb is in control, the system is already silenced */
+ printk(KERN_ALERT "LKCD entered from KDB\n");
+ }
+#endif /* CONFIG_KDB */
+
if (in_interrupt()) {
printk(KERN_ALERT "Dumping from interrupt handler !\n");
printk(KERN_ALERT "Uncertain scenario - but will try my best\n");
@@ -861,6 +871,9 @@ dump_silence_system(void)
* another approach
*/
}
/* see if there's something to do before we re-enable interrupts */
+#ifdef CONFIG_KDB
+ if (!KDB_IS_RUNNING())
+#endif /* CONFIG_KDB */
(void)__dump_silence_system(stage);
@@ -905,6 +918,9 @@ dump_silence_system(void)
/* now increment the stage and do stuff after interrupts are enabled */
stage++;
+#ifdef CONFIG_KDB
+ if (!KDB_IS_RUNNING())
+#endif /* CONFIG_KDB */
(void)__dump_silence_system(stage);
/* time to leave */
diff -urp lkcd/drivers/dump/dump_i386.c lkcd/drivers/dump/dump_i386.c
--- lkcd/drivers/dump/dump_i386.c Tue Jul 9 07:14:11 2002
+++ lkcd/drivers/dump/dump_i386.c Fri Jun 20 12:29:12 2003
@@ -27,6 +27,10 @@
#include <asm/processor.h>
#include <asm/hardirq.h>
#include <linux/irq.h>
+#ifdef CONFIG_KDB
+#include <linux/kdb.h>
+#include <linux/kdbprivate.h>
+#endif /* CONFIG_KDB */
static int alloc_dha_stack(void)
{
@@ -119,6 +123,31 @@ save_other_cpu_states(void)
{
int i;
+#ifdef CONFIG_KDB
+ if (KDB_IS_RUNNING()) {
+ /* invoked from kdb, which has already saved all the state */
+ int cpu;
+ struct kdb_running_process *krp;
+ for (cpu = 0, krp = kdb_running_process; cpu < smp_num_cpus; ++cpu, ++krp) {
+ if (krp->seqno < kdb_seqno - 1 ||
+ !krp->regs ||
+ !krp->p ||
+ kdb_process_cpu(krp->p) != cpu) {
+ printk(KERN_WARNING "No KDB data for cpu %d, it will not be in the LKCD dump\n", cpu);
+ continue;
+ }
+ if (cpu == smp_processor_id())
+ continue; /* dumped by save_this_cpu_state */
+ // kdb_printf("%s: cpu %d task %p regs %p\n", __FUNCTION__, cpu, krp->p, krp->regs);
+ save_this_cpu_state(cpu, krp->regs, krp->p);
+ }
+ return;
+ }
+ printk(KERN_WARNING "This kernel supports KDB but LKCD was invoked directly, not via KDB.\n");
+ printk(KERN_WARNING "Falling back to the old and broken LKCD method of getting data from all cpus,\n");
+ printk(KERN_WARNING "do not be surprised if LKCD hangs.\n");
+#endif /* CONFIG_KDB */
+
if (smp_num_cpus > 1) {
atomic_set(&waiting_for_dump_ipi, smp_num_cpus-1);
for (i = 0; i < NR_CPUS; i++)
diff -urp lkcd/drivers/dump/dump_ia64.c lkcd/drivers/dump/dump_ia64.c
--- lkcd/drivers/dump/dump_ia64.c Tue Jul 9 07:14:11 2002
+++ lkcd/drivers/dump/dump_ia64.c Fri Jun 20 12:31:41 2003
@@ -30,6 +30,10 @@
#include <asm/processor.h>
#include <asm/hardirq.h>
#include <linux/irq.h>
+#ifdef CONFIG_KDB
+#include <linux/kdb.h>
+#include <linux/kdbprivate.h>
+#endif /* CONFIG_KDB */
extern unsigned long irq_affinity[];
@@ -75,6 +79,12 @@ save_this_cpu_state(int cpu, struct pt_r
if (tsk && dump_header_asm.dha_stack[cpu]) {
memcpy((void*)dump_header_asm.dha_stack[cpu], tsk, THREAD_SIZE);
+#ifdef CONFIG_KDB
+ if (KDB_IS_RUNNING()) {
+ static void kludge_for_broken_lcrash(int);
+ kludge_for_broken_lcrash(cpu);
+ }
+#endif /* CONFIG_KDB */
}
return;
}
@@ -107,6 +117,32 @@ save_other_cpu_states(void)
{
int i;
+#ifdef CONFIG_KDB
+ if (KDB_IS_RUNNING()) {
+ /* invoked from kdb, which has already saved all the state */
+ int cpu;
+ struct kdb_running_process *krp;
+ for (cpu = 0, krp = kdb_running_process; cpu < smp_num_cpus; ++cpu, ++krp) {
+ if (krp->seqno < kdb_seqno - 1 ||
+ !krp->regs ||
+ !krp->arch.sw ||
+ !krp->p ||
+ kdb_process_cpu(krp->p) != cpu) {
+ printk(KERN_WARNING "No KDB data for cpu %d, it will not be in the LKCD dump\n", cpu);
+ continue;
+ }
+ if (cpu == smp_processor_id())
+ continue; /* dumped by save_this_cpu_state */
+ // kdb_printf("%s: cpu %d task %p regs %p\n", __FUNCTION__, cpu, krp->p, krp->regs);
+ save_this_cpu_state(cpu, krp->regs, krp->p);
+ }
+ return;
+ }
+ printk(KERN_WARNING "This kernel supports KDB but LKCD was invoked directly, not via KDB.\n");
+ printk(KERN_WARNING "Falling back to the old and broken LKCD method of getting data from all cpus,\n");
+ printk(KERN_WARNING "do not be surprised if LKCD hangs.\n");
+#endif /* CONFIG_KDB */
+
if (smp_num_cpus > 1) {
atomic_set(&waiting_for_dump_ipi, smp_num_cpus-1);
for (i = 0; i < NR_CPUS; i++)
@@ -380,3 +416,131 @@ void * __dump_memcpy(void * dest, const
}
return(vp);
}
+
+#ifdef CONFIG_KDB
+/*
+ * lcrash is broken. It incorrectly assumes that all tasks are blocked, it
+ * assumes that all code is built by gcc (and therefore it cannot unwind through
+ * assembler code), it assumes that there is only one pt_regs at the base of the
+ * stack (where user space entered the kernel). Dumping from kdb (or any
+ * interrupt context) breaks all those assumptions, resulting in a good dump
+ * that lcrash cannot get any useful backtraces from.
+ *
+ * The real fix is to correct lcrash, using libunwind. That is not going to
+ * happen any time soon, so this kludge takes the kdb data and reformats it to
+ * suit the broken lcrash code. The task state is unwound past the interrupt
+ * frame (pt_regs) before kdb, then a switch_stack is synthesized in place of
+ * the pt_regs, using the unwound data. ksp is changed to point to this
+ * switch_stack, making it look like the task is blocked with no interrupt.
+ *
+ * This will not work when the interrupt occurred in a leaf function, with no
+ * save of b0. But the old unwind code in lcrash cannot cope with that either,
+ * so no change.
+ */
+
+static inline void *
+kludge_copy_addr(int cpu, void *addr, struct task_struct *p)
+{
+ return (char *)addr - (char *)p + (char *)(dump_header_asm.dha_stack[cpu]);
+}
+
+static void
+kludge_for_broken_lcrash(int cpu)
+{
+ struct kdb_running_process *krp = kdb_running_process + cpu;
+ struct task_struct *p, *p_copy;
+ struct switch_stack *sw, *sw_copy, *sw_new;
+ struct pt_regs *regs;
+ struct unw_frame_info info;
+ kdb_symtab_t symtab;
+ kdb_machreg_t sp;
+ int count, i;
+ char nat;
+
+ if (krp->seqno < kdb_seqno - 1 ||
+ !krp->regs ||
+ user_mode(krp->regs) ||
+ !krp->arch.sw ||
+ !krp->p ||
+ kdb_process_cpu(krp->p) != cpu)
+ return;
+ p = krp->p;
+ regs = krp->regs;
+ sw = krp->arch.sw;
+#if 0
+ {
+ char buf[80];
+ sprintf(buf, "btc %d\n", cpu);
+ kdb_parse(buf, regs);
+ }
+#endif
+
+ unw_init_frame_info(&info, p, sw);
+ count = 0;
+ do {
+ unw_get_sp(&info, &sp);
+ // kdb_printf("sp 0x%lx regs 0x%lx\n", sp, regs);
+ } while (sp < (kdb_machreg_t)regs && unw_unwind(&info) >= 0 && count++ < 200);
+ if (count >= 200) {
+ printk(KERN_WARNING "Unwind for process %d on cpu %d looped\n", p->pid, cpu);
+ return;
+ }
+
+ /* Must not touch the real stack data, kludge the data using the copies
+ * in dump_header_asm.
+ */
+ p_copy = kludge_copy_addr(cpu, p, p);
+ sw_new = (struct switch_stack *)((u64)(regs + 1) + 16) - 1;
+ sw_copy = kludge_copy_addr(cpu, sw_new, p);
+ // kdb_printf("p_copy 0x%p sw_new 0x%p sw_copy 0x%p\n", p_copy, sw_new, sw_copy);
+ memset(sw_copy, 0, sizeof(*sw_copy));
+
+ sw_copy->caller_unat = sw->caller_unat;
+ unw_access_ar(&info, UNW_AR_FPSR, &sw_copy->ar_fpsr, 0);
+ for (i = 2; i <= 5; ++i)
+ unw_access_fr(&info, i, &sw_copy->f2 + i - 2, 0);
+ for (i = 10; i <= 31; ++i)
+ unw_access_fr(&info, i, &sw_copy->f10 + i - 10, 0);
+ for (i = 4; i <= 7; ++i)
+ unw_access_gr(&info, i, &sw_copy->r4 + i - 4, &nat, 0);
+ for (i = 0; i <= 5; ++i)
+ unw_access_br(&info, i, &sw_copy->b0 + i, 0);
+ sw_copy->ar_pfs = *info.cfm_loc;
+ unw_access_ar(&info, UNW_AR_LC, &sw_copy->ar_lc, 0);
+ unw_access_ar(&info, UNW_AR_UNAT, &sw_copy->ar_unat, 0);
+ unw_access_ar(&info, UNW_AR_RNAT, &sw_copy->ar_rnat, 0);
+ /* FIXME: unwind.c returns the original bspstore, not the value that
+ * matches the current unwind state. Calculate our own value for the
+ * modified bspstore. This should work but does not
+ * unw_access_ar(&info, UNW_AR_BSPSTORE, &sw_copy->ar_bspstore, 0);
+ */
+ sw_copy->ar_bspstore = (unsigned long)ia64_rse_skip_regs((unsigned long *)info.bsp, (*info.cfm_loc >> 7) & 0x7f);
+ unw_access_pr(&info, &sw_copy->pr, 0);
+
+ /* lcrash cannot unwind through the new spinlock contention code and it
+ * is too important a case to ignore. So the kludge extracts the
+ * calling IP before saving the data.
+ */
+ if (kdbnearsym(regs->cr_iip, &symtab) &&
+ strncmp(symtab.sym_name, "ia64_spinlock_contention", 24) == 0)
+ unw_get_rp(&info, &sw_copy->b0);
+
+ p_copy->thread.ksp = (__u64)sw_new - 16;
+ dump_header_asm.dha_smp_regs[cpu] = *((struct pt_regs *)((unsigned long)p + THREAD_SIZE) - 1);
+#if 0
+ {
+ /* debug. Destructive overwrite of task, then bt the result in kdb to
+ * validate the modified task.
+ */
+ char buf[80];
+ memcpy(p, p_copy, THREAD_SIZE);
+ krp->regs = NULL;
+ krp->arch.sw = sw_new;
+ sprintf(buf, "btc %d\n", cpu);
+ kdb_parse(buf, NULL);
+ while(1){};
+ }
+#endif
+}
+
+#endif /* CONFIG_KDB */
|