OpenCores
URL https://opencores.org/ocsvn/or1k/or1k/trunk

Subversion Repositories or1k

[/] [or1k/] [trunk/] [linux/] [linux-2.4/] [Documentation/] [s390/] [Debugging390.txt] - Blame information for rev 1765

Details | Compare with Previous | View Log

Line No. Rev Author Line
1 1275 phoenix
 
2
                          Debugging on Linux for s/390 & z/Architecture
3
                                       by
4
                Denis Joseph Barrow (djbarrow@de.ibm.com,barrow_dj@yahoo.com)
5
                Copyright (C) 2000-2001 IBM Deutschland Entwicklung GmbH, IBM Corporation
6
                              Best viewed with fixed width fonts
7
 
8
Overview of Document:
9
=====================
10
This document is intended to give an good overview of how to debug
11
Linux for s/390 & z/Architecture it isn't intended as a complete reference & not a
12
tutorial on the fundamentals of C & assembly, it dosen't go into
13
390 IO in any detail. It is intended to compliment the documents in the
14
reference section below & any other worthwhile references you get.
15
 
16
It is intended like the Enterprise Systems Architecture/390 Reference Summary
17
to be printed out & used as a quick cheat sheet self help style reference when
18
problems occur.
19
 
20
Contents
21
========
22
Register Set
23
Address Spaces on Intel Linux
24
Address Spaces on Linux for s/390 & z/Architecture
25
The Linux for s/390 & z/Architecture Kernel Task Structure
26
Register Usage & Stackframes on Linux for s/390 & z/Architecture
27
A sample program with comments
28
Compiling programs for debugging on Linux for s/390 & z/Architecture
29
Figuring out gcc compile errors
30
Debugging Tools
31
objdump
32
strace
33
Performance Debugging
34
Debugging under VM
35
s/390 & z/Architecture IO Overview
36
Debugging IO on s/390 & z/Architecture under VM
37
GDB on s/390 & z/Architecture
38
Stack chaining in gdb by hand
39
Examining core dumps
40
ldd
41
Debugging modules
42
The proc file system
43
Starting points for debugging scripting languages etc.
44
SysRq
45
References
46
Special Thanks
47
 
48
Register Set
49
============
50
The current architectures have the following registers.
51
 
52
16  General propose registers, 32 bit on s/390 64 bit on z/Architecture, r0-r15 or gpr0-gpr15 used for arithmetic & addressing.
53
 
54
16 Control registers, 32 bit on s/390 64 bit on z/Architecture, ( cr0-cr15 kernel usage only ) used for memory managment,
55
interrupt control,debugging control etc.
56
 
57
16 Access registers ( ar0-ar15 ) 32 bit on s/390 & z/Architecture
58
not used by normal programs but potentially could
59
be used as temporary storage. Their main purpose is their 1 to 1
60
association with general purpose registers and are used in
61
the kernel for copying data between kernel & user address spaces.
62
Access register 0 ( & access register 1 on z/Architecture ( needs 64 bit
63
pointer ) ) is currently used by the pthread library as a pointer to
64
the current running threads private area.
65
 
66
16 64 bit floating point registers (fp0-fp15 ) IEEE & HFP floating
67
point format compliant on G5 upwards & a Floating point control reg (FPC)
68
4  64 bit registers (fp0,fp2,fp4 & fp6) HFP only on older machines.
69
Note:
70
Linux (currently) always uses IEEE & emulates G5 IEEE format on older machines,
71
( provided the kernel is configured for this ).
72
 
73
 
74
The PSW is the most important register on the machine it
75
is 64 bit on s/390 & 128 bit on z/Architecture & serves the roles of
76
a program counter (pc), condition code register,memory space designator.
77
In IBM standard notation I am counting bit 0 as the MSB.
78
It has several advantages over a normal program counter
79
in that you can change address translation & program counter
80
in a single instruction. To change address translation,
81
e.g. switching address translation off requires that you
82
have a logical=physical mapping for the address you are
83
currently running at.
84
 
85
      Bit           Value
86
s/390 z/Architecture
87
 
88
 
89
1       1     Program Event Recording 1 PER enabled,
90
              PER is used to facilititate debugging e.g. single stepping.
91
 
92
2-4    2-4    Reserved ( must be 0 ).
93
 
94
5       5     Dynamic address translation 1=DAT on.
95
 
96
6       6     Input/Output interrupt Mask
97
 
98
7       7     External interrupt Mask used primarily for interprocessor signalling &
99
              clock interupts.
100
 
101
8-11  8-11    PSW Key used for complex memory protection mechanism not used under linux
102
 
103
12      12    1 on s/390 0 on z/Architecture
104
 
105
13      13    Machine Check Mask 1=enable machine check interrupts
106
 
107
14      14    Wait State set this to 1 to stop the processor except for interrupts & give
108
              time to other LPARS used in CPU idle in the kernel to increase overall
109
              usage of processor resources.
110
 
111
15      15    Problem state ( if set to 1 certain instructions are disabled )
112
              all linux user programs run with this bit 1
113
              ( useful info for debugging under VM ).
114
 
115
16-17 16-17   Address Space Control
116
 
117
              00 Primary Space Mode when DAT on
118
              The linux kernel currently runs in this mode, CR1 is affiliated with
119
              this mode & points to the primary segment table origin etc.
120
 
121
              01 Access register mode this mode is used in functions to
122
              copy data between kernel & user space.
123
 
124
              10 Secondary space mode not used in linux however CR7 the
125
              register affiliated with this mode is & this & normally
126
              CR13=CR7 to allow us to copy data between kernel & user space.
127
              We do this as follows:
128
              We set ar2 to 0 to designate its
129
              affiliated gpr ( gpr2 )to point to primary=kernel space.
130
              We set ar4 to 1 to designate its
131
              affiliated gpr ( gpr4 ) to point to secondary=home=user space
132
              & then essentially do a memcopy(gpr2,gpr4,size) to
133
              copy data between the address spaces, the reason we use home space for the
134
              kernel & don't keep secondary space free is that code will not run in
135
              secondary space.
136
 
137
              11 Home Space Mode all user programs run in this mode.
138
              it is affiliated with CR13.
139
 
140
18-19 18-19   Condition codes (CC)
141
 
142
20    20      Fixed point overflow mask if 1=FPU exceptions for this event
143
              occur ( normally 0 )
144
 
145
21    21      Decimal overflow mask if 1=FPU exceptions for this event occur
146
              ( normally 0 )
147
 
148
22    22      Exponent underflow mask if 1=FPU exceptions for this event occur
149
              ( normally 0 )
150
 
151
23    23      Significance Mask if 1=FPU exceptions for this event occur
152
              ( normally 0 )
153
 
154
24-31 24-30   Reserved Must be 0.
155
 
156
      31      Extended Addressing Mode
157
      32      Basic Addressing Mode
158
              Used to set addressing mode
159
              PSW 31   PSW 32
160
 
161
 
162
                1         1        64 bit
163
 
164
32             1=31 bit addressing mode 0=24 bit addressing mode (for backward
165
               compatibility ), linux always runs with this bit set to 1
166
 
167
33-64          Instruction address.
168
      33-63    Reserved must be 0
169
      64-127   Address
170
               In 24 bits mode bits 64-103=0 bits 104-127 Address
171
               In 31 bits mode bits 64-96=0 bits 97-127 Address
172
               Note: unlike 31 bit mode on s/390 bit 96 must be zero
173
               when loading the address with LPSWE otherwise a
174
               specification exception occurs, LPSW is fully backward
175
               compatible.
176
 
177
 
178
Prefix Page(s)
179
--------------
180
This per cpu memory area is too intimately tied to the processor not to mention.
181
It exists between the real addresses 0-4096 on s/390 & 0-8192 z/Architecture & is exchanged
182
with a 1 page on s/390 or 2 pages on z/Architecture in absolute storage by the set
183
prefix instruction in linux'es startup.
184
This page is mapped to a different prefix for each processor in an SMP configuration
185
( assuming the os designer is sane of course :-) ).
186
Bytes 0-512 ( 200 hex ) on s/390 & 0-512,4096-4544,4604-5119 currently on z/Architecture
187
are used by the processor itself for holding such information as exception indications &
188
entry points for exceptions.
189
Bytes after 0xc00 hex are used by linux for per processor globals on s/390 & z/Architecture
190
( there is a gap on z/Architecure too currently between 0xc00 & 1000 which linux uses ).
191
The closest thing to this on traditional architectures is the interrupt
192
vector table. This is a good thing & does simplify some of the kernel coding
193
however it means that we now cannot catch stray NULL pointers in the
194
kernel without hard coded checks.
195
 
196
 
197
 
198
Address Spaces on Intel Linux
199
=============================
200
 
201
The traditional Intel Linux is approximately mapped as follows forgive
202
the ascii art.
203
0xFFFFFFFF 4GB Himem                        *****************
204
                                            *               *
205
                                            * Kernel Space  *
206
                                            *               *
207
                                            *****************          ****************
208
User Space Himem (typically 0xC0000000 3GB )*  User Stack   *          *              *
209
                                            *****************          *              *
210
                                            *  Shared Libs  *          * Next Process *
211
                                            *****************          *     to       *
212
                                            *               *    <==   *     Run      *  <==
213
                                            *  User Program *          *              *
214
                                            *   Data BSS    *          *              *
215
                                            *    Text       *          *              *
216
                                            *   Sections    *          *              *
217
0x00000000                                  *****************          ****************
218
 
219
Now it is easy to see that on Intel it is quite easy to recognise a kernel address
220
as being one greater than user space himem ( in this case 0xC0000000).
221
& addresses of less than this are the ones in the current running program on this
222
processor ( if an smp box ).
223
If using the virtual machine ( VM ) as a debugger it is quite difficult to
224
know which user process is running as the address space you are looking at
225
could be from any process in the run queue.
226
 
227
The limitation of Intels addressing technique is that the linux
228
kernel uses a very simple real address to virtual addressing technique
229
of Real Address=Virtual Address-User Space Himem.
230
This means that on Intel the kernel linux can typically only address
231
Himem=0xFFFFFFFF-0xC0000000=1GB & this is all the RAM these machines
232
can typically use.
233
They can lower User Himem to 2GB or lower & thus be
234
able to use 2GB of RAM however this shrinks the maximum size
235
of User Space from 3GB to 2GB they have a no win limit of 4GB unless
236
they go to 64 Bit.
237
 
238
 
239
On 390 our limitations & strengths make us slightly different.
240
For backward compatibility ( because of the psw address hi bit which
241
indicates whether we are in 31 or 24 bit mode ) we are only allowed
242
use 31 bits (2GB) of our 32 bit addresses. However,
243
we use entirely separate address  spaces for the user & kernel.
244
 
245
This means we can support 2GB of non Extended RAM on s/390, & more
246
with the Extended memory managment swap device &
247
currently 4TB of physical memory currently on z/Architecture.
248
 
249
 
250
Address Spaces on Linux for s/390 & z/Architecture
251
==================================================
252
 
253
Our addressing scheme is as follows
254
 
255
 
256
Himem 0x7fffffff 2GB on s/390    *****************          ****************
257
currently 0x3ffffffffff (2^42)-1 *  User Stack   *          *              *
258
on z/Architecture.               *****************          *              *
259
                                 *  Shared Libs  *          *              *
260
                                 *****************          *              *
261
                                 *               *          *    Kernel    *
262
                                 *  User Program *          *              *
263
                                 *   Data BSS    *          *              *
264
                                 *    Text       *          *              *
265
                                 *   Sections    *          *              *
266
0x00000000                       *****************          ****************
267
 
268
This also means that we need to look at the PSW problem state bit
269
or the addressing mode to decide whether we are looking at
270
user or kernel space.
271
 
272
Virtual Addresses on s/390 & z/Architecture
273
===========================================
274
 
275
A virtual address on s/390 is made up of 3 parts
276
The SX ( segment index, roughly corresponding to the PGD & PMD in linux terminology )
277
being bits 1-11.
278
The PX ( page index, corresponding to the page table entry (pte) in linux terminology )
279
being bits 12-19.
280
The remaining bits BX (the byte index are the offset in the page )
281
i.e. bits 20 to 31.
282
 
283
On z/Architecture in linux we currently make up an address from 4 parts.
284
The region index bits (RX) 0-32 we currently use bits 22-32
285
The segment index (SX) being bits 33-43
286
The page index (PX) being bits  44-51
287
The byte index (BX) being bits  52-63
288
 
289
Notes:
290
1) s/390 has no PMD so the PMD is really the PGD also.
291
A lot of this stuff is defined in pgtable.h.
292
 
293
2) Also seeing as s/390's page indexes are only 1k  in size
294
(bits 12-19 x 4 bytes per pte ) we use 1 ( page 4k )
295
to make the best use of memory by updating 4 segment indices
296
entries each time we mess with a PMD & use offsets
297
0,1024,2048 & 3072 in this page as for our segment indexes.
298
On z/Architecture our page indexes are now 2k in size
299
( bits 12-19 x 8 bytes per pte ) we do a similar trick
300
but only mess with 2 segment indices each time we mess with
301
a PMD.
302
 
303
3) As z/Architecture supports upto a massive 5-level page table lookup we
304
can only use 3 currently on Linux ( as this is all the generic kernel
305
currently supports ) however this may change in future
306
this allows us to access ( according to my sums )
307
4TB of virtual storage per process i.e.
308
4096*512(PTES)*1024(PMDS)*2048(PGD) = 4398046511104 bytes,
309
enough for another 2 or 3 of years I think :-).
310
to do this we use a region-third-table designation type in
311
our address space control registers.
312
 
313
 
314
The Linux for s/390 & z/Architecture Kernel Task Structure
315
==========================================================
316
Each process/thread under Linux for S390 has its own kernel task_struct
317
defined in linux/include/linux/sched.h
318
The S390 on initialisation & resuming of a process on a cpu sets
319
the __LC_KERNEL_STACK variable in the spare prefix area for this cpu
320
( which we use for per processor globals).
321
 
322
The kernel stack pointer is intimately tied with the task stucture for
323
each processor as follows.
324
 
325
                      s/390
326
            ************************
327
            *  1 page kernel stack *
328
            *        ( 4K )        *
329
            ************************
330
            *   1 page task_struct *
331
            *        ( 4K )        *
332
8K aligned  ************************
333
 
334
                 z/Architecture
335
            ************************
336
            *  2 page kernel stack *
337
            *        ( 8K )        *
338
            ************************
339
            *  2 page task_struct  *
340
            *        ( 8K )        *
341
16K aligned ************************
342
 
343
What this means is that we don't need to dedicate any register or global variable
344
to point to the current running process & can retrieve it with the following
345
very simple construct for s/390 & one very similar for z/Architecture.
346
 
347
static inline struct task_struct * get_current(void)
348
{
349
        struct task_struct *current;
350
        __asm__("lhi   %0,-8192\n\t"
351
                "nr    %0,15"
352
                : "=r" (current) );
353
        return current;
354
}
355
 
356
i.e. just anding the current kernel stack pointer with the mask -8192.
357
Thankfully because Linux dosen't have support for nested IO interrupts
358
& our devices have large buffers can survive interrupts being shut for
359
short amounts of time we don't need a separate stack for interrupts.
360
 
361
 
362
 
363
 
364
Register Usage & Stackframes on Linux for s/390 & z/Architecture
365
=================================================================
366
Overview:
367
---------
368
This is the code that gcc produces at the top & the bottom of
369
each function, it usually is fairly consistent & similar from
370
function to function & if you know its layout you can probalby
371
make some headway in finding the ultimate cause of a problem
372
after a crash without a source level debugger.
373
 
374
Note: To follow stackframes requires a knowledge of C or Pascal &
375
limited knowledge of one assembly language.
376
 
377
It should be noted that there are some differences between the
378
s/390 & z/Architecture stack layouts as the z/Architecture stack layout didn't have
379
to maintain compatibility with older linkage formats.
380
 
381
Glossary:
382
---------
383
alloca:
384
This is a built in compiler function for runtime allocation
385
of extra space on the callers stack which is obviously freed
386
up on function exit ( e.g. the caller may choose to allocate nothing
387
of a buffer of 4k if required for temporary purposes ), it generates
388
very efficent code ( a few cycles  ) when compared to alternatives
389
like malloc.
390
 
391
automatics: These are local variables on the stack,
392
i.e they aren't in registers & they aren't static.
393
 
394
back-chain:
395
This is a pointer to the stack pointer before entering a
396
framed functions ( see frameless function ) prologue got by
397
deferencing the address of the current stack pointer,
398
 i.e. got by accessing the 32 bit value at the stack pointers
399
current location.
400
 
401
base-pointer:
402
This is a pointer to the back of the literal pool which
403
is an area just behind each procedure used to store constants
404
in each function.
405
 
406
call-clobbered: The caller probably needs to save these registers if there
407
is something of value in them, on the stack or elsewhere before making a
408
call to another procedure so that it can restore it later.
409
 
410
epilogue:
411
The code generated by the compiler to return to the caller.
412
 
413
frameless-function
414
A frameless function in Linux for s390 & z/Architecture is one which doesn't
415
need more than the register save area ( 96 bytes on s/390, 160 on z/Architecture )
416
given to it by the caller.
417
A frameless function never:
418
1) Sets up a back chain.
419
2) Calls alloca.
420
3) Calls other normal functions
421
4) Has automatics.
422
 
423
GOT-pointer:
424
This is a pointer to the global-offset-table in ELF
425
( Executable Linkable Format, Linux'es most common executable format ),
426
all globals & shared library objects are found using this pointer.
427
 
428
lazy-binding
429
ELF shared libraries are typically only loaded when routines in the shared
430
library are actually first called at runtime. This is lazy binding.
431
 
432
procedure-linkage-table
433
This is a table found from the GOT which contains pointers to routines
434
in other shared libraries which can't be called to by easier means.
435
 
436
prologue:
437
The code generated by the compiler to set up the stack frame.
438
 
439
outgoing-args:
440
This is extra area allocated on the stack of the calling function if the
441
parameters for the callee's cannot all be put in registers, the same
442
area can be reused by each function the caller calls.
443
 
444
routine-descriptor:
445
A COFF  executable format based concept of a procedure reference
446
actually being 8 bytes or more as opposed to a simple pointer to the routine.
447
This is typically defined as follows
448
Routine Descriptor offset 0=Pointer to Function
449
Routine Descriptor offset 4=Pointer to Table of Contents
450
The table of contents/TOC is roughly equivalent to a GOT pointer.
451
& it means that shared libraries etc. can be shared between several
452
environments each with their own TOC.
453
 
454
 
455
static-chain: This is used in nested functions a concept adopted from pascal
456
by gcc not used in ansi C or C++ ( although quite useful ), basically it
457
is a pointer used to reference local variables of enclosing functions.
458
You might come across this stuff once or twice in your lifetime.
459
 
460
e.g.
461
The function below should return 11 though gcc may get upset & toss warnings
462
about unused variables.
463
int FunctionA(int a)
464
{
465
        int b;
466
        FunctionC(int c)
467
        {
468
                b=c+1;
469
        }
470
        FunctionC(10);
471
        return(b);
472
}
473
 
474
 
475
s/390 & z/Architecture Register usage
476
=====================================
477
r0       used by syscalls/assembly                  call-clobbered
478
r1       used by syscalls/assembly                  call-clobbered
479
r2       argument 0 / return value 0                call-clobbered
480
r3       argument 1 / return value 1 (if long long) call-clobbered
481
r4       argument 2                                 call-clobbered
482
r5       argument 3                                 call-clobbered
483
r6       argument 5                                 saved
484
r7       pointer-to arguments 5 to ...              saved
485
r8       this & that                                saved
486
r9       this & that                                saved
487
r10      static-chain ( if nested function )        saved
488
r11      frame-pointer ( if function used alloca )  saved
489
r12      got-pointer                                saved
490
r13      base-pointer                               saved
491
r14      return-address                             saved
492
r15      stack-pointer                              saved
493
 
494
f0       argument 0 / return value ( float/double ) call-clobbered
495
f2       argument 1                                 call-clobbered
496
f4       z/Architecture argument 2                  saved
497
f6       z/Architecture argument 3                  saved
498
The remaining floating points
499
f1,f3,f5 f7-f15 are call-clobbered.
500
 
501
Notes:
502
------
503
1) The only requirement is that registers which are used
504
by the callee are saved, e.g. the compiler is perfectly
505
capible of using r11 for purposes other than a frame a
506
frame pointer if a frame pointer is not needed.
507
2) In functions with variable arguments e.g. printf the calling procedure
508
is identical to one without variable arguments & the same number of
509
parameters. However, the prologue of this function is somewhat more
510
hairy owing to it having to move these parameters to the stack to
511
get va_start, va_arg & va_end to work.
512
3) Access registers are currently unused by gcc but are used in
513
the kernel. Possibilities exist to use them at the moment for
514
temporary storage but it isn't recommended.
515
4) Only 4 of the floating point registers are used for
516
parameter passing as older machines such as G3 only have only 4
517
& it keeps the stack frame compatible with other compilers.
518
However with IEEE floating point emulation under linux on the
519
older machines you are free to use the other 12.
520
5) A long long or double parameter cannot be have the
521
first 4 bytes in a register & the second four bytes in the
522
outgoing args area. It must be purely in the outgoing args
523
area if crossing this boundary.
524
6) Floating point parameters are mixed with outgoing args
525
on the outgoing args area in the order the are passed in as parameters.
526
7) Floating point arguments 2 & 3 are saved in the outgoing args area for
527
z/Architecture
528
 
529
 
530
Stack Frame Layout
531
------------------
532
s/390     z/Architecture
533
 
534
4         8             eos ( end of stack, not used on Linux for S390 used in other linkage formats )
535
8         16            glue used in other s/390 linkage formats for saved routine descriptors etc.
536
12        24            glue used in other s/390 linkage formats for saved routine descriptors etc.
537
16        32            scratch area
538
20        40            scratch area
539
24        48            saved r6 of caller function
540
28        56            saved r7 of caller function
541
32        64            saved r8 of caller function
542
36        72            saved r9 of caller function
543
40        80            saved r10 of caller function
544
44        88            saved r11 of caller function
545
48        96            saved r12 of caller function
546
52        104           saved r13 of caller function
547
56        112           saved r14 of caller function
548
60        120           saved r15 of caller function
549
64        128           saved f4 of caller function
550
72        132           saved f6 of caller function
551
80                      undefined
552
96        160           outgoing args passed from caller to callee
553
96+x      160+x         possible stack alignment ( 8 bytes desirable )
554
96+x+y    160+x+y       alloca space of caller ( if used )
555
96+x+y+z  160+x+y+z     automatics of caller ( if used )
556
 
557
 
558
A sample program with comments.
559
===============================
560
 
561
Comments on the function test
562
-----------------------------
563
1) It didn't need to set up a pointer to the constant pool gpr13 as it isn't used
564
( :-( ).
565
2) This is a frameless function & no stack is bought.
566
3) The compiler was clever enough to recognise that it could return the
567
value in r2 as well as use it for the passed in parameter ( :-) ).
568
4) The basr ( branch relative & save ) trick works as follows the instruction
569
has a special case with r0,r0 with some instruction operands is understood as
570
the literal value 0, some risc architectures also do this ). So now
571
we are branching to the next address & the address new program counter is
572
in r13,so now we subtract the size of the function prologue we have executed
573
+ the size of the literal pool to get to the top of the literal pool
574
0040037c int test(int b)
575
{                                                          # Function prologue below
576
  40037c:       90 de f0 34     stm     %r13,%r14,52(%r15) # Save registers r13 & r14
577
  400380:       0d d0           basr    %r13,%r0           # Set up pointer to constant pool using
578
  400382:       a7 da ff fa     ahi     %r13,-6            # basr trick
579
        return(5+b);
580
                                                           # Huge main program
581
  400386:       a7 2a 00 05     ahi     %r2,5              # add 5 to r2
582
 
583
                                                           # Function epilogue below
584
  40038a:       98 de f0 34     lm      %r13,%r14,52(%r15) # restore registers r13 & 14
585
  40038e:       07 fe           br      %r14               # return
586
}
587
 
588
Comments on the function main
589
-----------------------------
590
1) The compiler did this function optimally ( 8-) )
591
 
592
Literal pool for main.
593
400390: ff ff ff ec     .long 0xffffffec
594
main(int argc,char *argv[])
595
{                                                          # Function prologue below
596
  400394:       90 bf f0 2c     stm     %r11,%r15,44(%r15) # Save necessary registers
597
  400398:       18 0f           lr      %r0,%r15           # copy stack pointer to r0
598
  40039a:       a7 fa ff a0     ahi     %r15,-96           # Make area for callee saving
599
  40039e:       0d d0           basr    %r13,%r0           # Set up r13 to point to
600
  4003a0:       a7 da ff f0     ahi     %r13,-16           # literal pool
601
  4003a4:       50 00 f0 00     st      %r0,0(%r15)        # Save backchain
602
 
603
        return(test(5));                                   # Main Program Below
604
  4003a8:       58 e0 d0 00     l       %r14,0(%r13)       # load relative address of test from
605
                                                           # literal pool
606
  4003ac:       a7 28 00 05     lhi     %r2,5              # Set first parameter to 5
607
  4003b0:       4d ee d0 00     bas     %r14,0(%r14,%r13)  # jump to test setting r14 as return
608
                                                           # address using branch & save instruction.
609
 
610
                                                           # Function Epilogue below
611
  4003b4:       98 bf f0 8c     lm      %r11,%r15,140(%r15)# Restore necessary registers.
612
  4003b8:       07 fe           br      %r14               # return to do program exit
613
}
614
 
615
 
616
Compiler updates
617
----------------
618
 
619
main(int argc,char *argv[])
620
{
621
  4004fc:       90 7f f0 1c             stm     %r7,%r15,28(%r15)
622
  400500:       a7 d5 00 04             bras    %r13,400508 
623
  400504:       00 40 04 f4             .long   0x004004f4
624
  # compiler now puts constant pool in code to so it saves an instruction
625
  400508:       18 0f                   lr      %r0,%r15
626
  40050a:       a7 fa ff a0             ahi     %r15,-96
627
  40050e:       50 00 f0 00             st      %r0,0(%r15)
628
        return(test(5));
629
  400512:       58 10 d0 00             l       %r1,0(%r13)
630
  400516:       a7 28 00 05             lhi     %r2,5
631
  40051a:       0d e1                   basr    %r14,%r1
632
  # compiler adds 1 extra instruction to epilogue this is done to
633
  # avoid processor pipeline stalls owing to data dependencies on g5 &
634
  # above as register 14 in the old code was needed directly after being loaded
635
  # by the lm   %r11,%r15,140(%r15) for the br %14.
636
  40051c:       58 40 f0 98             l       %r4,152(%r15)
637
  400520:       98 7f f0 7c             lm      %r7,%r15,124(%r15)
638
  400524:       07 f4                   br      %r4
639
}
640
 
641
 
642
Hartmut ( our compiler developer ) also has been threatening to take out the
643
stack backchain in optimised code as this also causes pipeline stalls, you
644
have been warned.
645
 
646
64 bit z/Architecture code disassembly
647
--------------------------------------
648
 
649
If you understand the stuff above you'll understand the stuff
650
below too so I'll avoid repeating myself & just say that
651
some of the instructions have g's on the end of them to indicate
652
they are 64 bit & the stack offsets are a bigger,
653
the only other difference you'll find between 32 & 64 bit is that
654
we now use f4 & f6 for floating point arguments on 64 bit.
655
00000000800005b0 :
656
int test(int b)
657
{
658
        return(5+b);
659
    800005b0:   a7 2a 00 05             ahi     %r2,5
660
    800005b4:   b9 14 00 22             lgfr    %r2,%r2 # downcast to integer
661
    800005b8:   07 fe                   br      %r14
662
    800005ba:   07 07                   bcr     0,%r7
663
 
664
 
665
}
666
 
667
00000000800005bc 
:
668
main(int argc,char *argv[])
669
{
670
    800005bc:   eb bf f0 58 00 24       stmg    %r11,%r15,88(%r15)
671
    800005c2:   b9 04 00 1f             lgr     %r1,%r15
672
    800005c6:   a7 fb ff 60             aghi    %r15,-160
673
    800005ca:   e3 10 f0 00 00 24       stg     %r1,0(%r15)
674
        return(test(5));
675
    800005d0:   a7 29 00 05             lghi    %r2,5
676
    # brasl allows jumps > 64k & is overkill here bras would do fune
677
    800005d4:   c0 e5 ff ff ff ee       brasl   %r14,800005b0 
678
    800005da:   e3 40 f1 10 00 04       lg      %r4,272(%r15)
679
    800005e0:   eb bf f0 f8 00 04       lmg     %r11,%r15,248(%r15)
680
    800005e6:   07 f4                   br      %r4
681
}
682
 
683
 
684
 
685
Compiling programs for debugging on Linux for s/390 & z/Architecture
686
====================================================================
687
-gdwarf-2 now works it should be considered the default debugging
688
format for s/390 & z/Architecture as it is more reliable for debugging
689
shared libraries,  normal -g debugging works much better now
690
Thanks to the IBM java compiler developers bug reports.
691
 
692
This is typically done adding/appending the flags -g or -gdwarf-2 to the
693
CFLAGS & LDFLAGS variables Makefile of the program concerned.
694
 
695
If using gdb & you would like accurate displays of registers &
696
 stack traces compile without optimisation i.e make sure
697
that there is no -O2 or similar on the CFLAGS line of the Makefile &
698
the emitted gcc commands, obviously this will produce worse code
699
( not advisable for shipment ) but it is an  aid to the debugging process.
700
 
701
This aids debugging because the compiler will copy parameters passed in
702
in registers onto the stack so backtracing & looking at passed in
703
parameters will work, however some larger programs which use inline functions
704
will not compile without optimisation.
705
 
706
Debugging with optimisation has since much improved after fixing
707
some bugs, please make sure you are using gdb-5.0 or later developed
708
after Nov'2000.
709
 
710
Figuring out gcc compile errors
711
===============================
712
If you are getting a lot of syntax errors compiling a program & the problem
713
isn't blatantly obvious from the source.
714
It often helps to just preprocess the file, this is done with the -E
715
option in gcc.
716
What this does is that it runs through the very first phase of compilation
717
( compilation in gcc is done in several stages & gcc calls many programs to
718
achieve its end result ) with the -E option gcc just calls the gcc preprocessor (cpp).
719
The c preprocessor does the following, it joins all the files #included together
720
recursively ( #include files can #include other files ) & also the c file you wish to compile.
721
It puts a fully qualified path of the #included files in a comment & it
722
does macro expansion.
723
This is useful for debugging because
724
1) You can double check whether the files you expect to be included are the ones
725
that are being included ( e.g. double check that you aren't going to the i386 asm directory ).
726
2) Check that macro definitions aren't clashing with typedefs,
727
3) Check that definitons aren't being used before they are being included.
728
4) Helps put the line emitting the error under the microscope if it contains macros.
729
 
730
For convenience the Linux kernel's makefile will do preprocessing automatically for you
731
by suffixing the file you want built with .i ( instead of .o )
732
 
733
e.g.
734
from the linux directory type
735
make arch/s390/kernel/signal.i
736
this will build
737
 
738
s390-gcc -D__KERNEL__ -I/home1/barrow/linux/include -Wall -Wstrict-prototypes -O2 -fomit-frame-pointer
739
-fno-strict-aliasing -D__SMP__ -pipe -fno-strength-reduce   -E arch/s390/kernel/signal.c
740
> arch/s390/kernel/signal.i
741
 
742
Now look at signal.i you should see something like.
743
 
744
 
745
# 1 "/home1/barrow/linux/include/asm/types.h" 1
746
typedef unsigned short umode_t;
747
typedef __signed__ char __s8;
748
typedef unsigned char __u8;
749
typedef __signed__ short __s16;
750
typedef unsigned short __u16;
751
 
752
If instead you are getting errors further down e.g.
753
unknown instruction:2515 "move.l" or better still unknown instruction:2515
754
"Fixme not implemented yet, call Martin" you are probably are attempting to compile some code
755
meant for another architecture or code that is simply not implemented, with a fixme statement
756
stuck into the inline assembly code so that the author of the file now knows he has work to do.
757
To look at the assembly emitted by gcc just before it is about to call gas ( the gnu assembler )
758
use the -S option.
759
Again for your convenience the Linux kernel's Makefile will hold your hand &
760
do all this donkey work for you also by building the file with the .s suffix.
761
e.g.
762
from the Linux directory type
763
make arch/s390/kernel/signal.s
764
 
765
s390-gcc -D__KERNEL__ -I/home1/barrow/linux/include -Wall -Wstrict-prototypes -O2 -fomit-frame-pointer
766
-fno-strict-aliasing -D__SMP__ -pipe -fno-strength-reduce  -S arch/s390/kernel/signal.c
767
-o arch/s390/kernel/signal.s
768
 
769
 
770
This will output something like, ( please note the constant pool & the useful comments
771
in the prologue to give you a hand at interpreting it ).
772
 
773
.LC54:
774
        .string "misaligned (__u16 *) in __xchg\n"
775
.LC57:
776
        .string "misaligned (__u32 *) in __xchg\n"
777
.L$PG1: # Pool sys_sigsuspend
778
.LC192:
779
        .long   -262401
780
.LC193:
781
        .long   -1
782
.LC194:
783
        .long   schedule-.L$PG1
784
.LC195:
785
        .long   do_signal-.L$PG1
786
        .align 4
787
.globl sys_sigsuspend
788
        .type    sys_sigsuspend,@function
789
sys_sigsuspend:
790
#       leaf function           0
791
#       automatics              16
792
#       outgoing args           0
793
#       need frame pointer      0
794
#       call alloca             0
795
#       has varargs             0
796
#       incoming args (stack)   0
797
#       function length         168
798
        STM     8,15,32(15)
799
        LR      0,15
800
        AHI     15,-112
801
        BASR    13,0
802
.L$CO1: AHI     13,.L$PG1-.L$CO1
803
        ST      0,0(15)
804
        LR    8,2
805
        N     5,.LC192-.L$PG1(13)
806
 
807
Adding -g to the above output makes the output even more useful
808
e.g. typing
809
make CC:="s390-gcc -g" kernel/sched.s
810
 
811
which compiles.
812
s390-gcc -g -D__KERNEL__ -I/home/barrow/linux-2.3/include -Wall -Wstrict-prototypes -O2 -fomit-frame-pointer -fno-strict-aliasing -pipe -fno-strength-reduce   -S kernel/sched.c -o kernel/sched.s
813
 
814
also outputs stabs ( debugger ) info, from this info you can find out the
815
offsets & sizes of various elements in structures.
816
e.g. the stab for the structure
817
struct rlimit {
818
        unsigned long   rlim_cur;
819
        unsigned long   rlim_max;
820
};
821
is
822
.stabs "rlimit:T(151,2)=s8rlim_cur:(0,5),0,32;rlim_max:(0,5),32,32;;",128,0,0,0
823
from this stab you can see that
824
rlimit_cur starts at bit offset 0 & is 32 bits in size
825
rlimit_max starts at bit offset 32 & is 32 bits in size.
826
 
827
 
828
Debugging Tools:
829
================
830
 
831
objdump
832
=======
833
This is a tool with many options the most useful being ( if compiled with -g).
834
objdump --source  > 
835
 
836
 
837
The whole kernel can be compiled like this ( Doing this will make a 17MB kernel
838
& a 200 MB listing ) however you have to strip it before building the image
839
using the strip command to make it a more reasonable size to boot it.
840
 
841
A source/assembly mixed dump of the kernel can be done with the line
842
objdump --source vmlinux > vmlinux.lst
843
Also if the file isn't compiled -g this will output as much debugging information
844
as it can ( e.g. function names ), however, this is very slow as it spends lots
845
of time searching for debugging info, the following self explanitory line should be used
846
instead if the code isn't compiled -g.
847
objdump --disassemble-all --syms vmlinux > vmlinux.lst
848
as it is much faster
849
 
850
As hard drive space is valuble most of us use the following approach.
851
1) Look at the emitted psw on the console to find the crash address in the kernel.
852
2) Look at the file System.map ( in the linux directory ) produced when building
853
the kernel to find the closest address less than the current PSW to find the
854
offending function.
855
3) use grep or similar to search the source tree looking for the source file
856
 with this function if you don't know where it is.
857
4) rebuild this object file with -g on, as an example suppose the file was
858
( /arch/s390/kernel/signal.o )
859
5) Assuming the file with the erroneous function is signal.c Move to the base of the
860
Linux source tree.
861
6) rm /arch/s390/kernel/signal.o
862
7) make /arch/s390/kernel/signal.o
863
8) watch the gcc command line emitted
864
9) type it in again or alernatively cut & paste it on the console adding the -g option.
865
10) objdump --source arch/s390/kernel/signal.o > signal.lst
866
This will output the source & the assembly intermixed, as the snippet below shows
867
This will unfortunately output addresses which aren't the same
868
as the kernel ones you should be able to get around the mental arithmetic
869
by playing with the --adjust-vma parameter to objdump.
870
 
871
 
872
 
873
 
874
extern inline void spin_lock(spinlock_t *lp)
875
{
876
      a0:       18 34           lr      %r3,%r4
877
      a2:       a7 3a 03 bc     ahi     %r3,956
878
        __asm__ __volatile("    lhi   1,-1\n"
879
      a6:       a7 18 ff ff     lhi     %r1,-1
880
      aa:       1f 00           slr     %r0,%r0
881
      ac:       ba 01 30 00     cs      %r0,%r1,0(%r3)
882
      b0:       a7 44 ff fd     jm      aa 
883
        saveset = current->blocked;
884
      b4:       d2 07 f0 68     mvc     104(8,%r15),972(%r4)
885
      b8:       43 cc
886
        return (set->sig[0] & mask) != 0;
887
}
888
 
889
6) If debugging under VM go down to that section in the document for more info.
890
 
891
 
892
I now have a tool which takes the pain out of --adjust-vma
893
& you are able to do something like
894
make /arch/s390/kernel/traps.lst
895
& it automatically generates the correctly relocated entries for
896
the text segment in traps.lst.
897
This tool is now standard in linux distro's in scripts/makelst
898
 
899
strace:
900
-------
901
Q. What is it ?
902
A. It is a tool for intercepting calls to the kernel & logging them
903
to a file & on the screen.
904
 
905
Q. What use is it ?
906
A. You can used it to find out what files a particular program opens.
907
 
908
 
909
 
910
Example 1
911
---------
912
If you wanted to know does ping work but didn't have the source
913
strace ping -c 1 127.0.0.1
914
& then look at the man pages for each of the syscalls below,
915
( In fact this is sometimes easier than looking at some spagetti
916
source which conditionally compiles for several architectures )
917
Not everything that it throws out needs to make sense immeadiately
918
 
919
Just looking quickly you can see that it is making up a RAW socket
920
for the ICMP protocol.
921
Doing an alarm(10) for a 10 second timeout
922
& doing a gettimeofday call before & after each read to see
923
how long the replies took, & writing some text to stdout so the user
924
has an idea what is going on.
925
 
926
socket(PF_INET, SOCK_RAW, IPPROTO_ICMP) = 3
927
getuid()                                = 0
928
setuid(0)                               = 0
929
stat("/usr/share/locale/C/libc.cat", 0xbffff134) = -1 ENOENT (No such file or directory)
930
stat("/usr/share/locale/libc/C", 0xbffff134) = -1 ENOENT (No such file or directory)
931
stat("/usr/local/share/locale/C/libc.cat", 0xbffff134) = -1 ENOENT (No such file or directory)
932
getpid()                                = 353
933
setsockopt(3, SOL_SOCKET, SO_BROADCAST, [1], 4) = 0
934
setsockopt(3, SOL_SOCKET, SO_RCVBUF, [49152], 4) = 0
935
fstat(1, {st_mode=S_IFCHR|0620, st_rdev=makedev(3, 1), ...}) = 0
936
mmap(0, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x40008000
937
ioctl(1, TCGETS, {B9600 opost isig icanon echo ...}) = 0
938
write(1, "PING 127.0.0.1 (127.0.0.1): 56 d"..., 42PING 127.0.0.1 (127.0.0.1): 56 data bytes
939
) = 42
940
sigaction(SIGINT, {0x8049ba0, [], SA_RESTART}, {SIG_DFL}) = 0
941
sigaction(SIGALRM, {0x8049600, [], SA_RESTART}, {SIG_DFL}) = 0
942
gettimeofday({948904719, 138951}, NULL) = 0
943
sendto(3, "\10\0D\201a\1\0\0\17#\2178\307\36"..., 64, 0, {sin_family=AF_INET,
944
sin_port=htons(0), sin_addr=inet_addr("127.0.0.1")}, 16) = 64
945
sigaction(SIGALRM, {0x8049600, [], SA_RESTART}, {0x8049600, [], SA_RESTART}) = 0
946
sigaction(SIGALRM, {0x8049ba0, [], SA_RESTART}, {0x8049600, [], SA_RESTART}) = 0
947
alarm(10)                               = 0
948
recvfrom(3, "E\0\0T\0005\0\0@\1|r\177\0\0\1\177"..., 192, 0,
949
{sin_family=AF_INET, sin_port=htons(50882), sin_addr=inet_addr("127.0.0.1")}, [16]) = 84
950
gettimeofday({948904719, 160224}, NULL) = 0
951
recvfrom(3, "E\0\0T\0006\0\0\377\1\275p\177\0"..., 192, 0,
952
{sin_family=AF_INET, sin_port=htons(50882), sin_addr=inet_addr("127.0.0.1")}, [16]) = 84
953
gettimeofday({948904719, 166952}, NULL) = 0
954
write(1, "64 bytes from 127.0.0.1: icmp_se"...,
955
5764 bytes from 127.0.0.1: icmp_seq=0 ttl=255 time=28.0 ms
956
 
957
Example 2
958
---------
959
strace passwd 2>&1 | grep open
960
produces the following output
961
open("/etc/ld.so.cache", O_RDONLY)      = 3
962
open("/opt/kde/lib/libc.so.5", O_RDONLY) = -1 ENOENT (No such file or directory)
963
open("/lib/libc.so.5", O_RDONLY)        = 3
964
open("/dev", O_RDONLY)                  = 3
965
open("/var/run/utmp", O_RDONLY)         = 3
966
open("/etc/passwd", O_RDONLY)           = 3
967
open("/etc/shadow", O_RDONLY)           = 3
968
open("/etc/login.defs", O_RDONLY)       = 4
969
open("/dev/tty", O_RDONLY)              = 4
970
 
971
The 2>&1 is done to redirect stderr to stdout & grep is then filtering this input
972
through the pipe for each line containing the string open.
973
 
974
 
975
Example 3
976
---------
977
Getting sophistocated
978
telnetd crashes on & I don't know why
979
Steps
980
-----
981
1) Replace the following line in /etc/inetd.conf
982
telnet  stream  tcp     nowait  root    /usr/sbin/in.telnetd -h
983
with
984
telnet  stream  tcp     nowait  root    /blah
985
 
986
2) Create the file /blah with the following contents to start tracing telnetd
987
#!/bin/bash
988
/usr/bin/strace -o/t1 -f /usr/sbin/in.telnetd -h
989
3) chmod 700 /blah to make it executable only to root
990
4)
991
killall -HUP inetd
992
or ps aux | grep inetd
993
get inetd's process id
994
& kill -HUP inetd to restart it.
995
 
996
Important options
997
-----------------
998
-o is used to tell strace to output to a file in our case t1 in the root directory
999
-f is to follow children i.e.
1000
e.g in our case above telnetd will start the login process & subsequently a shell like bash.
1001
You will be able to tell which is which from the process ID's listed on the left hand side
1002
of the strace output.
1003
-p will tell strace to attach to a running process, yup this can be done provided
1004
 it isn't being traced or debugged already & you have enough privileges,
1005
the reason 2 processes cannot trace or debug the same program is that strace
1006
becomes the parent process of the one being debugged & processes ( unlike people )
1007
can have only one parent.
1008
 
1009
 
1010
However the file /t1 will get big quite quickly
1011
to test it telnet 127.0.0.1
1012
 
1013
now look at what files in.telnetd execve'd
1014
413   execve("/usr/sbin/in.telnetd", ["/usr/sbin/in.telnetd", "-h"], [/* 17 vars */]) = 0
1015
414   execve("/bin/login", ["/bin/login", "-h", "localhost", "-p"], [/* 2 vars */]) = 0
1016
 
1017
Whey it worked!.
1018
 
1019
 
1020
Other hints:
1021
------------
1022
If the program is not very interactive ( i.e. not much keyboard input )
1023
& is crashing in one architecture but not in another you can do
1024
an strace of both programs under as identical a scenario as you can
1025
on both architectures outputting to a file then.
1026
do a diff of the two traces using the diff program
1027
i.e.
1028
diff output1 output2
1029
& maybe you'll be able to see where the call paths differed, this
1030
is possibly near the cause of the crash.
1031
 
1032
More info
1033
---------
1034
Look at man pages for strace & the various syscalls
1035
e.g. man strace, man alarm, man socket.
1036
 
1037
 
1038
Performance Debugging
1039
=====================
1040
gcc is capible of compiling in profiling code just add the -p option
1041
to the CFLAGS, this obviously affects program size & performance.
1042
This can be used by the gprof gnu profiling tool or the
1043
gcov the gnu code coverage tool ( code coverage is a means of testing
1044
code quality by checking if all the code in an executable in exercised by
1045
a tester ).
1046
 
1047
 
1048
Using top to find out where processes are sleeping in the kernel
1049
----------------------------------------------------------------
1050
To do this copy the System.map from the root directory where
1051
the linux kernel was built to the /boot directory on your
1052
linux machine.
1053
Start top
1054
Now type fU
1055
You should see a new field called WCHAN which
1056
tells you where each process is sleeping here is a typical output.
1057
 
1058
 6:59pm  up 41 min,  1 user,  load average: 0.00, 0.00, 0.00
1059
28 processes: 27 sleeping, 1 running, 0 zombie, 0 stopped
1060
CPU states:  0.0% user,  0.1% system,  0.0% nice, 99.8% idle
1061
Mem:   254900K av,   45976K used,  208924K free,       0K shrd,   28636K buff
1062
Swap:       0K av,       0K used,       0K free                    8620K cached
1063
 
1064
  PID USER     PRI  NI  SIZE  RSS SHARE WCHAN     STAT  LIB %CPU %MEM   TIME COMMAND
1065
  750 root      12   0   848  848   700 do_select S       0  0.1  0.3   0:00 in.telnetd
1066
  767 root      16   0  1140 1140   964           R       0  0.1  0.4   0:00 top
1067
    1 root       8   0   212  212   180 do_select S       0  0.0  0.0   0:00 init
1068
    2 root       9   0     0    0     0 down_inte SW      0  0.0  0.0   0:00 kmcheck
1069
 
1070
The time command
1071
----------------
1072
Another related command is the time command which gives you an indication
1073
of where a process is spending the majority of its time.
1074
e.g.
1075
time ping -c 5 nc
1076
outputs
1077
real    0m4.054s
1078
user    0m0.010s
1079
sys     0m0.010s
1080
 
1081
Debugging under VM
1082
==================
1083
 
1084
Notes
1085
-----
1086
Addresses & values in the VM debugger are always hex never decimal
1087
Address ranges are of the format - or .
1088
e.g. The address range  0x2000 to 0x3000 can be described described as
1089
2000-3000 or 2000.1000
1090
 
1091
The VM Debugger is case insensitive.
1092
 
1093
VM's strengths are usually other debuggers weaknesses you can get at any resource
1094
no matter how sensitive e.g. memory managment resources,change address translation
1095
in the PSW. For kernel hacking you will reap dividends if you get good at it.
1096
 
1097
The VM Debugger displays operators but not operands, probably because some
1098
of it was written when memory was expensive & the programmer was probably proud that
1099
it fitted into 2k of memory & the programmers & didn't want to shock hardcore VM'ers by
1100
changing the interface :-), also the debugger displays useful information on the same line &
1101
the author of the code probably felt that it was a good idea not to go over
1102
the 80 columns on the screen.
1103
 
1104
As some of you are probably in a panic now this isn't as unintuitive as it may seem
1105
as the 390 instructions are easy to decode mentally & you can make a good guess at a lot
1106
of them as all the operands are nibble ( half byte aligned ) & if you have an objdump listing
1107
also it is quite easy to follow, if you don't have an objdump listing keep a copy of
1108
the s/390 Reference Summary & look at between pages 2 & 7 or alternatively the
1109
s/390 principles of operation.
1110
e.g. even I can guess that
1111
0001AFF8' LR    180F        CC 0
1112
is a ( load register ) lr r0,r15
1113
 
1114
Also it is very easy to tell the length of a 390 instruction from the 2 most significant
1115
bits in the instruction ( not that this info is really useful except if you are trying to
1116
make sense of a hexdump of code ).
1117
Here is a table
1118
Bits                    Instruction Length
1119
------------------------------------------
1120
00                          2 Bytes
1121
01                          4 Bytes
1122
10                          4 Bytes
1123
11                          6 Bytes
1124
 
1125
 
1126
 
1127
 
1128
The debugger also displays other useful info on the same line such as the
1129
addresses being operated on destination addresses of branches & condition codes.
1130
e.g.
1131
00019736' AHI   A7DAFF0E    CC 1
1132
000198BA' BRC   A7840004 -> 000198C2'   CC 0
1133
000198CE' STM   900EF068 >> 0FA95E78    CC 2
1134
 
1135
 
1136
 
1137
Useful VM debugger commands
1138
---------------------------
1139
 
1140
I suppose I'd better mention this before I start
1141
to list the current active traces do
1142
Q TR
1143
there can be a maximum of 255 of these per set
1144
( more about trace sets later ).
1145
To stop traces issue a
1146
TR END.
1147
To delete a particular breakpoint issue
1148
TR DEL 
1149
 
1150
The PA1 key drops to CP mode so you can issue debugger commands,
1151
Doing alt c (on my 3270 console at least ) clears the screen.
1152
hitting b  comes back to the running operating system
1153
from cp mode ( in our case linux ).
1154
It is typically useful to add shortcuts to your profile.exec file
1155
if you have one ( this is roughly equivalent to autoexec.bat in DOS ).
1156
file here are a few from mine.
1157
/* this gives me command history on issuing f12 */
1158
set pf12 retrieve
1159
/* this continues */
1160
set pf8 imm b
1161
/* goes to trace set a */
1162
set pf1 imm tr goto a
1163
/* goes to trace set b */
1164
set pf2 imm tr goto b
1165
/* goes to trace set c */
1166
set pf3 imm tr goto c
1167
 
1168
 
1169
 
1170
Instruction Tracing
1171
-------------------
1172
Setting a simple breakpoint
1173
TR I PSWA 
1174
To debug a particular function try
1175
TR I R 
1176
TR I on its own will single step.
1177
TR I DATA   will trace for particular mnemonics
1178
e.g.
1179
TR I DATA 4D R 0197BC.4000
1180
will trace for BAS'es ( opcode 4D ) in the range 0197BC.4000
1181
if you were inclined you could add traces for all branch instructions &
1182
suffix them with the run prefix so you would have a backtrace on screen
1183
when a program crashes.
1184
TR BR  will trace branches into or out of an address.
1185
e.g.
1186
TR BR INTO 0 is often quite useful if a program is getting awkward & deciding
1187
to branch to 0 & crashing as this will stop at the address before in jumps to 0.
1188
TR I R 
RUN cmd d g
1189
single steps a range of addresses but stays running &
1190
displays the gprs on each step.
1191
 
1192
 
1193
 
1194
Displaying & modifying Registers
1195
--------------------------------
1196
D G will display all the gprs
1197
Adding a extra G to all the commands is neccessary to access the full 64 bit
1198
content in VM on z/Architecture obviously this isn't required for access registers
1199
as these are still 32 bit.
1200
e.g. DGG instead of DG
1201
D X will display all the control registers
1202
D AR will display all the access registers
1203
D AR4-7 will display access registers 4 to 7
1204
CPU ALL D G will display the GRPS of all CPUS in the configuration
1205
D PSW will display the current PSW
1206
st PSW 2000 will put the value 2000 into the PSW &
1207
cause crash your machine.
1208
D PREFIX displays the prefix offset
1209
 
1210
 
1211
Displaying Memory
1212
-----------------
1213
To display memory mapped using the current PSW's mapping try
1214
D 
1215
To make VM display a message each time it hits a particular address & continue try
1216
D I will disassemble/display a range of instructions.
1217
ST addr 32 bit word will store a 32 bit aligned address
1218
D T will display the EBCDIC in an address ( if you are that way inclined )
1219
D R will display real addresses ( without DAT ) but with prefixing.
1220
There are other complex options to display if you need to get at say home space
1221
but are in primary space the easiest thing to do is to temporarily
1222
modify the PSW to the other addressing mode, display the stuff & then
1223
restore it.
1224
 
1225
 
1226
 
1227
Hints
1228
-----
1229
If you want to issue a debugger command without halting your virtual machine with the
1230
PA1 key try prefixing the command with #CP e.g.
1231
#cp tr i pswa 2000
1232
also suffixing most debugger commands with RUN will cause them not
1233
to stop just display the mnemonic at the current instruction on the console.
1234
If you have several breakpoints you want to put into your program &
1235
you get fed up of cross referencing with System.map
1236
you can do the following trick for several symbols.
1237
grep do_signal System.map
1238
which emits the following among other things
1239
0001f4e0 T do_signal
1240
now you can do
1241
 
1242
TR I PSWA 0001f4e0 cmd msg * do_signal
1243
This sends a message to your own console each time do_signal is entered.
1244
( As an aside I wrote a perl script once which automatically generated a REXX
1245
script with breakpoints on every kernel procedure, this isn't a good idea
1246
because there are thousands of these routines & VM can only set 255 breakpoints
1247
at a time so you nearly had to spend as long pruning the file down as you would
1248
entering the msg's by hand ),however, the trick might be useful for a single object file.
1249
On linux'es 3270 emulator x3270 there is a very useful option under the file ment
1250
Save Screens In File this is very good of keeping a copy of traces.
1251
 
1252
From CMS help  will give you online help on a particular command.
1253
e.g.
1254
HELP DISPLAY
1255
 
1256
Also CP has a file called profile.exec which automatically gets called
1257
on startup of CMS ( like autoexec.bat ), keeping on a DOS analogy session
1258
CP has a feature similar to doskey, it may be useful for you to
1259
use profile.exec to define some keystrokes.
1260
e.g.
1261
SET PF9 IMM B
1262
This does a single step in VM on pressing F8.
1263
SET PF10  ^
1264
This sets up the ^ key.
1265
which can be used for ^c (ctrl-c),^z (ctrl-z) which can't be typed directly into some 3270 consoles.
1266
SET PF11 ^-
1267
This types the starting keystrokes for a sysrq see SysRq below.
1268
SET PF12 RETRIEVE
1269
This retrieves command history on pressing F12.
1270
 
1271
 
1272
Sometimes in VM the display is set up to scroll automatically this
1273
can be very annoying if there are messages you wish to look at
1274
to stop this do
1275
TERM MORE 255 255
1276
This will nearly stop automatic screen updates, however it will
1277
cause a denial of service if lots of messages go to the 3270 console,
1278
so it would be foolish to use this as the default on a production machine.
1279
 
1280
 
1281
Tracing particular processes
1282
----------------------------
1283
The kernels text segment is intentionally at an address in memory that it will
1284
very seldom collide with text segments of user programs ( thanks Martin ),
1285
this simplifies debugging the kernel.
1286
However it is quite common for user processes to have addresses which collide
1287
this can make debugging a particular process under VM painful under normal
1288
circumstances as the process may change when doing a
1289
TR I R 
.
1290
Thankfully after reading VM's online help I figured out how to debug
1291
particular processes in 31 bit mode, however, according to the current
1292
VM online help documentation the method described below uses
1293
TR STO or STD which don't currently work on  z/Series while in
1294
64-bit mode.
1295
 
1296
Your first problem is to find the STD ( segment table designation )
1297
of the program you wish to debug.
1298
 
1299
There are several ways you can do this here are a few
1300
1) objdump --syms  | grep main
1301
To get the address of main in the program.
1302
tr i pswa 
1303
Start the program, if VM drops to CP on what looks like the entry
1304
point of the main function this is most likely the process you wish to debug.
1305
Now do a D X13 or D XG13 on z/Architecture.
1306
On 31 bit the STD is bits 1-19 ( the STO segment table origin )
1307
& 25-31 ( the STL segment table length ) of CR13.
1308
now type
1309
TR I R STD  0.7fffffff
1310
e.g.
1311
TR I R STD 8F32E1FF 0.7fffffff
1312
Another very useful variation is
1313
TR STORE INTO STD  
1314
for finding out when a particular variable changes.
1315
 
1316
An alternative way of finding the STD of a currently running process
1317
is to do the following, ( this method is more complex but
1318
could be quite convient if you aren't updating the kernel much &
1319
so your kernel structures will stay constant for a reasonable period of
1320
time ).
1321
 
1322
grep task /proc//status
1323
from this you should see something like
1324
task: 0f160000 ksp: 0f161de8 pt_regs: 0f161f68
1325
This now gives you a pointer to the task structure.
1326
Now make CC:="s390-gcc -g" kernel/sched.s
1327
To get the task_struct stabinfo.
1328
( task_struct is defined in include/linux/sched.h ).
1329
Now we want to look at
1330
task->active_mm->pgd
1331
on my machine the active_mm in the task structure stab is
1332
active_mm:(4,12),672,32
1333
its offset is 672/8=84=0x54
1334
the pgd member in the mm_struct stab is
1335
pgd:(4,6)=*(29,5),96,32
1336
so its offset is 96/8=12=0xc
1337
 
1338
so we'll
1339
hexdump -s 0xf160054 /dev/mem | more
1340
i.e. task_struct+active_mm offset
1341
to look at the active_mm member
1342
f160054 0fee cc60 0019 e334 0000 0000 0000 0011
1343
hexdump -s 0x0feecc6c /dev/mem | more
1344
i.e. active_mm+pgd offset
1345
feecc6c 0f2c 0000 0000 0001 0000 0001 0000 0010
1346
we get something like
1347
now do
1348
TR I R STD  0.7fffffff
1349
i.e. the 0x7f is added because the pgd only
1350
gives the page table origin & we need to set the low bits
1351
to the maximum possible segment table length.
1352
TR I R STD 0f2c007f 0.7fffffff
1353
on z/Architecture you'll probably need to do
1354
TR I R STD  0.ffffffffffffffff
1355
to set the TableType to 0x1 & the Table length to 3.
1356
 
1357
 
1358
 
1359
Tracing Program Exceptions
1360
--------------------------
1361
If you get a crash which says something like
1362
illegal operation or specification exception followed by a register dump
1363
You can restart linux & trace these using the tr prog  trace option.
1364
 
1365
 
1366
 
1367
The most common ones you will normally be tracing for is
1368
1=operation exception
1369
2=privileged operation exception
1370
4=protection exception
1371
5=addressing exception
1372
6=specification exception
1373
10=segment translation exception
1374
11=page translation exception
1375
 
1376
The full list of these is on page 22 of the current s/390 Reference Summary.
1377
e.g.
1378
tr prog 10 will trace segment translation exceptions.
1379
tr prog on its own will trace all program interruption codes.
1380
 
1381
Trace Sets
1382
----------
1383
On starting VM you are initially in the INITIAL trace set.
1384
You can do a Q TR to verify this.
1385
If you have a complex tracing situation where you wish to wait for instance
1386
till a driver is open before you start tracing IO, but know in your
1387
heart that you are going to have to make several runs through the code till you
1388
have a clue whats going on.
1389
 
1390
What you can do is
1391
TR I PSWA 
1392
hit b to continue till breakpoint
1393
reach the breakpoint
1394
now do your
1395
TR GOTO B
1396
TR IO 7c08-7c09 inst int run
1397
or whatever the IO channels you wish to trace are & hit b
1398
 
1399
To got back to the initial trace set do
1400
TR GOTO INITIAL
1401
& the TR I PSWA  will be the only active breakpoint again.
1402
 
1403
 
1404
Tracing linux syscalls under VM
1405
-------------------------------
1406
Syscalls are implemented on Linux for S390 by the Supervisor call instruction (SVC) there 256
1407
possibilities of these as the instruction is made up of a  0xA opcode & the second byte being
1408
the syscall number. They are traced using the simple command.
1409
TR SVC  
1410
the syscalls are defined in linux/include/asm-s390/unistd.h
1411
e.g. to trace all file opens just do
1412
TR SVC 5 ( as this is the syscall number of open )
1413
 
1414
 
1415
SMP Specific commands
1416
---------------------
1417
To find out how many cpus you have
1418
Q CPUS displays all the CPU's available to your virtual machine
1419
To find the cpu that the current cpu VM debugger commands are being directed at do
1420
Q CPU to change the current cpu cpu VM debugger commands are being directed at do
1421
CPU 
1422
 
1423
On a SMP guest issue a command to all CPUs try prefixing the command with cpu all.
1424
To issue a command to a particular cpu try cpu  e.g.
1425
CPU 01 TR I R 2000.3000
1426
If you are running on a guest with several cpus & you have a IO related problem
1427
& cannot follow the flow of code but you know it isnt smp related.
1428
from the bash prompt issue
1429
shutdown -h now or halt.
1430
do a Q CPUS to find out how many cpus you have
1431
detach each one of them from cp except cpu 0
1432
by issueing a
1433
DETACH CPU 01-(number of cpus in configuration)
1434
& boot linux again.
1435
TR SIGP will trace inter processor signal processor instructions.
1436
DEFINE CPU 01-(number in configuration)
1437
will get your guests cpus back.
1438
 
1439
 
1440
Help for displaying ascii textstrings
1441
-------------------------------------
1442
On the very latest VM Nucleus'es VM can now display ascii
1443
( thanks Neale for the hint ) by doing
1444
D TX.
1445
e.g.
1446
D TX0.100
1447
 
1448
Alternatively
1449
=============
1450
Under older VM debuggers ( I love EBDIC too ) you can use this little program I wrote which
1451
will convert a command line of hex digits to ascii text which can be compiled under linux &
1452
you can copy the hex digits from your x3270 terminal to your xterm if you are debugging
1453
from a linuxbox.
1454
 
1455
This is quite useful when looking at a parameter passed in as a text string
1456
under VM ( unless you are good at decoding ASCII in your head ).
1457
 
1458
e.g. consider tracing an open syscall
1459
TR SVC 5
1460
We have stopped at a breakpoint
1461
000151B0' SVC   0A05     -> 0001909A'   CC 0
1462
 
1463
D 20.8 to check the SVC old psw in the prefix area & see was it from userspace
1464
( for the layout of the prefix area consult P18 of the s/390 390 Reference Summary
1465
if you have it available ).
1466
V00000020  070C2000 800151B2
1467
The problem state bit wasn't set &  it's also too early in the boot sequence
1468
for it to be a userspace SVC if it was we would have to temporarily switch the
1469
psw to user space addressing so we could get at the first parameter of the open in
1470
gpr2.
1471
Next do a
1472
D G2
1473
GPR  2 =  00014CB4
1474
Now display what gpr2 is pointing to
1475
D 00014CB4.20
1476
V00014CB4  2F646576 2F636F6E 736F6C65 00001BF5
1477
V00014CC4  FC00014C B4001001 E0001000 B8070707
1478
 
1479
Alternatively you can do the more elegant
1480
D 0.20;BASE2
1481
BASE2 telling VM to use GPR2 as the base register.
1482
 
1483
 
1484
Now copy the text till the first 00 hex ( which is the end of the string
1485
to an xterm & do hex2ascii on it.
1486
hex2ascii 2F646576 2F636F6E 736F6C65 00
1487
outputs
1488
Decoded Hex:=/ d e v / c o n s o l e 0x00
1489
We were opening the console device,
1490
 
1491
You can compile the code below yourself for practice :-),
1492
/*
1493
 *    hex2ascii.c
1494
 *    a useful little tool for converting a hexadecimal command line to ascii
1495
 *
1496
 *    Author(s): Denis Joseph Barrow (djbarrow@de.ibm.com,barrow_dj@yahoo.com)
1497
 *    (C) 2000 IBM Deutschland Entwicklung GmbH, IBM Corporation.
1498
 */
1499
#include 
1500
 
1501
int main(int argc,char *argv[])
1502
{
1503
  int cnt1,cnt2,len,toggle=0;
1504
  int startcnt=1;
1505
  unsigned char c,hex;
1506
 
1507
  if(argc>1&&(strcmp(argv[1],"-a")==0))
1508
     startcnt=2;
1509
  printf("Decoded Hex:=");
1510
  for(cnt1=startcnt;cnt1
1511
  {
1512
    len=strlen(argv[cnt1]);
1513
    for(cnt2=0;cnt2
1514
    {
1515
       c=argv[cnt1][cnt2];
1516
       if(c>='0'&&c<='9')
1517
          c=c-'0';
1518
       if(c>='A'&&c<='F')
1519
          c=c-'A'+10;
1520
       if(c>='a'&&c<='F')
1521
          c=c-'a'+10;
1522
       switch(toggle)
1523
       {
1524
          case 0:
1525
             hex=c<<4;
1526
             toggle=1;
1527
          break;
1528
          case 1:
1529
             hex+=c;
1530
             if(hex<32||hex>127)
1531
             {
1532
                if(startcnt==1)
1533
                   printf("0x%02X ",(int)hex);
1534
                else
1535
                   printf(".");
1536
             }
1537
             else
1538
             {
1539
               printf("%c",hex);
1540
               if(startcnt==1)
1541
                  printf(" ");
1542
             }
1543
             toggle=0;
1544
          break;
1545
       }
1546
    }
1547
  }
1548
  printf("\n");
1549
}
1550
 
1551
 
1552
 
1553
 
1554
Stack tracing under VM
1555
----------------------
1556
A basic backtrace
1557
-----------------
1558
 
1559
Here are the tricks I use 9 out of 10 times it works pretty well,
1560
 
1561
When your backchain reaches a dead end
1562
--------------------------------------
1563
This can happen when an exception happens in the kernel & the kernel is entered twice
1564
if you reach the NULL pointer at the end of the back chain you should be
1565
able to sniff further back if you follow the following tricks.
1566
1) A kernel address should be easy to recognise since it is in
1567
primary space & the problem state bit isn't set & also
1568
The Hi bit of the address is set.
1569
2) Another backchain should also be easy to recognise since it is an
1570
address pointing to another address approximately 100 bytes or 0x70 hex
1571
behind the current stackpointer.
1572
 
1573
 
1574
Here is some practice.
1575
boot the kernel & hit PA1 at some random time
1576
d g to display the gprs, this should display something like
1577
GPR  0 =  00000001  00156018  0014359C  00000000
1578
GPR  4 =  00000001  001B8888  000003E0  00000000
1579
GPR  8 =  00100080  00100084  00000000  000FE000
1580
GPR 12 =  00010400  8001B2DC  8001B36A  000FFED8
1581
Note that GPR14 is a return address but as we are real men we are going to
1582
trace the stack.
1583
display 0x40 bytes after the stack pointer.
1584
 
1585
V000FFED8  000FFF38 8001B838 80014C8E 000FFF38
1586
V000FFEE8  00000000 00000000 000003E0 00000000
1587
V000FFEF8  00100080 00100084 00000000 000FE000
1588
V000FFF08  00010400 8001B2DC 8001B36A 000FFED8
1589
 
1590
 
1591
Ah now look at whats in sp+56 (sp+0x38) this is 8001B36A our saved r14 if
1592
you look above at our stackframe & also agrees with GPR14.
1593
 
1594
now backchain
1595
d 000FFF38.40
1596
we now are taking the contents of SP to get our first backchain.
1597
 
1598
V000FFF38  000FFFA0 00000000 00014995 00147094
1599
V000FFF48  00147090 001470A0 000003E0 00000000
1600
V000FFF58  00100080 00100084 00000000 001BF1D0
1601
V000FFF68  00010400 800149BA 80014CA6 000FFF38
1602
 
1603
This displays a 2nd return address of 80014CA6
1604
 
1605
now do d 000FFFA0.40 for our 3rd backchain
1606
 
1607
V000FFFA0  04B52002 0001107F 00000000 00000000
1608
V000FFFB0  00000000 00000000 FF000000 0001107F
1609
V000FFFC0  00000000 00000000 00000000 00000000
1610
V000FFFD0  00010400 80010802 8001085A 000FFFA0
1611
 
1612
 
1613
our 3rd return address is 8001085A
1614
 
1615
as the 04B52002 looks suspiciously like rubbish it is fair to assume that the kernel entry routines
1616
for the sake of optimisation dont set up a backchain.
1617
 
1618
now look at System.map to see if the addresses make any sense.
1619
 
1620
grep -i 0001b3 System.map
1621
outputs among other things
1622
0001b304 T cpu_idle
1623
so 8001B36A
1624
is cpu_idle+0x66 ( quiet the cpu is asleep, don't wake it )
1625
 
1626
 
1627
grep -i 00014 System.map
1628
produces among other things
1629
00014a78 T start_kernel
1630
so 0014CA6 is start_kernel+some hex number I can't add in my head.
1631
 
1632
grep -i 00108 System.map
1633
this produces
1634
00010800 T _stext
1635
so   8001085A is _stext+0x5a
1636
 
1637
Congrats you've done your first backchain.
1638
 
1639
 
1640
 
1641
s/390 & z/Architecture IO Overview
1642
==================================
1643
 
1644
I am not going to give a course in 390 IO architecture as this would take me quite a
1645
while & I'm no expert. Instead I'll give a 390 IO architecture summary for Dummies if you have
1646
the s/390 principles of operation available read this instead. If nothing else you may find a few
1647
useful keywords in here & be able to use them on a web search engine like altavista to find
1648
more useful information.
1649
 
1650
Unlike other bus architectures modern 390 systems do their IO using mostly
1651
fibre optics & devices such as tapes & disks can be shared between several mainframes,
1652
also S390 can support upto 65536 devices while a high end PC based system might be choking
1653
with around 64. Here is some of the common IO terminology
1654
 
1655
Subchannel:
1656
This is the logical number most IO commands use to talk to an IO device there can be upto
1657
0x10000 (65536) of these in a configuration typically there is a few hundred. Under VM
1658
for simplicity they are allocated contiguously, however on the native hardware they are not
1659
they typically stay consistent between boots provided no new hardware is inserted or removed.
1660
Under Linux for 390 we use these as IRQ's & also when issuing an IO command (CLEAR SUBCHANNEL,
1661
HALT SUBCHANNEL,MODIFY SUBCHANNEL,RESUME SUBCHANNEL,START SUBCHANNEL,STORE SUBCHANNEL &
1662
TEST SUBCHANNEL ) we use this as the ID of the device we wish to talk to, the most
1663
important of these instructions are START SUBCHANNEL ( to start IO ), TEST SUBCHANNEL ( to check
1664
whether the IO completed successfully ), & HALT SUBCHANNEL ( to kill IO ), a subchannel
1665
can have up to 8 channel paths to a device this offers redunancy if one is not available.
1666
 
1667
 
1668
Device Number:
1669
This number remains static & Is closely tied to the hardware, there are 65536 of these
1670
also they are made up of a CHPID ( Channel Path ID, the most significant 8 bits )
1671
& another lsb 8 bits. These remain static even if more devices are inserted or removed
1672
from the hardware, there is a 1 to 1 mapping between Subchannels & Device Numbers provided
1673
devices arent inserted or removed.
1674
 
1675
Channel Control Words:
1676
CCWS are linked lists of instructions initially pointed to by an operation request block (ORB),
1677
which is initially given to Start Subchannel (SSCH) command along with the subchannel number
1678
for the IO subsystem to process while the CPU continues executing normal code.
1679
These come in two flavours, Format 0 ( 24 bit for backward )
1680
compatibility & Format 1 ( 31 bit ). These are typically used to issue read & write
1681
( & many other instructions ) they consist of a length field & an absolute address field.
1682
For each IO typically get 1 or 2 interrupts one for channel end ( primary status ) when the
1683
channel is idle & the second for device end ( secondary status ) sometimes you get both
1684
concurrently, you check how the IO went on by issueing a TEST SUBCHANNEL at each interrupt,
1685
from which you receive an Interruption response block (IRB). If you get channel & device end
1686
status in the IRB without channel checks etc. your IO probably went okay. If you didn't you
1687
probably need a doctorto examine the IRB & extended status word etc.
1688
If an error occurs more sophistocated control units have a facitity known as
1689
concurrent sense this means that if an error occurs Extended sense information will
1690
be presented in the Extended status word in the IRB if not you have to issue a
1691
subsequent SENSE CCW command after the test subchannel.
1692
 
1693
 
1694
TPI( Test pending interrupt) can also be used for polled IO but in multitasking multiprocessor
1695
systems it isn't recommended except for checking special cases ( i.e. non looping checks for
1696
pending IO etc. ).
1697
 
1698
Store Subchannel & Modify Subchannel can be used to examine & modify operating characteristics
1699
of a subchannel ( e.g. channel paths ).
1700
 
1701
Other IO related Terms:
1702
Sysplex: S390's Clustering Technology
1703
QDIO: S390's new high speed IO architecture to support devices such as gigabit ethernet,
1704
this architecture is also designed to be forward compatible with up & coming 64 bit machines.
1705
 
1706
 
1707
General Concepts
1708
 
1709
Input Output Processors (IOP's) are responsible for communicating between
1710
the mainframe CPU's & the channel & relieve the mainframe CPU's from the
1711
burden of communicating with IO devices directly, this allows the CPU's to
1712
concentrate on data processing.
1713
 
1714
IOP's can use one or more links ( known as channel paths ) to talk to each
1715
IO device. It first checks for path availability & chooses an available one,
1716
then starts ( & sometimes terminates IO ).
1717
There are two types of channel path ESCON & the Paralell IO interface.
1718
 
1719
IO devices are attached to control units, control units provide the
1720
logic to interface the channel paths & channel path IO protocols to
1721
the IO devices, they can be integrated with the devices or housed separately
1722
& often talk to several similar devices ( typical examples would be raid
1723
controllers or a control unit which connects to 1000 3270 terminals ).
1724
 
1725
 
1726
    +---------------------------------------------------------------+
1727
    | +-----+ +-----+ +-----+ +-----+  +----------+  +----------+   |
1728
    | | CPU | | CPU | | CPU | | CPU |  |  Main    |  | Expanded |   |
1729
    | |     | |     | |     | |     |  |  Memory  |  |  Storage |   |
1730
    | +-----+ +-----+ +-----+ +-----+  +----------+  +----------+   |
1731
    |---------------------------------------------------------------+
1732
    |   IOP        |      IOP      |       IOP                      |
1733
    |---------------------------------------------------------------
1734
    | C | C | C | C | C | C | C | C | C | C | C | C | C | C | C | C |
1735
    ----------------------------------------------------------------
1736
         ||                                              ||
1737
         ||  Bus & Tag Channel Path                      || ESCON
1738
         ||  ======================                      || Channel
1739
         ||  ||                  ||                      || Path
1740
    +----------+               +----------+         +----------+
1741
    |          |               |          |         |          |
1742
    |    CU    |               |    CU    |         |    CU    |
1743
    |          |               |          |         |          |
1744
    +----------+               +----------+         +----------+
1745
       |      |                     |                |       |
1746
+----------+ +----------+      +----------+   +----------+ +----------+
1747
|I/O Device| |I/O Device|      |I/O Device|   |I/O Device| |I/O Device|
1748
+----------+ +----------+      +----------+   +----------+ +----------+
1749
  CPU = Central Processing Unit
1750
  C = Channel
1751
  IOP = IP Processor
1752
  CU = Control Unit
1753
 
1754
The 390 IO systems come in 2 flavours the current 390 machines support both
1755
 
1756
The Older 360 & 370 Interface,sometimes called the paralell I/O interface,
1757
sometimes called Bus-and Tag & sometimes Original Equipment Manufacturers
1758
Interface (OEMI).
1759
 
1760
This byte wide paralell channel path/bus has parity & data on the "Bus" cable
1761
& control lines on the "Tag" cable. These can operate in byte multiplex mode for
1762
sharing between several slow devices or burst mode & monopolize the channel for the
1763
whole burst. Upto 256 devices can be addressed  on one of these cables. These cables are
1764
about one inch in diameter. The maximum unextended length supported by these cables is
1765
125 Meters but this can be extended up to 2km with a fibre optic channel extended
1766
such as a 3044. The maximum burst speed supported is 4.5 megabytes per second however
1767
some really old processors support only transfer rates of 3.0, 2.0 & 1.0 MB/sec.
1768
One of these paths can be daisy chained to up to 8 control units.
1769
 
1770
 
1771
ESCON if fibre optic it is also called FICON
1772
Was introduced by IBM in 1990. Has 2 fibre optic cables & uses either leds or lasers
1773
for communication at a signaling rate of upto 200 megabits/sec. As 10bits are transferred
1774
for every 8 bits info this drops to 160 megabits/sec & to 18.6 Megabytes/sec once
1775
control info & CRC are added. ESCON only operates in burst mode.
1776
 
1777
ESCONs typical max cable length is 3km for the led version & 20km for the laser version
1778
known as XDF ( extended distance facility ). This can be further extended by using an
1779
ESCON director which triples the above mentioned ranges. Unlike Bus & Tag as ESCON is
1780
serial it uses a packet switching architecture the standard Bus & Tag control protocol
1781
is however present within the packets. Upto 256 devices can be attached to each control
1782
unit that uses one of these interfaces.
1783
 
1784
Common 390 Devices include:
1785
Network adapters typically OSA2,3172's,2116's & OSA-E gigabit ethernet adapters,
1786
Consoles 3270 & 3215 ( a teletype emulated under linux for a line mode console ).
1787
DASD's direct access storage devices ( otherwise known as hard disks ).
1788
Tape Drives.
1789
CTC ( Channel to Channel Adapters ),
1790
ESCON or Paralell Cables used as a very high speed serial link
1791
between 2 machines. We use 2 cables under linux to do a bi-directional serial link.
1792
 
1793
 
1794
Debugging IO on s/390 & z/Architecture under VM
1795
===============================================
1796
 
1797
Now we are ready to go on with IO tracing commands under VM
1798
 
1799
A few self explanatory queries:
1800
Q OSA
1801
Q CTC
1802
Q DISK ( This command is CMS specific )
1803
Q DASD
1804
 
1805
 
1806
 
1807
 
1808
 
1809
 
1810
Q OSA on my machine returns
1811
OSA  7C08 ON OSA   7C08 SUBCHANNEL = 0000
1812
OSA  7C09 ON OSA   7C09 SUBCHANNEL = 0001
1813
OSA  7C14 ON OSA   7C14 SUBCHANNEL = 0002
1814
OSA  7C15 ON OSA   7C15 SUBCHANNEL = 0003
1815
 
1816
If you have a guest with certain priviliges you may be able to see devices
1817
which don't belong to you to avoid this do add the option V.
1818
e.g.
1819
Q V OSA
1820
 
1821
Now using the device numbers returned by this command we will
1822
Trace the io starting up on the first device 7c08 & 7c09
1823
In our simplest case we can trace the
1824
start subchannels
1825
like TR SSCH 7C08-7C09
1826
or the halt subchannels
1827
or TR HSCH 7C08-7C09
1828
MSCH's ,STSCH's I think you can guess the rest
1829
 
1830
Ingo's favourite trick is tracing all the IO's & CCWS & spooling them into the reader of another
1831
VM guest so he can ftp the logfile back to his own machine.I'll do a small bit of this & give you
1832
 a look at the output.
1833
 
1834
1) Spool stdout to VM reader
1835
SP PRT TO (another vm guest ) or * for the local vm guest
1836
2) Fill the reader with the trace
1837
TR IO 7c08-7c09 INST INT CCW PRT RUN
1838
3) Start up linux
1839
i 00c
1840
4) Finish the trace
1841
TR END
1842
5) close the reader
1843
C PRT
1844
6) list reader contents
1845
RDRLIST
1846
7) copy it to linux4's minidisk
1847
RECEIVE / LOG TXT A1 ( replace
1848
8)
1849
filel & press F11 to look at it
1850
You should see someting like.
1851
 
1852
00020942' SSCH  B2334000    0048813C    CC 0    SCH 0000    DEV 7C08
1853
          CPA 000FFDF0   PARM 00E2C9C4    KEY 0  FPI C0  LPM 80
1854
          CCW    000FFDF0  E4200100 00487FE8   0000  E4240100 ........
1855
          IDAL                                      43D8AFE8
1856
          IDAL                                      0FB76000
1857
00020B0A'   I/O DEV 7C08 -> 000197BC'   SCH 0000   PARM 00E2C9C4
1858
00021628' TSCH  B2354000 >> 00488164    CC 0    SCH 0000    DEV 7C08
1859
          CCWA 000FFDF8   DEV STS 0C  SCH STS 00  CNT 00EC
1860
           KEY 0   FPI C0  CC 0   CTLS 4007
1861
00022238' STSCH B2344000 >> 00488108    CC 0    SCH 0000    DEV 7C08
1862
 
1863
If you don't like messing up your readed ( because you possibly booted from it )
1864
you can alternatively spool it to another readers guest.
1865
 
1866
 
1867
Other common VM device related commands
1868
---------------------------------------------
1869
These commands are listed only because they have
1870
been of use to me in the past & may be of use to
1871
you too. For more complete info on each of the commands
1872
use type HELP  from CMS.
1873
detaching devices
1874
DET 
1875
ATT  
1876
attach a device to guest * for your own guest
1877
READY  cause VM to issue a fake interrupt.
1878
 
1879
The VARY command is normally only available to VM administrators.
1880
VARY ON PATH  TO 
1881
VARY OFF PATH  FROM 
1882
This is used to switch on or off channel paths to devices.
1883
 
1884
Q CHPID 
1885
This displays state of devices using this channel path
1886
D SCHIB 
1887
This displays the subchannel information SCHIB block for the device.
1888
this I believe is also only available to administrators.
1889
DEFINE CTC 
1890
defines a virtual CTC channel to channel connection
1891
2 need to be defined on each guest for the CTC driver to use.
1892
COUPLE  devno userid remote devno
1893
Joins a local virtual device to a remote virtual device
1894
( commonly used for the CTC driver ).
1895
 
1896
Building a VM ramdisk under CMS which linux can use
1897
def vfb-  
1898
blocksize is commonly 4096 for linux.
1899
Formatting it
1900
format   (blksize 
1901
 
1902
Sharing a disk between multiple guests
1903
LINK userid devno1 devno2 mode password
1904
 
1905
 
1906
 
1907
GDB on S390
1908
===========
1909
N.B. if compiling for debugging gdb works better without optimisation
1910
( see Compiling programs for debugging )
1911
 
1912
invocation
1913
----------
1914
gdb  
1915
 
1916
Online help
1917
-----------
1918
help: gives help on commands
1919
e.g.
1920
help
1921
help display
1922
Note gdb's online help is very good use it.
1923
 
1924
 
1925
Assembly
1926
--------
1927
info registers: displays registers other than floating point.
1928
info all-registers: displays floating points as well.
1929
disassemble: dissassembles
1930
e.g.
1931
disassemble without parameters will disassemble the current function
1932
disassemble $pc $pc+10
1933
 
1934
Viewing & modifying variables
1935
-----------------------------
1936
print or p: displays variable or register
1937
e.g. p/x $sp will display the stack pointer
1938
 
1939
display: prints variable or register each time program stops
1940
e.g.
1941
display/x $pc will display the program counter
1942
display argc
1943
 
1944
undisplay : undo's display's
1945
 
1946
info breakpoints: shows all current breakpoints
1947
 
1948
info stack: shows stack back trace ( if this dosent work too well, I'll show you the
1949
stacktrace by hand below ).
1950
 
1951
info locals: displays local variables.
1952
 
1953
info args: display current procedure arguments.
1954
 
1955
set args: will set argc & argv each time the victim program is invoked.
1956
 
1957
set =value
1958
set argc=100
1959
set $pc=0
1960
 
1961
 
1962
 
1963
Modifying execution
1964
-------------------
1965
step: steps n lines of sourcecode
1966
step steps 1 line.
1967
step 100 steps 100 lines of code.
1968
 
1969
next: like step except this will not step into subroutines
1970
 
1971
stepi: steps a single machine code instruction.
1972
e.g. stepi 100
1973
 
1974
nexti: steps a single machine code instruction but will not step into subroutines.
1975
 
1976
finish: will run until exit of the current routine
1977
 
1978
run: (re)starts a program
1979
 
1980
cont: continues a program
1981
 
1982
quit: exits gdb.
1983
 
1984
 
1985
breakpoints
1986
------------
1987
 
1988
break
1989
sets a breakpoint
1990
e.g.
1991
 
1992
break main
1993
 
1994
break *$pc
1995
 
1996
break *0x400618
1997
 
1998
heres a really useful one for large programs
1999
rbr
2000
Set a breakpoint for all functions matching REGEXP
2001
e.g.
2002
rbr 390
2003
will set a breakpoint with all functions with 390 in their name.
2004
 
2005
info breakpoints
2006
lists all breakpoints
2007
 
2008
delete: delete breakpoint by number or delete them all
2009
e.g.
2010
delete 1 will delete the first breakpoint
2011
delete will delete them all
2012
 
2013
watch: This will set a watchpoint ( usually hardware assisted ),
2014
This will watch a variable till it changes
2015
e.g.
2016
watch cnt, will watch the variable cnt till it changes.
2017
As an aside unfortunately gdb's, architecture independent watchpoint code
2018
is inconsistent & not very good, watchpoints usually work but not always.
2019
 
2020
info watchpoints: Display currently active watchpoints
2021
 
2022
condition: ( another useful one )
2023
Specify breakpoint number N to break only if COND is true.
2024
Usage is `condition N COND', where N is an integer and COND is an
2025
expression to be evaluated whenever breakpoint N is reached.
2026
 
2027
 
2028
 
2029
User defined functions/macros
2030
-----------------------------
2031
define: ( Note this is very very useful,simple & powerful )
2032
usage define   end
2033
 
2034
examples which you should consider putting into .gdbinit in your home directory
2035
define d
2036
stepi
2037
disassemble $pc $pc+10
2038
end
2039
 
2040
define e
2041
nexti
2042
disassemble $pc $pc+10
2043
end
2044
 
2045
 
2046
Other hard to classify stuff
2047
----------------------------
2048
signal n:
2049
sends the victim program a signal.
2050
e.g. signal 3 will send a SIGQUIT.
2051
 
2052
info signals:
2053
what gdb does when the victim receives certain signals.
2054
 
2055
list:
2056
e.g.
2057
list lists current function source
2058
list 1,10 list first 10 lines of curret file.
2059
list test.c:1,10
2060
 
2061
 
2062
directory:
2063
Adds directories to be searched for source if gdb cannot find the source.
2064
(note it is a bit sensititive about slashes )
2065
e.g. To add the root of the filesystem to the searchpath do
2066
directory //
2067
 
2068
 
2069
call 
2070
This calls a function in the victim program, this is pretty powerful
2071
e.g.
2072
(gdb) call printf("hello world")
2073
outputs:
2074
$1 = 11
2075
 
2076
You might now be thinking that the line above didn't work, something extra had to be done.
2077
(gdb) call fflush(stdout)
2078
hello world$2 = 0
2079
As an aside the debugger also calls malloc & free under the hood
2080
to make space for the "hello world" string.
2081
 
2082
 
2083
 
2084
hints
2085
-----
2086
1) command completion works just like bash
2087
( if you are a bad typist like me this really helps )
2088
e.g. hit br  & cursor up & down :-).
2089
 
2090
2) if you have a debugging problem that takes a few steps to recreate
2091
put the steps into a file called .gdbinit in your current working directory
2092
if you have defined a few extra useful user defined commands put these in
2093
your home directory & they will be read each time gdb is launched.
2094
 
2095
A typical .gdbinit file might be.
2096
break main
2097
run
2098
break runtime_exception
2099
cont
2100
 
2101
 
2102
stack chaining in gdb by hand
2103
-----------------------------
2104
This is done using a the same trick described for VM
2105
p/x (*($sp+56))&0x7fffffff get the first backchain.
2106
 
2107
For z/Architecture
2108
Replace 56 with 112 & ignore the &0x7fffffff
2109
in the macros below & do nasty casts to longs like the following
2110
as gdb unfortunately deals with printed arguments as ints which
2111
messes up everything.
2112
i.e. here is a 3rd backchain dereference
2113
p/x *(long *)(***(long ***)$sp+112)
2114
 
2115
 
2116
this outputs
2117
$5 = 0x528f18
2118
on my machine.
2119
Now you can use
2120
info symbol (*($sp+56))&0x7fffffff
2121
you might see something like.
2122
rl_getc + 36 in section .text  telling you what is located at address 0x528f18
2123
Now do.
2124
p/x (*(*$sp+56))&0x7fffffff
2125
This outputs
2126
$6 = 0x528ed0
2127
Now do.
2128
info symbol (*(*$sp+56))&0x7fffffff
2129
rl_read_key + 180 in section .text
2130
now do
2131
p/x (*(**$sp+56))&0x7fffffff
2132
& so on.
2133
 
2134
Another good trick to look at addresses on the stack if you've somehow lost
2135
the backchain is.
2136
x/500xa $sp
2137
This displays anything the name of any known functions above the stack pointer
2138
for 500 bytes.
2139
 
2140
Disassembling instructions without debug info
2141
---------------------------------------------
2142
gdb typically compains if there is a lack of debugging
2143
symbols in  the disassemble command with
2144
"No function contains specified address." to get around
2145
this do
2146
x/xi 
2147
e.g.
2148
x/20xi 0x400730
2149
 
2150
 
2151
 
2152
Note: Remember gdb has history just like bash you don't need to retype the
2153
whole line just use the up & down arrows.
2154
 
2155
 
2156
 
2157
For more info
2158
-------------
2159
From your linuxbox do
2160
man gdb or info gdb.
2161
 
2162
core dumps
2163
----------
2164
What a core dump ?,
2165
A core dump is a file generated by the kernel ( if allowed ) which contains the registers,
2166
& all active pages of the program which has crashed.
2167
From this file gdb will allow you to look at the registers & stack trace & memory of the
2168
program as if it just crashed on your system, it is usually called core & created in the
2169
current working directory.
2170
This is very useful in that a customer can mail a core dump to a technical support department
2171
& the technical support department can reconstruct what happened.
2172
Provided the have an indentical copy of this program with debugging symbols compiled in &
2173
the source base of this build is available.
2174
In short it is far more useful than something like a crash log could ever hope to be.
2175
 
2176
In theory all that is missing to restart a core dumped program is a kernel patch which
2177
will do the following.
2178
1) Make a new kernel task structure
2179
2) Reload all the dumped pages back into the kernels memory managment structures.
2180
3) Do the required clock fixups
2181
4) Get all files & network connections for the process back into an identical state ( really difficult ).
2182
5) A few more difficult things I haven't thought of.
2183
 
2184
 
2185
 
2186
Why have I never seen one ?.
2187
Probably because you haven't used the command
2188
ulimit -c unlimited in bash
2189
to allow core dumps, now do
2190
ulimit -a
2191
to verify that the limit was accepted.
2192
 
2193
A sample core dump
2194
To create this I'm going to do
2195
ulimit -c unlimited
2196
gdb
2197
to launch gdb (my victim app. ) now be bad & do the following from another
2198
telnet/xterm session to the same machine
2199
ps -aux | grep gdb
2200
kill -SIGSEGV 
2201
or alternatively use killall -SIGSEGV gdb if you have the killall command.
2202
Now look at the core dump.
2203
./gdb ./gdb core
2204
Displays the following
2205
GNU gdb 4.18
2206
Copyright 1998 Free Software Foundation, Inc.
2207
GDB is free software, covered by the GNU General Public License, and you are
2208
welcome to change it and/or distribute copies of it under certain conditions.
2209
Type "show copying" to see the conditions.
2210
There is absolutely no warranty for GDB.  Type "show warranty" for details.
2211
This GDB was configured as "s390-ibm-linux"...
2212
Core was generated by `./gdb'.
2213
Program terminated with signal 11, Segmentation fault.
2214
Reading symbols from /usr/lib/libncurses.so.4...done.
2215
Reading symbols from /lib/libm.so.6...done.
2216
Reading symbols from /lib/libc.so.6...done.
2217
Reading symbols from /lib/ld-linux.so.2...done.
2218
#0  0x40126d1a in read () from /lib/libc.so.6
2219
Setting up the environment for debugging gdb.
2220
Breakpoint 1 at 0x4dc6f8: file utils.c, line 471.
2221
Breakpoint 2 at 0x4d87a4: file top.c, line 2609.
2222
(top-gdb) info stack
2223
#0  0x40126d1a in read () from /lib/libc.so.6
2224
#1  0x528f26 in rl_getc (stream=0x7ffffde8) at input.c:402
2225
#2  0x528ed0 in rl_read_key () at input.c:381
2226
#3  0x5167e6 in readline_internal_char () at readline.c:454
2227
#4  0x5168ee in readline_internal_charloop () at readline.c:507
2228
#5  0x51692c in readline_internal () at readline.c:521
2229
#6  0x5164fe in readline (prompt=0x7ffff810 "\177x\177\177x")
2230
    at readline.c:349
2231
#7  0x4d7a8a in command_line_input (prrompt=0x564420 "(gdb) ", repeat=1,
2232
    annotation_suffix=0x4d6b44 "prompt") at top.c:2091
2233
#8  0x4d6cf0 in command_loop () at top.c:1345
2234
#9  0x4e25bc in main (argc=1, argv=0x7ffffdf4) at main.c:635
2235
 
2236
 
2237
LDD
2238
===
2239
This is a program which lists the shared libraries which a library needs,
2240
Note you also get the relocations of the shared library text segments which
2241
help when using objdump --source.
2242
e.g.
2243
 ldd ./gdb
2244
outputs
2245
libncurses.so.4 => /usr/lib/libncurses.so.4 (0x40018000)
2246
libm.so.6 => /lib/libm.so.6 (0x4005e000)
2247
libc.so.6 => /lib/libc.so.6 (0x40084000)
2248
/lib/ld-linux.so.2 => /lib/ld-linux.so.2 (0x40000000)
2249
 
2250
 
2251
Debugging shared libraries
2252
==========================
2253
Most programs use shared libraries, however it can be very painful
2254
when you single step instruction into a function like printf for the
2255
first time & you end up in functions like _dl_runtime_resolve this is
2256
the ld.so doing lazy binding, lazy binding is a concept in ELF where
2257
shared library functions are not loaded into memory unless they are
2258
actually used, great for saving memory but a pain to debug.
2259
To get around this either relink the program -static or exit gdb type
2260
export LD_BIND_NOW=true this will stop lazy binding & restart the gdb'ing
2261
the program in question.
2262
 
2263
 
2264
 
2265
Debugging modules
2266
=================
2267
As modules are dynamically loaded into the kernel their address can be
2268
anywhere to get around this use the -m option with insmod to emit a load
2269
map which can be piped into a file if required.
2270
 
2271
The proc file system
2272
====================
2273
What is it ?.
2274
It is a filesystem created by the kernel with files which are created on demand
2275
by the kernel if read, or can be used to modify kernel parameters,
2276
it is a powerful concept.
2277
 
2278
e.g.
2279
 
2280
cat /proc/sys/net/ipv4/ip_forward
2281
On my machine outputs
2282
 
2283
telling me ip_forwarding is not on to switch it on I can do
2284
echo 1 >  /proc/sys/net/ipv4/ip_forward
2285
cat it again
2286
cat /proc/sys/net/ipv4/ip_forward
2287
On my machine now outputs
2288
1
2289
IP forwarding is on.
2290
There is a lot of useful info in here best found by going in & having a look around,
2291
so I'll take you through some entries I consider important.
2292
 
2293
All the processes running on the machine have there own entry defined by
2294
/proc/
2295
So lets have a look at the init process
2296
cd /proc/1
2297
 
2298
cat cmdline
2299
emits
2300
init [2]
2301
 
2302
cd /proc/1/fd
2303
This contains numerical entries of all the open files,
2304
some of these you can cat e.g. stdout (2)
2305
 
2306
cat /proc/29/maps
2307
on my machine emits
2308
 
2309
00400000-00478000 r-xp 00000000 5f:00 4103       /bin/bash
2310
00478000-0047e000 rw-p 00077000 5f:00 4103       /bin/bash
2311
0047e000-00492000 rwxp 00000000 00:00 0
2312
40000000-40015000 r-xp 00000000 5f:00 14382      /lib/ld-2.1.2.so
2313
40015000-40016000 rw-p 00014000 5f:00 14382      /lib/ld-2.1.2.so
2314
40016000-40017000 rwxp 00000000 00:00 0
2315
40017000-40018000 rw-p 00000000 00:00 0
2316
40018000-4001b000 r-xp 00000000 5f:00 14435      /lib/libtermcap.so.2.0.8
2317
4001b000-4001c000 rw-p 00002000 5f:00 14435      /lib/libtermcap.so.2.0.8
2318
4001c000-4010d000 r-xp 00000000 5f:00 14387      /lib/libc-2.1.2.so
2319
4010d000-40111000 rw-p 000f0000 5f:00 14387      /lib/libc-2.1.2.so
2320
40111000-40114000 rw-p 00000000 00:00 0
2321
40114000-4011e000 r-xp 00000000 5f:00 14408      /lib/libnss_files-2.1.2.so
2322
4011e000-4011f000 rw-p 00009000 5f:00 14408      /lib/libnss_files-2.1.2.so
2323
7fffd000-80000000 rwxp ffffe000 00:00 0
2324
 
2325
 
2326
Showing us the shared libraries init uses where they are in memory
2327
& memory access permissions for each virtual memory area.
2328
 
2329
/proc/1/cwd is a softlink to the current working directory.
2330
/proc/1/root is the root of the filesystem for this process.
2331
 
2332
/proc/1/mem is the current running processes memory which you
2333
can read & write to like a file.
2334
strace uses this sometimes as it is a bit faster than the
2335
rather inefficent ptrace interface for peeking at DATA.
2336
 
2337
 
2338
cat status
2339
 
2340
Name:   init
2341
State:  S (sleeping)
2342
Pid:    1
2343
PPid:   0
2344
Uid:    0       0       0       0
2345
Gid:    0       0       0       0
2346
Groups:
2347
VmSize:      408 kB
2348
VmLck:         0 kB
2349
VmRSS:       208 kB
2350
VmData:       24 kB
2351
VmStk:         8 kB
2352
VmExe:       368 kB
2353
VmLib:         0 kB
2354
SigPnd: 0000000000000000
2355
SigBlk: 0000000000000000
2356
SigIgn: 7fffffffd7f0d8fc
2357
SigCgt: 00000000280b2603
2358
CapInh: 00000000fffffeff
2359
CapPrm: 00000000ffffffff
2360
CapEff: 00000000fffffeff
2361
 
2362
User PSW:    070de000 80414146
2363
task: 004b6000 tss: 004b62d8 ksp: 004b7ca8 pt_regs: 004b7f68
2364
User GPRS:
2365
00000400  00000000  0000000b  7ffffa90
2366
00000000  00000000  00000000  0045d9f4
2367
0045cafc  7ffffa90  7fffff18  0045cb08
2368
00010400  804039e8  80403af8  7ffff8b0
2369
User ACRS:
2370
00000000  00000000  00000000  00000000
2371
00000001  00000000  00000000  00000000
2372
00000000  00000000  00000000  00000000
2373
00000000  00000000  00000000  00000000
2374
Kernel BackChain  CallChain    BackChain  CallChain
2375
       004b7ca8   8002bd0c     004b7d18   8002b92c
2376
       004b7db8   8005cd50     004b7e38   8005d12a
2377
       004b7f08   80019114
2378
Showing among other things memory usage & status of some signals &
2379
the processes'es registers from the kernel task_structure
2380
as well as a backchain which may be useful if a process crashes
2381
in the kernel for some unknown reason.
2382
 
2383
Some driver debugging techniques
2384
================================
2385
debug feature
2386
-------------
2387
Some of our drivers now support a "debug feature" in
2388
/proc/s390dbf see s390dbf.txt in the linux/Documentation directory
2389
for more info.
2390
e.g.
2391
to switch on the lcs "debug feature"
2392
echo 5 > /proc/s390dbf/lcs/level
2393
& then after the error occured.
2394
cat /proc/s390dbf/lcs/sprintf >/logfile
2395
the logfile now contains some information which may help
2396
tech support resolve a problem in the field.
2397
 
2398
 
2399
 
2400
high level debugging network drivers
2401
------------------------------------
2402
ifconfig is a quite useful command
2403
it gives the current state of network drivers.
2404
 
2405
If you suspect your network device driver is dead
2406
one way to check is type
2407
ifconfig 
2408
e.g. tr0
2409
You should see something like
2410
tr0       Link encap:16/4 Mbps Token Ring (New)  HWaddr 00:04:AC:20:8E:48
2411
          inet addr:9.164.185.132  Bcast:9.164.191.255  Mask:255.255.224.0
2412
          UP BROADCAST RUNNING MULTICAST  MTU:2000  Metric:1
2413
          RX packets:246134 errors:0 dropped:0 overruns:0 frame:0
2414
          TX packets:5 errors:0 dropped:0 overruns:0 carrier:0
2415
          collisions:0 txqueuelen:100
2416
 
2417
if the device doesn't say up
2418
try
2419
/etc/rc.d/init.d/network start
2420
( this starts the network stack & hopefully calls ifconfig tr0 up ).
2421
ifconfig looks at the output of /proc/net/dev & presents it in a more presentable form
2422
Now ping the device from a machine in the same subnet.
2423
if the RX packets count & TX packets counts don't increment you probably
2424
have problems.
2425
next
2426
cat /proc/net/arp
2427
Do you see any hardware addresses in the cache if not you may have problems.
2428
Next try
2429
ping -c 5  i.e. the Bcast field above in the output of
2430
ifconfig. Do you see any replies from machines other than the local machine
2431
if not you may have problems. also if the TX packets count in ifconfig
2432
hasn't incremented either you have serious problems in your driver
2433
(e.g. the txbusy field of the network device being stuck on )
2434
or you may have multiple network devices connected.
2435
 
2436
 
2437
chandev
2438
-------
2439
There is a new device layer for channel devices, some
2440
drivers e.g. lcs are registered with this layer.
2441
If the device uses the channel device layer you'll be
2442
able to find what interupts it uses & the current state
2443
of the device.
2444
See the manpage chandev.8 &type cat /proc/chandev for more info.
2445
 
2446
 
2447
 
2448
Starting points for debugging scripting languages etc.
2449
======================================================
2450
 
2451
bash/sh
2452
 
2453
bash -x 
2454
e.g. bash -x /usr/bin/bashbug
2455
displays the following lines as it executes them.
2456
+ MACHINE=i586
2457
+ OS=linux-gnu
2458
+ CC=gcc
2459
+ CFLAGS= -DPROGRAM='bash' -DHOSTTYPE='i586' -DOSTYPE='linux-gnu' -DMACHTYPE='i586-pc-linux-gnu' -DSHELL -DHAVE_CONFIG_H   -I. -I. -I./lib -O2 -pipe
2460
+ RELEASE=2.01
2461
+ PATCHLEVEL=1
2462
+ RELSTATUS=release
2463
+ MACHTYPE=i586-pc-linux-gnu
2464
 
2465
perl -d  runs the perlscript in a fully intercative debugger
2466
.
2467
Type 'h' in the debugger for help.
2468
 
2469
for debugging java type
2470
jdb  another fully interactive gdb style debugger.
2471
& type ? in the debugger for help.
2472
 
2473
 
2474
 
2475
SysRq
2476
=====
2477
This is now supported by linux for s/390 & z/Architecture.
2478
To enable it do compile the kernel with
2479
Kernel Hacking -> Magic SysRq Key Enabled
2480
echo "1" > /proc/sys/kernel/sysrq.
2481
On 390 all commands are prefixed with
2482
^-
2483
e.g.
2484
^-t will show tasks.
2485
^-? or some unknown command will display help.
2486
The sysrq key reading is very picky ( I have to type the keys in an
2487
 xterm session & paste them  into the x3270 console )
2488
& it may be wise to predefine the keys as described in the VM hints above
2489
 
2490
This is particularly useful for syncing disks unmounting & rebooting
2491
if the machine gets partially hung.
2492
 
2493
Read Documentation/sysrq.txt for more info
2494
 
2495
References:
2496
===========
2497
Enterprise Systems Architecture Reference Summary
2498
Enterprise Systems Architecture Principles of Operation
2499
Hartmut Penners s390 stack frame sheet.
2500
IBM Mainframe Channel Attachment a technology brief from a CISCO webpage
2501
Various bits of man & info pages of Linux.
2502
Linux & GDB source.
2503
Various info & man pages.
2504
CMS Help on tracing commands.
2505
Linux for s/390 Elf Application Binary Interface
2506
Linux for z/Series Elf Application Binary Interface ( Both Highly Recommended )
2507
z/Architecture Principles of Operation SA22-7832-00
2508
Enterprise Systems Architecture/390 Reference Summary SA22-7209-01 & the
2509
Enterprise Systems Architecture/390 Principles of Operation SA22-7201-05
2510
 
2511
Special Thanks
2512
==============
2513
Special thanks to Neale Ferguson who maintains a much
2514
prettier HTML version of this page at
2515
http://penguinvm.princeton.edu/notes.html#Debug390
2516
 

powered by: WebSVN 2.1.0

© copyright 1999-2024 OpenCores.org, equivalent to Oliscience, all rights reserved. OpenCores®, registered trademark.