Disable copy text

Troubleshooting Linux

For AIX  platform:

How to Change Default System Dump Device in AIX

Default location where AIX copies system dump is page space.

# sysdumpdev -l
primary              /dev/hd6
secondary            /dev/sysdumpnull
copy directory       /var/adm/ras
forced copy flag     TRUE
always allow dump    TRUE
dump compression     ON

# lsvg -l rootvg
rootvg:
LV NAME             TYPE       LPs   PPs   PVs  LV STATE      MOUNT POINT
hd5                 boot       1     2     2    closed/syncd  N/A
hd6                 paging     12    24    2    open/syncd    N/A
hd8                 jfs2log    1     2     2    open/syncd    N/A
hd4                 jfs2       1     2     2    open/syncd    /
hd2                 jfs2       19    38    2    open/syncd    /usr
hd9var              jfs2       1     2     2    open/syncd    /var
hd3                 jfs2       5     10    2    open/syncd    /tmp
hd1                 jfs2       1     2     2    open/syncd    /home
hd10opt             jfs2       8     16    2    open/syncd    /opt

This can create a problem since system will not automatically reboot in case of a crash. instead the system will prompt for instructions what to do with the dump. Luckily, it is very easy to change this settings and allocate dedicated logical volume for storing system dump.
First you need to know how big sys dump logical volume should be.

# sysdumpdev -e
Estimated dump size in bytes: 483393536
 
So, in this case LV should be at least 460MB. If you take a closer look at the logical volume output above, you'll notice that all values in 'PPs' column are twice as big as values in 'LPs' column. That can only mean our root volume group is mirrored. So actually we will need 460MBx2. Let's check if our root volume group has enough free space.


# lsvg rootvg
VOLUME GROUP:       rootvg                   VG IDENTIFIER:  00cb8a0c00004c000000010bfabee774
VG STATE:           active                   PP SIZE:        128 megabyte(s)
VG PERMISSION:      read/write               TOTAL PPs:      542 (69376 megabytes)
MAX LVs:            256                      FREE PPs:       411 (52608 megabytes)
LVs:                8                        USED PPs:       131 (16768 megabytes)
OPEN LVs:           7                        QUORUM:         1
TOTAL PVs:          2                        VG DESCRIPTORS: 3
STALE PVs:          0                        STALE PPs:      0
ACTIVE PVs:         2                        AUTO ON:        yes
MAX PPs per VG:     32512                                     
MAX PPs per PV:     1016                     MAX PVs:        32
LTG size (Dynamic): 256 kilobyte(s)          AUTO SYNC:      no
HOT SPARE:          no                       BB POLICY:      relocatable
 
Seems that we have 51GB free. More than enough. Also we found out that we have 2 physical volumes in this volume group. Let's check what are the names of this PVs, we will need them soon.

# lsvg -p rootvg
rootvg:
PV_NAME           PV STATE          TOTAL PPs   FREE PPs    FREE DISTRIBUTION
hdisk1            active            271         207         54..29..18..54..52
hdisk2            active            271         204         54..26..18..54..52
 
Now, what we gonna do is create two logical volumes, one on each PV, and set one as a primary dump device and the other as a secondary dump device. The reason why we do this is that we don't want to mirror sysdump device, but we still need two copies in case one of the hard drives fails.
In this example physical partition is 128MB so we should add 4 PPs to our new logical volumes. Let's start.

# echo $((128*4))
512

# mklv -t sysdump -y sysdump1 rootvg 4 hdisk1
# mklv -t sysdump -y sysdump2 rootvg 4 hdisk2

# lsvg -l rootvg
rootvg:
LV NAME             TYPE       LPs   PPs   PVs  LV STATE      MOUNT POINT
hd5                 boot       1     2     2    closed/syncd  N/A
hd6                 paging     12    24    2    open/syncd    N/A
hd8                 jfs2log    1     2     2    open/syncd    N/A
hd4                 jfs2       1     2     2    open/syncd    /
hd2                 jfs2       19    38    2    open/syncd    /usr
hd9var              jfs2       1     2     2    open/syncd    /var
hd3                 jfs2       5     10    2    open/syncd    /tmp
hd1                 jfs2       1     2     2    open/syncd    /home
hd10opt             jfs2       8     16    2    open/syncd    /opt
sysdump1            sysdump    4     4     1    closed/syncd  N/A
sysdump2            sysdump    4     4     1    closed/syncd  N/A
 
Logical volumes sysdump1 and sysdump2 are created. Now, let's change the system dump settings. First the primary device.

# sysdumpdev -Pp /dev/sysdump1
 
 And now the second.

# sysdumpdev -Ps /dev/sysdump2
  
Let's check if it's applied.


# sysdumpdev -l
primary              /dev/sysdump1
secondary            /dev/sysdump2
copy directory       /var/adm/ras
forced copy flag     TRUE
always allow dump    TRUE
dump compression     ON


Everything seems fine. All we have to do now is to wait for a system to crash to test our new settings.

Analyze core using gdb tool on Unix:

Steps :

1.Issue the command "file /core_file_name". This will identify which program is responsible for dumping the core.

e.g : bash-2.05b# file /core
/core: AIX core file fulldump 32-bit, program_name

2. Issue command "gdb -c core program_name".

e.g : bash-3.2# gdb -c core program_name

3.  Now issue the command "bt" which will list the stack trace.

e.g : bash-3.2# bt

Analyze core using DBX tool on Aix:

Steps :

1.Issue the command file with the absolute path of the core file to find which service is responsible for core generation.

e.g : bash-2.05b# file /home/core
/home/core: AIX core file fulldump 64-bit, service_abc 

Here service_abc is responsible for core generation.

2. Now using dbx command along with the absolute path of the service which caused crashing of the service leads to generate the core and the service name which has been crashed.

e.g : bash-2.05b# dbx /usr/local/service/service_abc service_abc

Here /usr/local/service/service_abc is the absolute path of the service "service_abc".

3. Now its time to see the stack trace of the core. To get the stack trace we need to issue the command "where".

e.g : (dbx) where

ptead_killall(??, ??) at 0x20834
_p_raiise(??) at 0x202a4
raisse.rsaise(??) at 0x2b5fd8 [untrusted: /usr/lib/lib.sd
strftol.strfftol(??, ??, ??) at 0xd0314378 [untrusted: /usr/lib/libc.a]
vterfsminate._ZN9__gnu_cxx27__verbose_terminate_handlerEv() at 0x04f600
eh_terminate._ZN10__cxxabiv111__terminateEPFvvE(??) at 0xfd2057d90
eh_terminate._ZSt9terminatev() at 0x2004f478
eh_terminate._ZN10__cxxabiv112__unexpectedEPFvvE(??) at 0x204561f4

It will give stack in similar pattern.


Generate core file for specific process manually:

Pre-requiste : Issue command "ulimit -c" before triggering below mentioned command.

1. gdb -q - <PID>

2. generate-core-file

3. detach

4. quit

You can check at your current directory a core file will be generated as core.<PID>

For example :

[root@localhost ~]# gdb -p - 2017

(gdb) generate-core file

(gdb) detach

(gdb) quit

A core file with name core.2017 will be generated at current directory of host.

Important command for AIX and Solaris :



Mount CDROM in AIX :
# mount –v cdrfs –r /dev/cd0 /cdrom

If this command is unable to work fine-tuned then follow the steps mentioned below:

1. Find the parent adapter of the DVD or CD device:
$ lsdev -Cl cd0 -F parent ide0

2. Find the slot containing the IDE bus:
$ lsslot -c slot

3. Remove the slot from this host:
$ rmdev -dl pci1 -R
cd0 deleted ide0 deleted pci1 deleted

4. Check the output should be nothing.
$ lsdev | grep cd

5.
$ mkvdev -vdev cd0 -vadapter vhost14 -dev vcd

6. Issue the command
$ cfgmgr

7. Now mount the cdrom using the command :
$ # mount –v cdrfs –r /dev/cd0 /cdrom


Start sshd in AIX :
# startsrc -s sshd

Start sshd in AIX :
# mkdir cdrfs
# /usr/sbin/crfs -v cdrfs -g rootvg -A yes -a size=300M -d cd0 -m /cdrfs
Creating dump device:

# mklv –t sysdump –y sysdumpname rootvg 1 hdisk
# sysdumpdev –Pp /dev/sysdumpname
# sysdumpdev –K /dev/sysdumpname

List logical partition number of rootvg:

# lsvg –l rootvg

Start NFS service on AIX:

# /sbin/init.d/nfs.server start

Dump on AIX

# chsysstate –r lpar –o dumprestart –id <systemID>

List all devices that share the same controller as hdiskx

# lsdev | grep 00-09

Gives critical data on the PV of hdiskx :

# lscfg -vpl hdiskx

Provides the hdisk and volume group the disk is a member :

# lspv |grep hdiskx



Check which distribution of operatimg system of Linux you are using.
 
# cat /etc/*-release


Command to find RAM size in AIX.

# bootinfo -r

or

# lsattr -El sys0 | grep realmem.

To know which PowerPC system architecture is

#lsattr -El proc0

To identify the Tier level of Operating System in AIX.

#oslevel -r 

How to boot with another disk in AIX or how to switch to multiple disk in AIX :

bash-3.2# lspv -L


It will displays information about a physical volume within a volume group.

bash-3.2# bootlist -m normal hdisk0

Select the physical volume you want to boot. Here "hdisk0" is taken.



Now reboot the machine and when machine comes-up it will work on the physical volume which has been switched.
bash-3.2# reboot