Oracle Real Application Cluster
With the announcement of the availability and support for Oracle 10gR2 on Linux on POWER the so called Oracle Cluster Suite became available and supported, too. This finally means that you can install, deploy and run an Oracle Real Application Cluster (RAC) on Linux on POWER.
Without going into too much details here, RAC has two main goals:
- Provide high availability for an Oracle database.
- Provide high scalability for an Oracle database.
This is achieved by building a cluster of several systems forming one logical database server. Each node within the cluster is able to access all the data and knows who's doing what on which data.
Now the main goal of Oracle is, to make this cluster or multinode database look and behave and therefore manageable like a single system.
The follwing few sections will give you an introduction of how to implement/install Oracle's RAC on Linux on POWER Systems.
Don't fear the RAC
My first contact with a parallel database designed to work like Oracle RAC was in fact its predecessor Oracle Parallel Server. As far as I remember it was more or less some kind of an art to install and run and manage this product and that it was pain - a lot of pain and headaches.
Now unfortunately I cannot proof if this is true because I never had the change to see and work on an OPS cluster - but the bad reputation lasts.
So I was actually not sure as I've been asked to look at RAC and some colleagues from Oracle tried to convince me that it is sooo easy to install and use. Well - yes, they are true!
Don't panic - and read manual first!
Good documentation and where you can find it is a real gift - from my opinion. Now Oracle in fact has a quite good documentation - the only thing you must do is to take a look at it
. You can find the original Oracle Clusterware doku here
. Just believe me that it is really worthwhile taking a look at it - well, if you've installed your 3,345 RAC cluster you might not need it.
Furthermore have a look at Oracle's metalink
and especially at note 341507.1 which has some additional informations about Oracle on Linux on POWER.
Planning your cluster
It is always good to think before you're doing something as my mother always said. So you should think about things like "how many nodes do I want to run in a cluster?" or "which kind of shared storage do I want or need?" or "What requirements and prerequisites does RAC actually has?" I will not go into more detail on question 1 and 2 because they are simply out of scope for this type of documentation - there are I believe hundreds of books (online or printed) available dealing with that. I will concentrate on the last question because it fits perfectly into the purpose of this article.
So there are several things to consider when talking about the requirements/prerequisites of an Oracle RAC:
- Hardware prerequisites
- Software requirements
- Network requirements
- Operating System groups and users
- Kernel requirements
- Oracle's user environment settings
- Shared disk considerations
I will cover all of them
Hardware requirements
They are easy.
- At least 1 GB physical memory.
- Enough SWAP space available
- Up to 2GB memory - 1.5 times the size of RAM (e.g. 1GB leads to 1.5GB SWAP).
- More than 2GB memory - 1 time the size of RAM (e.g. 12GB leads to 12GB SWAP).
- At least 400MB free space in /tmp. If you don't have 400MB free space, you can
- delete something from /tmp.
- resize the file system.
- use TEMP and TMPDIR environment variables to point at a directory with at leas 400MB space. Please note that you cannot use a shared disk device for TEMP and TMPDIR - this would make the installation fail.
- Up to 4GB available disk space for the Oracle software.
- Optional 1.2GB available disk space for a preconfigured database using file system storage.
That's it.
Software requirements
RAC is supported on Red Hat Enterprise Linux Update 1 (RHEL4U1) or later and on Novell SUSE Enterprise Linux 9 Service Pack 2 (SLES9-SP2) or later.
For RHEL4 you must install the following packages:
- make-3.80-5 (not gmake!)
- gcc-3.4.3-22.1
- gcc-ppc32-3.4.3-22.1
- gcc-c++-3.4.3-22.1
- gcc-c++-ppc32-3.4.3-22.1
- glibc-2.3.4-2.9
- glibc-2.3.4-2.9 (64-Bit)
- libgcc-3.4.3-9.EL4
- libgcc-3.4.3-9.EL4.ppc64.rp
- libstdc++-3.4.3-9.EL4
- libstdc++-devel-3.4.3-9.EL4
- libaio-0.3.103-3
- libaio-0.3.103-3 (64-Bit)
- libaio-devel-0.3.103-3 (64-Bit)
- compat-libstdc++-33-3.2.3-47.3
- binutils-2.15.92.0.2-13
- perl-5.8.5-12.1
- tcl-8.4.7-2
- unzip-5.51-7
- zip-2.3-27
- tar-1.14-4
For SLES9 you must install the following packages:
- gcc-3.3.3-43.34
- gcc-64bit-9-200505240008
- gcc-c++-3.3.3-43.34
- glibc-2.3.3-98.47
- glibc-64bit-9-200506062240
- libgcc-3.3.3-43.34
- libgcc (64-bit) 9-200505240008
- libstdc++-3.3.3-43.34
- libstdc++-devel-3.3.3-43.34
- libaio-0.3.102-1.2
- libaio-64bit-9-200502241152
- libaio-devel-0.3.102-1.2
- libaio-devel-0.3.102-1.2 (64-bit)
- binutils-2.15.90.0.1.1-32.10
- binutils-64bit-9-200505240008
- make-3.80-184.1 (not Gmake-3.80-184.1)
- perl-5.8.3-32.4
- tcl-8.4.6-26.3
- unzip-5.50-345.1
- zip-2.3-732.4
- tar-1.13.25-325.3
Check if you've already installed all required packages or - if not - install them.
In addition Oracle requires the IBM XLC Runtime Environment which can be downloade here
.
Don't forget to download and install the XL Optimization Libraries component from this link!!!
If you are planning to compile your own C/C++ code, the IBM XL C/C++ compilers are required. Please note that the RTE and compiler versions used must match.
Finally you can use the following optional JDK versions with the Oracle JDBC/OCI drivers. They are not required for the installation because the Oracle Universal Installer (OUI) comes with its own Java or in other words, IBM Java 1.4.2 32-bit will be automatically installed.
- IBM Java 1.4.2 64-bit (SR1a) or later
- IBM Java 1.4.2 32-bit (SR1a) or later
- IBM Java 1.3.1 32-bit (SR8) or later (for SLES 9 only)
You can download the JDKs here
.
Network requirements
That's a little bit tricky.
- Each node must have at least two network adapters.
- Ethernet to precise, Token-Ring is not supported.
- The adapter configuration on each not must be equal.
- E.g. if eth0 points to the public network and eth1 will be used for the private interconnect on node1 it must be the same on all other nodes in the cluster!
- Use at least 1Gb Ethernet for the private interconnect - 100Mb Ethernet is possible but not recommended!
- You can use redundant adapters and build bonding devices for availability and performance reasons.
Ok, now let's have a look at the IP requirements of each node.
- Each node must have one public IP address which is resolvable via DNS - or if no DNS is available it must be specified in each node's /etc/hosts file!
- Each node must have one so called Virtual IP address (VIP) which could be resolved using DNS - or if no DNS is available it must be specified in each node's /etc/hosts file! It must be on the same subnet as the public IP address!
- Clients connect to the VIP address of the cluster node and if one node fails this VIP address will failover to another node in the cluster.
- Each node must have one private IP address on a separate subnet used for cluster communication. Use the /etc/hosts file to associate private network names with private IP addresses.
So you'll need three IP addresses for each node. Here's an example for a two node cluster:
9.154.2.113 op710-1-lpar1.stuttgart.de.ibm.com op710-1-lpar1
9.154.2.114 op710-1-lpar2.stuttgart.de.ibm.com op710-2-lpar2
9.154.2.92 op710-1-lpar1-vip.stuttgart.de.ibm.com op710-1-lpar1-vip
9.154.2.93 op710-1-lpar2-vip.stuttgart.de.ibm.com op710-1-lpar2-vip
192.168.0.113 op710-1-lpar1-priv
192.168.0.114 op710-1-lpar2-priv
 | Please note...
You can configure the public and the private IP addresses on each node but do not configure the VIP address - just put it into your /etc/hosts file. During the installation the Oracle Universal Installer will configure the VIP on an interface! |
Operating system groups and users
Like for any other "normal" Oracle installation you must configure three groups (dba, oinstall and oper) and at least one user (e.g. oracle) which will act as the software owner.
The oracle user must have oinstall as primary and dba and oper as supplemental groups.
Due to the fact that we are planning to install and run a RAC cluster these groups and the Oracle user must be equal on each node! So either use the configuration tools available on RHEL4 or SLES9 to configure the required groups and the user or use the command line as shown in the next example - by the way the password is "oracle" - on each node.
/usr/sbin/groupadd -g 200 oinstall
/usr/sbin/groupadd -g 201 dba
/usr/sbin/groupadd -g 202 oper
/usr/sbin/useradd -u 200 -g oinstall -G dba,oper -m -p '$1$stZPv2Dd$7/1Y/VX5TF2r6vSMBR91q1' oracle
 | Tip!
The option -p gives you the possibility to assign a password to a new user during its creation with useradd. This is very comfortable when you use it in scripts The password string of course should be crypted - this could be done with the command openssl.
The following example creates a MD5 crypted password string where the password itself is oracle:
bc1-js21-1-lpar2:~ # openssl passwd -1 oracle
$1$FgWroiFf$EAjd8hMWnJxEIkcl1IRx30 |
SSH equivalency
During the installation of Oracle the installer will perform some tasks on the local note as well as on the remote one('s). The Oracle installer will use ssh and scp for this tasks (if not available it will try to use rsh and rcp). In order to make this work it must be possible for the oracle user on each node to login and scp on each other node. Therefore SSH must be configured to allow this.
The first step is to create the required SSH keys for the user oracle and to add them to its authorized_keys2 file. Then this file must be merged to include all keys from all nodes.
Here's an example how to create the keys. Note that with the -N option you can specify a passphrase - by using the double doublequotes the passphrase will be empty or in other words you don't need a passphrase at all.
[oracle@op710-1-lpar1 ~]$ /usr/bin/ssh-keygen -f /home/oracle/.ssh/id_rsa -q -t rsa -N ""
[oracle@op710-1-lpar1 ~]$ /usr/bin/ssh-keygen -f /home/oracle/.ssh/id_dsa -q -t dsa -N ""
[oracle@op710-1-lpar1 ~]$ cat /home/oracle/.ssh/id_dsa.pub >> /home/oracle/.ssh/authorized_keys2
[oracle@op710-1-lpar1 ~]$ cat /home/oracle/.ssh/id_rsa.pub >> /home/oracle/.ssh/authorized_keys2
Ok, assuming this has been done on all cluster nodes it is time to exchange the keys.
On Node1:
ssh op710-1-lpar2 cat /home/oracle/.ssh/authorized_keys2 >> .ssh/authorized_keys2
The authenticity of host 'op710-1-lpar2 (9.154.2.114)' can't be established.
RSA key fingerprint is eb:3e:f7:ee:d4:34:35:60:aa:0f:1d:b6:f3:d3:50:83.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'op710-1-lpar2,9.154.2.114' (RSA) to the list of known hosts.
oracle@op710-1-lpar2's password:
And on Node2:
[oracle@op710-1-lpar2 ~]$ ssh op710-1-lpar1 cat /home/oracle/.ssh/authorized_keys2 >> .ssh/authorized_keys2
The authenticity of host 'op710-1-lpar1 (9.154.2.113)' can't be established.
RSA key fingerprint is 31:ff:b1:bc:1d:4a:8e:45:13:f9:43:9a:c0:5a:2e:22.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'op710-1-lpar1,9.154.2.113' (RSA) to the list of known hosts.
Noticed the difference? On Node2 I've not been asked for a password because the keys for this node were already there!
Now you should be able to use SSH to do something on the remote node(s) without the need of giving a password to login (but only for the Oracle user!).
 | Please note...
The first time you connect to a node via SSH the host key is not known to the system and you're getting the question if you're sure to connect. This is not a big thing in an interactive session but the Oracle installer will fail if it gets this message during the installation! So take care that all host keys are known on all nodes (added to the known_host file) for the short AND the fully qualified names! Here's a little script I use - it saves me some time:
#!/bin/bash
#Defining some variables
DOMAIN="stuttgart.de.ibm.com"
PRIV="priv"
LIST="op710-1-lpar1 op710-1-lpar2"
for NODE in $LIST
do
ssh -o StrictHostKeyChecking=no $NODE date
ssh -o StrictHostKeyChecking=no $NODE-$PRIV date
ssh -o StrictHostKeyChecking=no $NODE.$DOMAIN date
done
|
One last configuration concerning SSH - to ensure the installer won't fail using X11 forwarding, create the config file in the Oracle user's .ssh directory with the following content:
[oracle@op710-1-lpar1 ~]$ cat .ssh/config
Host *
ForwardX11 no
Kernel settings
Just like for any other Oracle installation you must set some kernel parameter and if fact they are the same as for any other Oracle installaiton. If you're interested in what they exactly do - have a look at the kernel documentation.
 | Please note...
Oracle states, that "The kernel parameter and shell limit values shown in the following section are recommended values only. For production database systems, Oracle recommends that you tune these values to optimize the performance of the system. Refer to your operating system documentation for more information about tuning kernel parameters." (taken from the Oracle Database Installation Guide). |
Here's a list of the recommended kernel parameter and their recommended values:
- kernel.shmall = 2097152
- kernel.shmmax = HALF SIZE OF PHYSICAL MEMORY IN BYTES
- kernel.shmmni = 4096
- kernel.sem = 250 32000 100 128
- fs.file-max = 65536
- net.ipv4.ip_local_port_range = 1024 65000
- net.core.rmem_default = 1048576
- net.core.rmem_max = 1048576
- net.core.wmem_default = 262144
- net.core.wmem_max = 262144
Check the above values using the sysctl command. If any of the above parameter has a larger value, KEEP IT!
To adjust the parameter and make them persistent add them to /etc/sysctl.conf. After that, use sysctl -p to activate the parameter.
[root@op710-1-lpar1 ~]# cat /etc/sysctl.conf
# Kernel sysctl configuration file for Red Hat Linux
#
# For binary values, 0 is disabled, 1 is enabled. See sysctl(8) and
# sysctl.conf(5) for more details.
# Controls IP packet forwarding
net.ipv4.ip_forward = 0
# Controls source route verification
net.ipv4.conf.default.rp_filter = 1
# Do not accept source routing
net.ipv4.conf.default.accept_source_route = 0
# Controls the System Request debugging functionality of the kernel
kernel.sysrq = 0
# Controls whether core dumps will append the PID to the core filename.
# Useful for debugging multi-threaded applications.
kernel.core_uses_pid = 1
kernel.shmall = 2097152
kernel.shmmax = 1073741824
kernel.shmmni = 4096
kernel.sem = 250 32000 100 128
fs.file-max = 65536
net.ipv4.ip_local_port_range = 1024 65000
net.core.rmem_default = 1048576
net.core.rmem_max = 1048576
net.core.wmem_default = 262144
net.core.wmem_max = 262144
[root@op710-1-lpar1 ~]# sysctl -p
net.ipv4.ip_forward = 0
net.ipv4.conf.default.rp_filter = 1
net.ipv4.conf.default.accept_source_route = 0
kernel.sysrq = 0
kernel.core_uses_pid = 1
kernel.shmall = 2097152
kernel.shmmax = 1073741824
kernel.shmmni = 4096
kernel.sem = 250 32000 100 128
fs.file-max = 65536
net.ipv4.ip_local_port_range = 1024 65000
net.core.rmem_default = 1048576
net.core.rmem_max = 1048576
net.core.wmem_default = 262144
net.core.wmem_max = 262144
 | Please note...
On SLES9 systems you must run chkconfig boot.sysctl on in order to load the parameter on a system reboot! |
Oracle's user environment
Now it's time to increase the shell limits for the oracle user.
- Add the following lines to the /etc/security/limits.conf file:
oracle soft nproc 2047
oracle hard nproc 16384
oracle soft nofile 1024
oracle hard nofile 65536
- Add or edit the following line in the /etc/pam.d/login file, if it does not already exist:
session required /lib/security/pam_limits.so
- For the Bourne, Bash, or Korn shell, add the following lines to the /etc/profile file (or the file on SUSE systems /etc/profile.local):
if [ $USER = "oracle" ]; then
if [ $SHELL = "/bin/ksh" ]; then
ulimit -p 16384
ulimit -n 65536
else
ulimit -u 16384 -n 65536
fi
fi
In addition add the following to the .bashrc of the Oracle user. The reason for that is that any stty command in hidden file like [[.bashrc}} will cause the installer to fail.
if [ -t 0 ]; then
stty intr ^C
fi
And finally add the correct umask for the Oracle user in its .bash_profile
Creating the required directories
Oracle requires the following directories to be created before the installation:
- ORACLE_BASE
- ORACLE_INVENTORY
- CRS_HOME
- ORACLE_HOME
The ORACLE_BASE directory is the top directory for the installation. In fact normally ORACLE_INVENTORY and ORACLE_HOME are located somwhere here. Oracle itself recommends the following strukture:
- /mount_point/app/oracle_sw_owner
The mount_point can be anywhere in your filesystem - e.g. /opt or /u01. The oracle_sw_user is the user you've created earlier - e.g. oracle. Let's make an example:
The ORACLE_HOME directory specifies the directory where you actually install the software. It is normaly somewhere under ORACLE_BASE, for example:
- /ORACLE_BASE/product/10.2.0.1/rdbms
This says that my ORACLE_HOME is at /opt/oracle/app/oracle/product/10.2.0.1/rdbms - and the software will be installed there!
The CRS_HOME directory specifies where Oracle's clusterware will be installed. It is normaly somewhere under ORACLE_BASE, for example
This says that my CRS_HOME is at /opt/oracle/app/oracle/product/crs - and the clusterware will be installed here!
Finally the ORACLE_INVENTORY directory is required for the - guess what - Oracle inventory which means that Oracle will look here to determine which Oracle software is already on the system etc.
The really good thing is that it must not be created now - the installer will do that for you!
Putting all together will lead to the following example:
[root@op710-1-lpar1 ~]# mkdir -p /opt/oracle/app/oracle/product/10/app/crs
[root@op710-1-lpar1 ~]# mkdir -p /opt/oracle/app/oracle/product/10/app/rdbms
[root@op710-1-lpar1 ~]# mkdir -p /opt/oracle/app/oracle/product/10/app/asm
[root@op710-1-lpar1 ~]# chown -R oracle:oinstall /opt/oracle/app/oracle
[root@op710-1-lpar1 ~]# chmod -R 755 /opt/oracle/app/oracle
In this example my ORACLE_BASE is /opt/oracle/app/oracle, my ORACLE_HOME is at /opt/oracle/app/oracle/product/10/app/rdbms and my CRS_HOME is /opt/oracle/app/oracle/product/10/app/crs.
Note that I've created a third directory called /opt/oracle/app/oracle/product/10/app/asm which I'll use for my ASM instance in order to separate it from the ORACLE_HOME.
Don't forget to set the right permissions on the directories!
 | Please note...
You can of course set ORACLE_HOME like other environment variables to the shell profile which makes them persistent. Keep in mind that for RHEL4 the settings for ORACLE_HOME, ORACLE_SID and CRS_HOME should go into /home/oracle/.bash_profile - for SLES9 the correct location is /etc/profile.local! |
Configuring hangcheck
Before installing Oracle Real Application Clusters on Linux systems, verify that the hangcheck-timer module (hangcheck-timer) is loaded and configured correctly. hangcheck-timer monitors the Linux kernel for extended operating system hangs that could affect the reliability of a RAC node and cause a database corruption. If a hang occurs, then the module restarts the node in seconds.
You can use the hangcheck_tick and hangcheck_margin parameters to control the behavior of the module, as follows:
- The hangcheck_tick parameter defines how often, in seconds, the hangcheck-timer checks the node for hangs. The default value is 60 seconds.
- The hangcheck_margin parameter defines how long the timer waits, in seconds, for a response from the kernel. The default value is 180 seconds.
If the kernel fails to respond within the sum of the hangcheck_tick and hangcheck_margin parameter values, the hangcheck-timer module restarts the system. Using the default values, the node would be restarted if the kernel fails to respond within 240 seconds.
To insert the hangcheck-timer module load it by executing the following command:
- /sbin/insmod /lib/modules/kernel_version/kernel/drivers/char/hangcheck-timer.ko hangcheck_tick=30 hangcheck_margin=180
- e.g. /sbin/insmod /lib/modules/2.6.9-42.0.3.EL/kernel/drivers/char/hangcheck-timer.ko hangcheck_tick=30 hangcheck_margin=180
To load the module at each system restart add this line to:
- /etc/rc.d/rc.local for RHEL4 systems.
- /etc/init.d/boot.local for SLES9 systems.
 | Please note...
If you're upgrading your kernel, the kernel-version changes and therefore the modules for the new kernel are located at a different place. Don't forget to update rc.local or boot.local to reflect those changes! |
Checking storage requirements
Due to the fact that each node of the cluster must be able to access the same disks you must use some kind of shared storage like a SAN. In fact there are several storage requirements:
- The Oracle software itself - it could be located on each node's local storage or within a cluster file system. Please note that using NFS for the Oracle software is not supported on Linux on POWER.
- The Oracle Cluster Repository (CRS) and Voting disk. They must be shared and the available option is to use either a cluster filesystem (like OCFS2 or IBM's GPFS) or shared raw devices.
- The database and recovery areas. They could be located on a shared cluster filesystem (e.g. GPFS or OCFS2), on shared raw disk partitions (not the recovery area) or using ASM.
The cluster file systems are out of scope for this document. I will show how to setup an Oracle RAC using shared raw disks for the CRS and the Voting disk and ASM for the database.
Device name persistent
One very important point within a Linux RAC implementation is that each node has the same view on the storage. For example the Voting and CRS disks and of course the data disks, too, must be configured equal on each node, meaning that a Voting disk on node1 must be a Voting disk on node2. Unfortunately nobody guarantees that - for example - sdb is the same on each node (it could be sdd on one and sdc on another node and after a reboot everything changes again). To get this consistent and persistent throughout the cluster you should somehow configure Linux in that way, that for example the Voting disk will always be available as Voting disk. And this means - let's have some fun with udev.
Let's have a look at one of my nodes. Using fdisk shows me the following.
[root@op710-1-lpar1 ~]# fdisk -l
Disk /dev/sda: 21.4 GB, 21474836480 bytes
255 heads, 63 sectors/track, 2610 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Device Boot Start End Blocks Id System
/dev/sda1 * 1 1 8001 41 PPC PReP Boot
/dev/sda2 2 14 104422+ 83 Linux
/dev/sda3 15 2610 20852370 8e Linux LVM
Disk /dev/sdb: 536 MB, 536870912 bytes
17 heads, 61 sectors/track, 1011 cylinders
Units = cylinders of 1037 * 512 = 530944 bytes
Device Boot Start End Blocks Id System
/dev/sdb1 1 1011 524173 83 Linux
Disk /dev/sdc: 10.4 GB, 10485760000 bytes
64 heads, 32 sectors/track, 10000 cylinders
Units = cylinders of 2048 * 512 = 1048576 bytes
Device Boot Start End Blocks Id System
/dev/sdc1 1 10000 10239984 83 Linux
Disk /dev/sdd: 1048 MB, 1048576000 bytes
33 heads, 61 sectors/track, 1017 cylinders
Units = cylinders of 2013 * 512 = 1030656 bytes
Device Boot Start End Blocks Id System
/dev/sdd1 1 1017 1023580 83 Linux
Ok, I have one system disk sda and three other disks sdb, sdc, sdd available to all nodes in my cluster. Unfortunately on the second node the naming is not the same! So I can't use the default setup.
The first step to get a persisten device name is to find out the scsi_id of each disk. Here's how:
[root@op710-1-lpar1 ~]# scsi_id -g -s /block/sdb
149455400000000000000000000000003004180040fd00008
[root@op710-1-lpar1 ~]# scsi_id -g -s /block/sdc
149455400000000000000000000000002004180020fd00008
[root@op710-1-lpar1 ~]# scsi_id -g -s /block/sdd
149455400000000000000000000000001004180030fd00008
Now that I know the scsi_id's I can use them to identify the disks on the system. Without going into much details I will add the following lines to /etc/udev/rules.d/51-by-id.rules - ah yes, it's an RHEL4 example
[root@op710-1-lpar1 ~]# cat /etc/udev/rules.d/51-by-id.rules
...
KERNEL="sd*[!0-9]", PROGRAM="/sbin/scsi_id", RESULT="149455400000000000000000000000002004180020fd00008", SYMLINK="disk/by-name/data1"
KERNEL="sd*[!0-9]", PROGRAM="/sbin/scsi_id", RESULT="149455400000000000000000000000001004180030fd00008", SYMLINK="disk/by-name/crs"
KERNEL="sd*[!0-9]", PROGRAM="/sbin/scsi_id", RESULT="149455400000000000000000000000003004180040fd00008", SYMLINK="disk/by-name/voting"
KERNEL="sd*[!0-9]", PROGRAM="/sbin/scsi_id", SYMLINK="disk/by-id/scsi-%c"
KERNEL="sd*[0-9]", PROGRAM="/sbin/scsi_id", RESULT="149455400000000000000000000000002004180020fd00008", SYMLINK="disk/by-name/data1-p%n"
KERNEL="sd*[0-9]", PROGRAM="/sbin/scsi_id", RESULT="149455400000000000000000000000001004180030fd00008", SYMLINK="disk/by-name/crs-p%n"
KERNEL="sd*[0-9]", PROGRAM="/sbin/scsi_id", RESULT="149455400000000000000000000000003004180040fd00008", SYMLINK="disk/by-name/voting-p%n"
KERNEL="sd*[0-9]", PROGRAM="/sbin/scsi_id", SYMLINK="disk/by-id/scsi-%c-part%n"
...
I will use this configuration on both nodes. It simply says that udev should look for a device sd* without any numbers/partitions and get it's scsi_id. If the result is one of the one's above it will create a symlink to the device in /dev/disk/by-name and name it based on its future role (e.g. data1, crs, etc.). I'll do the same for each sd* device's partition so I'll get a symlink in /dev/disk/by-name/ pointing to the partition of the disk.
After I've created this udev - Rules there are two little steps to go. First edit /etc/scsi_id.config and comment out the line saying options=-b and uncomment the line saying options=-g.
...
#options=-b
...
options=-g
...
Finally I'll have to restart udev with the command udevstart.
[root@op710-1-lpar1 ~]# udevstart
[root@op710-1-lpar1 ~]# ll /dev/disk/by-name/
total 0
lrwxrwxrwx 1 root root 9 Dec 29 15:45 crs -> ../../sdd
lrwxrwxrwx 1 root root 10 Dec 29 15:45 crs-p1 -> ../../sdd1
lrwxrwxrwx 1 root root 9 Dec 29 15:45 data1 -> ../../sdc
lrwxrwxrwx 1 root root 10 Dec 29 15:45 data1-p1 -> ../../sdc1
lrwxrwxrwx 1 root root 9 Dec 29 15:45 voting -> ../../sdb
lrwxrwxrwx 1 root root 10 Dec 29 15:45 voting-p1 -> ../../sdb1
Now I have persisten device names throughout my system regardless how the kernel calls it.
 | Please note...
This works with SLES9, too. It's similar there but instead of using the KERNEL Option you should specify BUS="scsi". |
Voting and CRS
As mentioned I am planning to use raw disk devices for my Voting and CRS drives. After I've ensured that I know exactly which device I'm gonna use for what (see above), it's time to bind it to raw devices. Therefore I am editing the /etc/sysconfig/rawdevices file for RHEL4 - for SLES9 it's called /etc/raw.
[root@op710-1-lpar1 ~]# cat /etc/sysconfig/rawdevices
# This file and interface are deprecated.
# Applications needing raw device access should open regular
# block devices with O_DIRECT.
# raw device bindings
# format: <rawdev> <major> <minor>
# <rawdev> <blockdev>
# example: /dev/raw/raw1 /dev/sda1
# /dev/raw/raw2 8 5
/dev/raw/raw1 /dev/disk/by-name/crs-p1
/dev/raw/raw2 /dev/disk/by-name/voting-p1
Now I know that raw1 is my crs disk and raw2 is my voting disk. I'll distribute this file on each cluster node and start the rawdevice services with service rawdevice start or on a SLES9 system with rcraw start.
Don't forget to use chkconfig to enable the raw devices service on each reboot!!
One last step must be done - because the device permissions are wrong. Per default all raw devices belongs to root and the group disk. Therefore I change this to oracle and oinstall or dba. To make this persistent edit the file /etc/udev/permissions.d/50-udev.permissions on a RHEL4 system respective /etc/udev/udev.permissions on SLES9 to reflect this and to make it persistent even on a system reboot.
[root@op710-1-lpar1 ~]# cat /etc/udev/permissions.d/50-udev.permissions | grep raw
# raw devices
raw/*:oracle:dba:0660
Data disks
Without going into too much details, I am using ASM for my data disks. I know that you might think that a Cluster Filesystem like GPFS or using RAW devices is the only valid way to go, but I think that with ASM you'll get good results very fast.
First I am installing the ASM library driver and utilities from here
.
Afterwards I am configuring ASM with the oracleasm command.
[root@op710-1-lpar1 ~]# /etc/init.d/oracleasm configure
Configuring the Oracle ASM library driver.
This will configure the on-boot properties of the Oracle ASM library
driver. The following questions will determine whether the driver is
loaded on boot and what permissions it will have. The current values
will be shown in brackets ('[]'). Hitting <ENTER> without typing an
answer will keep that current value. Ctrl-C will abort.
Default user to own the driver interface []: oracle
Default group to own the driver interface []: dba
Start Oracle ASM library driver on boot (y/n) [n]: y
Fix permissions of Oracle ASM disks on boot (y/n) [y]: y
Writing Oracle ASM library driver configuration: [ OK ]
Loading module "oracleasm": [ OK ]
Mounting ASMlib driver filesystem: [ OK ]
Scanning system for ASM disks: [ OK ]
Now you can add devices as ASM drives with the option createdisk.
[root@op710-1-lpar1 ~]# /etc/init.d/oracleasm createdisk VOL1 /dev/disk/by-name/data1-p1
Marking disk "/dev/disk/by-name/data1-p1" as an ASM disk: [ OK ]
I am doing this only on the first node. On the second node I am using the scandisks option.
[root@op710-1-lpar2 ~]# /etc/init.d/oracleasm scandisks
Scanning system for ASM disks: [ OK ]
[root@op710-1-lpar2 ~]# /etc/init.d/oracleasm listdisks
VOL1
 | Please note...
If your are adding addl. disks to the cluster, run oracleasm with the scandisk option! This will ensure that each node in the cluster will detect the new ASM drive! |
That's it.
Installing Oracle RAC
Now we've setup our systems it's time to do the installation. First of all download the code from here
to a place of your choice.
Unpack the code to a directory of your choice - here's an example of
bc1-mms:/export/oracle/clusterware # gunzip 10201_database_lin_ppc.cpio.gz
bc1-mms:/export/oracle/clusterware # cpio -idmv < 10201_clusterware_lin_ppc.cpio
...skipping some output here...
bc1-mms:/export/oracle/clusterware # ll
total 390877
drwxr-xr-x 3 root root 120 Dec 29 16:12 .
drwxr-xr-x 9 nobody nobody 240 Nov 23 09:19 ..
-rwxr-xr-x 1 nobody nobody 399866368 Jun 20 2006 10201_clusterware_lin_ppc.cpio
drwxr-xr-x 7 51162 42424 200 Nov 15 2005 Disk1
Do the same for the database.
bc1-mms:/export/oracle/10gR2 # gunzip 10201_database_lin_ppc.cpio.gz
bc1-mms:/export/oracle/10gR2 # cpio -idmv < 10201_database_lin_ppc.cpio
...skipping some output here...
bc1-mms:/export/oracle/10gR2 # ls
total 888175
drwxr-xr-x 3 root root 160 Nov 23 09:19 .
drwxr-xr-x 9 nobody nobody 240 Nov 23 09:19 ..
-rwxr-xr-x 1 nobody nobody 908559872 Jun 20 2006 10201_database_lin_ppc.cpio
drwxr-xr-x 5 51162 42424 152 Nov 16 2005 Disk1
 | Please note...
Don't use any other unpack options than shown here or on Oracle's documentation (f.e. gunzip -d) because this could cause the installation to fail with some very strange error messages (e.g. OUI-10133: Invalid staging area...) |
The installation itself requires three steps:
- Installation of Oracle's clusterware
- Installation of the ASM instance
- Installation of the Oracle database
Cluster verification utility (cluvfy)
On the clusterware Disk1 you'll find a subdirectory called cluvfy. In this directory you'll find the cluster verification utility (cluvfy) which helps you to check if you've setup your cluster correctly before you actually install it.
How to use cluvfy is documented on the installation guide - here
Installation of Oracle's clusterware
First we are installing the Oracle clusterware. Login to one node of your choice and change to the directory where you've unpacked the installation files.
 | Tip!
Although it is not necessary it is recommended that you'll alway install from one node. This will ensure that all installation logs are at one place. |
[root@op710-1-lpar1 ~]# mount 9.154.2.86:/export/oracle/clusterware /mnt
[root@op710-1-lpar1 ~]# cd /mnt/Disk1/
[root@op710-1-lpar1 Disk1]# ls
cluvfy install response runInstaller stage upgrade
Ensure that each client can connect to your X-Server by using the xhost + command. And then start the installer by using the runInstaller script.
After a view seconds the Welcome screen appears. Select Next

Specify the location of the Oracle Inventory and the operating system goup. Press Next after you've made your selections.

Now you're asked for the CRS_HOME directory. Change as appropriate and press Next.

You're ask to add all nodes you're planning to install Oracle's clusterware. Per default only the note where you've started the installation is listed here. To add more nodes, press Add.

You are asked for the public, the vip and the private nodename - as you've specified in /etc/hosts. After you've made your selections press OK.

After you've added all nodes, press Next. The installer will now do some checks on the network and if anything is wrong here you won't get any further!

Check if the installer detects the correct network usage. If not correct it via Edit - otherwise press Next.

You're asked for the OCR file location. Remember I've configured it as a raw devices /dev/raw/raw1. I am choosing External Redundancy because I am using a RAID array and I am not paranoid enough to add more redundancy here. After all selections, press Next.

The same now for the Voting disk. Make your selections and after that press Next.

At the end you'll see this Summary screen. Check if everything's alright - and if it is press Install

Now the installation of the Oracle clusterware is done. It will be quite fast.

As soon as it's finished you'll see this screen, asking you to run these scripts. Be carefull because it says run orainstRoot.sh on one node and then on the other node. Afterwards run the root.sh script on the first node and then on the second node. Any other order will crash you installation!
 | Please note...
Sounds strange but I've seen some installations fail when doing a su - root. To be on the save side, use a new terminal session to connect to the system! |

Here's the output of both scripts - remember after the first script has finished I ran it on the second node before I ran the second script on the first node!
[root@op710-1-lpar1 ~]# /home/oracle/oraInventory/orainstRoot.sh
Changing permissions of /home/oracle/oraInventory to 770.
Changing groupname of /home/oracle/oraInventory to oinstall.
The execution of the script is complete
[root@op710-1-lpar1 ~]# /opt/oracle/app/oracle/product/10/app/crs/root.sh
WARNING: directory '/opt/oracle/app/oracle' is not owned by root
Checking to see if Oracle CRS stack is already configured
/etc/oracle does not exist. Creating it now.
Setting the permissions on OCR backup directory
Setting up NS directories
Oracle Cluster Registry configuration upgraded successfully
WARNING: directory '/opt/oracle/app/oracle' is not owned by root
Successfully accumulated necessary OCR keys.
Using ports: CSS=49895 CRS=49896 EVMC=49898 and EVMR=49897.
node <nodenumber>: <nodename> <private interconnect name> <hostname>
node 1: op710-1-lpar1 op710-1-lpar1-priv op710-1-lpar1
node 2: op710-1-lpar2 op710-1-lpar2-priv op710-1-lpar2
Creating OCR keys for user 'root', privgrp 'root'..
Operation successful.
Now formatting voting device: /dev/raw/raw2
Format of 1 voting devices complete.
Startup will be queued to init within 90 seconds.
Adding daemons to inittab
Expecting the CRS daemons to be up within 600 seconds.
CSS is active on these nodes.
op710-1-lpar1
CSS is inactive on these nodes.
op710-1-lpar2
Local node checking complete.
Run root.sh on remaining nodes to start CRS daemons.
Finally the clusterware is doing some addl. configurations...

...and then SUCCESS!!!

Congratulation - you've just installed a RAC cluster!
The next steps involve installing the ASM instance and then the database. In fact it is like a local installation. So change to the directory where your unpacked the database to the subdirectory Disk1 and execute the script runInstaller.sh.
After a few seconds the "normal" Oracle Universal Installer comes up - and yes, it is like any other Oracle installation. The only difference is that the installer automatically detects the cluster installation and offers you the choice of installing the software on all cluster nodes like shown in the next picture.

 | Linking fails on Red Hat Enterprise Linux 4 U2 or greater..
On Red Hat Enterprise Linux (RHEL) Version 4 Update 2 systems the linking or relinking of Oracle fails. Although this is a known bug since U2 it is still present on RHEL4U3 and U4. There's a patch available from Oracle. Login to Oracle Metalink and search for Patch number 4767801. After you've downloaded the patch to the system, unzip it and wait until you'll see the following error message during the installation.

Now copy the contents to $ORACLE_HOME/lib/stubs and $ORACLE_HOME/lib32/stubs. Here's an example:
[root@bc1-js21-1-lpar1 ~]# unzip 4767801.zip
[root@bc1-js21-1-lpar1 ~]# cp 4767801/stubs64/* $ORACLE_HOME/lib/stubs
[root@bc1-js21-1-lpar1 ~]# cp 4767801/stubs32/* $ORACLE_HOME/lib32/stubs
Finally resume the installation by pressing "RETRY" on the dialog. It should install Oracle without addl. errors. |
After you've installed the ASM instance install the database - so execute the runInstaller.sh script once more. As seen before, the Oracle installer is able to detect that it is a cluster installation and therefore offers you the choice of installing the database on all nodes.
Finally you can make a quick check with the crs_stat command. You can find this command in the CRS_HOME/bin folder.
[oracle@op710-1-lpar1 ~]$ /opt/oracle/app/oracle/product/10/app/crs/bin/crs_stat -t
Name Type Target State Host
------------------------------------------------------------
ora....SM1.asm application ONLINE ONLINE op71...par1
ora....R1.lsnr application ONLINE ONLINE op71...par1
ora....ar1.gsd application ONLINE ONLINE op71...par1
ora....ar1.ons application ONLINE ONLINE op71...par1
ora....ar1.vip application ONLINE ONLINE op71...par1
ora....SM2.asm application ONLINE ONLINE op71...par2
ora....R2.lsnr application ONLINE ONLINE op71...par2
ora....ar2.gsd application ONLINE ONLINE op71...par2
ora....ar2.ons application ONLINE ONLINE op71...par2
ora....ar2.vip application ONLINE ONLINE op71...par2
ora.rac.db application ONLINE ONLINE op71...par2
ora....c1.inst application ONLINE ONLINE op71...par1
ora....c2.inst application ONLINE ONLINE op71...par2
[oracle@op710-1-lpar1 ~]$ /opt/oracle/app/oracle/product/10/app/crs/bin/crs_stat
NAME=ora.op710-1-lpar1.ASM1.asm
TYPE=application
TARGET=ONLINE
STATE=ONLINE on op710-1-lpar1
NAME=ora.op710-1-lpar1.LISTENER_OP710-1-LPAR1.lsnr
TYPE=application
TARGET=ONLINE
STATE=ONLINE on op710-1-lpar1
NAME=ora.op710-1-lpar1.gsd
TYPE=application
TARGET=ONLINE
STATE=ONLINE on op710-1-lpar1
NAME=ora.op710-1-lpar1.ons
TYPE=application
TARGET=ONLINE
STATE=ONLINE on op710-1-lpar1
NAME=ora.op710-1-lpar1.vip
TYPE=application
TARGET=ONLINE
STATE=ONLINE on op710-1-lpar1
NAME=ora.op710-1-lpar2.ASM2.asm
TYPE=application
TARGET=ONLINE
STATE=ONLINE on op710-1-lpar2
NAME=ora.op710-1-lpar2.LISTENER_OP710-1-LPAR2.lsnr
TYPE=application
TARGET=ONLINE
STATE=ONLINE on op710-1-lpar2
NAME=ora.op710-1-lpar2.gsd
TYPE=application
TARGET=ONLINE
STATE=ONLINE on op710-1-lpar2
NAME=ora.op710-1-lpar2.ons
TYPE=application
TARGET=ONLINE
STATE=ONLINE on op710-1-lpar2
NAME=ora.op710-1-lpar2.vip
TYPE=application
TARGET=ONLINE
STATE=ONLINE on op710-1-lpar2
NAME=ora.rac.db
TYPE=application
TARGET=ONLINE
STATE=ONLINE on op710-1-lpar2
NAME=ora.rac.rac1.inst
TYPE=application
TARGET=ONLINE
STATE=ONLINE on op710-1-lpar1
NAME=ora.rac.rac2.inst
TYPE=application
TARGET=ONLINE
STATE=ONLINE on op710-1-lpar2
Well done.
Managing the cluster
After you've successfully installed Oracle RAC, it is important to check and test your cluster. I've already shown the first command crs_stat which will give you a quick overview of the services running or not running on all nodes.
There's another important command for tasks concerning mananging the cluster (i.e. getting informations about the status, starting and stopping cluster services etc.). It's called srvctl. As cls_stat the srvctl command is located in the $CRS_HOME directory.
You can use srvctl for example to query the status of the database.
[oracle@op710-1-lpar1 ~]$ /opt/oracle/app/oracle/product/10/app/crs/bin/srvctl status database -d rac -v
Instance rac1 is running on node op710-1-lpar1
Instance rac2 is running on node op710-1-lpar2
Here are more examples about getting status informations about the cluster.
[oracle@op710-1-lpar1 ~]$ /opt/oracle/app/oracle/product/10/app/crs/bin/srvctl status nodeapps -n op710-1-lpar1
VIP is running on node: op710-1-lpar1
GSD is running on node: op710-1-lpar1
Listener is running on node: op710-1-lpar1
ONS daemon is running on node: op710-1-lpar1
[oracle@op710-1-lpar1 ~]$ /opt/oracle/app/oracle/product/10/app/crs/bin/srvctl status nodeapps -n op710-1-lpar2
VIP is running on node: op710-1-lpar2
GSD is running on node: op710-1-lpar2
Listener is running on node: op710-1-lpar2
ONS daemon is running on node: op710-1-lpar2
[oracle@op710-1-lpar1 ~]$ /opt/oracle/app/oracle/product/10/app/crs/bin/srvctl status asm -n op710-1-lpar1
ASM instance +ASM1 is running on node op710-1-lpar1.
[oracle@op710-1-lpar1 ~]$ /opt/oracle/app/oracle/product/10/app/crs/bin/srvctl status asm -n op710-1-lpar2
ASM instance +ASM2 is running on node op710-1-lpar2.
You can stop and start services, instances and the whole cluster services with the srvctl command. If you want to stop the instance on one node you can issue the following command.
[oracle@op710-1-lpar1 ~]$ /opt/oracle/app/oracle/product/10/app/crs/bin/srvctl stop instance -d rac -i rac2
This will stop the instance rac2 of the database rac.
[oracle@op710-1-lpar1 ~]$ /opt/oracle/app/oracle/product/10/app/crs/bin/srvctl stop nodeapps -n op710-1-lpar2
[oracle@op710-1-lpar1 ~]$ /opt/oracle/app/oracle/product/10/app/crs/bin/crs_stat -t
Name Type Target State Host
------------------------------------------------------------
ora....SM1.asm application ONLINE ONLINE op71...par1
ora....R1.lsnr application ONLINE ONLINE op71...par1
ora....ar1.gsd application ONLINE ONLINE op71...par1
ora....ar1.ons application ONLINE ONLINE op71...par1
ora....ar1.vip application ONLINE ONLINE op71...par1
ora....SM2.asm application OFFLINE OFFLINE
ora....R2.lsnr application OFFLINE OFFLINE
ora....ar2.gsd application OFFLINE OFFLINE
ora....ar2.ons application OFFLINE OFFLINE
ora....ar2.vip application OFFLINE OFFLINE
ora.rac.db application ONLINE ONLINE op71...par2
ora....c1.inst application ONLINE ONLINE op71...par1
ora....c2.inst application OFFLINE OFFLINE
The example above will stop all cluster services of instance rac2 of the RAC database rac as you can see on the crs_stat output.
To (re)start the CRS services, use srvctl with the start option.
[oracle@op710-1-lpar1 ~]$ /opt/oracle/app/oracle/product/10/app/crs/bin/srvctl start nodeapps -n op710-1-lpar2
[oracle@op710-1-lpar1 ~]$ /opt/oracle/app/oracle/product/10/app/crs/bin/crs_stat -t
Name Type Target State Host
------------------------------------------------------------
ora....SM1.asm application ONLINE ONLINE op71...par1
ora....R1.lsnr application ONLINE ONLINE op71...par1
ora....ar1.gsd application ONLINE ONLINE op71...par1
ora....ar1.ons application ONLINE ONLINE op71...par1
ora....ar1.vip application ONLINE ONLINE op71...par1
ora....SM2.asm application OFFLINE OFFLINE
ora....R2.lsnr application ONLINE ONLINE op71...par2
ora....ar2.gsd application ONLINE ONLINE op71...par2
ora....ar2.ons application ONLINE ONLINE op71...par2
ora....ar2.vip application ONLINE ONLINE op71...par2
ora.rac.db application ONLINE ONLINE op71...par2
ora....c1.inst application ONLINE ONLINE op71...par1
ora....c2.inst application OFFLINE OFFLINE
As you can see, the services are started by issuing this commmand but ASM and the database instance rac2 are still offline. So first start ASM and then the instance.
[oracle@op710-1-lpar1 ~]$ /opt/oracle/app/oracle/product/10/app/crs/bin/srvctl start asm -n op710-1-lpar2
[oracle@op710-1-lpar1 ~]$ /opt/oracle/app/oracle/product/10/app/crs/bin/srvctl start instance -d rac -i rac2
[oracle@op710-1-lpar1 ~]$ /opt/oracle/app/oracle/product/10/app/crs/bin/crs_stat -t
Name Type Target State Host
------------------------------------------------------------
ora....SM1.asm application ONLINE ONLINE op71...par1
ora....R1.lsnr application ONLINE ONLINE op71...par1
ora....ar1.gsd application ONLINE ONLINE op71...par1
ora....ar1.ons application ONLINE ONLINE op71...par1
ora....ar1.vip application ONLINE ONLINE op71...par1
ora....SM2.asm application ONLINE ONLINE op71...par2
ora....R2.lsnr application ONLINE ONLINE op71...par2
ora....ar2.gsd application ONLINE ONLINE op71...par2
ora....ar2.ons application ONLINE ONLINE op71...par2
ora....ar2.vip application ONLINE ONLINE op71...par2
ora.rac.db application ONLINE ONLINE op71...par2
ora....c1.inst application ONLINE ONLINE op71...par1
ora....c2.inst application ONLINE ONLINE op71...par2
 | Please note...
To get all possible configuration options you can use with srvctl use $CRS_HOME/bin/srvctl -h. This will give you a (quite long) list of possible configuration options. |
Links
Find below some links which might be helpfull: