Preparing Linux for the installation of Oracle RAC
Linux® preparation before the installation of Oracle RAC consists of preparing users, specifying various parameters and settings, setting up the network, setting up the file system, and performing some disk tasks.
To prepare Linux for Oracle RAC installation, complete these tasks in the order that they are listed.
Creating users and authentication parameters
- Log in with root authority.
- Create the user named oracle on all of the nodes.
- Create the user group named oinstall on all of the nodes.
- Use an editor such as vi to add these lines to the body of the /etc/security/limits/conf file,
to increase the limits for open files and processes:
oracle hard nofile 65536 oracle soft nproc 2047 oracle hard nproc 16384 # End of file
- Check these four files for the following line, and add it if
the line is not already present:
session required pam_limits.so
- /etc/pam.d sshd
- /etc/pam.d/login
- /etc/pam.d/su
- /etc/pam.d/xdm
Using ulimit for shell settings
ulimit -n 65536
ulimit -u 16384
export OPATCH_PLATFORM_ID=211
The environment variable OPATCH_PLATFORM_ID indicates Linux on IBM® System z®.
Setting Linux kernel parameters
kernel.sem = 250 32000 100 128
fs.file-max = 65536
net.ipv4.iiiip_local_port_range = 1024 65000
net.core.rmem_default = 1048576
net.core.rmem_max = 1048576
net.core.wmem_default = 262144
net.core.wmem_max = 262144
Setting up ssh user equivalence
Oracle uses the ssh protocol for issuing remote commands. Oracle uses scp to perform the installation. Therefore, it is necessary for the users oracle and root to have user equivalence, or the ability to use ssh to move from one node to another without authenticating with passwords or passphrases. Set up ssh user equivalence between each of the interfaces on all of the nodes in the cluster with public and private authentication key pairs that are generated with the Linux command key-gen. Instructions on how to use key-gen to create public and private security keys are available at:
When the key-gen command issues a prompt to enter a passphrase, just type enter so that no passphrase will be required.
For example, an installation with two Oracle RAC servers will have a total of twenty-four key pairs (between three interfaces, public, vip, and interconnect; users root and oracle; node1 to node; node1 to node2; and the reverse).
Setting up the network for Oracle RAC
Each server node in the cluster needs two physical connections and three IP interfaces, which must be set up before beginning the installation. Network device type for the Oracle interconnect explained the need to use the device that is fastest and can handle the most throughput and traffic for the private interconnect between server nodes in the cluster. For this reason, the study used HiperSockets on IBM System z.
For the public access, a 1 Gb Ethernet was used on each server, which was configured to have two interfaces, the second one being an alias.
NAME='IBM OSA Express Network card (0.0.07c0)'
IPADDR='10.10.10.200'
NETMASK='255.255.255.0'
By using the network address 10.10.10.200
and a netmask of 255.255.255.0, it leaves the hardware device available
to be used by any other interface using the network address of the
form: 10.10.10.xxx, and the clusterware startup scripts will create
an alias for the public interface when the node starts CRS.ff02::3 ipv6-allhosts
10.10.10.200 rac-node1 rac-node1.pdl.pok.ibm.com
10.10.10.202 rac-node1vip rac-node1vip.pdl.pok.ibm.com
In an Oracle RAC system, the string VIP in the interface name stands for virtual IP, and identifies its role. Having two interfaces for the same Ethernet connection supports immediate failover within the cluster. If a node is not responding, the vip IP address is attached to another node in the cluster, faster than the time it would take for hardware timeout to be recognized and processed.
10.10.10.200 db-node1 db-node1.pdl.pok.ibm.com
10.10.10.201 db-node2 db-node2.pdl.pok.ibm.com
10.10.10.202 db-node1vip db-node1vip.pdl.pok.ibm.com
10.10.10.203 db-node2vip db-node2vip.pdl.pok.ibm.com
10.10.50.200 db-node1priv db-node1priv.pdl.pok.ibm.com
10.10.50.201 db-node2priv db-node2priv.pdl.pok.ibm.com
net0 Link encap:Ethernet HWaddr 00:14:5E:78:1D:14
inet addr:10.10.10.200 Bcast:10.10.10.255 Mask:255.255.255.0
inet6 addr: fe80::14:5e00:578:1d14/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1492 Metric:1
RX packets:12723074 errors:0 dropped:0 overruns:0 frame:0
TX packets:13030111 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:2164181117 (2063.9 Mb) TX bytes:5136519940 (4898.5 Mb)
net0:1 Link encap:Ethernet HWaddr 00:14:5E:78:1D:14
inet addr:10.10.10.203 Bcast:10.10.10.255 Mask:255.255.255.0
UP BROADCAST RUNNING MULTICAST MTU:1492 Metric:1
SUBSYSTEM=="net", ACTION=="add", ENV{PHYSDEVPATH}=="*0.0.07c0", IMPORT="/lib/udev/rename_netiface %k net0"
A separate physical connection will become the private interconnect used for Oracle cache fusion, where the nodes in the cluster exchange cache memory and messages in order to maintain a united cache for the cluster. This study used IBM System z HiperSockets in the first installation, and named the interface db-node1priv, since the Oracle convention is to call the interconnect the private connection.
The decision of what type of connectivity to use for the private interconnect is an important decision for a new installation, the objective being high speed, and the ability to handle transferring large amounts of data.
The example of the interconnect for both nodes is shown in the example above of /etc/hosts.
The first node (node1) uses IP address 10.10.50.200 and host name db-node1priv. The second node (node2) uses IP address 10.10.50.201 and host name db-node1priv.
Using a 10.x.x.x network for the external (public) interfaces
The setup for the study required circumventing an Oracle RAC requirement that the external RAC IP address and it's alias be an IP address that is from the range of public IP addresses. Th setup needed to use an IP address in the form 10.10.x.x, which is classified as an internal range.
Oracle RAC would not work until a change was made in $HOME_CRS/bin/racgvip. To do the same thing on your system, either set the variable DEFAULTGW to an IP address that can always be pinged successfully (it will never be used, Oracle only checks to see if it is available), or else search for and change FAIL_WHEN_DEFAULTGW_NOT_FOUND to a value of 0.
Using udev when preparing to install Automatic Storage Manager
To set up shared storage for Oracle RAC on IBM System z, some of the attributes in Linux of the DASD disk storage devices must be modified. In SLES, configuration files located in /etc/udev/rules.d are read by Linux shortly after the kernel is loaded, and before Linux has created the file structures for disk storage and also before Linux assigns file names and attributes to all the known disk devices and partitions.
The udev command can be used to change ownership of the block devices that are used for the OCR and Voting Disks. It is also necessary to alter the attributes of the block devices that will be given to ASM to manage as shared storage for data. Shared DASD used for data also managed by ASM must be assigned the owner oracle and the group named dba.
For this study, a new file was created and given a high number (98) in the file name, so that it would be read last and the setup changes would not be overwritten by other startup processes. There are different rules for the udev command even when comparing SLES 10 SP2 with SLES 10 SP1, so it may be necessary to check the man pages or documentation for the udev command on your system, to ensure that it works as expected.
# for partitions import parent information
KERNEL=="*[0-9]", IMPORT{parent}=="ID_*"
# OCR disks
KERNEL=="dasdf1", OWNER="oracle", GROUP="oinstall" MODE="0660"
KERNEL=="dasdp1", OWNER="oracle", GROUP="oinstall" MODE="0660"
# VOTING DISKS
KERNEL=="dasdg1", OWNER="oracle", GROUP="oinstall" MODE="0660"
KERNEL=="dasdq1", OWNER="oracle", GROUP="oinstall" MODE="0660"
#ASM
KERNEL=="dasdh1", OWNER="oracle", GROUP="dba" MODE="0660"
KERNEL=="dasdi1", OWNER="oracle", GROUP="dba" MODE="0660"
KERNEL=="dasdj1", OWNER="oracle", GROUP="dba" MODE="0660"
KERNEL=="dasdk1", OWNER="oracle", GROUP="dba" MODE="0660"
KERNEL=="dasdm1", OWNER="oracle", GROUP="dba" MODE="0660"
KERNEL=="dasdn1", OWNER="oracle", GROUP="dba" MODE="0660"
/etc/init.d/boot.udev restart
Setting up persistent names for disk devices
Linux assigns names to all devices that it discovers at startup, in the order in which it discovers them, assigning the device names starting with the name dasda (or sda for SCSI) and continuing using that pattern. Even with a small number of disks used from a SAN, the order can change from one Linux startup to the next. For example, if one disk in the sequence becomes unavailable, then all the disks that follow it will shift to a different name in the series. The naming order might change in a way that affects the individual nodes differently, which makes the management of the disks complicated and error-prone.
To produce device names that were the same on different Linux systems and also persistent after rebooting required the use of device names in Linux such as /dev/disk/by-path, /dev/disk/by-id, or /dev/disk/by-uuid that are unambiguous. The problem is that those types of names did not fit into spaces provided in the ASM GUI installer for that purpose. It is possible to use these names with the silent install method, which runs a scripts and uses a response file to complete the installation. The problem with the silent install approach when doing the first installation is that there is no interactive error checking, so if any of the input was unacceptable there is no way to remove a failed installation.
This study employed a workaround for this issue, which was to use the file /etc/zipl.conf to set the disks in the same order of discovery at startup using the dasd= parameter. With the order controlled this way, it is possible to use the names of the partitioned files with confidence that the naming is consistent among the nodes and will not change with a reboot.
When the installation was based on Linux as a guest on z/VM®, the order of the disks is controlled by the order of the definition to the guest, for example in the user directory.
The disks managed by ASM are not visible on /etc/fstab and are out of sight, so careful disk management must be done in order to avoid errors.