Vishwanath Nayak : EXADATA non rolling Patching

QFSD (Quarterly Full Stack Download)

Patch number would be - year month date of patch application 160709

This patch includes

- BP (Bundle Patch) --- For GI and RDBMS home in Compute node - Use Opatch utility to apply the patch.Opatch version should be same across all the compute nodes(GI/RDBMS). While running catbundle.sql you need a downtime . ./datapatch.sh will login to database and run the neccesary scripts like catbundle.sql etc.

- YUM Update (ISO Patching) - For Compute node- Firmware update - Use dbnodeupdate.sh to apply this patch. While applying this patch compute node will reboot twice , or 5 times it depends . Complete reboot will take may be an hour. Some error may encounter after the last reboot, you should ignore it as it was due to oracle coding issue. Patching will be done for all the compute nodes first, then proceed with cell server patching

- OJVM Patch -- For compute node, this patch is required only if customer using java libraries.

- IB Switch Patch -- For infiniband switch. This patch will be applied using patchmanager utility, but this utility must be invoked from ILOM console. You will be login to ILOM console using spsh utility.

-Cell patch -- For Cell server. This patch will be applied using patchmanager utility.
You have to run exacheck utility to verify the cell status and for any hardware issues. if there is any issues you have to fix it prior to patching.
As a prerequisite for rolling patching, asm powerlimit should be set to min 4, disk repair time should be minimum of 8 hours (by default 3.5 hours) and ADVM should be set to GI version. You will be invoking cell patching from compute node itself. In that compute node cat/etc/cell_groups file should have cell server entries of each cell nodes, else it won't apply cell patch for all the cell servers. Reboot all the cell servers one by one to verify the cell server bootup issue, also run patchmgr precheck utility. then apply the patch using patchmgr utlity, this will pickup the cell server ips from cell_groups file and apply the cell patch to cell server one by one. oracle will quot 4 hours for each cell server patch, by realistic patching can be completed in 3.5 hours on all the cell servers. It again depends on ASM resync operation. After the patching flash cache will be dropped, so you have to recreate it else you will face performance issues.

PDU patch - For Power distribution unit. How many PDU s will have? This patch will be released once a year. Yearly once you can see this patch as part of QFSD. This patch will be applied using patchmgr.

High level steps
===============

Cell sever patch
------------------
Check for any critical alert and status of disks using DCLI
clear the cell critical alert if any
Check the ssh equivalence for both compute and cell node
Check for Compute node and cell node uptime,If the uptime is more than 128 days it is recommended to reboot
Reboot the cell and compute node
Make all griddisk inactive and shutdown cell services
unzip qfsd patches
Apply the cell patch
Check the imageinfo once cell Patch completed and activate the griddisks

Bundle Patch(BP) Grid and RDBMS patching
---------------- -------------------------
Logon to Each compute node and Bundle patch can be applied parallely
Apply JDBC patch on GI only

ISO Patching(Compute node Pathcing)
-----------------------------------
Make sure that all NFS and ZFS file system are unmounted and comment out in FStab
Do the precheck and notifies for any conflicating RPM's which needs to be removed
Make the file system backup and reboot the node and update the image
Bringup the clusterware stack and enable the CRS

IB5 critical work around
--------------------------
IB Switches will not be upgraded in all the QFSD release,It will be only upgraded if your switch is below specific version and some critical fix will be given
As a root user locate the IBS using below command
ibswitches
then ssh to switch
take the spsh console and run the below command

Technical steps
===============

GI_HOME:/oracle_crs/product/11.2.0.4/crs_1
ORACLE_HOME: /oracle/product/11.2.0.4/db_1

Patch location: /oracle/depot/JULY2016-QFSDP/

drwxr-xr-x 2 oracle dba 4096 Sep 16 15:17 16486998
drwxr-xr-x 2 oracle dba 4096 Sep 16 15:18 23727132
drwxr-xr-x 3 oracle dba 4096 Sep 20 09:18 23274210

Cell Patching
==============
Check for any critical alert and status of disks using DCLI
------------------------------------------------------------

dcli -g /root/cell_group -l root "cellcli -e list alerthistory where endTime=null and alertShortName=Hardware and alertType=stateful and severity=critical" --->

Please check theses 3 steps before you stop cluster
dcli -g /root/cell_group -l root "cellcli -e list griddisk attributes name,asmmodestatus,asmdeactivationoutcome"
dcli -g /root/cell_group -l root "cellcli -e list cell attributes cellsrvStatus,msStatus,rsStatus detail"

clear the cell critical alert if any
-------------------------------------

dcli -g /root/cell_group -l root "cellcli -e DROP ALERTHISTORY ALL"

Check the ssh equivalence for both compute and cell node
--------------------------------------------------------

dcli -g /root/cell_group -l root "hostname -i"
dcli -g /root/dbs_group -l root "hostname -i"

Check for Compute node and cell node uptime,If the uptime is more than 128 days it is recommended to reboot
--------------------------------------------------------------------------------------------------------------
dcli -g /root/cell_group -l root "uptime"
dcli -g /root/dbs_group -l root "uptime"

Stop and disable the CRS
-------------------------
dcli -g /root/dbs_group -l root "/oracle_crs/product/11.2.0.4/crs_1/bin/crsctl check crs"
dcli -g /root/dbs_group -l root "/oracle_crs/product/11.2.0.4/crs_1/bin/crsctl stop crs -f"
dcli -g /root/dbs_group -l root "/oracle_crs/product/11.2.0.4/crs_1/bin/crsctl disable crs"

Check for Compute node and cell node uptime,If the uptime is more than 128 days it is recommended to reboot
--------------------------------------------------------------------------------------------------------------
dcli -g /root/cell_group -l root "uptime"
dcli -g /root/dbs_group -l root "uptime"

Reboot the cell and compute node
---------------------------------

dcli -g /root/dbs_group -l root "shutdown -F -r now"
dcli -g /root/cell_group -l root "shutdown -F -r now"

Make all griddisk inactive and shutdown cell services
------------------------------------------------------
dcli -g /root/cell_group -l root "cellcli -e alter griddisk all inactive"
dcli -g /root/cell_group -l root "cellcli -e alter cell shutdown services all"

unzip qfsd patches

Apply the cell patch using the below commands
-------------------------------------------------

./patchmgr -cells /root/cell_group -reset_force
./patchmgr -cells /root/cell_group -cleanup
./patchmgr -cells /root/cell_group -patch_check_prereq
./patchmgr -cells /root/cell_group -patch

Check the imageinfo once cell Patch completed
-------------------------------------------------
dcli -g /root/cell_group -l root imageinfo
dcli -g /root/cell_group -l root "cellcli -e alter griddisk all active"

Bundle Patch(BP)
=============
Grid and RDBMS patching
========================
Logon to Each compute node and Bundle patch can be applied parallely

/oracle_crs/product/11.2.0.4/crs_1
/oracle/product/11.2.0.4/db_1
GI_HOME:/oracle_crs/product/11.2.0.4/crs_1
ORACLE_HOME: /oracle/product/11.2.0.4/db_1

% /oracle_crs/product/11.2.0.4/crs_1/OPatch/opatch version
% /oracle/product/11.2.0.4/db_1/OPatch/opatch version

/oracle/product/11.2.0.4/db_1/OPatch/opatch lspatches -oh /oracle/product/11.2.0.4/db_1
% /oracle_crs/product/11.2.0.4/crs_1/OPatch/opatch lsinventory -detail -oh /oracle_crs/product/11.2.0.4/crs_1
% /oracle/product/11.2.0.4/db_1/OPatch/opatch lsinventory -detail -oh /oracle/product/11.2.0.4/db_1

% unzip p23274515_112040_Linux-x86-64.zip
# chown -R oracle:oinstall /u01/app/oracle/patches/23274515

export ORACLE_HOME=/oracle_crs/product/11.2.0.4/crs_1

/oracle_crs/product/11.2.0.4/crs_1/OPatch/opatch prereq CheckConflictAgainstOHWithDetail -phBaseDir <UNZIPPED_PATCH_LOCATION>/23274515/23061511
/oracle_crs/product/11.2.0.4/crs_1/OPatch/opatch prereq CheckConflictAgainstOHWithDetail -phBaseDir <UNZIPPED_PATCH_LOCATION>/23274515/23054319
/oracle_crs/product/11.2.0.4/crs_1/OPatch/opatch prereq CheckConflictAgainstOHWithDetail -phBaseDir <UNZIPPED_PATCH_LOCATION>/23274515/22502505

/oracle/product/11.2.0.4/db_1/OPatch/opatch prereq CheckConflictAgainstOHWithDetail -phBaseDir <UNZIPPED_PATCH_LOCATION>/23274515/23061511
/oracle/product/11.2.0.4/db_1/OPatch/opatch prereq CheckConflictAgainstOHWithDetail -phBaseDir <UNZIPPED_PATCH_LOCATION>/23274515/23054319/custom/server/23054319

# /u01/app/11.2.0.4/grid/crs/install/rootcrs.pl -unlock

/oracle_crs/product/11.2.0.4/crs_1/OPatch/opatch napply -oh /oracle_crs/product/11.2.0.4/crs_1 -local <UNZIPPED_PATCH_LOCATION>/23274515/23061511
/oracle_crs/product/11.2.0.4/crs_1/OPatch/opatch napply -oh /oracle_crs/product/11.2.0.4/crs_1 -local <UNZIPPED_PATCH_LOCATION>/23274515/23054319
/oracle_crs/product/11.2.0.4/crs_1/OPatch/opatch napply -oh /oracle_crs/product/11.2.0.4/crs_1 -local <UNZIPPED_PATCH_LOCATION>/23274515/22502505

Apply JDBC patch on GI only
$ cd <PATCH_TOP_DIR>/23727132

/oracle_crs/product/11.2.0.4/crs_1/OPatch/opatch apply -local

export ORACLE_HOME=/oracle/product/11.2.0.4/db_1

/u01/patches/23274515/23054319/custom/server/23054319/custom/scripts/prepatch.sh -dbhome /oracle/product/11.2.0.4/db_1
/u01/app/oracle/product/11.2.0.4/db_2/OPatch/opatch napply -oh /oracle/product/11.2.0.4/db_1 -local <UNZIPPED_PATCH_LOCATION>/23274515/23061511
/u01/app/oracle/product/11.2.0.4/db_2/OPatch/opatch napply -oh /oracle/product/11.2.0.4/db_1 -local <UNZIPPED_PATCH_LOCATION>/23274515/23054319/custom/server/23054319
/u01/patches/23274515/23054319/custom/server/23054319/custom/scripts/postpatch.sh -dbhome /oracle/product/11.2.0.4/db_1

Make sure that Cell Patching is completed and griddisk is made active before running the below command
======================================================================================================
/oracle_crs/product/11.2.0.4/crs_1/rdbms/install/rootadd_rdbms.sh
/oracle_crs/product/11.2.0.4/crs_1/crs/install/rootcrs.pl –patch

ISO Patching(Compute node Pathcing)
===================================
Stop the clusterware
dcli -g /root/dbs_group -l root "/oracle_crs/product/11.2.0.4/crs_1/bin/crsctl check crs"
dcli -g /root/dbs_group -l root "/oracle_crs/product/11.2.0.4/crs_1/bin/crsctl stop crs -f"

Make sure that all NFS and ZFS file system are unmounted and comment out in FStab

Unzip p16486998_121232_Linux-x86-64.zip and copy the patch to all compute nodes

below command will do the precheck and notifies for any conflicating RPM's which needs to be removed

./dbnodeupdate.sh -u -l /ora01/patches/23274210/Infrastructure/12.1.2.3.2/ExadataDatabaseServer_OL6/p23564643_121232_Linux-x86-64.zip -v -N

Below command will make the file system backup and reboot the node and update the image
./dbnodeupdate.sh -u -l /ora01/patches/23274210/Infrastructure/12.1.2.3.2/ExadataDatabaseServer_OL6/p23564643_121232_Linux-x86-64.zip

Below command will bringup the clusterware stack and enable the CRS
./dbnodeupdate.sh -c

IB5 critical work around
=========================
IB Switches will not be upgraded in all the QFSD release,It will be only upgraded if your switch is below specific version and some critical fix will be given

As a root user locate the IBS using below command
ibswitches
then ssh to switch
take the spsh console and run the below command

-> set /SP/services/http secureredirect=disabled
-> set /SP/services/http servicestate=disabled
-> set /SP/services/https servicestate=disabled
-> exit

Vishwanath Nayak

Sunday, March 18, 2018

EXADATA non rolling Patching

No comments:

Post a Comment