Auto detect java.lang.OutOfMemoryError

Oracle Weblogic Server

While using weblogic containers for managed servers we encountered OOM errors and sometimes the process use to go down in the middle of night where no one was monitoring.Due to this our productivity severely impacted. So, finally to eradicate this kind of issues we created below script which runs as a cron job each 10mins and detects if any out of memory situations comes and bounces the managed server automatically after backing up the error logs for future analysis of the error.

You can use the below autocheck.sh script as a cronjob like below to run each 10 mins to check for OOM errors or process failures.

-bash-3.2$ crontab -e

*/10 * * * * /opt/apps/mgr/autocheck.sh
:wq 

-bash-3.2$ crontab -l 
*/10 * * * * /opt/apps/mgr/autocheck.sh


Below is the script for autocheck.sh

Change the “ENVNAME, WORKDIR, msName, Log_LOC, Tmp_LOC, Cac_LOC” according to your env. details including the email addresses.

#!/bin/bash
##########
ENVNAME="Production Env:"
COUNTER="0"
WORKDIR="/opt/apps/mgr"
msName="myprodserver1"
Log_LOC="$WORKDIR/WLS/user_projects/domains/wls_mydomain/servers/AdminServer/logs"
Tmp_LOC="$WORKDIR/WLS/user_projects/domains/wls_mydomain/servers/AdminServer/tmp"
Cac_LOC="$WORKDIR/WLS/user_projects/domains/wls_mydomain/servers/AdminServer/cache"
StartScritDir="$WORKDIR/WLS/user_projects/domains/wls_mydomain/bin"
##############################################
## You can add any number of log files to monitor
## for the specified out of memory errors
##############################################
LogFiles=( AdminServer.log AdminServer-diagnostic.log wls_app1266.log )
##########
###NOTIFICATIONS###
#To email address where all the notifications will be sent via mail
EMAIL="notification_list(at)techpaste.com"
# CC list in the notification mail
CCList="mymail(at)techpaste.com"
# From email address in the notification Email
FromAdd="mymail(at)techpaste.com"
##########
######End to be modified######

######Do not make modifications below######
###########################
#Functions
##########################
OOMCheck() {
for logfile in ${LogFiles[@]} ;do
OOMCount=`grep "java.lang.OutOfMemoryError" $Log_LOC/$logfile | wc -l`
COUNTER=$[$COUNTER + $OOMCount]
export COUNTER
done
}
Clean() {
echo "`date` :Clearing Temp and Cache Folders...";
if [ -d "$Tmp_LOC" ]; then
rm -rf $Tmp_LOC/*
fi
if [ -d "$Cac_LOC" ]; then
rm -rf $Cac_LOC/*
fi
}

CleanLogs() {
for logfile in ${LogFiles[@]} ;do
if [ -f $Log_LOC/$logfile.tar.gz ]; then
rm -f $Log_LOC/$logfile.tar.gz
fi
if [ -f $Log_LOC/$logfile ]; then
tar -czf $Log_LOC/$logfile.tar.gz $Log_LOC/$logfile
rm -f $Log_LOC/$logfile
fi
done
}

BackupLogs() {
for logfile in ${LogFiles[@]} ;do
if [ -f $Log_LOC/$logfile ]; then
tar -czf $Log_LOC/$logfile.tar.gz $Log_LOC/$logfile
fi
done
}

ProcCheck() {
PID=`ps -eaf | grep -v grep | grep java |grep $msName | grep -v "<defunct>" | awk '{ print $2 }'`
export PID
}
KillProc() {
PID=`ps -eaf | grep -v grep | grep java |grep $msName | grep -v "<defunct>" | awk '{ print $2 }'`
kill -9 $PID
}

StartServ() {
cd $StartScritDir
./startWeblogic.sh $msName
}

Main() {
OOMCheck
ProcCheck
if [[ "$COUNTER" != "0" || "$PID" == "" ]] ; then
echo "`date` :Out Of Memory Condition Detected or Server Process is Down..."
echo "`date` :Bouncing the Env..";
echo "`date` :Backing up logs for future reference.."
BackupLogs
echo "$ENVNAME Seems To Be Down.Auto Restarting the server..." | mail -s "$(echo -e "Auto-Msg: $ENVNAME :Java Process Is Down Or OutOfMemory Error\nContent-Type: text/html")" $EMAIL -c $CCList -- -f $FromAdd
KillProc
Clean
CleanLogs
StartServ
echo "$ENVNAME : Server restarted and env back online." | mail -s "$(echo -e "Auto-Msg:[RESTART-COMPLETED] $ENVNAME :Java Process Is Down Or OutOfMemory Error\nContent-Type: text/html")" $EMAIL -c $CCList -- -f $FromAdd

else
echo "`date` :Exiting, No OOM or Server Process downtime found...";
exit 0
fi
}
Main

 

Sample Output:

-bash-3.2$ ./autocheck.sh
Wed Aug 28 08:44:10 PDT 2013 :Exiting, No OOM or Server Process downtime found...

Please do let me know in comments incase you have some doubts on how to execute it. I will try to reply all your queries as soon as I get to them.

In case of any ©Copyright or missing credits issue please check CopyRights page for faster resolutions.

4 Responses

  1. Ahmad Azkan says:

    Hello,
    Stumbled across your site while searching for something similar.
    May I use your codes and make changes as necessary?

    Regards
    Ahmad

Leave a Reply