Weblogic automatic server migration

Oracle Weblogic Server

When designing a highly available production system, you must eliminate single points of failure and you must provide mechanisms to ensure your system continues to operate without breaking your service level agreements. As you know, WebLogic Server clustering provides an excellent first line of defense against single points of failure and provides automatic failover when a server instance or machine fails.
However, several situations may require some sort of weblogic automatic server migration and same for services. If you have singleton services running in your cluster, a server process or machine failure may require you to migrate that service to another running instance in the cluster. If you lose a machine, you may need to migrate one or more WebLogic Server instances to another machine to allow the system to continue to service requests without prolonged periods of degraded performance.

In this section, we talk about three types of migration:

service migration, whole server migration, and admin server migration.

Service Level Migration

WebLogic Server supports running singleton services in a cluster such that the service will always run in one server at a time. The most popular singleton service is a JMS server, which typically includes one or more destinations. WebLogic Server also provides application developers the ability to build their own custom singleton services. We discuss both topics in this section.

WebLogic JMS provides clustering facilities to allow you to build JMS applications that are resilient to server failure. However, the fact remains that when a server hosting one or more JMS destinations fails, it is very likely that the destinations may contain undelivered messages. If the JMS messages represent time-sensitive tasks that need to be processed, your job as aWebLogic Server administrator is to provide a failover mechanism that allows those trapped messages to be delivered in a timely fashion. JMS service migration is one way to achieve this.

Migrating JMS Services

WebLogic Server supports both manual and automatic JMS service migration. If you are using JTA transactions with your JMS application, you will also need to migrate the JTA service. To set up manual JMS service migration, you will need to perform a number of steps. There are many options and variations on the configuration; we list the primary steps to get JMS service migration working.

1. Create your machines and assign the managed servers to the appropriate machines.
2. WebLogic Server automatically creates migratable targets for your clustered managed servers. However, you still need to configure them to make sure that the correct User-Preferred Server is selected and that the Service Migration Policy is set to Manual Service Migration Only.
3. Create and target custom persistent stores for each migratable target. These will be used to store any persistent JMS messages.
4. Create your JMS servers and target them to the migratable targets.
5. If any migration policies were modified, you need to restart the admin server and any managed servers affected.
6. To manually migrate a JMS service, use the migratable targets’ Control tab. You can also use WLST (or a custom JMX program) to perform manual server migration.

 

While manual service migration is good for situations where you have an external HA framework that can invoke a migration script, many WebLogic Server installations simply do not need an external HA framework so that fact that WebLogic Server provides a framework for automatic migration is a real benefit to administrators.
Setting up automatic JMS service migration requires the same steps as setting up manual migration plus a couple more. Before we cover the steps required to perform automatic migration, we need to talk about leasing and automatic service migration policies.

When performing automatic migration, WebLogic Server needs a leasing mechanism to ensure that the service only runs on one server at a time. WebLogic Server supports two leasing mechanisms:

Database-Based Leasing

This style of leasing relies on a highly available database to coordinate the actions of the servers in the cluster. It is important that you ensure that the database is always available and reachable by each migratable server. A migratable server is only as reliable as the database. If a migratable server is unable to reach the database, it will shut itself down.
Leasing information is maintained in a database table. The schema definition for the table is located in a database vendor–specific directory underneath the $WL_HOME/server/db directory in a file called leasing.ddl. You must configure a nontransactional data source to access the leasing information. To tell WebLogic Server about the database configuration, use the cluster’s Migration Configuration tab to set the Migration Basis to Database and set the Data Source for Automatic Migration attribute to point to the nontransactional data source for the database where you created the leasing table. Change the Auto Migration Table Name attribute if you named the leasing table something other than the default value of ACTIVE.

Consensus-Based Leasing

This style of leasing keeps the leasing table in-memory. One server in the cluster is designated as the cluster leader. The cluster leader controls leasing in that it holds a copy of the leasing table in-memory and other servers in the cluster communicate with the cluster leader to determine lease information. The leasing table is replicated across the cluster to ensure high availability should the cluster leader go down.

To tell WebLogic Server you want to use consensus-based leasing, use the cluster’s Migration Configuration tab to set the Migration Basis to Consensus. Note that consensus-based leasing requires the use of node manager on every machine hosting managed servers within the cluster.

***Database leasing requires a highly available database. Your migratable targets are only as reliable as the database. If the database becomes unavailable, the migratable servers will shut themselves down.
***Consensus leasing requires the use of the node manager on every machine hosting managed servers in your cluster.

WebLogic Server has two automatic service migration policies from which to choose.

Auto-Migrate Exactly-Once Services— With this policy, WebLogic Server will try to keep the service running if at least one candidate server is available in the cluster — even when an administrator shuts down a server on which the service is running. Note that this can lead to all migratable targets running on a single server.

Auto-Migrate Failure Recovery Services— With this option, WebLogic Server will not try to migrate services where the User-Preferred Server (UPS) is shutdown by the administrator. If the UPS goes down for any other reason, WebLogic Server will try to migrate the service to another candidate server. If the candidate server also goes down, WebLogic Server will first try to reactivate the service on the UPS before searching for another candidate server.

For our purposes with migrating JMS servers that only contain uniform distributed destination members, we will select the Auto-Migrate Failure Recovery Services option. This means that if we plan to shut a server down for an extended period of time, we will need to manually migrate the service before shutting the server down; otherwise, the service will be unavailable. This is fine because the only reason we want to migrate the service is to process any message stuck in the queue. Our application will continue to function without the service because the other distributed destination members are still available.

If the JMS destinations had not been part of a distributed destination and our application depended on access to the destinations, we would have selected the Auto-Migrate Exactly-Once Services option to ensure that the destinations were made available as quickly as possible to prevent our application from failing for an extended period of time.
So, now that you understand the leasing an automatic service migration policies, you are ready to configure automatic JMS service migration. As with manual migration there are many options and variations on the configuration; we list the primary steps to get automatic JMS service migration working.

1. Create your machines and assign the managed servers to the appropriate machines.
2. Use the cluster’s Migration Configuration tab to set the Migration Basis to either Database or Consensus. If you choose Database:
a. Create the leasing table, as described previously.
b. Create a nontransactional JDBC data source for the servers to use to access the leasing table.
c. On the cluster’s Migration Configuration tab, set the Data Source for Auto Migration attribute to point to your nontransactional data source and verify that the Auto Migration Table Name attribute is set to the name of your leasing table.

If you choose Consensus:
a. Make sure that you configure the node manager on each machine that hosts the cluster’s managed servers.
3. WebLogic Server automatically creates migratable targets for your clustered managed servers. However, you still need to use the migratable targets’ Migration Configuration tab to make sure that the correct User-Preferred Server is selected and that the Service Migration Policy is set to either Auto-Migrate Exactly-Once Services or Auto-Migrate Failure Recovery Services.
4. Create and target custom persistent stores for each migratable target. These will be used to store any persistent JMS messages.
5. Create your JMS servers and target them to the migratable targets.
6. Restart the admin server and any managed servers to pick up the new migration policy settings.
7. Even with automatic migration configured, you can still manually migrate a JMS server using the migratable targets’ Control tab (orWLST), if desired.

One important thing to note is that, as of the time of writing,WebLogic Server does not support automatic failback of migrated JMS services. You will need to perform this task manually, either via the WebLogic Console or a WLST script.

 

Migrating the JTA Service

When machines fail, you need to be able to bring up services on other machines. Migrating the JTA service can play a critical role in recovery from a failure scenario. In-flight transactions can hold locks on the underlying resources. If the transaction manager is not available to recover these transactions, resources may hold on to these locks for long periods of time, making it difficult for the application to function properly. JTA service migration is possible only if the server’s default persistent store (where the JTA logs are kept) is accessible to the server to which the service will migrate. Once you guarantee this, migration is simple, although you must be careful how you share these files. Distributed file systems such as NFS typically do not provide the necessary semantics to guarantee the integrity and content of transaction logs. Typically, this means using some higher-end means of sharing the files, such as a multi-ported disk or storage area network (SAN).

Like JMS service migration, you can configure either manual or automatic JTA service migration. Because one of the most common use cases for JTA service migration is using it in conjunction with JMS service migration, we only point out the additional steps needed to allow JTA service migration. To add manual JTA service migration to your cluster already configured to support manual JMS service migration, you must ensure that the managed servers’ default persistent stores are accessible from the other managed servers to which you want to be able to migrate the JTA service. By default, WebLogic Server expects to locate a server’s default persistent store in the $DOMAIN_HOME/servers/<server-name>/data/store/default directory. For example, say that you want to be able to migrate Server2’s JTA service to Server1 in the event of failure. That means that on Server1’s machine, the directory $DOMAIN_HOME/servers/Server2/data/store/default must contain Server2’s default persistent store. The Server2 directory structure would typically not exist on Server1 so you would need to do something to realize this structure. One thing to consider though is that the addition of the Server2 directory structure on Server1 might be confusing for another WebLogic Server administrator so you might want to think twice before adopting this strategy.

A better approach might be to store the default persistent store directories for all managed servers using a common mount point outside the domain directories on each machine (for example, /mount/<server-name>/defaultstore). Once you do this, you need to reconfigure your managed server to use this directory by using the server’s Service Configuration tab and setting the Directory attribute to the absolute path to the directory on the shared file system.

To add automatic JTA service migration to your cluster already configured to support automatic JMS service migration, you must do the following things:

1. Ensure that each managed server’s default persistent store is accessible via shared disk, as we just described in our discussion of manual JTA service migration.
2. On each server’s Migration Configuration tab, enable the Automatic JTA Migration Enabled checkbox.
3. Restart the managed servers to pick up this change.
4. Even with automatic migration configured, you can manually migrate the service, if desired.

Before we move on to discuss whole server migration, we need to discuss migrating custom singleton services that your application may contain.

The automatic singleton service migration relies on the same sort of configuration needed for automatic JMS and JTA service migration. You must perform the following steps to enable automatic migration of your singleton service:

1. Create your machines and assign the managed servers to the appropriate machines.
2. Use the cluster’s Migration Configuration tab to set the Migration Basis to either Database or Consensus.

If you choose Database:
a. Create the leasing table, as described previously.
b. Create a nontransactional JDBC data source for the servers to use to access the leasing table.
c. On the cluster’s Migration Configuration tab, set the Data Source for Auto Migration attribute to point to your nontransactional data source and verify that the Auto Migration Table Name attribute is set to the name of your leasing table.

If you choose Consensus:
a. Make sure that you configure the node manager on each machine that hosts the cluster’s managed servers.

3. Restart the admin server and any managed servers to pick up the new migration policy settings.
4. Even with automatic migration configured, you can manually migrate a singleton service using the singleton service’s Control tab (orWLST), if desired.

Whole Server Migration

Although service migration provides a great framework for ensuring availability of critical services during failure conditions, it does not change the fact that one or more servers in your cluster have failed and are not available to process incoming requests. If you haven’t oversized your cluster to handle such failures gracefully, your applications could experience service level degradation until the failed managed servers are restarted. For extended periods of server downtime (for example, hardware failure), it is often desirable to restart managed servers on another machine to limit your exposure to service level degradation.

WebLogic Server provides a whole server migration (WSM) framework that supports restarting managed servers on different machines. Before we dive into the details of configuring WSM, let’s talk about some of the requirements for using WSM.

– WSM uses a floating IP address, also known as a virtual IP address, for each migratable server.

This means that the migratable server candidate machines have to be in the same subnet (because the virtual IP address must be valid on all candidate machines).

– WSM requires the use of the node manager. You must make sure the node manager on each candidate machine is properly initialized with the security-related files it need to authenticate and accept commands from the admin server.

– WSM uses the node manager to migrate the virtual IP address and assign it to the target machine. As such, the default configuration assumes that the machines are similar; specifically, it assumes the following:

* The netmask associated with the virtual IP is the same on all machines.
* The network device name (for example, eth0 on Linux) is the same on all machines.
* The functional behavior of the platform-specific OS command used to add and remove the virtual IP (for example, ifconfig on Linux) is the same.
* WSM only supports migration of a single virtual IP address for each migratable server. Therefore, a migratable server cannot define any network channels that use a Listen Address different from the virtual IP address associated with the server. If you need your servers to use multiple network channels associated with multiple IP addresses, you cannot use theWSM framework.
* WSM assumes that any server-specific state is already shared through some highly available sharing mechanism. For example, the server’s default persistent store where it keeps its XA transaction logs must be accessible on all candidate machines using the exact same path.

Now that you understand the requirements, let’s discuss the steps required to set up automatic whole server migration.

1. Create your domain. Make sure that you set up each managed server’s to Listen Address to its virtual IP address and assign it to a machine.

2. Set up the node manager for each candidate machine. For each machine, edit the nodemanager.properties file to set the NetMask property to the netmask associated with the virtual IP addresses being used and Interface to the network device name with which to associate the virtual IP address. Typically, the nodemanager.properties file is created the first time the node manager is started in the $NODEMGR_HOME directory (by default, $WL_HOME/common/nodemanager).

3. Verify the domain and node manager configuration. Before we proceed, start up the domain and each clustered managed server via its node manager. This not only ensures that the node managers and servers are properly configured but also initializes the node managers with the password files they need to accept commands from the admin server. Don’t forget to start managed servers on all candidate machines to ensure that the node manager and domain directory are properly initialized.

4. Choose and configure your leasing mechanism. Like automatic service migration, automatic whole server migration relies on leasing. Use the cluster’s Migration Configuration tab to select the appropriate Migration Basis.

5. If you choose database leasing, be sure to create and configure your nontransactional data source, create the leasing table, and use the cluster’s Migration Configuration tab to set the Data Source for Automatic Migration and Auto Migration Table Name appropriately.

6. Grant superuser privileges to the wlsifconfig script. Node managers use the $WL_HOME/common/bin/wlsifconfig.sh (onWindows, wlsifconfig.cmd) script to add and remove virtual IP addresses from the machines. By default, the file is set up to use sudo; sudo typically prompts you for your password the first time you run it and periodically after that.

To do this without needing to input your password, you need to add the NOPASSWD option to the relevant entry in your /etc/sudoers file. Don’t forget to add the wlsifconfig script to your PATH so that the node managers can locate it.

weblogic machine1 = NOPASSWD: /oracle/middleware/wlserver_10.3/common/bin/wlsifconfig.sh

7. Enable automatic server migration. The last step is to use each managed server’s Migration Configuration tab to select the Automatic Server Migration Enabled checkbox and restart the servers.

At this point, automatic whole server migration configuration is complete and needs to be tested.
Debugging problems with whole server migration can be tricky so you will probably want to add

-Dweblogic.debug.DebugServerMigration=true to the Java command line used to start your servers.

 

Weblogic Admin Server Migration

The admin server is not currently clusterable. This means that if the admin server goes down, you cannot administer your WebLogic Server domain until you bring it back up. In most cases, you may not be too concerned if the admin server goes down because all you need to do is restart it. If you use the node manager to start the admin server, the node manager can automatically restart a failed admin server just like it can any other server. What happens if the machine where the admin server runs fails in such a way that you cannot restart the admin server? The answer is simple if you prepare for this unlikely event.

Proper operation of the admin server relies on several configuration files and any application files it controls. Typically, the best thing to do is to store the admin server’s directory tree on a shared disk. As long as the configuration and application files are accessible, you can restart the admin server on another machine. It is up to you to make sure that you don’t have more than one admin server running at a time. If the new machine can assume the original admin server’s Listen Address (or if it was not set), you can simply start the admin server on the new machine without any configuration changes.

Otherwise, you will need to change the admin server’s Listen Address. Since the managed servers ping the admin server URL every 10 seconds until it comes back up, you need to devise a way for the admin server URL to allow the managed server to find the restarted admin server on the new IP address. The easiest way to achieve that is using a DNS name that maps to both IP addresses, or better yet that is dynamically updated to point to the correct location of the admin server. If this is a graceful shutdown and migration, use the WebLogic Console to change the Listen Address just before shutting down the admin server. If not, you will need to edit the config.xml file by hand to replace the old Listen Address with the new one. Typically, we recommend planning ahead so that everything you need is already in place to make admin server failover as painless as possible.

 

In case of any ©Copyright or missing credits issue please check CopyRights page for faster resolutions.

1 Response

  1. Faheem says:

    Nice article..

Leave a Reply