OpenSSO – HA Across Data Centers Configuration 2

This is the final installment of the discussion about deploying OpenSSO in a high availability configuration across two data centers.

In the previous installment OpenSSO was configured as two separate sites one for each data center.

Configuration with a Single Logical Grouping of OpenSSO Servers

The load balancer in each data center is configured to connect to both the local and remote OpenSSO servers.

They are configured to communicate with their local OpenSSO servers by default and route requests to the remote servers if both local OpenSSO instances are down. Session stickiness settings are configured in both load balancers across all OpenSSO servers in both data centers.

The main benefit of adopting this configuration is to reduce the amount of back channel server-to-server communications. Since all load balancers can connect to any OpenSSO server in both data centers it’s not necessary to route traffic via an OpenSSO server during fail-over as is the case with the previous configuration.


The OpenSSO configuration to implement this solution is:

Use Cases

The following use cases describe what happens during fail-over with this configuration.

These use cases assume that the user session is initially created on Svr1 via LB1 and that fail-over occurs when ClientA tries to validate the user session later on.

Case 1: LB1 is up, Svr1 is down and Svr2 is up

  1. The session validation user request goes to Svr2 via LB1.
  2. Svr2 detects the host server for this particular session (Svr1) is down and selects the backup OpenSSO server to serve the request . Based on the OpenSSO IRR logic the user session could be recovered on one of the three running OpenSSO instances Svr2, Svr3 or Svr4.
  3. If the backup server is Svr2:
    1. The user session is recovered on Svr2. The session stickiness cookie is now set to Svr2.
    2. All subsequent requests for this session will be routed to Svr2 via LB1.
  4. If the backup server is Svr3:
    1. The request is forwarded to Svr3.
    2. The user session is recovered on Svr3. The session stickiness cookie is now set to Svr3.
    3. All subsequent requests for this session will be routed to Svr3 via LB1.
  5. If the backup server is Svr4:
    1. The request is forwarded to Svr4.
    2. The user session is recovered on Svr4. The session stickiness cookie is now set to Svr4.
    3. All subsequent requests for this session will be routed to Svr4 via LB1.

Case 2: LB1 is up, both Svr1 and Svr2 are down

  1. Because LB1 is connected to all of the four OpenSSO servers it can forward the request to either Svr3 or Svr4.
  2. Svr3 receives the request.
    1. Svr3 detects the primary server hosting this session (Svr1) is down.
    2. Svr3 selects the backup OpenSSO server to serve the request . Based on the OpenSSO IRR logic the user session could be recovered on either Svr3 or Svr4.
    3. If the backup server is Svr3:
      1. The user session is recovered on Svr3. The session stickiness cookie is now set to Svr3.
      2. All subsequent requests for this session will be routed to Svr3 via LB1.
    4. If the backup server is Svr4:
      1. The request is forwarded to Svr4.
      2. The user session is recovered on Svr4. The session stickiness cookie is now set to Svr4.
      3. All subsequent requests for this session will be routed to Svr4 via LB1.
  3. Svr4 receives the request.
    1. Svr4 detects the primary server hosting this session (Svr1) is down.
    2. Svr4 selects the backup OpenSSO server to serve the request . Based on the OpenSSO IRR logic the user session could be recovered on either Svr3 or Svr4.
    3. If the backup server is Svr3:
      1. The request is forwarded to Svr3.
      2. The user session is recovered on Svr3. The session stickiness cookie is now set to Svr3.
      3. All subsequent requests for this session will be routed to Svr3 via LB1.
    4. If the backup server is Svr4:
      1. The user session is recovered on Svr4. The session stickiness cookie is now set to Svr4.
      2. All subsequent requests for this session will be routed to Svr4 via LB1.

Case 3: LB1 is down and Svr1 is down

  1. The Site Monitor in the Client SDK detects LB1 is unreachable so the session validation request is forwarded to LB2.
  2. LB2 checks the session stickiness routing cookie and tries to send the request to Svr1. But since Svr1 is down either Svr3 or Svr4 receives the request.
  3. Svr3 detects the primary server hosting this session (Svr1) is down.
  4. Svr3 selects the backup OpenSSO server to serve the request . Based on the OpenSSO IRR logic the user session could be recovered on one of the three running OpenSSO instances Svr2, Svr3 or Svr4.
  5. If the backup server is Svr2,
    1. The request is forwarded to Svr2.
    2. The user session is recovered on Svr2. The session stickiness cookie is now set to Svr2.
    3. All subsequent requests for this session will be routed to Svr2 by LB2.
  6. If the backup server is Svr3:
    1. The user session is recovered on Svr3. The session stickiness cookie is now set to Svr3.
    2. All subsequent requests for this session will be routed to Svr3 via LB2.
  7. If the backup server is Svr4:
    1. The request is forwarded to Svr4.
    2. The user session is recovered on Svr4. The session stickiness cookie is now set to Svr4.
    3. All subsequent requests for this session will be routed to Svr4 via LB2.

Case 4: LB1 is down and Svr1 is up

  1. The Site Monitor in the Client SDK detects LB1 is unreachable so the session validation request is forwarded to LB2.
  2. LB2 checks the session stickiness routing cookie and sends the request to Svr1.
  3. All subsequent requests for this session will be routed to Svr1 via LB2.

Finally

OpenSSO out-of-the-box provides service fail-over to address the requirement of infrastructure redundancy within data centers and session fail-over to achieve state redundancy within each deployment site. In most cases these HA product features are sufficient. However, there are sometimes requirements for cross-site session fail-over. Although this feature was not added to OpenSSO before the Oracle purchase of Sun MicroSystems the product itself can be configured to meet this requirement with a special OpenSSO Site configuration as described in these posts.

The potential performance impact with both presented solutions depends on the network bandwidth in the WAN environment.  This is because a large number of cross-site communications is expected to occur between the data centers.

About these ads

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s