Forum Discussion

Piotr_Lewandows's avatar
Piotr_Lewandows
Icon for Altostratus rankAltostratus
Jun 06, 2017

DNS (GTM) best practice for DR

Hi,

 

I need to set DR based on DNS module. After reading few posts and docs all I know that there is plenty approaches that can be implemented.

 

I have little experience with DNS module so I will appreciate any advice what will be optimum solution.

 

Scenario:

 

  • Two Data Centers: DC1 (main), DR1 (used only when resources in DC not available)
  • Each DC uses non overlapping subnet ranges
  • DCs connected via internal private L2/L3 link
  • All DNS queries will only come from devices inside DCs
  • In each Data Center single BIG-IP DNS device
  • In each DC one host (let's call it Main) requiring DNS resolution for resources it has to access
  • In each DS eight hosts (let's call them Slaves) with separate IP and FQDN - those are not LTMs but standard servers - Generic Host type (monitoring via HTTP monitor)
  • DNS device should perform DNS resolution for FQDNs for Main host

DR rules:

 

  • If any Slave in DC1 is down, DNS request should be resolved to IP of any working Slave
  • If all Slaves in DC1 are down, DNS request should be resolved to IP of any Slave in DR1

What would be best approach?

 

As far as I understand Global Availability method should be used, but at what level:

 

  • Pool
  • Wide IP

Is that better to create on Pool with members from both DSs or separate Pool - one per DC - each containing members from respective DC?

 

Now how to handle condition to return IP of active Slave inside one DC?

 

I guess I need to create as many WideIPs as Slaves (8), or rather one wildcard IP: slave1.vip.site.com, slave2.vip.site.com,...,slave8.vip.site.com or .vip.site.com

 

Then how to perform returning IP of another active Slave when Slave for which DNS request was made is down - inside DC HA?

 

Piotr

 

3 Replies

  • After some test my idea for scenario when each Slave has separate FQDN and IP is:

     

    • Create as many Wide IPs as Slave FQDN
    • Wide IP Load Balancing Method: Global Availability
    • Create pair of Pools for each WideIP:
      • PoolDC1-1 to 8 for Slaves in DC1
      • PoolDR1-1 to 8 for Slaves in DR1
      • In each pool pair first Member is the one that should be returned when DNS query for given Wide IP is performed, only if it's down next member IP is returned, and so on until there is no active member left (whole pool down)
    • LB at Pool level set to:
      • Preferred: Global Availability
      • Alternate: Round Robin (probably could be None, or maybe should?)
      • Fallback: None (this is a must)
    • Assign appropriate pool pairs to each Wide IP, Member Order:
      • PoolDC1-1 to 8
      • PoolDR1-1 to 8

    Result are: If for given Wide IP member that is default for FQDN is down - like slave1.vip.site.com should return 192.168.1.1, slave2.vip.site.com - 192.168.1.2 etc. - then next IP is returned (for example 192.168.1.3) If all member of pools PoolDC1-1 to 8 are down then members from pools PoolDR1-1 to 8 are returned.

     

    Is that make sense? Any better/easier/more robust config could be used?

     

    Last question is (for scenario with one FQDN for all Slaves - BIG-IP DNS makes LB decision) how to configure when we have only one host performing queries (Main) and it needs persistence.

     

    I know that it can be set at Wide IP level but how it works - similar to src addr affinity? So if request is from the same IP then for period set in Persistence TTL same IP will be returned.

     

    Is this Persistence TTL refreshed after each request from same IP?

     

    If there is no request during set value then next request from same IP will be LB according to method set.

     

    LB Method related to Persistence is the one set at Wide IP or Pool level?

     

    Piotr

     

  • popica's avatar
    popica
    Icon for Nimbostratus rankNimbostratus

    Any update on the recommended / best practice GTM DR implementation? Please advise. Thanks!

     

  • Hi,

     

    Sorry but no. This project is quite old and was not really implemented in the end.

     

    Piotr