Forum Discussion

manc_63343's avatar
manc_63343
Icon for Nimbostratus rankNimbostratus
Jan 25, 2011

GTM failure causing customer impact

We are kind of paranoid and thinking what if we have to keep the GTM down for several hours, then what will be our fallback. We already have redundant GTM but still we are exploring the question in case something happens to both GTMs.

 

 

Obvious option is to push out DNS change to point the IP directly to the LTM, but that will happen only after the fact. Is there any other option like for eg: Can we setup a DNS servers with IPs and define a failback something like

 

 

GTM A, GTM B, DNS C

 

 

So if if no response from GTM A, B then use C. And C can have IPs pointing to the LTM Is that possible?

 

 

Any other option?

10 Replies

  • If you're worried about losing both, why not buy a third? :-P

     

     

    With mail records, this is nice and easy since you get to configure priorities.

     

    With standard DNS, I'm pretty sure you'll need to make A-records pointing directly to the LTMs. Yes, this is a manual "fail-over" but with short TTLs, you can minimize the outage.
  • I am wondering if there is a way to do it automatically. Can't we have another DNS server which points the CNAMES directly directly to VIP on LTM and the local DNS server requeststhis standby DNS when it can't get to GTMs. Can primary / secondary concept be utilized here?
  • consider short TTLs carefully if you have global audience and cnames. The double dns lookup every 30s can cause some serious user experience issues.
  • I wasn't suggesting double CNAME lookups. I was merely thinking secondary DNS which will become active when GTM fails. Is that possible?
  • Posted By Chris Miller on 01/25/2011 05:47 PM

     

    If you're worried about losing both, why not buy a third? :-P

     

     

    With mail records, this is nice and easy since you get to configure priorities.

     

    With standard DNS, I'm pretty sure you'll need to make A-records pointing directly to the LTMs. Yes, this is a manual "fail-over" but with short TTLs, you can minimize the outage.

     

    sorry, manc. I was more responding to Chris's post here about minimizing outages. It's always a give/take. Yes, an outage can be recovered in 30 seconds, but is that worth 99.99% of the time having a worse user experience? Perhaps..depends on the business requirements.

     

     

    To answer your question, DNS isn't like an active/standby scenario with LTM. If you have 2,3, 4 (shouldn't be more than 4) GTM's authoritative for that record, the LDNSs of the world querying will reach out to all of them in case of failures to any one or more of them.

     

  • Posted By Jason Rahm on 01/26/2011 10:56 AM

     

    Posted By Chris Miller on 01/25/2011 05:47 PM

     

    If you're worried about losing both, why not buy a third? :-P

     

     

    With mail records, this is nice and easy since you get to configure priorities.

     

    With standard DNS, I'm pretty sure you'll need to make A-records pointing directly to the LTMs. Yes, this is a manual "fail-over" but with short TTLs, you can minimize the outage.

     

    sorry, manc. I was more responding to Chris's post here about minimizing outages. It's always a give/take. Yes, an outage can be recovered in 30 seconds, but is that worth 99.99% of the time having a worse user experience? Perhaps..depends on the business requirements.

     

     

    To answer your question, DNS isn't like an active/standby scenario with LTM. If you have 2,3, 4 (shouldn't be more than 4) GTM's authoritative for that record, the LDNSs of the world querying will reach out to all of them in case of failures to any one or more of them.

     

     

    He's wondering about having GTM queried as long as one of his GTMs are available and if they're not, having an A record pointed right to LTM VIP. It'd be like Priority Group Activation, but for name resolution. Works with MX records, not sure anything can be done with A-records.
  • I still think there got to be some way in case we have complete GTM failure. We had several instances of GTM failure and same problem affected all GTMs so need to have a good solution. Manual configuration is what we have in case of failure but I think we can do better
  • Posted By manc on 01/26/2011 12:57 PM

     

    I still think there got to be some way in case we have complete GTM failure. We had several instances of GTM failure and same problem affected all GTMs so need to have a good solution. Manual configuration is what we have in case of failure but I think we can do better

     

    What was the GTM issue? Are your GTMs in an HA pair or are they stand-alone units?
  • With multiple servers any of them can be asked at any time depending on the ldns doing the query, you can't rely on order. My advise while you are in a period of instability and troubleshooting would be to decrease your ttl so you can make a manual update if necessary. I wouldn't deploy a hybrid gslb/standard dns architecture. It's complicated enough with one solution.

     

     

    When you say only one is operational..does that mean the box is completely down, or you're having trouble with the GTM daemon? If the latter, you might be able to utilize an iRule to check for a response, and if not present or invalid, pass off to the local bind instance.