Users unable to log in

Questions

Since the cluster move of our forum, a lot of members are still getting "Maintenance" notices, although I am able to log in and comment on my main computer, I can't on my tablet, and some pages are not working:-

http://learn-english-forum.org/discussion/ 109/howe-does-it-work

I've told everyone to refresh their cache, but it's still happening.

Is there anything they should be doing.



  • I have met with the forum moderators. Everyone is experiencing this.

    The number of visitors / comments from yesterday show that this has affected our forum for way longer than the hour promised, and it's still happening.

    I have opened a ticket too, we have to get this sorted out.

  • Tim GunterTim Gunter Administrator, Operations, Staff
    Hi Lynne,

    The move required a DNS change. DNS is the global system the allows names like "learn-english-forum.org" to resolve to an IP address usable by computers.

    In order to support such a massive system as the internet, DNS is heavily cached, which means that changes to it are not "real time". The way this works is that each rule added to the DNS system has a "Time To Live" attached to it, which you can think of as a limit on the "maximum staleness" of a rule. Basically, the TTL advises other DNS servers how long they are allows to cache a rule before they must check back at the source to see if it has changed.

    At Vanilla, we have tried to standardize on 1 hour (3600 second) TTLs for our records, because we feel this gives us a good balance between improved performance from caching, and the ability to make changes relatively quickly. Unfortunately, it looks like the administrator of your DNS server has chosen 1 DAY for your TTL, as evidenced by the following output:

    dig learn-english-forum.org
    ; <<>> DiG 9.8.3-P1 <<>> learn-english-forum.org
    ;; global options: +cmd
    ;; Got answer:
    ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 40322
    ;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 0
    ;learn-english-forum.org.   IN  A
    learn-english-forum.org. 86386  IN  A

    On the last line, you can see the number 86386. This is almost 86400, which is the number of seconds in a day, and shows that your domain's TTL is 1 day. Unfortunately this means that users of your service can see up to 1 day of old DNS values (pointing to your old IP).

    That said, there are some things they can try to get their computer to receive the updates rule:
    * Restart their browser
    * Restart their computer
    * Restart their modem

    Unfortunately, beyond that, our hands are a little bit tied. I would advise requesting a reduction in your domain's TTL, down to 1 hour, as a precautionary measure for the future.

    Has this made sense?

  • I think so.

    The crazy thing was I was happily posting away all day yesterday, and it was only today, during an online session that everyone started complaining about how long the maintenance was taking, that I realised we had a problem. I have now managed to flush my local DNS cache, but the forum has travelled back in time to the 19th, and now I'm worried that that is still the old forum, and some people are still stuck.

    I guess it will all settle down eventually.

  • A lot of users have followed all the advice, it's been over 24 hours, but they're still getting the Forum down message.

  • Tim GunterTim Gunter Administrator, Operations, Staff
    I understand your frustration and concern, but at the moment there is not much we can do. I have ensured that the DNS rules in our system are correct:

    $ dig learnenglish.vanillaforums.com
    learnenglish.vanillaforums.com. 3599 IN CNAME   vip.cl300.vanilladev.com.
    vip.cl300.vanilladev.com. 299   IN  A

    learnenglish.vanillaforums.com is correctly pointing at your new cluster : cl300 and its TTL is 1 hour (this has been the case since the 19th).

    I have also checked that your domain is configured correctly:

    $ dig @NS-DE.1AND1-DNS.DE learn-english-forum.org
    ;learn-english-forum.org.   IN  A
    learn-english-forum.org. 86400  IN  A

    learn-english-forum.org is correctly pointing at your new IP: but your TTL is still set to a very high value. I had recommended that you reduce it to improve responsiveness. A value of 3600 (instead of 86400) is my suggestion.

    That said, your DNS seems to have propagated throughout most of the internet at this time:

    As you can see, most reference servers are returning the new IP, with the exception of Istanbul, Turkey; Karachi, Pakistan; Nanjing, China; and Melbourne, Australia. These, too, will update themselves over time.

    It is now up to the Internet Service Providers and your end users' modems and computers to detect the change and direct them at the new site. I apologize that there is nothing else I can do at this time. All I can tell you is that your forum is up and running on its new ip and is accessible on the named address that we provide you upon signup (learnenglish.vanillaforums.com).

