Home News Archive
Public Sources
Git repositories Electronics Arduino RFID Door Lock XBee network controller Hardware LS30 Burglar Alarm Online store Software logrun math-trainer mp3cd-tools fakepop aunic-modify ausadmin sms-notify apt-cacher video-capture wav-fixer misc
TRS-80 software I wrote
Hats I wear
Philosophy
Technical
Unclassified
Your donation gives me time
Stop the Australian (Anti)Vaccination Network
|
Nick Andrew News2024-06-30 - I have a new GPG keyMy old key, generated in the year 2000, is not up to today's security standards. So I've created a new key, and here are the details. You can verify the following text, because it is signed with both the old and new key. -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA256 This is a public notice that I am migrating to a new GPG key to upgrade the digest algoritm away from the weak 1024 bit DSA. My old key is: pub dsa1024 2000-11-02 [SCA] [expires: 2024-09-14] B3ED68948E491770C24A67E362666EB9418487E7 uid [ unknown] Nick Andrew 2016-05-10 - Installing a StartSSL identity certificate in Google ChromeStartSSL provides free SSL/TLS certificates for web servers, and it also installs a Class 1 Client certificate into your browser to authenticate yourself to its own site. This article is all about how to install the Class 1 Client certificate into Google Chrome, especially when Chrome does not accept the certificate automatically, as happened to me when an old certificate expired. I'll be signing up under the dummy address website@nick-andrew.net to demonstrate the process. Step 1, Sign Up
At this point, Chrome is supposed to import your client certificate and you can be on your way. Certainly at this point if I was using Firefox, then Firefox has installed the certificate, and a box appears with a "Login Now" button. But it doesn't happen for me, on Chrome version 50. So I'm going to set it up the hard way. Step 2, Generate a private keyAt a shell prompt, do this: $ openssl req -newkey rsa:2048 -keyout website.key -out website.csr Generating a 2048 bit RSA private key ..............+++ ...............................................................+++ writing new private key to 'website.key' Enter PEM pass phrase: Verifying - Enter PEM pass phrase: ----- You are about to be asked to enter information that will be incorporated into your certificate request. What you are about to enter is what is called a Distinguished Name or a DN. There are quite a few fields but you can leave some blank For some fields there will be a default value, If you enter '.', the field will be left blank. ----- Country Name (2 letter code) [AU]: State or Province Name (full name) [Some-State]: Locality Name (eg, city) []: Organization Name (eg, company) [Internet Widgits Pty Ltd]: Organizational Unit Name (eg, section) []: Common Name (e.g. server FQDN or YOUR name) []:website@nick-andrew.net Email Address []: Please enter the following 'extra' attributes to be sent with your certificate request A challenge password []: An optional company name []: $ cat website.csr | xclip -in Make sure you enter website@nick-andrew.net in the Common Name field.
$ unzip website%40nick-andrew.net.zip Archive: website%40nick-andrew.net.zip inflating: 1_Intermediate.crt inflating: 2_website@nick-andrew.net.crt To import a certificate, Chrome needs a file in PKCS12 format. The magic command to make that is: $ openssl pkcs12 -export -inkey website.key -in 2_website@nick-andrew.net.crt -out certificate.pkcs12 -name "StartSSL cert for website@nick-andrew.net" -certfile 1_Intermediate.crt Enter pass phrase for website.key: Enter Export Password: Verifying - Enter Export Password: Finally, you can import the certificate.pkcs12 file into Google Chrome. Go to chrome://settings and click on "Show advanced settings". Then click on "Manage certificates..." In the "Your Certificates" tab, select "Import..." Select "certificate.pkcs12" and enter the Export password you used. At this point, Google Chrome has successfully imported the certificate, and it can be used to authenticate to StartSSL the next time you visit the site. 2014-10-06 - Testing tsuru PaaS systemInstalling my app: vagrant@tsuru:~/GIT/tsuru-demo$ git commit -m 'Add basic app' [master 57ffb34] Add basic app 3 files changed, 29 insertions(+) create mode 100644 .buildpacks create mode 100644 Procfile create mode 100755 app.psgi vagrant@tsuru:~/GIT/tsuru-demo$ git remote -v vagrant@tsuru:~/GIT/tsuru-demo$ git remote add tsuru git@192.168.50.4.nip.io:helloworld.git vagrant@tsuru:~/GIT/tsuru-demo$ git push tsuru master Counting objects: 8, done. Compressing objects: 100% (6/6), done. Writing objects: 100% (8/8), 1.00 KiB | 0 bytes/s, done. Total 8 (delta 0), reused 0 (delta 0) remote: tar: Removing leading `/' from member names remote: BUILDPACK: remote: Multipack app detected remote: =====> Downloading Buildpack: https://github.com/miyagawa/heroku-buildpack-perl.git remote: =====> Detected Framework: Perl/PSGI remote: -----> Bootstrapping cpanm remote: Successfully installed App-cpanminus-1.7012 remote: 1 distribution installed remote: -----> Installing dependencies remote: ! Configuring . failed. See /home/ubuntu/.cpanm/work/1412583687.168/build.log for details. remote: -----> Installing Starman remote: Successfully installed ExtUtils-Config-0.008 remote: Successfully installed ExtUtils-InstallPaths-0.010 remote: Successfully installed ExtUtils-Helpers-0.022 remote: Successfully installed Test-Harness-3.33 remote: Successfully installed Module-Build-Tiny-0.038 remote: Successfully installed HTTP-Date-6.02 remote: Successfully installed File-ShareDir-Install-0.09 remote: Successfully installed Test-SharedFork-0.29 remote: Successfully installed Test-TCP-2.06 remote: Successfully installed URI-1.64 remote: Successfully installed LWP-MediaTypes-6.02 remote: Successfully installed Encode-Locale-1.03 remote: Successfully installed IO-HTML-1.001 remote: Successfully installed HTTP-Message-6.06 remote: Successfully installed HTTP-Body-1.19 remote: Successfully installed Try-Tiny-0.22 remote: Successfully installed POSIX-strftime-Compiler-0.40 remote: Successfully installed Apache-LogFormat-Compiler-0.32 remote: Successfully installed Filesys-Notify-Simple-0.12 remote: Successfully installed Class-Inspector-1.28 remote: Successfully installed File-ShareDir-1.102 remote: Successfully installed HTTP-Tiny-0.050 (upgraded from 0.025) remote: Successfully installed Hash-MultiValue-0.15 remote: Successfully installed Devel-StackTrace-1.34 remote: Successfully installed Devel-StackTrace-AsHTML-0.14 remote: Successfully installed Stream-Buffered-0.03 remote: Successfully installed Plack-1.0032 remote: Successfully installed Net-Server-2.008 remote: Successfully installed Data-Dump-1.22 remote: Successfully installed HTTP-Parser-XS-0.16 remote: Successfully installed Starman-0.4010 remote: 31 distributions installed remote: Using release configuration from last framework (Perl/PSGI). remote: -----> Discovering process types remote: Procfile declares types -> web remote: Default process types for Multipack -> web remote: remote: ---- Building application image ---- remote: ---> Sending image to repository (22.28MB) remote: ---> Cleaning up remote: remote: ---- Starting 1 new unit ---- remote: ---> Started unit 63ba717632... remote: remote: ---- Adding routes to 1 new units ---- remote: ---> Added route to unit 63ba717632 remote: remote: OK To git@192.168.50.4.nip.io:helloworld.git * [new branch] master -> master So far, so good! vagrant@tsuru:~/GIT/tsuru-demo$ tsuru app-list +-----------------+-------------------------+-------------------------------------+--------+ | Application | Units State Summary | Address | Ready? | +-----------------+-------------------------+-------------------------------------+--------+ | helloworld | 0 of 1 units in-service | helloworld.192.168.50.4.nip.io | Yes | +-----------------+-------------------------+-------------------------------------+--------+ | tsuru-dashboard | 1 of 1 units in-service | tsuru-dashboard.192.168.50.4.nip.io | Yes | +-----------------+-------------------------+-------------------------------------+--------+ This looks very nice. At this point, 2 tsuru apps should be ready to run, at the hostnames shown, which both resolve to 192.168.50.4, the address of my cluster. 2014-10-05 - Running perl web apps in FlynnFlynn is a new PaaS (Platform as a Service) project intended to manage a cluster of servers running applications inside containers. Applications are deployed using a simple 'git push' and behind the scenes the application is built in a container and the container is then started on one or more nodes within the cluster. Current version is v20140817, classed as Pre-release. In other words, it's still bleeding edge. I tried it out under a VMWare ESXi client, an LXC container and inside a Vagrant Virtualbox VM (all running Ubuntu 14.04 Trusty). First, the two failures. Trying it in the LXC container failed utterly because the container wasn't able to itself run containers. I could have probably fixed that by messing with the LXC container configuration, but there are only so many hours in a day (and I use all of them already). I couldn't get Flynn running under the ESXi client. I doubt it was an unfixable problem, i.e. there's nothing inherent in the ESXi client which would stop Flynn from working. I did get it going under Vagrant Virtualbox, by following the steps listed under "Manual Ubuntu Deployment" very carefully(*). Those steps were (presented as a script): #!/bin/bash # # Install Flynn on this host apt-get update apt-get -y upgrade apt-get -y install apt-transport-https curl patch apt-key adv --keyserver hkp://keyserver.ubuntu.com:80 --recv BC79739C507A9B53BB1B0E7D820A5489998D827B echo deb https://dl.flynn.io/ubuntu flynn main > /etc/apt/sources.list.d/flynn.list apt-get update apt-get -y install linux-image-extra-$(uname -r) flynn-host if [ ! -f token ] ; then curl -k https://discovery.etcd.io/new > token patch <<EOF /etc/init/flynn-host.conf --- a 2014-10-04 08:15:09.380249566 +0000 +++ b 2014-10-04 04:33:20.135872045 +0000 @@ -1,5 +1,7 @@ description "Flynn layer 0" +env ETCD_DISCOVERY=$(cat token) + #start on (started libvirt-bin and started networking) respawn respawn limit 1000 60 EOF fi flynn-release download /etc/flynn/version.json start flynn-host Don't forget to install the correct linux-image-extra package, as I did - else
you won't have aufs and Flynn will emit unhelpful errors. Don't forget to
run the In fact it's a good idea to change the After this it's necessary to run When I did eventually get Flynn running, it more or less did what the sparse documentation indicated it should. I was able to get a simple perl web app running in containers. An app is developed in a git repository and is deployed by pushing a branch to Flynn's git server. Flynn then builds the app into a container and deploys it accordingly. Building the app requires some machine-readable build and execution instructions. Documentation for doing this for perl was nonexistent so I had to figure it out. Here's my application in its repository. ==> .buildpacks <== https://github.com/miyagawa/heroku-buildpack-perl.git ==> cpanfile <== requires 'Mojolicious'; requires 'Sys::Hostname'; ==> app.psgi <== #!/usr/bin/env perl use Mojolicious::Lite; use Sys::Hostname qw(); # Documentation browser under "/perldoc" plugin 'PODRenderer'; get '/' => sub { my $c = shift; $c->stash('hostname', Sys::Hostname::hostname()); $c->render('index'); }; app->start; __DATA__ @@ index.html.ep % layout 'default'; % title 'Welcome'; Welcome to the Mojolicious real-time web framework! Hello, world! <%= $hostname %> @@ layouts/default.html.ep <!DOCTYPE html> <html> <head><title><%= title %></title></head> <body><%= content %></body> </html> ==> Procfile <== web: 'perl -Mlib=$PWD/local/lib/perl5 ./local/bin/starman --preload-app --port $PORT' Did I mention there is no documentation for running a perl app? Flynn apps can (must?) be defined in a Heroku buildpack compatible manner.
The included flynn/slugbuilder component has some preset buildpacks, and
one of the buildpacks allows additional buildpacks (such as the perl one) to be
specified in a The cpanfile says what perl modules need to be installed. I used Mojolicious and generated a sample application with a very small amount of customisation. For Miyagawa's buildpack, the perl script has to be named Finally the I could have omitted the That's how far I got before something broke and I couldn't get Flynn working again. Bear in mind, Flynn is still bleeding edge but it shows some promise. Thoughts
Next up I'll be testing Tsuru. 2010-04-25 - No sound after upgrading to Ubuntu 9.10 (Karmic Koala)This affected my EEE 901. The symptoms were:
The problem turned out to be with my $HOME/.asoundrc and $HOME/.asoundrc.asoundconf files. I found this:
Solution:
Then sound worked. I assume that pulseaudio couldn't get the ALSA device information due to the problems with the config file(s). It was probably only .asoundrc.asoundconf with a problem but I had nothing special in the other file so I removed that too. 2010-04-16 - GPT, Grub and RAIDGPT is the Generalised Partition Table which is a new standard for partitioning a hard disk. Its advantages over the MSDOS partition table used since the 1980s are:
Grub 2 (versions 1.96 and later) can boot disks which use GPT, but there are some considerations which are detailed below. What's the MSDOS partition table?The MSDOS partition table was invented when disk drive capacities were around 5-20 megabytes and sectors were addressed using a (C,H,S) 3-tuple: Cylinder, Head and Sector numbers. This addressing scheme has been obsolete for a very long time now, and has been kludged several times to cope. Firstly disk drive capacities increased enormously - and we started to use geometries with large numbers of heads (63) and sectors (255) to cope. Secondly disk drives no longer have a fixed geometry: although the number of heads is fixed (up to twice the number of physical platters the drive has), there is a varying number of sectors per track, as more data is stored on the outside of each platter. Disk drives store an approximately constant amount of data per square millimetre of disk surface. Thirdly, modern drives provide internal error correction by relocating sectors away from disk areas with physical defects. The MSDOS partition table uses CHS and also provides an LBA representation of each partition. LBA is a linear numbering of sectors independent of the physical drive geometry in which sectors are numbered starting from zero and going to a very high number (depending on device listed capacity) and we have relied on this for some years now - as the size of devices exceeded the capacity of CHS to describe, operating systems started ignoring the CHS values and using only LBA. But LBA's capacity is about to run out; it has a hard limit of 2 TB (1 terabyte = 2^40 bytes) and we will soon surpass that capacity. What's GPT?GPT uses an LBA system with a much larger capacity. It has provision for more partitions per device (although in my experience, with larger device sizes we make larger partitions, not more partitions). MSDOS was limited to 4 so-called "primary" partitions and one of these could be used as a pointer to further "extended" partitions. It was a kludge, and GPT eliminates this kludge. MSDOS also used a single byte to describe the contents of each partition. This was troublesome as almost all of the 256 possibilities have been used at various times (the linux fdisk utility contains a list). So GPT extends that with UUID based partition types - an almost limitless set. Linux supports GPT partition tables (also called disk labels). A tool called 'gdisk' can create and edit them, and other partitioning tools have varying levels of support. Grub 2 can boot disks partitioned with GPT. But there are some interactions between Grub, GPT and RAID, which is the reason for the existence of this article. GPT, Grub2 and RAID considerations
4K sector disksStorage manufacturers are now shipping 4K sector disks. Why 4K not the previous standard of 512 bytes? 4K sectors are more efficient for such large devices: the drive can pick up more data with each read operation; the number of physical sectors on the device is reduced by 8 times (which reduces the size of relocation tables, for instance); 4K matches the page size used in the x86 architecture. There are a lot of good reasons why. The first Western Digital 4K drives were released with a compatibility mode enabled in which the drive simulates 512 byte sectors. This can affect performance quite a lot. The installer should ensure that all partitions are a multiple of 8 sectors in size. This affects GPT partitioning as well as MSDOS. It's a bit easier to ensure this in GPT because there's no geometry baggage, unlike MSDOS. 2010-04-09 - LS-30 GSM interface
The LS-30 can notify an alert via its internal PSTN interface. There is also an optional GSM interface (pictured), which takes a SIM card and can be used instead of or in addition to the PSTN notification. Alerts are notified via SMS message at first, and this is followed by a voice call. Five numbers can be configured for GSM. The SMS message looks like this: Burglar 01-03 19:19 2010/04/02 The "01-03" refers to the device which triggered. My monitoring software can detect a Burglary message and SMS me directly. I get the impression that it takes a couple of minutes for the LS-30 to get around to sending an SMS. It may be that the LS-30 takes time to check the PSTN line for a dial tone. The LS-30 seems to try to alert via PSTN first, and tries GSM second. My monitoring system sends an SMS of this form: Burglary at Fri Apr 9 10:48:50 EST 2010 zone 500 group 00 The zone and group numbers would correspond (on a real burglary) to the "01-03" shown above. 2010-04-04 - Reverse-Engineering the LS-30
In February 2010, after long investigation, I purchased an LS-30 Alarm System From Securepro Security. Having moved recently, I needed an alarm system. I'd had many years experience with the Solution 16 (PDF) range of alarms, originally designed and manufactured by EDM (a Zeta Internet customer) and now owned by Bosch. But I thought perhaps technology has improved in the meantime, and I wanted something more powerful, configurable, ... more geeky. That's the LS-30. The LS-30 is an alarm system designed for self-installation in a home or business. It communicates wirelessly with its devices, which include:
It is very configurable, and according to the vendor has "more features than any other system available". Except perhaps for very expensive commercial systems. What appealed to me though, was the ethernet interface - I can plug it into my network and (hopefully) configure it remotely. I have begun to reverse-engineer the LS-30 communications protocol. Furthermore, I have released my code under the GNU General Public License (Version 3) so others in the community can benefit from this effort (and help me finish the job). See the links in the left column to my GitHub project. 2009-06-13 - dnscache fixed
I was able to solve the problem with dnscache. Basically, before sending each new request to a nameserver (for a particular query), dnscache would close the socket used for the last request. So any late response would not reach dnscache. The code flow was roughly this: +-> close socket | open socket() | bind() local end of socket to random port | connect() socket to next destination | send() request packet +-< poll with timeout for a response The fix turned out to be fairly simple. Open a socket before sending the first request packet for a particular query, do not connect it to the destination, and use sendto() to specify each destination instead of send(). So the fixed code flow now looks like this: open socket() bind() local end of socket to random port +-> sendto() request packet to next destination | +-< poll with timeout for a response close() socket The improvement on performance is extreme. I'm testing on a virtual machine using the 'netem' module to artificially create network latency of 5000ms. Before patching, dnscache took 310 seconds to lookup 'A www.telstra.net' - much longer than I had calculated in the previous post, because in fact dnscache had to send more requests than expected due to missing nameserver glue, perhaps for the net domain. After patching, dnscache was able to resolve 'www.telstra.net' in only 16 seconds. It sends 6 queries then receives the response to the first, and closes the socket so subsequent responses are ignored (they're not needed anyway). You might think this patch is not necessary because 5000ms network latency is an extreme test and most internet hosts will have much lower latency (it's usually a leaf-node problem) but I experienced it on a modern HSDPA network. Also I have seen saturated dialup and ISDN connections with very high latencies (over 3000ms for ISDN). Also dnscache's timeouts start at 1 second so performance will start to degrade as soon as average request latency exceeds 1000ms. The more nameservers a domain has, the longer it will take to lookup because dnscache sends a request to every nameserver with a 1-second timeout before changing to 3 seconds (then 11, then 45). I've made the patch available on github.com in a new repository I made called djbdns. The url is: http://github.com/nickandrew/djbdns/tree/high-latency-patch This repository also contains all public releases of djbdns to date: version 1.01 through 1.05. DJB has put djbdns into the public domain so it is OK for me to do this. There are also some patches written by other people, linked from tinydns.org. Feel free to clone, fork and submit further patches to this djbdns repository on github. 2009-06-08 - dnscache fails spectacularly on high latency connections
Like most people I use dnscache extensively for name resolution within my network. All this worked fine until recently when my DSL link broke (with the telephone line) and I had to use HSDPA for external connectivity. My HSDPA provider is supposed to provide a maximum speed of at least 1.8 Mbit/sec - of course this can be lower due to network congestion, poor signal strength, low capacity network links and so on. I don't think I have ever received the full 1800 kbit/sec from my provider, maybe nothing over 300 kbit/sec in fact. Anyway this particular outage was particularly troublesome because I was getting only 50 kbit/sec through HSDPA ... slower than a dialup modem. It was slow in another respect too, the packet round-trip time was between 5 and 7 seconds. That's over 200 times more than a typical RTT on DSL of 25 ms. I don't know what caused the extremely high latency, but I do know what its effect was. Dnscache failed almost completely. It would send out requests to nameservers on the internet, and not receiving any response in a reasonable time, would go on to try the next nameserver, and the next, and so on. The responses came back at some later time ... after dnscache had given up on the request (and dnscache would ignore the response). So the net effect was a storm of DNS packets sent and received, as well as ICMP port-unreachable packets when responses were received after dnscache had stopped listening. Now that the DSL is working I am testing dnscache (from djbdns 1.05) to see the exact nature of this problem and if it can be fixed. I am using Linux's Traffic Control subsystem (see lartc.org for documentation) and specifically the 'netem' queue discipline module to simulate various amounts of latency to a virtual host. I setup the variable latency using this script: #!/bin/bash # # Setup network delay for testing dnscache on high latency links # Outbound traffic on eth0 is delayed by several seconds if # - ip dst is 192.168.1.75 set -x tc qdisc del dev eth0 root 2>/dev/null # Setup root class and base rates tc qdisc add dev eth0 root handle 1: htb default 99 tc class add dev eth0 parent 1: classid 1:1 htb rate 100mbit # all traffic tc class add dev eth0 parent 1:1 classid 1:10 htb rate 100mbit # delayed traffic tc class add dev eth0 parent 1:1 classid 1:11 htb rate 2mbit # default tc class add dev eth0 parent 1:1 classid 1:99 htb rate 3500kbit tc qdisc add dev eth0 parent 1:10 handle 10: sfq tc qdisc add dev eth0 parent 1:11 handle 11: netem delay 5000ms tc qdisc add dev eth0 parent 1:99 handle 99: sfq # Move selected traffic into 1:11 tc filter add dev eth0 parent 1: protocol ip prio 1 u32 match ip dst 192.168.1.75 flowid 1:11 This script is run on my gateway and it uses the HTB queue discipline on device eth0 and some classes beneath that to filter outbound packets on that interface. Packets sent to 192.168.1.75 are delayed (by the 'netem' qdisc) by exactly 5000 msec, which is more than enough time to give dnscache a headache. Next up I hacked dnscache to (a) run as an ordinary user, (b) listen on port 5300, and (c) show me what it is doing. Dnscache implements exponential timeouts on sent requests for 1, 3, 11 and 45 seconds. I don't know why djb chose those particular numbers. I started the server and sent a single request to lookup an 'A' record for telstra.net: dig -p 5300 @127.0.0.1 A telstra.net +tries=1 This is what dnscache output at first: $ ./dnscache starting query 1 7f000001:e4d2:6114 1 telstra.net. tx 0 1 telstra.net. . 803f0235 c03a801e c0249411 c6290004 c707532a 80080a5a c0702404 c0cbe60a ca0c1b21 c0e44fc9 c00505f1 c021040c c1000e81 dns_transmit to 128.63.2.53, timeout set to 1 dns_transmit to 192.58.128.30, timeout set to 1 dns_transmit to 192.36.148.17, timeout set to 1 dns_transmit to 198.41.0.4, timeout set to 1 dns_transmit to 199.7.83.42, timeout set to 1 dns_transmit to 128.8.10.90, timeout set to 1 dns_transmit to 192.112.36.4, timeout set to 1 dns_transmit to 192.203.230.10, timeout set to 1 dns_transmit to 202.12.27.33, timeout set to 1 dns_transmit to 192.228.79.201, timeout set to 1 dns_transmit to 192.5.5.241, timeout set to 1 dns_transmit to 192.33.4.12, timeout set to 1 dns_transmit to 193.0.14.129, timeout set to 1 What seems to be happening here is that dnscache is looking up the root nameservers ('.') for 'telstra.net'. The hex numbers are the IPv4 addresses of each root nameserver. Dnscache tries them in order (at least, in the same order as they appear on the 'tx' line). There are 13 root nameservers and these requests appear to be issued once per second - so this process has taken 13 seconds so far. With a 5000 msec delay on the interface, 8 of those 13 requests have been replied-to, but dnscache apparently stops listening for a response as soon as its timeout expires (1 second here) and it sends the next request. Continuing on: dns_transmit to 128.63.2.53, timeout set to 3 dns_transmit to 192.58.128.30, timeout set to 3 dns_transmit to 192.36.148.17, timeout set to 3 dns_transmit to 198.41.0.4, timeout set to 3 dns_transmit to 199.7.83.42, timeout set to 3 dns_transmit to 128.8.10.90, timeout set to 3 dns_transmit to 192.112.36.4, timeout set to 3 dns_transmit to 192.203.230.10, timeout set to 3 dns_transmit to 202.12.27.33, timeout set to 3 dns_transmit to 192.228.79.201, timeout set to 3 dns_transmit to 192.5.5.241, timeout set to 3 dns_transmit to 192.33.4.12, timeout set to 3 dns_transmit to 193.0.14.129, timeout set to 3 Dnscache sends to the same set of 13 nameservers, but with a 3 second timeout on each. That takes 39 seconds (for a total time spent so far of 52 seconds, and we still don't know what are the nameservers for telstra.net). Continuing: dns_transmit to 128.63.2.53, timeout set to 11 rr 803f0235 172800 1 a.gtld-servers.net. c005061e rr 803f0235 172800 1 b.gtld-servers.net. c0210e1e rr 803f0235 172800 1 c.gtld-servers.net. c01a5c1e rr 803f0235 172800 1 d.gtld-servers.net. c01f501e rr 803f0235 172800 1 e.gtld-servers.net. c00c5e1e rr 803f0235 172800 1 f.gtld-servers.net. c023331e rr 803f0235 172800 1 g.gtld-servers.net. c02a5d1e rr 803f0235 172800 1 h.gtld-servers.net. c036701e rr 803f0235 172800 1 i.gtld-servers.net. c02bac1e rr 803f0235 172800 1 j.gtld-servers.net. c0304f1e rr 803f0235 172800 1 k.gtld-servers.net. c034b21e rr 803f0235 172800 1 l.gtld-servers.net. c029a21e rr 803f0235 172800 1 m.gtld-servers.net. c037531e rr 803f0235 172800 ns net. a.gtld-servers.net. rr 803f0235 172800 ns net. b.gtld-servers.net. rr 803f0235 172800 ns net. c.gtld-servers.net. rr 803f0235 172800 ns net. d.gtld-servers.net. rr 803f0235 172800 ns net. e.gtld-servers.net. rr 803f0235 172800 ns net. f.gtld-servers.net. rr 803f0235 172800 ns net. g.gtld-servers.net. rr 803f0235 172800 ns net. h.gtld-servers.net. rr 803f0235 172800 ns net. i.gtld-servers.net. rr 803f0235 172800 ns net. j.gtld-servers.net. rr 803f0235 172800 ns net. k.gtld-servers.net. rr 803f0235 172800 ns net. l.gtld-servers.net. rr 803f0235 172800 ns net. m.gtld-servers.net. rr 803f0235 172800 28 a.gtld-servers.net. 20010503a83e00000000000000020030 stats 1 945 1 0 cached 1 a.gtld-servers.net. cached 1 b.gtld-servers.net. cached 1 c.gtld-servers.net. cached 1 d.gtld-servers.net. cached 1 e.gtld-servers.net. cached 1 f.gtld-servers.net. cached 1 g.gtld-servers.net. cached 1 h.gtld-servers.net. cached 1 i.gtld-servers.net. cached 1 j.gtld-servers.net. cached 1 k.gtld-servers.net. cached 1 l.gtld-servers.net. cached 1 m.gtld-servers.net. Dnscache has finally increased its timeout to 11 and after another 5 seconds (total time elapsed now 57 seconds) it receives a response. Now it knows the nameservers for the 'net' top-level domain. There are 13 of them and so it's going to take another 57 seconds before it learns the 4 nameservers for 'telstra.net', and then another 21 seconds to learn that there is actually no 'A' record for 'telstra.net'. That's 135 seconds total time. I don't know how long clients typically wait for a response but it's a lot less than that. Clearly dnscache should implement two timeouts per request: one for sending a second request (to another nameserver) for the same information, and one to give up waiting for a response from the first request. The 2nd timeout should be much longer than the first. If dnscache was modified to wait up to 10 seconds for a response but try each successive nameserver after 1 second, then it should be possible for dnscache to answer the query within 15 seconds, which is reasonable in this context. In these calculations I'm assuming that dnscache has no existing cache (other than the list of root nameservers) because that makes dnscache's behaviour predictable, and it shows us the worst case performance. The next step for me is to work out if dnscache can have multiple outstanding requests to nameservers for the same client request. Obviously dnscache can handle multiple concurrent client requests and must query many different nameservers concurrently, but the question is whether it can query multiple nameservers for the same information, at more or less the same time, and use the first response that is received. |