ApacheのDigest認証モジュールがどのようにnonceを生成しているか

前回の記事で、Digest認証がどうやって攻撃に対策しているかまとめました。いくつかの攻撃への対策として、nonceと呼ばれるワンタイムトークンが重要な役割を果たしていました。このnonceの生成方法は実装者に任せられているのですが、実際のWebサーバではどうしているのでしょうか。生成文字列が第三者に予想できるような方法はよくなさそうです。nonceの生成日時がnonceに含まれているとexpire判定に役立ちそうです。まぁ考えていても分からないので、Apacheではどのようにnonceを生成しているか、調べてみました。

今回見たのは、2016年5月時点で最新の安定版である2.4.20のコードです。こちらから取得できます。

nonceの生成箇所

まずはどこでnonceを生成しているか、確認します。ApacheのDigest認証周りはmod_auth_digestモジュールにまとめられています。

mod_auth_digestモジュールはmodules/aaa/mod_auth_digest.cにソースコードがあります。認証に失敗するとnote_digest_auth_failure()という関数が呼ばれ、その中でWWW-Authenticateヘッダを生成します。WWW-AuthenticateヘッダはDigest認証で401 Unauthorizedが返されるときに付与されるヘッダーです。

note_digest_auth_failure()の真ん中辺り（modules/aaa/mod_auth_digest.cの1285行目〜1287行目）でnonceを、末尾（modules/aaa/mod_auth_digest.cの1318行目〜1326行目）でヘッダーを生成しています。

    /* Setup nonce */

    nonce = gen_nonce(r->pool, r->request_time, opaque, r->server, conf);

    apr_table_mergen(r->err_headers_out,
                     (PROXYREQ_PROXY == r->proxyreq)
                         ? "Proxy-Authenticate" : "WWW-Authenticate",
                     apr_psprintf(r->pool, "Digest realm=\"%s\", "
                                  "nonce=\"%s\", algorithm=%s%s%s%s%s",
                                  ap_auth_name(r), nonce, conf->algorithm,
                                  opaque_param ? opaque_param : "",
                                  domain ? domain : "",
                                  stale ? ", stale=true" : "", qop));

見ての通り、gen_nonce()という関数がnonceの生成箇所であることがわかります。

nonceの生成方法（前半）

gen_nonce()は次のような関数です（modules/aaa/mod_auth_digest.cの1067行目〜1093行目）。

/* The nonce has the format b64(time)+hash .
 */
static const char *gen_nonce(apr_pool_t *p, apr_time_t now, const char *opaque,
                             const server_rec *server,
                             const digest_config_rec *conf)
{
    char *nonce = apr_palloc(p, NONCE_LEN+1);
    time_rec t;

    if (conf->nonce_lifetime != 0) {
        t.time = now;
    }
    else if (otn_counter) {
        /* this counter is not synch'd, because it doesn't really matter
         * if it counts exactly.
         */
        t.time = (*otn_counter)++;
    }
    else {
        /* XXX: WHAT IS THIS CONSTANT? */
        t.time = 42;
    }
    apr_base64_encode_binary(nonce, t.arr, sizeof(t.arr));
    gen_nonce_hash(nonce+NONCE_TIME_LEN, nonce, opaque, server, conf);

    return nonce;
}

重要なのは最後のほうで、apr_base64_encode_binary()関数から得られる文字列と、gen_nonce_hash()関数から得られる文字列を結合したものが、nonceです。apr_base64_encode_binary()関数からは、タイムスタンプをBase64 エンコーディングした文字列が得られます。gen_nonce_hash()関数からは、いくつかの値をまとめてSHA1ハッシュ化した文字列が得られます。

実際のnonceは、例えば次のようになっています。

3QrlcEIyBQA=69dbf3121f8c6d36372601c08c7845ec9d59fa8c

突然=が現れるのが以前から不思議だったのですが、nonceの前半の文字列がBase64だったからなんですね。この文字列をデコードすると、apr_time_tという構造体の値が得られます。apr_time_tは、UNIXタイムのus精度版です。

なぜタイムスタンプだけハッシュ化ではなくBASE64 エンコーディングされているかというと、おそらくこの値を、nonceがexpireしているかどうかの判定に使うためだと思います。実際check_nonce()という関数では、nonceから導出されるタイムスタンプを元に、expire判定をしています。BASE64 エンコーディングなら元の値を復元できるので、このような実装ができます。

nonceの生成方法（後半）

次に、gen_nonce_hash()関数をみてみます（modules/aaa/mod_auth_digest.cの1039行目〜1064行目）。

/* The hash part of the nonce is a SHA-1 hash of the time, realm, server host
 * and port, opaque, and our secret.
 */
static void gen_nonce_hash(char *hash, const char *timestr, const char *opaque,
                           const server_rec *server,
                           const digest_config_rec *conf)
{
    unsigned char sha1[APR_SHA1_DIGESTSIZE];
    apr_sha1_ctx_t ctx;

    memcpy(&ctx, &conf->nonce_ctx, sizeof(ctx));
    /*
    apr_sha1_update_binary(&ctx, (const unsigned char *) server->server_hostname,
                         strlen(server->server_hostname));
    apr_sha1_update_binary(&ctx, (const unsigned char *) &server->port,
                         sizeof(server->port));
     */
    apr_sha1_update_binary(&ctx, (const unsigned char *) timestr, strlen(timestr));
    if (opaque) {
        apr_sha1_update_binary(&ctx, (const unsigned char *) opaque,
                             strlen(opaque));
    }
    apr_sha1_final(sha1, &ctx);

    ap_bin2hex(sha1, APR_SHA1_DIGESTSIZE, hash);
}

おおまかには、次のような流れです。

ハッシュ化するデータを変数ctxに詰めていく（memcpy()とapr_sha1_update_binary()）
SHA1ハッシュを計算する（apr_sha1_final()）
ハッシュをバイナリから16進数に変換する（ap_bin2hex()）

最初にmemcpy()しているのは、実は&conf->nonce_ctxに中間データが入っているからです。最初にDigest認証の設定を読み込むときに呼ばれるset_realm()という関数の中で、次のように中間データを作っています（modules/aaa/mod_auth_digest.cの474行目〜481行目）。

    /* we precompute the part of the nonce hash that is constant (well,
     * the host:port would be too, but that varies for .htaccess files
     * and directives outside a virtual host section)
     */
    apr_sha1_init(&conf->nonce_ctx);
    apr_sha1_update_binary(&conf->nonce_ctx, secret, sizeof(secret));
    apr_sha1_update_binary(&conf->nonce_ctx, (const unsigned char *) realm,
                           strlen(realm));

secretというのはDigestモジュール初期化時に生成される文字列です。realmには設定ファイルでAuthNameとして指定した文字列が入ります。apr_sha1_update_binary()関数をつかってこれらの文字列を詰めています。

gen_nonce_hash()に戻って、次はtimestrという文字列を詰めています。この文字列は、先ほどnonceに詰めた、タイムスタンプをBase64 エンコーディングした文字列です。

次は、opaque変数がNULLでない場合にopaqueが示す文字列を詰めています。ただし実際の挙動をみると、opaque変数はNULLか空文字列を指していて、ハッシュ値には影響がないようです。

まとめると、nonceの後半の文字列はsecret、realm、タイムスタンプをSHA1ハッシュして生成されます。予想ですが、secretはnonceを攻撃者が予想できないようにするため、タイムスタンプはnonceを以前の値とかぶらないようにするために使われているのではと思いました。realmは正直よくわかりませんが、同じサーバで他にDigest認証をかけている場合に、nonceがかぶらないようにするためでしょうか。

生成方法の確認

確認のため、手で生成した結果とくらべてみます。実際のnonceがROZROU0yBQA=0d3d7d1e0589ebe3eef43bb3b5b6ec932ba9e1b4で、secretが\x82\xd4\xe3\xe4\xb1t\xba\x9e\xf1\x1b\x9e\xdf\x805\x7fcs\xae\xd0\xd0、realmがsecretとします。

このとき、SHA1("\x82\xd4\xe3\xe4\xb1t\xba\x9e\xf1\x1b\x9e\xdf\x805\x7fcs\xae\xd0\xd0" + "secret" + "ROZROU0yBQA=")は、0d3d7d1e0589ebe3eef43bb3b5b6ec932ba9e1b4になり、実際のnonceの後半部分と一致します。