Diagnose "Http Challenge Server process unavailable" message

Hi,

I am having trouble renewing a certificate on one of two IIS servers (Server 2016) behind a load-balancer. When I look at the CertifyTheWeb log on the two servers one of the earliest differences is that on the failing server there is an entry:

Http Challenge Server process unavailable

How do I figure out what is causing this?

thanks

Martin

Hi, so there’s two problems to consider:

  • If the Http Challenge server process is unavailable that probably means something else non-microsoft is using port 80 (typically Apache or nginx or some other http server). The challenge server process uses http.sys to temporarily sit in front of IIS in the http pipeline but can only do so if the service using port 80 is built around http.sys. Alternatively the http challenge server may just be switch off under Settings, or the process is stuck ( not usually the case) - look for a Certify.exe process to kill in task manager.
  • We don’t officially support http challenges behind load balancers because that requires all the servers to present the same http challenge response. When Let’s Encrypt tries to validate it will make multiple attempts to check the http response (not just one) and it will fail if one of the servers doesn’t give the same response. Ideally use DNS validation instead of http, that way each server can control/acquire it’s own certificates. There are tricks you can do involving shared mappings for the /.well-known/acme-challenge/ root (and using IIS to handle the http challenge responses instead of using) but these are not supported directly and don’t solve the issue of updating the cert bindings on each server.

Hi,

I am puzzled about the “process is unavailable” - I can’t identify anything else that should be listening on port 80. But I can’t easily rule it out either. It’s definitely not switched off in Settings. Next time I do a refresh I will have to check for certify.exe in task manager. Is there any more detailed logging to check?

Re load-balancers - understood that this is tricky. Thanks!

If you look under C:\ProgramData\Certify\logs there should be httpChallengeServer.log which may have more information but won’t be able to identify what the other process is (if there is one). I’m assuming the Certify background service is running under the default user (Local System) - as other users would have trouble allocating a listener for port 80.

Hmm, don’t see anything real helpful in httpChallengeServer.log so far. Are timestamps in that file GMT or local time? And is it cumulative or just from the most recent execution?

thanks!

Martin

HI, the challenge server logs are local time and per session (the challenge server spins up when there is validation to be done and stops after validation completes).

Typical content is:

4/26/2020 4:52:15 AM: Http Challenge Server Started: http://+:80/.well-known/acme-challenge/
4/26/2020 4:52:15 AM: Control Key: baef2cc2-fc90-4b7a-a44e-63501a507c11: Check Key: configcheck
4/26/2020 4:52:16 AM: Responded with Key: _x5iv6fepazk2apmb0cdidkbh4kooqawgxrg3fuvcbm Value:_X5Iv6FePaZk2aPMB0CDiDKbH4kOoQawgxrg3fuvcBM.14XTcPPRHADZYAPTqp1PSuE4aXiKZb7Iqa15iyDIHSE
4/26/2020 4:52:16 AM: Responded with Key: _x5iv6fepazk2apmb0cdidkbh4kooqawgxrg3fuvcbm Value:_X5Iv6FePaZk2aPMB0CDiDKbH4kOoQawgxrg3fuvcBM.14XTcPPRHADZYAPTqp1PSuE4aXiKZb7Iqa15iyDIHSE

If you have recent requests logged then validation is reaching the challenge server and therefore the challenge server is working OK.

A test is to run a request then immediately check http://<yourdomain>/.well-known/acme-challenge/configcheck in a browser. If you get a response then the challenge server is running OK and is serving http challenges.

Assuming there is no stuck certify.exe running, port 80 is free for use (or is only used by IIS) and nothing else has allocated the `/.well-known/acme-challenge’ listener prefix (other acme tools) then it should all be working OK.

Note though, you will still have the issue of Let’s Encrypt attempting to validate over http - your load balanced servers will respond independently unless you can temporarily direct all /.well-known/ requests to the specific server. This is why DNS validation is currently better for load balanced sites.

In my case I got the first two lines but no “Responded with Key” lines; the entire content is

4/23/2020 10:06:52 PM: Control Key: 9a6a535a-25c6-431f-816c-22a9710640a6:    Check Key: configcheck
4/23/2020 10:07:02 PM: Checking for auto close.
4/23/2020 10:07:12 PM: Checking for auto close.
4/23/2020 10:07:22 PM: Checking for auto close.
4/23/2020 10:07:32 PM: Checking for auto close.
4/23/2020 10:07:42 PM: Checking for auto close.
4/23/2020 10:07:52 PM: Checking for auto close.
4/23/2020 10:07:52 PM: No requests recently, stopping server.
4/23/2020 10:07:52 PM: Stopping Server

Given that the log_<long_random_string>.log file contains

2020-04-23 22:06:52.226 -04:00 [INF] Http Challenge Server process unavailable.

but then proceeds to successfully validate


2020-04-23 22:06:58.518 -04:00 [INF] Domain validation completed: <my domain>

my interpretation is that the Challenge Server was not fully functional and so the CertifyTheWeb process fell back on using IIS, is that plausible?

thanks

Martin

Just to clarify - our DNS setup doesn’t allow for using DNS validation right now, so we are using the virtual directory and common share strategy.

thanks

Sounds plausible, although your log looks fine (it just looks like nothing is hitting the server). That said I don’t understand the process unavailable message so that needs more investigation.

For the purpose of load balanced validation writing challenges to a common path would likely be better than using the challenge serer anyway as then either IIS server can respond with the http challenge. The challenge server can be switched off under Settings.