Re: s6 service 'really up' clarification from Olivier Brunel on 2015-01-13 (skaware)

From: Olivier Brunel <jjk_at_jjacky.com>
Date: Tue, 13 Jan 2015 01:35:30 +0100

On 01/13/15 00:38, Laurent Bercot wrote:
> On 12/01/2015 23:27, Patrick Mahoney wrote:
>
>> So, speculating, it seems that 's6-svwait -U' on a service that is
>> currently 'down' either waits forever, or returns when something sends a
>> 'U' event. But when the service is 'up' (but no U event ever happened),
>> 's6-svwait -U' returns immediately as if the service was 'really up'.
>
> Yes, and I agree that is counter-intuitive and problematic, but I
> haven't come up with a good solution yet.
>
> The issue is that s6-supervise stores the service state in
> supervise/status
> and has no notion of service readiness. For s6-supervise, either the
> service
> is up (pid nonzero) or it is down (pid zero), it cannot know whether the
> service is "really up" or not.
> And adding an understanding of service readiness to s6-supervise would:
> - make no sense: it's a process supervisor, no matter what the process
> does.
> I use it for cron-like jobs (do the job then sleep for a while), for
> instance,
> where there's no notion of service readiness.
> - be absolutely bloated and ugly, since there should be a feedback
> canal from
> the service to s6-supervise, and I really don't want to go there.

If I may... this is actually something I have been thinking about, and
was planning on suggesting: that supervise should be aware of that 'U'
state.

My idea was that it would be made aware of it, basically adding a new
'command' to its fifo, so what s6-notifywhenup would do is write 'U'
there, and supervise would then update its state and be the one emitting
the event. (notifywhenup taking not a fifodir but a servicedir then.)

And for cases where one doesn't use notifywhenup, e.g. because there's
another way for the daemon to notify it's ready, then s6-svc could have
a -U to do what notifywhenup does: tell supervise to update the state &
emit the event U.

That way s6-svwait or s6-svstat could also be aware of the U state
properly, since it would in fact be in the statusfile, so there's always
a proper/consistent state known to all s6-* tool.

This might not always be useful/used, but it's there when needed and it
feels to me like the correct way to have this information (as opposed to
have e.g. other tool maintaining a "parallel" state on their own...).

> s6-svwait listens on the event/ fifodir then checks on supervise/status to
> know the initial state of the service before any notification arrives. But
> that initial state is only "down" or "up", as s6-supervise wrote it;
> it's never
> "ready". So for now, s6-svwait -U assumes that "up" means "ready". Which
> causes the behaviour you observed.
>
> To solve that, the up vs. ready state should be stored in another file,
> that
> s6-svwait -U should check. The logical place to do that operation is in
> s6-notifywhenup.But that raises a few other issues:
>
> - That would more or less enforce s6-notifywhenup as *the* readiness
> notification tool to use with s6-svwait. If daemons want to use another
> notification mechanism than the one supported by s6-notifywhenup, there
> will
> need to be another compatibility tool to convert notification mechanisms.
> This can become pretty ugly really fast.
>
> - That means s6-notifywhenup has to have write permissions to supervise/
> instead of simply read permissions. So, basically, s6-notifywhenup has to
> run as root. It's nothing complex, but it's still more root code that the
> admin has to trust.
>
> - s6-notifywhenup isn't around when the daemon dies, so it cannot update
> the new state file. Only s6-supervise is around, so it has to do the job.
> That is an ad-hoc hook that needs to be added to s6-supervise. Again,
> nothing complex, but by definition, ad-hoc is ugly.
>
> All this is probably still better than the current behaviour, so I may add
> it in a future release.
>
Received on Tue Jan 13 2015 - 00:35:30 UTC

This archive was generated by hypermail 2.3.0 : Sun May 09 2021 - 19:38:49 UTC