Re: Thoughts on "First Class Services" from Avery Payne on 2015-04-28 (supervision)

From: Avery Payne <avery.p.payne_at_gmail.com>
Date: Tue, 28 Apr 2015 11:28:42 -0700

On 4/28/2015 10:31 AM, Steve Litt wrote:
> Good! I was about to ask the definitions of parent and child, but the
> preceding makes it clear.

I'm taking it from the viewpoint that says "the service that the user
wishes to start is the parent of all other service dependencies that
must start".

> So what you're doing here is minimizing polling, right? Instead of
> saying "whoops, child not running yet, continue the runit loop", you
> actually start the child, the hope being that no service will ever be
> skipped and have to wait for the next iteration. Do I have that right?
Kinda. A failure of a single child still causes a run loop, but the
next time around, some of the children are already started, and a start
of the child will quickly return a success, allowing the script to skip
over it quickly until it is looking at the same problem child from the
last time. The time lost is only on failed starts, and child starts
typically don't take that long. If they are, well, it's not the
parent's fault...

>
>> Here's the current version of run.sh, with dependency support baked
>> in:
>> https://bitbucket.org/avery_payne/supervision-scripts/src/b8383ed5aaa1f6d848c1a85e6216e59ba98c3440/sv/.run/run.sh?at=default
>>
> That's a gnarley run script.

Yup. For the moment.

> If I'm not mistaken, everything inside the "if test
> $( cat ../.env/NEEDS_ENABLED ) -gt 0; then" block is boilerplate that
> could be put inside a shellscript callable from any ./run.

True, and that idea has merit.

> That would
> hack off 45 lines right there. I think you could do something similar
> with everything between lines 83 and 110. The person who is truly
> interested in the low level details could look at the called
> shellscripts (perhaps called with the dot operator). I'm thinking you
> could knock this ./run down to less than 35 lines of shellscript by
> putting boilerplate in shellscripts.
I've seen this done in other projects, and for the sake of simplicity
(and reducing subshell spawns) I've tried to avoid it. But that doesn't
mean I'm against the idea. Certainly, all of these are improvements
with merit, provided that they don't interfere with some of the other
project goals. If I can get the time to look at all of it, I'll
re-write it by segmenting out the various components.

In fact, you may have given me an idea to solve an existing problem I'm
having with certain daemons...

>
> You're doing more of a recursive start. No doubt, when there are two or
> three levels of dependency and services take a non-trivial amount of
> time to start (seconds), yours results in the quicker boot. But for
> typical stuff, I'd imagine the old "wait til next time if your ducks
> aren't in line" will be almost as fast, will be conceptually
> simpler, and more codeable by the end user. Not because your method is
> any harder, but because you're applying it against a program whose
> native behavior is "wait til next cycle".

Actually, I was looking for the lowest-cost solution to "how do I keep
track of dependency trees between multiple services". The result was a
self-organizing set of data and scripts. I don't manage *anything*
beyond "service A must have service B". It doesn't matter how deep that
dependency tree goes, or even if there are common "leaf nodes" at the
end of the tree, because it self-organizes. This reduces my cognitive
workload; as the project grows to hundreds of scripts, the number of
possible combinations reaches a point where it would be unmanageable
otherwise. Using this approach means I don't care how many there are, I
only care about what is needed for a specific service.

> And, as you said in a past email, having a run-once capability without
> insane kludges would be nice, and as you said in another past email,
> it's not enough to test for the child service to be "up" according to
> runit, but it must pass a test to indicate the process itself is
> functional. I've been doing that ever since you mentioned it.

At some point I have to go back and start writing ./check scripts. :(
Received on Tue Apr 28 2015 - 18:28:42 UTC

This archive was generated by hypermail 2.3.0 : Sun May 09 2021 - 19:44:19 UTC