Re: Linuxisms in s6

From: Laurent Bercot <ska-supervision_at_skarnet.org>
Date: Thu, 25 Aug 2016 13:30:44 +0200

(Answering to Jan here, but the "you" in this message is generic - if
anything, it's directed more at Adrian.)

On 25/08/2016 11:56, Jan Bramkamp wrote:
> The skalibs library used by s6 to calculate the deadlines
> should use clock_gettime(CLOCK_MONOTONIC) on FreeBSD and
> as such shouldn't be affected by changes to the wall clock.

  That's completely orthogonal to the system (iow it's not a linuxism).
Also, whether or not to use CLOCK_MONOTONIC is a difficult question, and
there are pros and cons for both answers. Not using CLOCK_MONOTONIC in the
general case is not an oversight; it's a carefully weighted decision, and
I'm aware it's not perfect - but using it would not be perfect either.

  The main argument in favor of CLOCK_REALTIME is that skalibs uses TAI
internally, so as long as the system clock is reasonably accurate, time
computations are right - and CLOCK_REALTIME maintains a view of the
current time that is cohesive across the system, whereas CLOCK_MONOTONIC
is completely isolated. If the system clock is flailing around so much
that it has a noticeable impact on skarnet.org programs' operation, then
you likely have bigger problems.

  The obvious exception is when you boot without a battery and have to
first set the time (most likely via NTP). In that case, yes, there will
be a big time jump. But the thing is: when the system clock does a
*forward* time jump, which is the case when you boot on Unix Epoch and
get the correct time later, it should not affect programs too much. After
the time jump, previously computed deadlines expire, so programs in an
event loop get a timeout; most of the time, it's harmless - they just
wake up once needlessly.

  *Backwards* time jumps are much more of a problem, because they can cause
programs not timing out when you want them to, and I have run into trouble
because of that. But it was a bug in a script; there's no reason, ever, to
have a backwards time jump of more than one second - if your system clock
boots with an undefined value, you should set it to a fixed, arbitrary
value in the past early on in your init scripts, and that will avoid any
significant backwards time jumps later on.

  There is a case where forward time jumps are a problem: if you are in an
init script that reports failure in case of a timeout, and the system clock
gets updated at that point, the init script may incorrectly report failure.
That can happen with some invocations of s6-rc, for instance. I am still
gathering data on the kind of things that can fail, in order to come up
with the best fix; so if you have precise examples where that initial
forward time jump is a real issue, please let me know.

  About the number of processes s6 uses: yes, it uses a lot of processes,
and that has never been a problem for anyone. Processes are not a scarce
resource, and are a fundamental part of the Unix API with very useful
isolation properties, so I use them liberally. I have run s6, with a lot of
services, on a 32 MB machine for years, and have never run out of memory.
I haven't tried with 16 MB, but chances are s6 still isn't the limiting
factor if you run a s6-based system on a 16 MB machine.

  The s6-supervise and s6-log programs, which are the ones you're likely to
find in abundance, are carefully designed to use as little private memory
as they can: an instance of s6-supervise should not be using more than
3 pages of private memory. That's 12 KB that cannot be shared per
s6-supervise process, plus whatever overhead is caused by your libc and
kernel. s6-log is a bit hungrier, about 6 or 7 pages tops in most cases,
because it's more complex and has to use heap memory to store log lines
and compiled regular expressions. That's still in the realm of "definitely
not the process you can accuse of using excessive memory".

  If your system causes so much overhead per process that forking 50
processes for s6 (if you have that many services!) leaves a noticeable
dent in your available memory, then you should look at your system's
kernel and libc, which are clearly not lightweight enough to implement
a fully usable Unix API on resource-constrained devices.

-- 
  Laurent
Received on Thu Aug 25 2016 - 11:30:44 UTC

This archive was generated by hypermail 2.3.0 : Sun May 09 2021 - 19:44:19 UTC