Use POSIX::RT::Timer

A possible solution to the /Problem with the clock.

POE roughly has 3 classes of events: due, delay and alarm events. Due events are to be delivered now. Delay events, after a number of seconds. And alarm events, at a specific time. Due events can be considered as delayed events with the delay set to 0 or less. POE currently converts due and delay events into alarm events by adding time(2) to their delay to get a delivery time.

A correct solution must be deliver to all 3 classes of events, in the right order at the right time or after the right delay, with minimal or zero change to the loop API and as little computer overhead as possible. And the solution must fall back to the current behaviour if some necessary mechanism is absent.

There are fewer implementations of POE::Queue then there are of POE::Loop so changing the API for the former is more attractive then changing the API to the latter.

Rocco has attempted to have 2 queues; one for due+delay, one for alarms. This attempt failed for reasons that he has forgotten. So we will stick with one queue for all events.

The POE queue must be able distinguish the alarm events from due and delay events. This will allow POE to recalculate the delivery time if the clock is skewed. Delay events must keep the monotonic clock time when posted, and the delay from that time. Calculating the delay event's due time is done by converting monotonic clock to system clock, adding the delay.

    event  posted  posted          due                due time  
    class   clock   mono   delta   time               changes on
    
Due XX YY<=0 mono2real(XX) + YY clock skew Delay XX YY>0 mono2real(XX) + YY skew, alarm_adjust Alarm XX YY XX + YY alarm_adjust

POE must also implement 2 timers; one for the next alarm, one for the next delay or due.

Modern operating systems expose a monotonic clock as well as a realtime clock. On Linux the monotonic clock is the number of seconds since the computer booted. Perl has access to these clocks via [POSIX::RT::Clock] and [POSIX::RT::Timer].

At their hearts, most POE::Loop implementations eventually call select(2) and the timer is implemented the select's timeout. But the POE::Loop API specifies that the timer is set as an absolute time, not a delay calculated by the POE kernel. This time is converted to a delay by deducting time(2) by the POE::Loop. While this might look like a timer for the alarm events, it will fail if a negative clock skew happens after time(2) is called. In essence this means portion of the loop API is broken and should not be used; converting from time(2) to a delay is inherently flawed.

The POE kernel will have 2 POSIX::RT::Timer objects. A realtime clock for alarm events, a monotonic clock for the other events. These objects send signals. sigaction(2) will be used to register signal handlers. The handlers will use the signal pipe to talk to the kernel. The signals will be intercepted in _data_sig_pipe_read, which will first check for clock skew (see below), recalculate delivery time if needed and then call _data_ev_dispatch_due.

When events are due, but we want to check the file handles is a special case; because setting a POSIX::RT::Timer interval to 0 deactivates it, we need to either set the next interval to POSIX::RT::Timer->get_resolution. Or we could "fake" out the loops getting them to timeout in 0 seconds.

Clock skew is detected by comparing the realtime clock to the monotonic clock. If the difference between this value and the previous value (the epsilon) is larger then a maximum epsilon, then the system clock has been changed.

Finally, it is a common idiom to implement a clock tick by successive calls to Kernel->alarm() with a delay on a fixed starting time (http://poe.perl.org/?POE_Cookbook/Recurring_Alarms). In effect this converts a delay event into an alarm event, which is wrong. POE::Kernel should implement a tick API.

    my $tid = POE::Kernel->timer_set( $event, $base, $repeat_interval, @ARGS );
    POE::Kernel->timer_adjust( $tid, $repeat_interval );
    POE::Kernel->alarm_remove( $tid );

Additionally, a 'clock-skew' signal should be posted by the POE kernel when skew is detected. It would have the skew, in seconds, as ARG0.

Signals

Modern OSes have 32 signals, between SIGRTMIN and SIGRTMAX which are supported by %SIG as NUM32-NUM63 and via $Config::Config{sig_num} and $Config::Config{sig_name} by their proper name.

POE::Kernel::USE_POSIXRT() and $ENV{POE_USE_POSIXRT} will control this new behaviour. Setting it to 1 will turn the new behaviour on (the default). Setting it to 0 will turn the new behaviour off.

POE::Kernel::ALARM_SIG() and $ENV{POE_ALARM_SIG} will be the signal number for the alarm timer (default SIGRTMAX).

POE::Kernel::DELAY_SIG() and $ENV{POE_DELAY_SIG} will be the signal number for the delay timer (default SIGRTMAX-1).

POE::Kernel::SKEW_EPSILON() and $ENV{POE_SKEW_EPSILON} is fractions of a second epsilon allowable before POE declares clock skew and recalculates the queue. Defaults to 0.25.

Problems

Unfortunately, this is a much longer code path then the current implementation: Signal, pipe write, pipe read, 2 syscalls + math for the clock skew. If we used a separate timer pipe, we could move the pipe read until after _data_ev_dispatch_due. We might also be able to get away with only dispatching alarm events when the alarm timer is up. We could also potentially only use one signal for both events.

Mac OS X does not currently support POSIX::RT::Clock.

Mac OS X does not have RTMIN in $Config{sig_name} nor $SIG{NUM32}.

Cygwin perl does not support POSIX::RT::Clock. This seems to only be a problem with timer_getoverrun() though; maybe it could be patched.

Cygwin perl does not have $SIG{NUM32} but does have in RTMIN in $Config{sig_name} and $SIG{RTMIN}.

Win32 will have to use [SetTimer()] This seems to only be available via [Win32::GUI::Timer].