1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112
|
package agent
import (
"context"
"time"
"gitlab.com/gitlab-org/cluster-integration/gitlab-agent/v16/internal/module/modagent"
"go.uber.org/zap"
)
// worker is responsible for coordinating and executing full and
// partial syncs by managing the lifecycle of a reconciler
type worker struct {
log *zap.Logger
api modagent.Api
fullSyncInterval time.Duration
partialSyncInterval time.Duration
reconcilerFactory func(ctx context.Context) (remoteDevReconciler, error)
}
func (w *worker) Run(ctx context.Context) error {
agentId, err := w.api.GetAgentId(ctx)
if err != nil {
return err
}
// full sync should be started immediately
// upon module start/restart
fullSyncTimer := time.NewTimer(0)
defer fullSyncTimer.Stop()
partialSyncTimer := time.NewTimer(w.partialSyncInterval)
defer partialSyncTimer.Stop()
var (
activeReconciler remoteDevReconciler
)
defer func() {
// this nil check is needed in case the context is cancelled before the first
// full sync has even been scheduled to execute. In such a case, the
// active reconciler will still be nil and the call to Stop may be skipped
if activeReconciler != nil {
activeReconciler.Stop()
}
}()
done := ctx.Done()
for {
select {
// this check allows the goroutine to immediately exit
// if the context cancellation is invoked while waiting on
// either of the timers
case <-done:
return nil
case <-fullSyncTimer.C:
// full sync is implemented by creating a new reconciler
// while discarding the state accrued in the previous reconciler
w.log.Info("starting full sync")
if activeReconciler != nil {
activeReconciler.Stop()
}
// Full sync could have been alternatively implemented by re-using a reconciler vs
// the current approach of destroying the existing reconciler and creating a new one
// This has been done for the following reasons
// 1. The current approach of stopping/starting a reconciler is conceptually
// equivalent to a module restart with minimal changes to the reconciliation logic/code.
// Full sync implemented with reconciler re-use would've required introducing special handling in the
// reconciliation code. This means that any future updates to the reconciliation logic would've
// led to increased complexity due to possible impact on full & partial sync logic
// leading to increased maintenance cost.
// 2. Reconciler re-use is inadequate when dealing with issues that occur due to corruption/
// mishandling of internal state, for example memory leaks that may occur due to bugs in
// reconciliation logic. The current approach will be able to deal with these as the core logic
// requires teardown of the active reconciler and creation of a new one
activeReconciler, err = w.reconcilerFactory(ctx)
if err != nil {
return err
}
execError := activeReconciler.Run(ctx)
if execError != nil {
w.api.HandleProcessingError(
ctx, w.log, agentId,
"Remote Dev - full sync cycle ended with error", execError,
)
}
// Timer is reset after the work has been completed
// If the timer were reset before reconciliation is executed, there may be a scenario
// where the next timer tick occurs immediately after the reconciler finishes its
// execution (because Run() takes too long for some reason)
fullSyncTimer.Reset(w.fullSyncInterval)
case <-partialSyncTimer.C:
w.log.Info("starting partial update")
execError := activeReconciler.Run(ctx)
if execError != nil {
w.api.HandleProcessingError(
ctx, w.log, agentId,
"Remote Dev - partial sync cycle ended with error", execError,
)
}
// Timer is reset after the work has been completed
// If the timer were reset before reconciler is executed, there may be a scenario
// where the next timer tick occurs immediately after the reconciler finishes its
// execution (because Run() takes too long for some reason)
partialSyncTimer.Reset(w.partialSyncInterval)
}
}
}
|