Summary
Three findings in internal/controller/vm/vm.go discovered during code review of #2634. None are introduced by that PR; all are pre-existing in upstream/main.
1. Data race in Stats() on c.vmmemProcess (most important)
Location: internal/controller/vm/vm.go:464-471
c.mu.RLock()
defer c.mu.RUnlock()
...
// Initialization of vmmemProcess to calculate stats properly for VA-backed UVMs.
if c.vmmemProcess == 0 {
vmmemHandle, err := vmutils.LookupVMMEM(ctx, c.uvm.RuntimeID(), &iwin.WinAPI{})
if err != nil {
return nil, fmt.Errorf("cannot get stats: %w", err)
}
c.vmmemProcess = vmmemHandle // ← write under RLock
}
Stats() writes c.vmmemProcess while holding only c.mu.RLock(). The same field is mutated under exclusive c.mu.Lock() in TerminateVM (vm.go:541) and read under exclusive lock in other paths.
Stats() is called by the shim's SandboxMetrics RPC, which containerd polls on a timer for every running pod. Two concurrent metric scrapes on the same Controller can both:
- Acquire read lock (allowed concurrently)
- Both observe
vmmemProcess == 0
- Both call
LookupVMMEM (succeeds, returns a handle)
- Both assign to
c.vmmemProcess — race + handle leak (one of the two opened handles is overwritten without being closed; the other is "lucky" to be the kept value)
Reproduction: add a parallel-Stats test under -race and watch the data-race detector fire.
Fix sketch:
- Take
c.mu.Lock() (exclusive) for the duration of Stats(), OR
- Lazy-initialize
vmmemProcess once during StartVM (under exclusive lock), OR
- Use
sync.Once keyed on c for the lookup.
2. windows.Close(0) when vmmemProcess was never initialized
Location: internal/controller/vm/vm.go:541
// Best effort attempt to clean up the open vmmem handle.
_ = windows.Close(c.vmmemProcess)
If Stats() was never called (e.g., a sandbox that was Created and Terminated without ever transitioning to Running long enough to be scraped), c.vmmemProcess is 0. windows.Close(0) returns ERROR_INVALID_HANDLE. The error is swallowed by _ = so there is no functional damage, but it is technically wrong and produces ETW noise.
Fix: add a nil-handle guard:
if c.vmmemProcess != 0 {
_ = windows.Close(c.vmmemProcess)
}
3. waitForVMExit overwrites StateInvalid to StateTerminated
Location: internal/controller/vm/vm.go:328-333
c.mu.Lock()
if c.vmState != StateTerminated {
c.vmState = StateTerminated
}
c.mu.Unlock()
Behaviour question rather than a clear bug: the guard if c.vmState != StateTerminated overwrites StateInvalid to StateTerminated. StateInvalid is set by TerminateVM when uvm.Close fails, indicating a leaked HCS handle. If the background waitForVMExit then runs and sees the VM has actually exited, it currently masks the Invalid signal.
The state-transition table in state.go documents StateInvalid → StateTerminated only via "TerminateVM called". The current code adds a second path through waitForVMExit.
Possible interpretations:
- Intentional: once the VM is actually gone,
Terminated is the truthful state regardless of how the controller arrived. StateInvalid was about the leaked handle, which windows.Close in TerminateVM already attempted.
- Unintentional: the recovery contract assumes
StateInvalid persists until a successful TerminateVM retry. If waitForVMExit sneaks in first and clears it, callers cannot distinguish "cleanly terminated" from "terminated with a leaked handle".
A regression guard test pinning the current behaviour was added in #2634 (TestWaitForVMExit_OverwritesInvalidToTerminated); that test should be updated if the intent is changed.
Suggested fix (if unintentional): narrow the guard to if c.vmState == StateRunning.
Out of scope
Summary
Three findings in
internal/controller/vm/vm.godiscovered during code review of #2634. None are introduced by that PR; all are pre-existing inupstream/main.1. Data race in
Stats()onc.vmmemProcess(most important)Location:
internal/controller/vm/vm.go:464-471Stats()writesc.vmmemProcesswhile holding onlyc.mu.RLock(). The same field is mutated under exclusivec.mu.Lock()inTerminateVM(vm.go:541) and read under exclusive lock in other paths.Stats()is called by the shim'sSandboxMetricsRPC, which containerd polls on a timer for every running pod. Two concurrent metric scrapes on the same Controller can both:vmmemProcess == 0LookupVMMEM(succeeds, returns a handle)c.vmmemProcess— race + handle leak (one of the two opened handles is overwritten without being closed; the other is "lucky" to be the kept value)Reproduction: add a parallel-
Statstest under-raceand watch the data-race detector fire.Fix sketch:
c.mu.Lock()(exclusive) for the duration ofStats(), ORvmmemProcessonce duringStartVM(under exclusive lock), ORsync.Oncekeyed oncfor the lookup.2.
windows.Close(0)whenvmmemProcesswas never initializedLocation:
internal/controller/vm/vm.go:541If
Stats()was never called (e.g., a sandbox that was Created and Terminated without ever transitioning to Running long enough to be scraped),c.vmmemProcessis 0.windows.Close(0)returnsERROR_INVALID_HANDLE. The error is swallowed by_ =so there is no functional damage, but it is technically wrong and produces ETW noise.Fix: add a nil-handle guard:
3.
waitForVMExitoverwritesStateInvalidtoStateTerminatedLocation:
internal/controller/vm/vm.go:328-333Behaviour question rather than a clear bug: the guard
if c.vmState != StateTerminatedoverwritesStateInvalidtoStateTerminated.StateInvalidis set byTerminateVMwhenuvm.Closefails, indicating a leaked HCS handle. If the backgroundwaitForVMExitthen runs and sees the VM has actually exited, it currently masks the Invalid signal.The state-transition table in
state.godocumentsStateInvalid → StateTerminatedonly via "TerminateVM called". The current code adds a second path throughwaitForVMExit.Possible interpretations:
Terminatedis the truthful state regardless of how the controller arrived.StateInvalidwas about the leaked handle, whichwindows.CloseinTerminateVMalready attempted.StateInvalidpersists until a successfulTerminateVMretry. IfwaitForVMExitsneaks in first and clears it, callers cannot distinguish "cleanly terminated" from "terminated with a leaked handle".A regression guard test pinning the current behaviour was added in #2634 (
TestWaitForVMExit_OverwritesInvalidToTerminated); that test should be updated if the intent is changed.Suggested fix (if unintentional): narrow the guard to
if c.vmState == StateRunning.Out of scope