Skip to content

Profiling

When a tick takes longer than expected, you need to know where the time went. golem/debug provides opt-in helpers for pprof, tick wall-time logs, pprof labels, and runtime trace regions. The core server still stays free of debug imports unless your binary explicitly imports this package.

Before you can collect CPU profiles or heap snapshots, you need Go’s pprof HTTP endpoints to be accessible. Import golem/debug in your main and call StartPprof behind whatever flag or environment check your project uses:

import "golem-engine/golem/debug"
// Call this from main before srv.Run, gated on a flag or env variable of your choice.
if err := debug.StartPprof("127.0.0.1:6060"); err != nil {
log.Fatal(err)
}

The debug server binds to its own port and uses a dedicated HTTP mux so it never touches your game’s WebSocket handler. Because golem/debug is a separate package, importing it is what registers the pprof handlers — if nothing imports it, the handlers don’t exist in the binary at all.

Keep this import out of production builds. The pprof endpoints expose unauthenticated access to live memory dumps, goroutine stacks, and CPU profiles, and a 30-second profile request adds measurable overhead to a live server.

The fastest way to check whether your ticks are running on time is the slow-tick logger. Attach it before Run:

debug.AttachSlowTickLogger(srv, debug.SlowTickLoggerOptions{
Budget: 40 * time.Millisecond, // e.g. budget for a 20-tick/s server
})

The logged wall time covers everything inside a tick: entity updates, your OnTick callback, the collision step, and the broadcast flush. You can still use OnTickEnd directly when you want to feed the values to a metrics backend.

This is the right first step — if wall looks healthy, you have no problem to solve.

If wall time is high and you need to know which code is responsible, take a 30-second CPU profile from the pprof server you started in step 1:

go tool pprof http://127.0.0.1:6060/debug/pprof/profile?seconds=30

This gives you a flamegraph of every function that consumed CPU time during those 30 seconds. For most games this is enough to find a hot path.

Once you have a profile, you can narrow it to specific ticks using pprof labels. Attach the tick profiler before Run:

debug.AttachTickProfiler(srv, debug.TickProfilerOptions{
Context: context.Background(),
})

Because OnTickStart, all game logic, and OnTickEnd run on the same goroutine, every CPU sample taken during the tick carries the label. Filter in the interactive pprof UI or on the command line:

go tool pprof -tags cpu.prof

Work dispatched to other goroutines — for example, a physics backend that runs its step in parallel — won’t carry these labels. Those goroutines need their own pprof.SetGoroutineLabels calls if you want them labeled.

Step 4 — capture an execution trace (optional)

Section titled “Step 4 — capture an execution trace (optional)”

A CPU profile tells you which functions are hot. A runtime/trace execution trace tells you when each tick ran, how long it took, and where goroutines blocked. It is the right tool when you suspect scheduling jitter or contention rather than CPU burn.

Start the trace capture once, before calling Run:

import "runtime/trace"
f, err := os.Create("trace.out")
if err != nil {
log.Fatal(err)
}
trace.Start(f)
defer trace.Stop()
srv.Run(ctx)

Then attach trace regions so the trace viewer shows per-tick boundaries:

debug.AttachTickProfiler(srv, debug.TickProfilerOptions{
Context: context.Background(),
Trace: true,
})

Open the result with:

go tool trace trace.out

Both hooks run on the tick goroutine, so the region spans all work between them: entity updates, your OnTick, the collision step, and the broadcast flush. You can open narrower nested regions inside OnTick to isolate individual subsystems like pathfinding or AI.

Goroutines spawned by a physics backend appear as separate goroutines in the trace rather than nested inside the tick region. The trace view still shows their contribution on the timeline; they just won’t be visually contained inside the tick region.

srv.Tick() returns the current 1-based tick counter. It is incremented before OnTickStart fires and is valid to read inside any of the tick hooks or inside OnTick:

srv.OnTick(func(dt float64, s *golem.Server) {
log.Printf("[tick %d] entities: %d", s.Tick(), s.Len())
})

It is also useful as a lightweight sequence number when logging events or recording replays.


See also: Game loop for the full tick phase order.