Skip to content

Add token refresh endpoints to agent server#2018

Open
ryans-posthog wants to merge 3 commits intomainfrom
posthog-code/agent-token-refresh
Open

Add token refresh endpoints to agent server#2018
ryans-posthog wants to merge 3 commits intomainfrom
posthog-code/agent-token-refresh

Conversation

@ryans-posthog
Copy link
Copy Markdown
Contributor

Summary

  • Adds a set_token command (also aliased as posthog/set_token) to the agent server that updates GH_TOKEN and GITHUB_TOKEN on the agent process so Claude-based agents pick up a refreshed token without restart.
  • Adds a localhost-only POST /gh route that runs an arbitrary gh CLI invocation using the agent's current env. Codex-based agents run in an isolated sandbox, so a shell-script wrapper hits this endpoint to apply the new token through gh.
  • Adds a small gh-exec helper (runGh + isLoopbackAddress) that spawns gh directly (no shell), inherits the agent's process.env, and supports timeouts. The /gh route reuses it; tests exercise both happy/error paths.

Test plan

  • pnpm --filter agent test — full agent server suite passes (149/149), including 15 new gh-exec tests and 6 new agent-server tests covering both set_token aliases, /gh body validation, and that /gh does not require JWT.
  • pnpm --filter agent typecheck — clean.
  • Biome — clean (lint-staged ran on commit).
  • Manual: send POST /command with {"command":"set_token","params":{"token":"…"}} — verify GH_TOKEN updates in the agent process.
  • Manual: from inside a Codex sandbox, hit POST /gh with the wrapper script — verify gh runs with the new token.

Adds a `set_token` command (also aliased as `posthog/set_token`) that
updates `GH_TOKEN` / `GITHUB_TOKEN` on the agent process, so Claude-based
agents pick up a refreshed token without restart. Adds a localhost-only
`POST /gh` route that runs an arbitrary `gh` CLI invocation with the
agent's current env, giving isolated Codex sandboxes a way to apply the
new token via a shell wrapper.

Generated-By: PostHog Code
Task-Id: 2cbfd9f1-a4ff-4d49-a526-1184b909a373
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented May 4, 2026

Security Review

  • Unrestricted gh command execution via /gh route (packages/agent/src/server/agent-server.ts): The loopback-only POST /gh endpoint accepts any args array and forwards it to gh with the agent's full credentials. Any co-located process can invoke destructive gh subcommands (gh secret set, gh auth logout, gh repo delete, etc.) without authentication. No subcommand allowlist is enforced.
  • Incomplete loopback range in isLoopbackAddress (packages/agent/src/server/gh-exec.ts): Only 4 specific addresses are checked; the full 127.0.0.0/8 IPv4 loopback range and IPv4-mapped variants beyond ::ffff:127.0.0.1 are not recognised, potentially causing unexpected 403s for valid loopback connections.

Comments Outside Diff (1)

  1. packages/agent/src/server/agent-server.test.ts, line 458-508 (link)

    P2 Prefer parameterised tests for /gh 400-validation cases

    All four POST /gh tests share identical scaffolding: start a server, POST a different bad payload, assert 400. Per the team's simplicity rules, these should be a single it.each block. Similarly, the two set_token tests each duplicate the env-backup-and-restore boilerplate and differ only in the command string and which token name is verified — they're a natural it.each candidate.

    Context Used: Do not attempt to comment on incorrect alphabetica... (source)

    Prompt To Fix With AI
    This is a comment left during a code review.
    Path: packages/agent/src/server/agent-server.test.ts
    Line: 458-508
    
    Comment:
    **Prefer parameterised tests for `/gh` 400-validation cases**
    
    All four `POST /gh` tests share identical scaffolding: start a server, POST a different bad payload, assert 400. Per the team's simplicity rules, these should be a single `it.each` block. Similarly, the two `set_token` tests each duplicate the env-backup-and-restore boilerplate and differ only in the command string and which token name is verified — they're a natural `it.each` candidate.
    
    **Context Used:** Do not attempt to comment on incorrect alphabetica... ([source](https://app.greptile.com/review/custom-context?memory=instruction-0))
    
    How can I resolve this? If you propose a fix, please make it concise.

    Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

Prompt To Fix All With AI
Fix the following 3 code review issues. Work through them one at a time, proposing concise fixes.

---

### Issue 1 of 3
packages/agent/src/server/agent-server.test.ts:458-508
**Prefer parameterised tests for `/gh` 400-validation cases**

All four `POST /gh` tests share identical scaffolding: start a server, POST a different bad payload, assert 400. Per the team's simplicity rules, these should be a single `it.each` block. Similarly, the two `set_token` tests each duplicate the env-backup-and-restore boilerplate and differ only in the command string and which token name is verified — they're a natural `it.each` candidate.

### Issue 2 of 3
packages/agent/src/server/gh-exec.ts:88-100
**`isLoopbackAddress` only recognises a four-value fixed set, not the full loopback range**

IPv4 reserves the entire `127.0.0.0/8` subnet as loopback, and IPv4-mapped loopback in IPv6 covers `::ffff:127.0.0.0/104`. The set currently contains only `127.0.0.1` and `::ffff:127.0.0.1`, so a connection arriving on any other address in those ranges (e.g. `127.0.0.2`, `::ffff:127.1.2.3`) is silently rejected with 403. A safer check would be a proper prefix test for `127.` and `::ffff:127.`.

### Issue 3 of 3
packages/agent/src/server/agent-server.ts:453-494
**`/gh` executes any `gh` subcommand, not just token-refresh operations**

The `args` array is unconstrained, so any local process can invoke `gh secret set`, `gh auth logout`, `gh repo delete`, or other destructive subcommands with the agent's inherited credentials. The loopback restriction limits exposure to co-located processes, but it's broader than the stated use case requires. Consider restricting `args[0]` to an allowlist of safe subcommands (e.g. `auth`) or documenting why full `gh` pass-through is intentionally required.

Reviews (1): Last reviewed commit: "Add token refresh endpoints to agent ser..." | Re-trigger Greptile

Comment on lines +88 to +100
});
});
});
}

const LOOPBACK_ADDRESSES = new Set([
"127.0.0.1",
"::1",
"::ffff:127.0.0.1",
"localhost",
]);

export function isLoopbackAddress(address: string | undefined): boolean {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 isLoopbackAddress only recognises a four-value fixed set, not the full loopback range

IPv4 reserves the entire 127.0.0.0/8 subnet as loopback, and IPv4-mapped loopback in IPv6 covers ::ffff:127.0.0.0/104. The set currently contains only 127.0.0.1 and ::ffff:127.0.0.1, so a connection arriving on any other address in those ranges (e.g. 127.0.0.2, ::ffff:127.1.2.3) is silently rejected with 403. A safer check would be a proper prefix test for 127. and ::ffff:127..

Prompt To Fix With AI
This is a comment left during a code review.
Path: packages/agent/src/server/gh-exec.ts
Line: 88-100

Comment:
**`isLoopbackAddress` only recognises a four-value fixed set, not the full loopback range**

IPv4 reserves the entire `127.0.0.0/8` subnet as loopback, and IPv4-mapped loopback in IPv6 covers `::ffff:127.0.0.0/104`. The set currently contains only `127.0.0.1` and `::ffff:127.0.0.1`, so a connection arriving on any other address in those ranges (e.g. `127.0.0.2`, `::ffff:127.1.2.3`) is silently rejected with 403. A safer check would be a proper prefix test for `127.` and `::ffff:127.`.

How can I resolve this? If you propose a fix, please make it concise.

Comment on lines 453 to 494
}
});

// Sandbox-local exec for `gh`. Codex agents run in an isolated child process
// whose environment is captured at spawn time, so refreshing GH_TOKEN in the
// agent server doesn't reach them. Their gh-wrapper script calls this route
// to run gh against the agent server's freshly-refreshed env. Loopback-only;
// intentionally not JWT-protected so the wrapper has nothing to forward.
app.post("/gh", async (c) => {
const remote = getConnInfo(c).remote.address;
if (!isLoopbackAddress(remote)) {
this.logger.warn("Rejected non-loopback /gh request", { remote });
return c.json({ error: "Forbidden" }, 403);
}

const rawBody = await c.req.json().catch(() => null);
const parsed = ghRequestSchema.safeParse(rawBody);
if (!parsed.success) {
return c.json({ error: parsed.error.message }, 400);
}

const { args, cwd, timeoutMs } = parsed.data;
try {
const result = await runGh(args, {
cwd: cwd ?? this.config.repositoryPath ?? process.cwd(),
timeoutMs: timeoutMs ?? 60_000,
logger: this.logger,
});
return c.json(result);
} catch (error) {
this.logger.error("Failed to run gh", error);
return c.json(
{
error: error instanceof Error ? error.message : "Unknown error",
},
500,
);
}
});

app.notFound((c) => {
return c.json({ error: "Not found" }, 404);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 security /gh executes any gh subcommand, not just token-refresh operations

The args array is unconstrained, so any local process can invoke gh secret set, gh auth logout, gh repo delete, or other destructive subcommands with the agent's inherited credentials. The loopback restriction limits exposure to co-located processes, but it's broader than the stated use case requires. Consider restricting args[0] to an allowlist of safe subcommands (e.g. auth) or documenting why full gh pass-through is intentionally required.

Prompt To Fix With AI
This is a comment left during a code review.
Path: packages/agent/src/server/agent-server.ts
Line: 453-494

Comment:
**`/gh` executes any `gh` subcommand, not just token-refresh operations**

The `args` array is unconstrained, so any local process can invoke `gh secret set`, `gh auth logout`, `gh repo delete`, or other destructive subcommands with the agent's inherited credentials. The loopback restriction limits exposure to co-located processes, but it's broader than the stated use case requires. Consider restricting `args[0]` to an allowlist of safe subcommands (e.g. `auth`) or documenting why full `gh` pass-through is intentionally required.

How can I resolve this? If you propose a fix, please make it concise.

- isLoopbackAddress: match the full 127.0.0.0/8 range (and the IPv4-mapped
  ::ffff:127.0.0.0/104 prefix) instead of a fixed four-value set, so legitimate
  loopback connections on 127.0.0.x are not silently rejected with 403.
- Parameterise the duplicated /gh 400-validation tests and the two set_token
  command tests with it.each to remove identical scaffolding.

Generated-By: PostHog Code
Task-Id: 2cbfd9f1-a4ff-4d49-a526-1184b909a373
@ryans-posthog
Copy link
Copy Markdown
Contributor Author

Make sure to not allow the caller to specify the binary.

Drops the binary option from runGh's public type so callers — including
the /gh HTTP route — cannot specify which binary is executed. The route
already only forwards args/cwd/timeoutMs from the body schema, but
removing the option makes that guarantee structural rather than
incidental. Tests now exercise the underlying spawnAndCollect helper
directly when they need to inject a stand-in binary.

Generated-By: PostHog Code
Task-Id: 2cbfd9f1-a4ff-4d49-a526-1184b909a373
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant