Build a cross-account, read-only diagnostics API for DeployHQ that enables autonomous support investigation via the MCP server. Currently, investigating customer issues requires human-run Rails console queries because the public API is account-scoped. This plan adds admin-level diagnostic endpoints to the Rails app (Tier 1) and documents a future read-only SQL tool (Tier 2).
Admin:: controllers at app/controllers/admin/, accessed via admin subdomain with IP whitelist (AdminConstraint in routes.rb:9-13)Admin::DeploymentsController already has recover, fix, requeue_deployment actions -- but all are HTML-only, session-authenticated, and don't return JSON diagnosticsInternalApiController uses Bearer token auth (callback_verification_token) for cross-service communication -- good pattern to follow. Test pattern at spec/requests/internal_api_controller_spec.rb uses request specs.Authentication concerndeployhq/deployhq-mcp-server, cloned locally at /Users/martakaras/Projects/DeployHQ/deployhq-mcp-server/createMCPServer() in src/mcp-server.ts (line 46), tool dispatch via switch/case (lines 81-364), Zod validation from src/tools.tssrc/api-client.ts -- DeployHQClient class with HTTP Basic Auth, request<T>() private method (line 220)src/config.ts -- env vars DEPLOYHQ_EMAIL, DEPLOYHQ_API_KEY, DEPLOYHQ_ACCOUNT + optional DEPLOYHQ_URL, DEPLOYHQ_READ_ONLYsrc/stdio.ts (primary, line 1-104), src/index.ts (Express hosted, line 1-110)src/__tests__/mcp-server.test.ts, helper invokeToolForTest() at line 375readOnlyMode config flag that blocks write operations (checked per-tool in mcp-server.ts)Deployment -- main deployment record, has_many :deployments_servers, has_many :deployment_stepsDeploymentsServer -- join table linking deployment to server, has status, file_operationsDeploymentStep -- granular step record with step type (e.g. transfer_files), server_id, statusServerGroup -- belongs_to :parent, polymorphic: true (server_group.rb:36). Has identifier (string UUID) and permalink fields. Parent is typically a Project but the association is polymorphic.Account -- has permalink field but NO identifier field. Lookup must use permalink or numeric id.deployments_servers count and transfer_files step countThe MCP server can look up any deployment by UUID (regardless of account) and return diagnostic data including server/step counts, mismatches, recent failures, and server group membership. All access is read-only, authenticated with a dedicated admin token, and logged.
admin_get_deployment returns deployment with server count, step counts, and mismatch detectionadmin_list_recent_failures returns recent failed deployments for a given account (looked up by permalink)admin_get_deployment_servers returns per-server step breakdownadmin_get_server_group returns server group details with current membershipAll admin API endpoints use the same identifiers that appear in support workflows:
identifier (UUID string, e.g. aab2065a-72ca-4263-a621-b4ddf59ca036)permalink (e.g. kicksite) -- Account has no identifier fieldidentifier (UUID string from server_group.identifier)No endpoints accept raw numeric database IDs. This matches how support tickets reference these entities.
A "server/step mismatch" is defined precisely as:
DeploymentsServer record, there MUST be exactly one DeploymentStep with step = 'transfer_files' and matching server_idDeploymentsServer exists but no transfer_files step for that server_id -- this is the bug we fixedtransfer_files step exists for a server_id that has no DeploymentsServer recordtransfer_files step for the same server_idDeploymentsServer records ARE included (they were part of the deployment)The diagnostics response reports all three categories separately.
/admin_api/ routes are intentionally outside AdminConstraint (no subdomain restriction)Authorization headerInternalApiController which is also outside AdminConstraintDEPLOYHQ_ADMIN_API_TOKEN env var, set in deployment configbefore_action if needed, using the same AuthorizedNetworks.valid_ip? pattern as AdminConstraintFollow the InternalApiController pattern: a new AdminApiController base class with Bearer token auth, under a /admin_api/ route namespace. This keeps it separate from both the session-based admin panel and the account-scoped public API. The MCP server gets a new DEPLOYHQ_ADMIN_TOKEN env var and additional admin tools alongside existing tools.
Create the base controller, authentication, routing, and diagnostic endpoints.
The token will be read from DEPLOYHQ_ADMIN_API_TOKEN environment variable. No database storage needed -- single static token is sufficient for machine-to-machine auth. Same pattern as AtechIdentity.callback_verification_token used by InternalApiController.
File: app/controllers/admin_api_controller.rb (new)
Pattern: Follow InternalApiController (app/controllers/internal_api_controller.rb:61-86)
# frozen_string_literal: true
class AdminApiController < ActionController::Base
skip_before_action :verify_authenticity_token
before_action :authenticate_admin_api
private
def authenticate_admin_api
auth_header = request.headers['Authorization']
unless auth_header&.start_with?('Bearer ')
Rails.logger.warn "[AdminAPI] Missing or invalid Authorization header from #{request.remote_ip}"
render json: { error: 'Unauthorized' }, status: :unauthorized
return
end
token = auth_header.sub('Bearer ', '')
expected_token = ENV['DEPLOYHQ_ADMIN_API_TOKEN']
if expected_token.blank?
Rails.logger.error '[AdminAPI] DEPLOYHQ_ADMIN_API_TOKEN not configured'
render json: { error: 'Internal server error' }, status: :internal_server_error
return
end
unless ActiveSupport::SecurityUtils.secure_compare(expected_token, token)
Rails.logger.warn "[AdminAPI] Invalid token from #{request.remote_ip}"
render json: { error: 'Unauthorized' }, status: :unauthorized
return
end
Rails.logger.info "[AdminAPI] Authenticated request from #{request.remote_ip} for #{request.path}"
end
end
File: app/controllers/admin_api/diagnostics_controller.rb (new)
# frozen_string_literal: true
class AdminApi::DiagnosticsController < AdminApiController
# GET /admin_api/deployments/:identifier
#
# Looks up a deployment by UUID across all accounts.
# Returns deployment metadata, server/step counts, and mismatch diagnostics.
def deployment
@deployment = Deployment.find_by!(identifier: params[:identifier])
project = @deployment.project
account = project.account
servers = @deployment.deployments_servers.includes(:server)
steps = @deployment.deployment_steps
server_ids = servers.pluck(:server_id)
transfer_steps = steps.where(step: 'transfer_files')
transfer_server_ids = transfer_steps.pluck(:server_id)
# Mismatch detection (skip for preview deployments)
if @deployment.preview?
diagnostics = { preview: true, mismatch_detection_skipped: true }
else
missing_steps = server_ids - transfer_server_ids
orphaned_steps = transfer_server_ids - server_ids
duplicate_server_ids = transfer_server_ids.group_by(&:itself).select { |_, v| v.size > 1 }.keys
diagnostics = {
server_step_mismatch: missing_steps.any? || orphaned_steps.any? || duplicate_server_ids.any?,
servers_missing_transfer_steps: missing_steps,
orphaned_transfer_steps: orphaned_steps,
duplicate_transfer_steps: duplicate_server_ids,
server_count: server_ids.size,
transfer_step_count: transfer_server_ids.size
}
end
render json: {
deployment: {
identifier: @deployment.identifier,
id: @deployment.id,
status: @deployment.status,
branch: @deployment.branch,
preview: @deployment.preview?,
created_at: @deployment.created_at,
started_at: @deployment.started_at,
completed_at: @deployment.completed_at,
failed_at: @deployment.failed_at
},
project: {
id: project.id,
name: project.name,
permalink: project.permalink
},
account: {
id: account.id,
name: account.name,
permalink: account.permalink
},
servers: {
count: server_ids.size,
ids: server_ids
},
steps: {
total: steps.count,
by_type: steps.group(:step).count,
failed: steps.where(status: 'failed').map { |s| { step: s.step, server_id: s.server_id, description: s.description } }
},
diagnostics: diagnostics
}
rescue ActiveRecord::RecordNotFound
render json: { error: 'Deployment not found' }, status: :not_found
end
# GET /admin_api/deployments/:identifier/servers
#
# Returns per-server step breakdown for a deployment.
def deployment_servers
@deployment = Deployment.find_by!(identifier: params[:identifier])
all_steps = @deployment.deployment_steps.to_a
data = @deployment.deployments_servers.includes(:server).map do |ds|
server_steps = all_steps.select { |s| s.server_id == ds.server_id }
{
server_id: ds.server_id,
server_name: ds.server&.name,
server_identifier: ds.server&.identifier,
status: ds.status,
steps: server_steps.map { |s| { step: s.step, status: s.status, identifier: s.identifier } },
has_transfer_step: server_steps.any? { |s| s.step == 'transfer_files' }
}
end
render json: { deployment_identifier: @deployment.identifier, servers: data }
rescue ActiveRecord::RecordNotFound
render json: { error: 'Deployment not found' }, status: :not_found
end
# GET /admin_api/accounts/:permalink/recent_failures
#
# Returns recent failed deployments for an account, looked up by permalink.
def recent_failures
account = Account.find_by!(permalink: params[:permalink])
limit = [(params[:limit] || 20).to_i, 100].min
deployments = Deployment
.joins(:project)
.where(projects: { account_id: account.id })
.where.not(failed_at: nil)
.order(failed_at: :desc)
.limit(limit)
.includes(:project)
render json: {
account: { id: account.id, name: account.name, permalink: account.permalink },
failures: deployments.map do |d|
{
identifier: d.identifier,
project: d.project.permalink,
branch: d.branch,
status: d.status,
failed_at: d.failed_at,
created_at: d.created_at
}
end
}
rescue ActiveRecord::RecordNotFound
render json: { error: 'Account not found' }, status: :not_found
end
# GET /admin_api/server_groups/:identifier
#
# Returns server group details with current membership, looked up by UUID identifier.
# Note: ServerGroup#parent is polymorphic but in practice is always a Project.
# We guard against non-Project parents to avoid NoMethodError.
def server_group
group = ServerGroup.find_by!(identifier: params[:identifier])
project = group.parent
group_data = {
id: group.id,
identifier: group.identifier,
name: group.name
}
# ServerGroup#parent is polymorphic (server_group.rb:36).
# In practice it's always a Project, but guard against other types.
if project.is_a?(Project)
group_data[:project] = project.permalink
group_data[:account] = project.account.permalink
else
group_data[:parent_type] = group.parent_type
group_data[:parent_id] = group.parent_id
end
render json: {
server_group: group_data,
servers: group.servers.map { |s|
{
id: s.id,
identifier: s.identifier,
name: s.name,
enabled: s.enabled?,
protocol_type: s.protocol_type
}
},
server_count: group.servers.count
}
rescue ActiveRecord::RecordNotFound
render json: { error: 'Server group not found' }, status: :not_found
end
end
File: config/routes.rb
Location: Near the existing internal_api route at line 31
# Admin API - token-authenticated diagnostics for MCP/support tooling
# Intentionally outside AdminConstraint: uses Bearer token auth, not session/subdomain.
# Same exposure model as InternalApiController.
namespace :admin_api, defaults: { format: :json } do
get 'deployments/:identifier', to: 'diagnostics#deployment', as: :deployment_diagnostics
get 'deployments/:identifier/servers', to: 'diagnostics#deployment_servers', as: :deployment_servers_diagnostics
get 'accounts/:permalink/recent_failures', to: 'diagnostics#recent_failures', as: :account_recent_failures
get 'server_groups/:identifier', to: 'diagnostics#server_group', as: :server_group_diagnostics
end
bin/rspec spec/requests/admin_api/ passesbin/rails routes | grep admin_api shows 4 GET routesDEPLOYHQ_ADMIN_API_TOKEN env var setImplementation Note: Complete this phase and all automated verification before proceeding to Phase 2.
Add admin tools to the existing MCP server that call the new Rails admin API endpoints.
Repository: deployhq/deployhq-mcp-server (local: /Users/martakaras/Projects/DeployHQ/deployhq-mcp-server/)
File: src/config.ts (lines 6-12 ServerConfig interface, lines 75-91 parseServerConfig)
Changes: Add adminToken?: string to ServerConfig. Read from DEPLOYHQ_ADMIN_TOKEN env var. No CLI flag needed (env-only is fine for a token).
// In ServerConfig interface
export interface ServerConfig {
readOnlyMode: boolean;
adminToken?: string; // Optional admin API token for cross-account diagnostics
}
// In parseServerConfig()
const adminToken = process.env.DEPLOYHQ_ADMIN_TOKEN || undefined;
return { readOnlyMode, adminToken };
File: src/api-client.ts
Changes: Add DeployHQAdminClient class. Follows the same pattern as DeployHQClient (line 192+) but uses Bearer token auth instead of Basic Auth, and targets the /admin_api/ path prefix.
export interface AdminClientConfig {
adminToken: string;
timeout?: number;
baseUrl: string; // Same base URL as the regular client
}
export class DeployHQAdminClient {
private readonly baseUrl: string;
private readonly authHeader: string;
private readonly timeout: number;
constructor(config: AdminClientConfig) {
this.baseUrl = config.baseUrl;
this.timeout = config.timeout || 30000;
this.authHeader = `Bearer ${config.adminToken}`;
}
// Same request() pattern as DeployHQClient (line 220)
// but with Bearer auth and /admin_api/ prefix
async getDeploymentDiagnostics(identifier: string): Promise<DeploymentDiagnostics> { ... }
async getDeploymentServers(identifier: string): Promise<DeploymentServersResponse> { ... }
async getRecentFailures(accountPermalink: string, limit?: number): Promise<RecentFailuresResponse> { ... }
async getServerGroup(identifier: string): Promise<ServerGroupResponse> { ... }
}
Type interfaces for each response shape to match the Rails JSON contracts defined in Phase 1.
File: src/tools.ts
Changes: Add 4 new Zod schemas and tool definitions at the end of the file.
// Schemas
export const AdminGetDeploymentSchema = z.object({
identifier: z.string().describe('Deployment UUID'),
});
export const AdminGetDeploymentServersSchema = z.object({
identifier: z.string().describe('Deployment UUID'),
});
export const AdminListRecentFailuresSchema = z.object({
account: z.string().describe('Account permalink'),
limit: z.number().optional().describe('Max results (default 20, max 100)'),
});
export const AdminGetServerGroupSchema = z.object({
identifier: z.string().describe('Server group UUID'),
});
// Tool definitions with readOnlyHint: true, destructiveHint: false
File: src/mcp-server.ts
Changes:
In createMCPServer() (line 46), accept adminToken parameter. The ListToolsRequestSchema handler (line 75) conditionally includes admin tools only when adminToken is set:
server.setRequestHandler(ListToolsRequestSchema, async () => {
const availableTools = adminToken ? [...tools, ...adminTools] : tools;
return { tools: availableTools };
});
In the CallToolRequestSchema handler (line 81), add cases for admin tools. If an admin tool is called but no adminToken is configured, return an error explaining admin tools are not available.
Files: src/stdio.ts (line 32-46), src/index.ts (line 23-36)
Changes: Read DEPLOYHQ_ADMIN_TOKEN from env. Pass config.adminToken to createMCPServer(). Log whether admin tools are enabled:
log.info(`Admin tools: ${config.adminToken ? 'enabled' : 'disabled (no DEPLOYHQ_ADMIN_TOKEN)'}`);
npm run build succeedsnpm run lint passesnpm run test passes with new admin tool testsDEPLOYHQ_ADMIN_TOKEN is not setDEPLOYHQ_ADMIN_TOKEN is setDEPLOYHQ_ADMIN_TOKEN in .claude.jsonadmin_get_deployment with a known deployment UUIDImplementation Note: Complete this phase and all automated verification before proceeding to Phase 3.
Write request specs for the Rails endpoints and Vitest tests for the MCP tools.
File: spec/requests/admin_api/diagnostics_spec.rb (new)
Pattern: Follow spec/requests/internal_api_controller_spec.rb (request specs with Bearer token auth)
# Test structure:
describe 'Admin API Diagnostics' do
let(:valid_token) { 'test-admin-api-token' }
before do
ENV['DEPLOYHQ_ADMIN_API_TOKEN'] = valid_token
end
after do
ENV.delete('DEPLOYHQ_ADMIN_API_TOKEN')
end
describe 'authentication' do
# 401 without token
# 401 with wrong token
# 500 when DEPLOYHQ_ADMIN_API_TOKEN not configured
# 200 with correct token
end
describe 'GET /admin_api/deployments/:identifier' do
# Returns deployment with correct metadata
# Returns correct server and step counts
# Detects missing transfer_files steps (mismatch)
# Detects orphaned transfer_files steps
# Detects duplicate transfer_files steps
# Skips mismatch detection for preview deployments
# Returns 404 for unknown identifier
end
describe 'GET /admin_api/deployments/:identifier/servers' do
# Returns per-server step breakdown
# has_transfer_step is true/false correctly
# Returns 404 for unknown identifier
end
describe 'GET /admin_api/accounts/:permalink/recent_failures' do
# Returns failures scoped to account
# Respects limit param
# Caps limit at 100
# Returns 404 for unknown permalink
end
describe 'GET /admin_api/server_groups/:identifier' do
# Returns group with server list
# Handles polymorphic parent gracefully
# Returns 404 for unknown identifier
end
end
Use factories to create deterministic test data:
File: src/__tests__/admin-tools.test.ts (new)
Pattern: Follow src/__tests__/mcp-server.test.ts (Vitest, invokeToolForTest() helper at line 375)
Tests:
adminToken in configadminToken is providedDeployHQAdminClient methodbin/rspec spec/requests/admin_api/ passesnpm run test passes in MCP serverDEPLOYHQ_ADMIN_API_TOKEN setDEPLOYHQ_ADMIN_API_TOKEN on stagingDEPLOYHQ_ADMIN_TOKEN matchingadmin_get_deployment with a known deployment UUIDincludes) to avoid N+1 queriesdeployment_servers endpoint loads all steps once then filters in Ruby (avoids N+1)recent_failures has a default limit of 20, hard cap at 100Authorization header, validated with ActiveSupport::SecurityUtils.secure_compare (prevents timing attacks)AdminConstraint (no subdomain restriction), same model as InternalApiController. Authentication relies solely on the token.DEPLOYHQ_ADMIN_API_TOKEN. Rotation = change env var + restart. No database records.before_action using AuthorizedNetworks.valid_ip? (same pattern as AdminConstraint in routes.rb:9-13).Tier 1 covers ~80% of support investigations with structured endpoints. The remaining 20% are ad-hoc questions that can't be predicted: "how many deployments hit this code path last week?", "is there a pattern in which server types fail?", "what's the distribution of server group sizes across accounts?".
A single admin API endpoint that accepts a SQL query and returns results:
POST /admin_api/query
{
"sql": "SELECT COUNT(*) FROM deployments WHERE failed_at > '2026-04-01'"
}
admin_sql_query:
sql: string (the SELECT query)
-> returns: { columns: string[], rows: any[][], row_count: number, truncated: boolean }
Build Tier 2 when there are 3+ support cases where Tier 1 endpoints were insufficient and a custom query would have solved the problem. Track these cases to justify the investment and inform what guardrails are most important.
app/controllers/internal_api_controller.rbspec/requests/internal_api_controller_spec.rbapp/controllers/admin/deployments_controller.rbapp/models/server_group.rb:36deployhq-mcp-server/src/stdio.tsdeployhq-mcp-server/src/mcp-server.ts:81-364deployhq-mcp-server/src/__tests__/mcp-server.test.ts:375fix/retry-deployment-server-step-mismatch branchthoughts/shared/handoffs/general/2026-04-14_10-38-26_retry-deployment-server-step-mismatch.md
comments (0)