DHQ-537: Progress Fields for Deployment Steps API — Alternatives Proposal
DHQ-537 asks for progress fields in the deployment steps API.
PR #719 adds started_at/finished_at timestamps and computed transfer_speed to the deployment steps API response. This requires an Lhm migration on deployment_steps (166M+ rows in production, ~30min downtime window, weekend-only).
Reviewers raised two concerns:
- @facundofarias: "Changing this table is tricky, we will have to do it during the weekend. Last time it took 30m." / "Also wondering if this is the only way to do it?"
- @thdurante: Suggested skipping detailed progress fields and showing simpler status (like "thinking"/"processing") instead.
This proposal evaluates alternatives.
#Option A: Expose completed_items only
Expose the existing completed_items column in the API. No new columns, no migration.
Changes: One line in to_hash (already done in PR #719 — completed_items was always returned, just not documented as the key addition).
API response:
{
"step": "transfer_files",
"status": "running",
"total_items": 31344,
"completed_items": 12400
}
CLI can show: Transferring files: 12,400/31,344 (39%)
CLI cannot show: Speed, ETA, elapsed time.
| Pros | Cons |
|---|---|
| Zero migration risk | No speed/ETA |
| Ships immediately | Clients must compute their own rate by polling delta |
| Already populated in production |
#Option B: Separate deployment_step_timings table
Create a new join table instead of altering the 166M-row deployment_steps table.
# New table — instant migration, no Lhm
class CreateDeploymentStepTimings < ActiveRecord::Migration[6.1]
def change
create_table :deployment_step_timings do |t|
t.references :deployment_step, null: false, index: { unique: true }
t.datetime :started_at, precision: 6
t.datetime :finished_at, precision: 6
end
end
end
# Model
class DeploymentStep < ApplicationRecord
has_one :timing, class_name: 'DeploymentStepTiming', dependent: :delete
end
API response: Same as original PR (with started_at, finished_at, transfer_speed).
| Pros | Cons |
|---|---|
| No Lhm, instant migration | Extra join on API reads (N+1 risk, mitigated with includes) |
| Sub-second precision (DATETIME(6)) | Slightly more complex model |
| Only new deployments create rows | Two tables to reason about |
| Full speed/ETA support |
#Option C: Store timing in Redis
Track started_at/finished_at in Redis, keyed by deployment step identifier. No schema change at all.
def set_started_at
Rails.cache.write("step_timing:#{id}:started_at", Time.current.iso8601(6), expires_in: 24.hours)
end
def started_at
@started_at ||= Time.zone.parse(Rails.cache.read("step_timing:#{id}:started_at"))
end
API response: Same as original PR (with started_at, finished_at, transfer_speed).
| Pros | Cons |
|---|---|
| No migration at all | Ephemeral — data lost on Redis restart/eviction |
| Very fast reads/writes | No historical queries |
| Zero DB impact | Adds Redis as a dependency for this feature |
| Cache expiry edge cases |
| Option | Why discarded |
|---|---|
Derive from updated_at | updated_at is overwritten by finalise (which calls save after batch-inserting logs), so you can't recover when a step started running |
Compute from deployment.started_at | Too coarse — a deployment runs many steps sequentially (preparing, building, transferring, finishing), so deployment start != step start |
Original PR (Lhm on deployment_steps) | 30min production migration on 166M-row table, weekend-only window, risk flagged by reviewer |
Start with Option A, follow up with Option B.
- Now: Expose
completed_itemsin the API (trivial, no migration). This unblocks CLI progress bars (39% complete) immediately. - Follow-up: Add
deployment_step_timingstable forstarted_at/finished_at/transfer_speed. This is a safe, instant migration that can ship on any day — no weekend window needed.
This splits the PR into a small shippable piece and a low-risk follow-up, addressing both reviewers' concerns.
- Is progress percentage (
completed_items / total_items) sufficient for the CLI MVP, or is speed/ETA a hard requirement? - If we go with Option B, should we eager-load timings in the deployments API to avoid N+1?
- Should
transfer_speedbe computed server-side or left to clients?
4 comments
+4 more