Agent
Updater
Updater Primary Spec
19 min
1\ introduction commandit utilizes a suite of agent components (ui/app, service, probe) installed on managed endpoints across various operating systems (windows, macos, linux) to ensure security, stability, and feature availability, these components must be kept up to date reliably and efficiently this document describes the requirements for a dedicated auto update service ( commandit updater ) responsible for managing the installation and updates of itself and all other commandit agent components, as well as the requirements for the supporting server side infrastructure and processes the updater must operate independently, handle initial installations, support release channels, provide defined rollback capabilities, support immediate update triggers, and follow industry best practices for robustness and security, similar to those employed by rmm and antivirus solutions 2\ goals & objectives reliable updates automatically deploy updates for all commandit agent components (ui/app, service, probe) and the updater service itself across all supported platforms, respecting assigned release channels simplified deployment facilitate the initial installation of commandit agents by bootstrapping the download and setup of the latest component versions appropriate for the target release channel self sufficiency operate independently of other commandit components, ensuring it can function even if other services are stopped or corrupted security ensure the integrity and authenticity of all downloaded update packages, maintain secure communication channels, protect update processes from tampering, and authenticate client requests resilience gracefully handle network interruptions, failed updates (including defined rollback procedures), concurrency issues, and unexpected system states efficiency minimize resource consumption (cpu, memory, network bandwidth, disk i/o) on the endpoint, both when idle and during active updates platform coverage support all specified target operating systems and architectures controlled rollouts support distinct release channels (e g , alpha, beta, production) for phased updates manageable infrastructure provide a reliable, scalable, secure, and auditable update server infrastructure and package management process on demand updates allow updates to be triggered immediately when required 3\ scope 3 1 in scope independent execution updater runs as a standalone process/service on the endpoint, ensuring concurrency control update check updater periodically checks a designated commandit update server ( updates commandit com ) via defined api endpoints for available updates based on the agent's assigned release channel forced update trigger ("update now") support for an external trigger (e g , via secure ipc from the main service or specific cli command) to initiate an immediate update check and application cycle, bypassing the regular schedule secure download updater downloads update packages over secure channels (https) integrity & authenticity verification updater verifies the cryptographic signature and integrity (e g , sha 256 hash) of downloaded packages before application using a securely managed public key self update updater is capable of updating its own executable/service binary securely component update updater manages the update process for ui/app, service, and probe components (stopping only the target component, backing up previous version, replacing files, restarting service, verifying health) specific handling for ui/app updates update rollback if an update fails validation, file operations, health checks, or other defined critical steps, the updater automatically reverts the affected component(s) to the previously backed up version component reinstall updater provides the ability to perform a clean installation of any managed component initial installation/bootstrap the updater executable facilitates the initial installation of commandit agents configuration management updater manages its own configuration (update server url, check interval, component manifest, release channel , public key info) detailed logging updater maintains detailed, structured operational logs for diagnostics and auditing, including defined error codes and rollback events basic status reporting updater provides a mechanism for status checking including defined status codes platform adaptation updater handles os specific requirements release channel support updater fetches updates corresponding to its configured channel update package creation building, signing (using secure key management), packaging (structure to be defined by dev team), and publishing update packages to the update server, including channel management update server infrastructure hosting (digital ocean), management, scaling, security (including rate limiting and basic client auth), and auditing of the update distribution endpoint ( updates commandit com ) and its apis 3 2 out of scope centralized management ui a user interface for fleet wide management (interacts with updater/server but is separate) advanced rollback multi version rollback strategies user driven updates updates initiated via the commandit ui/app delta updates (v1 4) see section 9 on network caching (v1 4) see section 9 advanced scheduling/throttling (v1 4) see section 9 (e g , update windows, user activity checks, metered network awareness) offline/air gapped environment updates requires connectivity to updates commandit com 4\ functional requirements 4 1 core update process (client side commandit updater ) fr upd 001 update check trigger scheduled periodic checks (configurable interval, e g , 4h + jitter) fr upd 001b update check trigger forced ("update now") the updater service must expose a secure mechanism (e g , specific command line argument commandit updater checknow , or a signal/ipc method callable by the main commandit service) to trigger an immediate update check cycle this trigger bypasses the regular schedule interval (fr upd 001) the rest of the update process (check, download, verify, install, rollback) proceeds as normal appropriate security measures must prevent misuse of this trigger by unauthorized processes/users (e g , cli only usable by root/system, ipc requires authenticated caller) fr upd 002 update server communication outbound https (tls 1 2+) to defined api endpoints on https //updates commandit com include simple authentication header (see nfr srv 010) consider nfr sec 005 (certificate pinning) fr upd 003 update manifest request send request (defined format, e g , json) including updater version string os string (e g , "windows", "macos", "linux") arch string (e g , "amd64", "arm64") release channel string (e g , "stable", "beta", "alpha") components array of objects \[{ "name" string, "version" string }] fr upd 004 update manifest response handling parse response (defined format, e g , json) from server (see fr srv 001) handle cases where no updates are available or errors occur fr upd 005 secure download download packages via https from urls in the manifest implement retries with exponential backoff fr upd 006 package verification before installation verify sha 256 hash against manifest verify package signature using the configured public key (see nfr sec 006) log verification success/failure clearly discard and log errors on failure failure triggers rollback (fr rbk 001) fr upd 007 update sequencing & transactional rollback apply updates in a safe order (guided by optional server manifest dependencies) if an update sequence involves multiple components (a, b, c), and component b fails to update (at any step triggering rollback), the updater must attempt to roll back component b and component a (which previously succeeded in this sequence) to maintain a consistent state log the full sequence rollback attempt fr upd 008 temporary space management check available space before download/backup clean up temporary files/downloads post operation (success, failure, rollback) failure due to insufficient space during backup/extraction triggers rollback (fr rbk 001) fr upd 009 concurrency control implement mechanism (e g , system wide named mutex or lock file in a secure location) to prevent multiple updater instances running simultaneously or interfering with an ongoing update process subsequent instances (including checknow triggers) should exit gracefully or wait if a lock is held, logging the condition 4 2 self update ( commandit updater ) fr sup 001 self update mechanism robust process (e g , download new binary > backup old > launch new > new replaces old > cleanup > restart service registration) handle permissions and service registration correctly fr sup 002 self update resilience recover from interruptions (reboot) during self update fr sup 003 self update rollback if the new updater fails basic health check (e g , cannot initialize, fails core function check, crashes quickly) immediately after launch, automatically attempt to restore the backed up binary and restart the previous version log clearly failure triggers rollback (fr rbk 001) 4 3 component update (ui/app, service, probe) fr cmp 001 component state management & graceful stop before updating a component, stop only the target component's service/process use os best practices for graceful stop send sigterm (linux/macos) or equivalent service stop control / wm close (windows) wait for a defined period (e g , 30 seconds) for the process to exit cleanly if the process has not exited, send sigkill (linux/macos) or terminateprocess (windows) fr cmp 002 backup create a backup of the existing component installation directory/files before replacement retain only the single immediately preceding version fr cmp 003 file replacement extract and replace files from the verified package preserve designated configuration/data files as specified by package metadata (structure tbd by devs) handle file permissions correctly critical file operation failures (e g , permission denied, disk full mid copy, file corruption during extraction) trigger rollback (fr rbk 001) fr cmp 004 component restart restart the service/process after file replacement failure to initiate restart (e g , service registration issue) triggers rollback (fr rbk 001) fr cmp 005 restart verification & health check verify successful restart perform a defined health check (mechanism tbd by devs for each component; minimum process running) if verification or health check fails within a timeout period, log the specific error and initiate rollback (fr rbk 001) fr cmp 006 post update cleanup on successful verification, remove the backup created in fr cmp 002 fr cmp 007 ui/app update handling updates for the ui/app component (user session process) should ideally be attempted when the updater detects no active user sessions associated with the app (best effort detection, os dependent) if an update must proceed while the ui/app is running, the updater should attempt graceful stop (fr cmp 001) but will not stop unrelated components (e g , the service) on terminal servers, updates may proceed even with active sessions after attempting graceful stop 4 4 update rollback (client side) fr rbk 001 rollback execution & triggers rollback must be triggered automatically upon detection of the following failures during an update attempt for a component (or self update) package hash verification failure (fr upd 006) package signature verification failure (fr upd 006) insufficient disk space during backup or file extraction (fr upd 008) failure to gracefully stop the existing service/process within timeout (fr cmp 001) critical file operation failure during extraction/replacement (e g , permissions, disk full, corruption) (fr cmp 003) failure to initiate the restart of the service/process (fr cmp 004) failure of the updated service/process to start or pass its health check within timeout (fr cmp 005) failure of the new updater binary to pass its health check during self update (fr sup 003) rollback procedure ensure component service/process is stopped remove files installed during the failed attempt restore files from the backup (fr cmp 002) handle potential errors during restore attempt to restart the component using the restored (previous) version verify the rollback perform the same health check (fr cmp 005) on the restored version log the rollback attempt, specific trigger reason (using defined error code), and success/failure status (including rollback verification result) fr rbk 002 rollback failure if restoring the backup or restarting/verifying the previous version also fails, log a critical error with specific details (using defined error code) leave the component stopped report critical failure status (fr sta 001) requires manual intervention 4 5 component reinstall (client side) fr rin 001 reinstall trigger support command/flag for full reinstall using the latest version for the channel fr rin 002 clean slate optionally remove existing component directory (preserving designated files) before fresh install rollback not applicable 4 6 initial installation / bootstrap (client side) fr bst 001 bootstrap execution standalone updater executable serves as installer payload fr bst 002 first run logic on 'install' or no manifest determine release channel (cli arg, default 'stable') connect to updates commandit com download manifest for baseline components for the channel download, verify, install components install self as persistent service/daemon with correct permissions create initial config (including channel) and component manifest securely provision the initial public key (e g , embedded in bootstrap binary or placed via secure deployment script) (method tbd by devs) fr bst 003 initial config security ensure initial configuration file permissions are set correctly (fr cfg 004) 4 7 configuration (client side) fr cfg 001 configuration file use local config file (e g , /etc/commandit/updater conf , %programdata%\commandit\updater conf ; format json/yaml) for update server url string (default https //updates commandit com ) update check interval seconds integer (default 14400) component manifest path string log file path string log max size mb integer log max files integer public key path or value string (path preferred for easier rotation) release channel string auth secret string (for simple shared secret auth nfr srv 010) fr cfg 002 configuration override allow override via cli/env vars for initial deployment/testing fr cfg 003 component manifest local manifest (e g , manifest json ) tracking installed components, versions, paths updated atomically on success fr cfg 004 configuration security configuration files must have permissions restricting read/write access to privileged users (system/root) only 4 8 logging and status (client side) fr log 001 detailed structured logging log key events with timestamps, severity, specific error codes/messages a defined list/enumeration of machine parseable error codes will be created by the development team to represent distinct failure conditions use a standard format (e g , json lines) fr log 002 log rotation rotate logs based on size ( log max size mb ) and number of files ( log max files ) fr sta 001 status reporting provide machine readable status (file/registry/plist) last check time iso8601 timestamp last update attempt time iso8601 timestamp last update success time iso8601 timestamp installed components array of objects \[{ "name" string, "version" string }] assigned channel string last operation status code integer/enum (a defined list/enumeration will be created by the development team) last error code integer/enum (corresponds to fr log 001 codes, null/0 on success) last error message string (brief human readable error summary on failure) 4 9 update server & packaging requirements (server side) fr srv 001 manifest generation generate json manifest response based on client request (fr upd 003) must accurately reflect latest versions for requested components/os/arch/channel structure example { "updates" \[ { "name" "commandit service", "version" "1 2 3", "url" "\[https //updates commandit com/packages/]\(https //updates commandit com/packages/) ", "hash sha256" " ", "signature" " ", // base64 encoded signature "release notes url" " ", // optional "dependencies" \[] // optional \[{ "name" " ", "min version" " " }] } // other components ] } fr srv 002 package hosting securely host signed packages (https) use digital ocean spaces (cdn recommended) for scalability/availability fr srv 003 channel management logic to manage versions across channels tool/api for promoting builds (e g , beta > stable) fr srv 004 package signing secure, automated process using dedicated signing keys (nfr sec 004) sign the package archive itself or include a signed manifest within the archive (tbd by devs) fr srv 005 build & packaging pipeline automated pipeline (ci/cd) to build, package (tr pack 001), sign, and upload to hosting, updating server metadata fr srv 006 scalability design infrastructure (do droplets/app platform, load balancers, spaces) for expected load fr srv 007 security secure server infrastructure (firewalls, access controls, regular updates) protect signing keys rigorously fr srv 008 api definition define and document the specific api endpoints (e g , /v1/check , /v1/packages/ ) and their request/response formats (tr api 001) 5\ non functional requirements nfr sec 001 secure communication https with tls 1 2+ for all client server traffic nfr sec 002 update integrity sha 256 hash verification and digital signature verification (e g , rsa 2048+, ecdsa p 256+) on client for all packages nfr sec 003 least privilege (client) updater service runs as system (windows) or root (linux/macos) nfr sec 004 secure signing process (server) use secure key management (hsm preferred, otherwise strictly controlled environment) strict access control regular key rotation policy nfr sec 005 certificate pinning (client consideration) evaluate implementing certificate pinning for updates commandit com (adds complexity for certificate rotation) defer decision to detailed design nfr sec 006 public key management (client) initial public key securely provisioned (fr bst 002) define a secure mechanism for updating the public key itself via a signed update package (likely requires special handling/multi stage process tbd by devs) nfr sec 007 local file protection (client) updater binary, config files (fr cfg 004), logs, temporary/backup directories must have restrictive file system permissions nfr per 001 low resource idle (client) minimal resource usage when idle nfr per 002 efficient download (client/server) server should support http range requests nfr per 003 server performance fast api responses and download speeds monitor performance nfr rel 001 network resilience (client) tolerate intermittent connectivity; use retries with exponential backoff nfr rel 002 installation atomicity & rollback (client) strive for atomic updates per component reliable rollback to previous state on defined failures (fr rbk 001) transactional rollback for sequences (fr upd 007) nfr rel 003 crash recovery (client) recover gracefully from crashes/reboots, resuming, cleaning up partial operations, or rolling back as appropriate based on state nfr rel 004 server availability high availability design for update server nfr rel 005 concurrency safety (client) updater must handle concurrent execution attempts safely (fr upd 009) nfr pla 001/002/003 windows/macos/linux support as previously defined nfr mnt 001 code quality well documented, modular, maintainable go code following best practices nfr tst 001 testability (client) design for testability (unit, integration) cli flags for testing modes (including checknow ) nfr tst 002 testability (server) testable api and packaging process nfr aud 001 server side auditing log key events on the update server for monitoring and security analysis required fields per request/event timestamp (iso8601) source ip address request endpoint/operation (e g , /v1/check , packageupload ) authentication identifier (e g , masked shared secret identifier, or agent id if available later) authentication result (success/failure) request parameters (os, arch, channel, requested components/versions) response status code (http status) manifest content summary (e g , components/versions offered) or package details (name, version, channel) error message (if applicable) request duration nfr srv 009 rate limiting (server) implement server side rate limiting on api endpoints (e g , per ip address, per auth identifier) to prevent accidental or malicious dos attacks nfr srv 010 client authentication (server/client) implement simple client authentication recommendation client sends a pre configured shared secret via a custom http header (e g , x commandit updater secret ) with each api request (fr upd 002) server validates this secret secret stored securely on client (fr cfg 001, fr cfg 004) 6\ technical requirements tr lang 001 client must be go server components likely go, tbd tr deps 001 minimize client external dependencies tr build 001 cross compiling build pipeline tr pack 001 client single executable update packages archives (zip/tar gz) internal structure, metadata format (e g , for config preservation rules), and pre/post script handling tbd by dev team tr infra 001 server hosted on digital ocean tr api 001 formal api definition document (e g , openapi/swagger) for server endpoints (fr srv 008) tr err 001 define standard error codes/enums for client logging and status reporting (fr log 001, fr sta 001) (responsibility dev team) 7\ error handling & recovery eh cli 001 network errors (client) retry with backoff log persistent failures with specific error codes (tr err 001) eh cli 002 verification failure (client) log specific failure code (tr err 001 hash mismatch, signature invalid) discard package trigger rollback (fr rbk 001) report status eh cli 003 installation/file op failure (client) log specific error code (tr err 001 permission, disk full, file corrupt) initiate rollback (fr rbk 001) report status based on rollback outcome eh cli 004 service control failure (client) log specific error code (tr err 001 stop failed, start failed, health check failed) initiate rollback (fr rbk 001) report status eh cli 005 rollback failure (client) log critical error code (tr err 001 restore error, start failed after restore) report critical failure status requires manual intervention eh srv 001 server errors (server) robust error handling/logging return appropriate http error codes and structured json error messages where possible (e g , { "error code" "invalid channel", "message" " " } ) 8\ release criteria / success metrics (previous criteria remain valid) rollback triggers correctly on all specified failure conditions (fr rbk 001) transactional rollback for multi component sequences functions correctly (fr upd 007) defined error codes (tr err 001) are implemented and used consistently in logs/status ui/app update handling respects active use (fr cmp 007) graceful stop procedure (fr cmp 001) is implemented correctly server implements rate limiting (nfr srv 009) and client authentication (nfr srv 010) server side audit logs contain all required fields (nfr aud 001) forced update trigger (fr upd 001b) successfully initiates an update cycle and functions correctly 9\ open issues / future considerations (v2+) delta updates binary diffing for bandwidth savings on network cache local peer to peer or designated cache server for packages advanced scheduling & throttling update windows, user activity checks, network awareness (metered connections) offline installer package bundled installer with packages for offline setup updater configuration updates secure mechanism for updating updater conf itself advanced rollback strategy multi version rollback a/b testing/canary releases finer grained rollouts client resource throttling implementation define specific mechanisms/limits for cpu/io usage during updates certificate pinning implementation (nfr sec 005) decision and implementation plan component health check mechanisms (fr cmp 005) final definition for each component package structure definition (tr pack 001) final definition including metadata/scripting initial public key provisioning method (fr bst 002) final definition public key rotation mechanism (nfr sec 006) final definition