Skip to content

DOMAIN:INNOVATION — TECH_RADAR_METHODOLOGY

OWNER: joshua
UPDATED: 2026-03-24
PURPOSE: define how Joshua operates GE's technology radar — evaluation framework, scoring criteria, changelog tracking, breaking change detection, briefing standards
MODEL: adapted from ThoughtWorks Technology Radar with GE-specific criteria for agentic multi-agent context


RADAR:QUADRANT_SYSTEM

ADOPT

DEFINITION: technology is proven, recommended for all new GE projects, actively used in production
CRITERIA: used in at least 2 GE client projects, no major open issues, team has deep expertise, cost is acceptable
EXAMPLES_IN_GE: TypeScript, Hono, Drizzle ORM, PostgreSQL, Redis, Tailwind, shadcn/ui, Vitest, Claude API
ACTION: new projects default to these technologies — deviation requires explicit justification through discussion
REVIEW_CYCLE: quarterly — confirm still best-in-class, check for emerging replacements

TRIAL

DEFINITION: technology has passed evaluation, approved for limited production use, being validated in real projects
CRITERIA: successful PoC, positive cost analysis, at least 1 agent has working knowledge, rollback plan exists
EXAMPLES_IN_GE: Stryker (mutation testing), Google Stitch (design-to-code), Remotion (video)
ACTION: use in 1-2 projects with monitoring — gather real-world data before promoting to Adopt
REVIEW_CYCLE: monthly — evaluate trial results, decide promote/hold/drop
DURATION_LIMIT: 3 months maximum in Trial — must move to Adopt or back to Assess/Hold

ASSESS

DEFINITION: technology looks promising, worth investigating, but not yet proven for GE's context
CRITERIA: solves a real GE problem, community momentum is positive, no showstopper risks identified
EXAMPLES: new LLM models (pre-evaluation), new frameworks (pre-PoC), emerging protocols (MCP servers)
ACTION: Joshua investigates, writes evaluation report, may recommend PoC to Dirk-Jan
REVIEW_CYCLE: bi-weekly — quick check on momentum signals, promote to Trial or demote to Hold

HOLD

DEFINITION: technology is known but NOT recommended — either replaced, risky, or not fitting GE's architecture
CRITERIA: superseded by Adopt-tier alternative, security concerns, poor LLM compatibility, excessive cost, or architectural mismatch
EXAMPLES_IN_GE: Express (replaced by Hono), Prisma (replaced by Drizzle), chokidar (caused token burn), Docker Compose (k3s in production)
ACTION: do not adopt — if already in use, plan migration away
REVIEW_CYCLE: quarterly — check if conditions changed (new major version, acquisition, community shift)


RADAR:SCORING_CRITERIA

Each technology evaluated on 8 dimensions, scored 1-5 (1=poor, 5=excellent):

DIMENSION_1: MATURITY

WHAT: production readiness, stability, API surface stability, backward compatibility track record
SIGNALS: semantic versioning adherence, breaking change frequency, time since v1.0, enterprise adoption
WEIGHT: high — GE delivers to paying clients, immature tools risk client deliverables
SCORING:
- 5: stable for 2+ years, strong semver, rare breaking changes (e.g., PostgreSQL, React)
- 4: stable for 1+ year, occasional breaking changes with migration guides (e.g., Hono, Drizzle)
- 3: post-v1.0 but actively evolving, some API churn (e.g., newer frameworks)
- 2: pre-v1.0 or frequent breaking changes, rapid iteration phase
- 1: alpha/beta, API surface unstable, not recommended for production

DIMENSION_2: COMMUNITY_AND_ECOSYSTEM

WHAT: contributor base, documentation quality, third-party integrations, Stack Overflow/GitHub activity
SIGNALS: GitHub stars trajectory (not absolute), monthly npm downloads, number of active maintainers, corporate backing
WEIGHT: high — agents rely on documentation and examples for code generation
SCORING:
- 5: massive ecosystem, excellent docs, corporate-backed, abundant learning resources
- 4: strong ecosystem, good docs, active community, regular releases
- 3: growing community, adequate docs, some gaps in advanced topics
- 2: small community, sparse docs, limited ecosystem
- 1: minimal community, poor docs, bus-factor risk (1-2 maintainers)

DIMENSION_3: LLM_COMPATIBILITY

WHAT: how well LLMs (Claude, GPT, Gemini) can generate correct code for this technology
SIGNALS: training data coverage (pre-cutoff presence), API surface simplicity, TypeScript type inference quality, error message clarity
WEIGHT: critical — GE's agents generate code via LLMs, poor LLM compatibility means high error rates and token waste
SCORING:
- 5: LLMs generate correct code consistently, well-represented in training data, strong types (e.g., React, Express, PostgreSQL)
- 4: LLMs generate mostly correct code, minor hallucinations on edge cases (e.g., Hono, Drizzle)
- 3: LLMs know the basics but hallucinate on newer APIs, needs system prompt guidance
- 2: LLMs frequently hallucinate, tool is too new or niche for training data
- 1: LLMs cannot reliably generate code, requires extensive manual correction

DIMENSION_4: GE_STACK_FIT

WHAT: compatibility with GE's existing architecture, deployment model, and operational patterns
SIGNALS: TypeScript-first, works with k3s, compatible with existing CI/CD, doesn't require new infrastructure
WEIGHT: high — integration cost is often higher than the tool itself
SCORING:
- 5: native TypeScript, works with all GE infrastructure, zero integration friction
- 4: TypeScript support, minor configuration for GE infra, well-documented integration path
- 3: usable with TypeScript (wrappers/types available), moderate integration effort
- 2: requires significant adaptation, new runtime or infrastructure dependency
- 1: fundamentally incompatible with GE's architecture (different runtime, conflicting patterns)

DIMENSION_5: COST

WHAT: direct costs (licensing, API pricing, hosting) and indirect costs (learning curve, migration effort, maintenance overhead)
SIGNALS: pricing model transparency, free tier availability, cost-per-unit predictability, total cost of ownership
WEIGHT: high — GE operates at SME price point (10% of human developer cost), margins are tight
SCORING:
- 5: free/OSS, minimal operational cost, low maintenance overhead
- 4: affordable SaaS or metered pricing within GE's margins, predictable costs
- 3: moderate cost, pricing model is clear, fits within project budgets
- 2: expensive, cost scales unpredictably, requires careful usage monitoring
- 1: prohibitively expensive for GE's model, or pricing is opaque/enterprise-sales-only

DIMENSION_6: SECURITY

WHAT: security posture, vulnerability history, supply chain risk, data handling practices
SIGNALS: CVE history, security audit reports, dependency count, data residency options, SOC2/ISO27001 compliance
WEIGHT: critical — GE targets ISO 27001, SOC 2 Type II compliance
SCORING:
- 5: excellent security track record, audited, minimal dependencies, strong data controls
- 4: good security posture, occasional CVEs handled quickly, reasonable dependency tree
- 3: adequate security, some concerns but manageable with configuration
- 2: security concerns exist, large dependency tree, limited audit history
- 1: known vulnerabilities, poor security practices, unacceptable for GE

DIMENSION_7: PERFORMANCE

WHAT: runtime performance, build performance, developer experience speed
SIGNALS: benchmarks, bundle size, startup time, memory footprint, concurrency model
WEIGHT: medium — GE agents need fast feedback loops, but raw performance is secondary to correctness
SCORING:
- 5: best-in-class performance, minimal resource usage (e.g., Hono, Pino)
- 4: strong performance, competitive with alternatives
- 3: adequate performance, not a bottleneck in practice
- 2: noticeable performance issues, may need optimization for production
- 1: performance is a blocking concern for GE's use cases

DIMENSION_8: MIGRATION_PATH

WHAT: ease of migrating to AND from this technology — GE must never be locked in
SIGNALS: standard protocols, data export capabilities, code portability, ecosystem alternatives
WEIGHT: medium — GE's multi-provider strategy requires optionality
SCORING:
- 5: standard interfaces, trivial to swap, data fully portable (e.g., PostgreSQL, standard REST)
- 4: reasonable migration path, some adaptation needed, data exportable
- 3: moderate lock-in, migration is feasible but significant effort
- 2: high lock-in, proprietary interfaces, data migration is complex
- 1: extreme lock-in, proprietary everything, migration is a rewrite


RADAR:COMPOSITE_SCORING

FORMULA: weighted sum of all 8 dimensions
WEIGHTS: LLM_compatibility (0.20) + security (0.15) + maturity (0.15) + GE_stack_fit (0.15) + cost (0.15) + community (0.10) + performance (0.05) + migration_path (0.05)
THRESHOLD_ADOPT: >= 4.0 composite score AND no dimension below 3
THRESHOLD_TRIAL: >= 3.5 composite score AND no dimension below 2
THRESHOLD_ASSESS: >= 2.5 composite score
THRESHOLD_HOLD: < 2.5 composite score OR any critical dimension (security, LLM_compatibility) below 2

NOTE: thresholds are guidelines — Joshua uses judgment, especially for emerging categories where scoring is imprecise
NOTE: LLM_compatibility has highest weight because GE's entire execution model depends on LLMs generating correct code


RADAR:CHANGELOG_TRACKING

WHAT_TO_TRACK

  • every tool in GE's Adopt and Trial quadrants (see index.md for full list)
  • LLM provider API changes (Claude, OpenAI, Gemini) — model deprecations, API version changes, pricing changes
  • infrastructure tools (k3s, Terraform, GitHub Actions) — breaking changes, security patches
  • direct dependencies in package.json and requirements.txt across GE projects

HOW_TO_TRACK

SOURCE_1: GitHub release feeds — subscribe to releases for all tracked repositories
SOURCE_2: npm advisories — security vulnerability notifications for JavaScript ecosystem
SOURCE_3: official blogs — Anthropic, OpenAI, Google AI, Vercel, Drizzle team, Hono team
SOURCE_4: changelog files — many projects maintain CHANGELOG.md or RELEASES.md in repo
SOURCE_5: Twitter/X and Bluesky — maintainers often announce breaking changes on social media before formal release notes
SOURCE_6: Discord/Slack communities — early warnings from other users hitting issues

TRACKING_CADENCE

DAILY: check LLM provider announcements (model releases, deprecations, pricing changes)
DAILY: check security advisories for Adopt-tier tools
WEEKLY: review changelogs for all Adopt and Trial tier tools
WEEKLY: scan for new major versions of Hold-tier tools (may warrant re-evaluation)
MONTHLY: deep review of dependency trees for supply chain changes

CHANGELOG_ANALYSIS_TEMPLATE

TOOL: [name]
VERSION: [old] → [new]
RELEASE_DATE: [date]
CATEGORY: [security-patch | bugfix | minor-feature | major-release | deprecation | breaking-change]

CHANGES_RELEVANT_TO_GE:
- [change 1]: [impact on GE]
- [change 2]: [impact on GE]

BREAKING_CHANGES:
- [breaking change]: [affected GE components] [migration steps]

ACTION_REQUIRED: [none | update-at-convenience | update-within-week | update-immediately | escalate-to-dirk-jan]
ASSIGNED_TO: [agent name if action needed]

RADAR:BREAKING_CHANGE_DETECTION

EARLY_WARNING_SIGNALS

SIGNAL_1: deprecation warnings in console output — agents running code see these first
SIGNAL_2: RFC or proposal in GitHub repository — breaking changes are usually discussed months ahead
SIGNAL_3: pre-release/beta version with breaking changes — test in isolation before stable release
SIGNAL_4: community chatter — others hit the issue before GE does
SIGNAL_5: LLM training data lag — new API surfaces that LLMs hallucinate the old version of

SEVERITY_CLASSIFICATION

CRITICAL: security vulnerability in Adopt-tier tool, or breaking change that affects running production systems
- action: immediate escalation to Dirk-Jan, notify affected team leads (faye, sytske)
- timeline: patch within 24 hours

HIGH: breaking change in next major version of Adopt-tier tool, deprecation with sunset date
- action: briefing to Dirk-Jan, create evolution entry, assign migration planning
- timeline: migration plan within 1 week, execution within deprecation timeline

MEDIUM: breaking change in Trial-tier tool, or non-breaking deprecation in Adopt-tier
- action: include in daily briefing, monitor
- timeline: address in next relevant project sprint

LOW: breaking change in Assess-tier tool, or minor deprecation with long sunset
- action: note in tech radar, update evaluation score
- timeline: address when relevant

BREAKING_CHANGE_RESPONSE_TEMPLATE

ALERT: BREAKING CHANGE DETECTED
TOOL: [name]
VERSION: [affected version]
SEVERITY: [critical | high | medium | low]

WHAT_BROKE: [description]
GE_IMPACT: [which projects/agents/systems affected]
WORKAROUND: [temporary fix if available]
MIGRATION_PATH: [steps to resolve permanently]
ESTIMATED_EFFORT: [hours/story points]
DEADLINE: [when must be resolved by]
OWNER: [assigned agent or team]

RADAR:BRIEFING_STANDARDS

ACTIONABLE_VS_NOISE

ACTIONABLE: "Drizzle ORM 0.35 drops support for the $inferSelect helper — GE uses this in 12 files across admin-ui. Migration to InferSelectModel needed before upgrading."
NOISE: "Drizzle ORM 0.35 released with performance improvements."

ACTIONABLE: "Anthropic announced Claude 4 Opus with 2M context window at $12/$36 per MTok — 50% cost reduction from Opus 3.5. Joshua recommends updating provider config for cost savings."
NOISE: "Anthropic announced a new model."

ACTIONABLE: "Hono v5 RFC proposes dropping CommonJS support — GE's build system is ESM-only so no impact, but worth noting for future reference."
NOISE: "Hono v5 is being discussed."

BRIEFING_STRUCTURE

# DAILY INTELLIGENCE BRIEFING — [DATE]
## JOSHUA → DIRK-JAN

### BREAKING (immediate action required)
- [item]: [impact] → [recommended action]

### UPDATES (this week)
- [tool] [version]: [what changed] → [GE relevance]

### LANDSCAPE (strategic awareness)
- [development]: [what it means for GE]

### OPPORTUNITIES (worth evaluating)
- [technology]: [why interesting] → [recommended next step]

### COMPETITOR MOVES (market awareness)
- [competitor]: [what they did] → [GE positioning implication]

ITEMS: [count] | CRITICAL: [count] | ACTION_REQUIRED: [count]

QUALITY_RULES

RULE_1: every item must answer "so what for GE?" — if you cannot articulate the GE impact, do not include it
RULE_2: breaking items must include affected GE components, not just the abstract change
RULE_3: opportunities must include estimated evaluation effort (hours) and potential GE benefit
RULE_4: maximum 10 items per briefing — if more than 10, prioritize by GE impact and defer the rest
RULE_5: competitor moves are strategic context, not action items — do not recommend reactive changes
RULE_6: use specific version numbers, dates, and links — not vague references
RULE_7: if nothing significant happened, the briefing says "NO NOTABLE CHANGES" — do not pad with noise


RADAR:VISUALIZATION

RADAR_FORMAT

The technology radar is visualized as a document (not a graphic) organized by quadrant:

# GE TECHNOLOGY RADAR — [QUARTER] [YEAR]

## ADOPT (use for all new projects)
| Technology | Category | Score | Last Reviewed | Notes |
|-----------|----------|-------|---------------|-------|
| TypeScript | Language | 4.8 | 2026-Q1 | strict mode mandatory |
| Hono | Framework | 4.5 | 2026-Q1 | primary web framework |
| ... | ... | ... | ... | ... |

## TRIAL (approved for limited use)
| Technology | Category | Score | Trial Since | Trial Projects | Notes |
|-----------|----------|-------|-------------|----------------|-------|
| Stryker | Testing | 3.8 | 2025-Q4 | admin-ui | mutation testing |
| ... | ... | ... | ... | ... | ... |

## ASSESS (investigating)
| Technology | Category | Score | Since | Evaluator | Notes |
|-----------|----------|-------|-------|-----------|-------|
| ... | ... | ... | ... | ... | ... |

## HOLD (do not adopt)
| Technology | Category | Score | Reason | Alternative |
|-----------|----------|-------|--------|-------------|
| Express | Framework | 2.1 | slow, no native TS | Hono |
| Prisma | ORM | 2.4 | heavy runtime, custom schema | Drizzle |
| ... | ... | ... | ... | ... |

UPDATE_CADENCE

FULL_RADAR_REVIEW: quarterly (Q1, Q2, Q3, Q4)
POSITION_CHANGES: as needed, triggered by evaluation results or breaking changes
HISTORY: previous radar snapshots archived for trend tracking


RADAR:ANTI_PATTERNS

HYPE_CHASING: adopting technology because it is trending, not because it solves a GE problem
- guard: every Assess entry must have a specific GE use case, not "it looks cool"

PREMATURE_ADOPTION: moving from Assess to Adopt without Trial phase
- guard: Trial phase is mandatory for any new tool touching production code (exception: drop-in replacements with identical API)

STALE_RADAR: not reviewing Hold/Assess items, missing when tools mature or die
- guard: quarterly full review with documented decisions for every item

OVER_MONITORING: tracking 200 tools and generating noise
- guard: Adopt + Trial items only get active monitoring — Assess gets bi-weekly scan — Hold gets quarterly review

EVALUATION_PARALYSIS: endless assessment without decision
- guard: 3-month maximum in Assess — must move to Trial or Hold with documented rationale

SINGLE_SOURCE_BIAS: evaluating based on one blog post or benchmark
- guard: minimum 3 independent sources for any recommendation, including at least 1 hands-on test


RADAR:GE_SPECIFIC_CONSIDERATIONS

LLM_TRAINING_DATA_LAG: new tools released after LLM training cutoffs will have poor code generation quality — this is a real cost for GE, factor it into scoring
MULTI_PROVIDER_IMPACT: if a tool works well with Claude but poorly with GPT/Gemini, it limits agent flexibility — GE uses multiple providers
AGENT_LEARNING_COST: every new tool requires updating agent identities, system prompts, and wiki pages — this is non-trivial with 59 agents
TOKEN_EFFICIENCY: tools with verbose APIs consume more tokens per interaction — prefer concise, well-typed APIs
INFRASTRUCTURE_SIMPLICITY: GE runs on single-node k3s — tools requiring complex clustering or multi-node setups are penalized
COST_AT_SCALE: GE operates at SME price point — tools must remain cost-effective at 100k client scale