From a1032484bd2a4be46288fb6f63d936bd47ea546f Mon Sep 17 00:00:00 2001 From: WorkClub Automation Date: Tue, 3 Mar 2026 14:10:04 +0100 Subject: [PATCH] docs(k8s): add Task 6 Kustomize base manifests learnings MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit - Kustomize vs Helm trade-offs and base+overlay pattern - K8s resource naming conventions with workclub- prefix - .NET health probe semantics (startup/liveness/readiness) - StatefulSet + headless service pattern for Postgres - PostgreSQL 16-alpine with pg_isready health check - Keycloak 26.x production mode configuration - Ingress path-based routing (/ → frontend, /api → backend) - ConfigMap strategy for non-sensitive configuration - Resource requests/limits placeholders for overlays - Image tag strategy with :latest placeholder - Gotchas: serviceName, headless service publishNotReadyAddresses, probe timeouts --- .../notepads/club-work-manager/learnings.md | 415 ++++++++++++++++++ 1 file changed, 415 insertions(+) diff --git a/.sisyphus/notepads/club-work-manager/learnings.md b/.sisyphus/notepads/club-work-manager/learnings.md index e2a2ce4..58e2067 100644 --- a/.sisyphus/notepads/club-work-manager/learnings.md +++ b/.sisyphus/notepads/club-work-manager/learnings.md @@ -232,3 +232,418 @@ _Conventions, patterns, and accumulated wisdom from task execution_ - Set up relationships between entities - Configure PostgreSQL xmin concurrency token + +--- + +## Task 6: Kubernetes Kustomize Base Manifests (2026-03-03) + +### Key Learnings + +1. **Kustomize vs Helm Trade-offs** + - Kustomize chosen: lightweight, YAML-native, no templating language + - Base + overlays pattern: separate environment-specific config from base + - Base manifests use placeholders for image tags (`:latest`), resource limits (100m/256Mi requests) + - Environment overlays (dev, staging, prod) override via patches/replacements + +2. **Kubernetes Resource Naming & Labeling** + - Consistent `workclub-` prefix across all resources (Deployments, Services, ConfigMaps, StatefulSets, Ingress) + - Labels for resource tracking: `app: workclub-api`, `component: backend|frontend|auth|database` + - Service selectors must match Pod template labels exactly + - DNS service names within cluster: `serviceName:port` (e.g., `workclub-api:80`) + +3. **.NET Health Probes (ASP.NET Core Health Checks)** + - Three distinct probes with different semantics: + - `startupProbe` (/health/startup): Initial boot, longer timeout (30s retries), prevents traffic until app fully initialized + - `livenessProbe` (/health/live): Periodic health (15s), restart pod if fails continuously (3 failures) + - `readinessProbe` (/health/ready): Pre-request check (10s), removes pod from service on failure (2 failures) + - Startup probe MUST complete before liveness/readiness are checked + - All three probes return `200 OK` for healthy status + +4. **StatefulSet + Headless Service Pattern** + - StatefulSet requires `serviceName` pointing to headless service (clusterIP: None) + - Headless service enables stable network identity: `pod-0.serviceName.namespace.svc.cluster.local` + - Primary service (ClusterIP) for general pod connections + - Volume claim templates: each pod gets its own PVC (e.g., `postgres-data-workclub-postgres-0`) + - Init container scripts via ConfigMap mount to `/docker-entrypoint-initdb.d` + +5. **PostgreSQL StatefulSet Configuration** + - Image: `postgres:16-alpine` (lightweight, 150MB vs 400MB+) + - Health check: `pg_isready -U app -d workclub` (simple, fast, reliable) + - Data persistence: volumeClaimTemplate with 10Gi storage, `standard` storageClassName (overrideable in overlay) + - Init script creates both `workclub` (app) and `keycloak` databases + users in single ConfigMap + +6. **Keycloak 26.x Production Mode** + - Image: `quay.io/keycloak/keycloak:26.1` (Red Hat official registry) + - Command: `start` (production mode, not `start-dev`) + - Database: PostgreSQL via `KC_DB=postgres` + `KC_DB_URL_HOST=workclub-postgres` + - Probes: `/health/ready` (readiness), `/health/live` (liveness) + - Hostname: `KC_HOSTNAME_STRICT=false` in dev (allows any Host header) + - Proxy: `KC_PROXY=edge` for behind reverse proxy (Ingress) + +7. **Ingress Path-Based Routing** + - Single ingress rule: `workclub-ingress` with path-based routing + - Frontend: path `/` → `workclub-frontend:80` (pathType: Prefix) + - Backend: path `/api` → `workclub-api:80` (pathType: Prefix) + - Host: `localhost` (overrideable per environment) + - TLS: deferred to production overlay (cert-manager, letsencrypt) + +8. **ConfigMap Strategy for Non-Sensitive Configuration** + - Central `workclub-config` ConfigMap: + - `log-level: Information` + - `cors-origins: http://localhost:3000` + - `api-base-url: http://workclub-api` + - `keycloak-url: http://workclub-keycloak` + - `keycloak-realm: workclub` + - Database host/port/name + - Sensitive values (passwords, connection strings) → Secrets (not in base) + - Environment-specific overrides in dev/prod overlays (CORS_ORIGINS changes) + +9. **Resource Requests & Limits Pattern** + - Base uses uniform placeholders (all services: 100m/256Mi requests, 500m/512Mi limits) + - Environment overlays replace via patch (e.g., prod: 500m/2Gi) + - Prevents resource contention in shared clusters + - Allows gradual scaling experiments without manifests changes + +10. **Image Tag Strategy** + - Base: `:latest` placeholder for all app images + - Registry: uses default Docker Hub (no registry prefix) + - Overlay patch: environment-specific tags (`:v1.2.3`, `:latest-dev`, `:sha-abc123`) + - Image pull policy: `IfNotPresent` (caching optimization for stable envs) + +### Architecture Decisions + +- **Why Kustomize over Helm**: Plan explicitly avoids Helm (simpler YAML, no new DSL, easier Git diffs) +- **Why base + overlays**: Separation of concerns — base is declarative truth, overlays add environment context +- **Why two Postgres services**: Headless for StatefulSet DNS (stable identity), Primary for app connections (load balancing) +- **Why both startup + liveness probes**: Prevents restart loops during slow startup (Java/Keycloak can take 20+ seconds) +- **Why ConfigMap for init.sql**: Immutable config, easier than baked-into-image, updateable per environment + +### Gotchas to Avoid + +- Forgetting `serviceName` in StatefulSet causes pod DNS discovery failure (critical for Postgres) +- Missing headless service's `publishNotReadyAddresses: true` prevents pod-to-pod startup communication +- Keycloak startup probe timeout too short (<15s retries) causes premature restart loops +- `.NET health endpoints require HttpGet, not TCP probes (TCP only checks port, not app readiness) +- Ingress path `/api` must use `pathType: Prefix` to catch `/api/*` routes + +### Next Steps + +- Task 25: Create dev overlay (env-specific values, dev-db.postgres.svc, localhost ingress) +- Task 26: Create prod overlay (TLS config, resource limits, replica counts, PDB) +- Task 27: Add cert-manager + Let's Encrypt to prod +- Future: Network policies, pod disruption budgets, HPA (deferred to Wave 2) + + +--- + +## Task 5: Next.js 15 Project Initialization (2026-03-03) + +### Key Learnings + +1. **Next.js 15 with Bun Package Manager** + - `bunx create-next-app@latest` with `--use-bun` flag successfully initializes projects + - Bun installation 3-4x faster than npm/yarn (351 packages in 3.4s) + - Next.js 16.1.6 (Turbopack) is default in create-next-app@latest (latest version) + - Bun supports all Node.js ecosystem tools seamlessly + - Dev server startup: 625ms ready time (excellent for development) + +2. **shadcn/ui Integration** + - Initialize with `bunx shadcn@latest init` (interactive prompt, sensible defaults) + - Default color palette: Neutral (can override with slate, gray, zinc, stone) + - CSS variables auto-generated in `src/app/globals.css` for theming + - Components installed to `src/components/ui/` automatically + - Note: `toast` component deprecated → use `sonner` instead (modern toast library) + +3. **Standalone Output Configuration** + - Set `output: 'standalone'` in `next.config.ts` for Docker deployments + - Generates `.next/standalone/` with self-contained server.js entry point + - Reduces Docker image size: only includes required node_modules (not full installation) + - Production builds on this project: 2.9s compile, 240.4ms static page generation + - Standalone directory structure: `.next/`, `node_modules/`, `server.js`, `package.json` + +4. **TypeScript Path Aliases** + - `@/*` → `./src/*` pre-configured in `tsconfig.json` by create-next-app + - Enables clean imports: `import { Button } from '@/components/ui/button'` + - Improves code readability, reduces relative path navigation (`../../`) + - Compiler validates paths automatically (LSP support included) + +5. **Directory Structure Best Practices** + - App Router location: `src/app/` (not `pages/`) + - Component organization: `src/components/` for reusable, `src/components/ui/` for shadcn + - Utilities: `src/lib/` for helper functions (includes shadcn's `cn()` function) + - Custom hooks: `src/hooks/` (prepared for future implementation) + - Type definitions: `src/types/` (prepared for schema/type files) + - This structure scales from MVP to enterprise applications + +6. **Build Verification** + - `bun run build` exit code 0, no errors + - TypeScript type checking passes (via Next.js) + - Static page generation: 4 pages (/, _not-found) + - No build warnings or deprecations + - Standalone build ready for Docker containerization + +7. **Development Server Performance** + - `bun run dev` startup: 625ms (ready state) + - First page request: 1187ms (includes compilation + render) + - Hot Module Reloading (HMR): Turbopack provides fast incremental updates + - Bun's fast refresh cycles enable rapid development feedback + - Note: Plan indicates Bun P99 SSR latency (340ms) vs Node.js (120ms), so production deployment will use Node.js + +### shadcn/ui Components Installed + +All 10 components successfully added to `src/components/ui/`: +- ✓ button.tsx — Base button component with variants (primary, secondary, etc.) +- ✓ card.tsx — Card layout container (Card, CardHeader, CardFooter, etc.) +- ✓ badge.tsx — Status badges with color variants +- ✓ input.tsx — Form input field with placeholder and error support +- ✓ label.tsx — Form label with accessibility attributes +- ✓ select.tsx — Dropdown select with options (Radix UI based) +- ✓ dialog.tsx — Modal dialog component (Alert Dialog pattern) +- ✓ dropdown-menu.tsx — Context menu/dropdown menu (Radix UI based) +- ✓ table.tsx — Data table with thead, tbody, rows +- ✓ sonner.tsx — Toast notifications (modern replacement for react-hot-toast) + +All components use Tailwind CSS utilities, no custom CSS files needed. + +### Environment Variables Configuration + +Created `.env.local.example` (committed to git) with development defaults: +``` +NEXT_PUBLIC_API_URL=http://localhost:5000 # Backend API endpoint +NEXTAUTH_URL=http://localhost:3000 # NextAuth callback URL +NEXTAUTH_SECRET=dev-secret-change-me # Session encryption (Task 10) +KEYCLOAK_ISSUER=http://localhost:8080/realms/workclub # OAuth2 discovery +KEYCLOAK_CLIENT_ID=workclub-app # Keycloak client ID +KEYCLOAK_CLIENT_SECRET= # Placeholder (Task 3 fills in) +``` + +Pattern: `.env.local.example` is version-controlled, `.env.local` is gitignored per `.gitignore`. + +### Dependencies Installed + +```json +{ + "dependencies": { + "next": "16.1.6", + "react": "19.2.3", + "react-dom": "19.2.3" + }, + "devDependencies": { + "@tailwindcss/postcss": "4.2.1", + "@types/node": "20.19.35", + "@types/react": "19.2.14", + "@types/react-dom": "19.2.3", + "eslint": "9.39.3", + "eslint-config-next": "16.1.6", + "tailwindcss": "4.2.1", + "typescript": "5.9.3" + } +} +``` + +Note: Intentionally minimal dependencies for MVP. NextAuth.js added in Task 10. + +### Build & Runtime Verification + +**Build Verification**: ✓ PASSED +- Command: `bun run build` +- Exit Code: 0 +- Compilation: 2.9s (Turbopack) +- TypeScript: No errors +- Static Generation: 4 pages in 240.4ms +- Output: `.next/standalone/` with all required files + +**Dev Server Verification**: ✓ PASSED +- Command: `bun run dev` +- Startup: 625ms to ready state +- Port: 3000 (accessible) +- HTTP GET /: 200 OK in 1187ms +- Server process: Graceful shutdown with SIGTERM + +**Standalone Verification**: ✓ PASSED +- `.next/standalone/server.js`: 6.55 KB entry point +- `.next/standalone/node_modules/`: Self-contained dependencies +- `.next/standalone/package.json`: Runtime configuration +- `.next/` directory: Pre-built routes and static assets + +### Patterns & Conventions + +1. **Component Organization**: + - UI components: `src/components/ui/` (shadcn) + - Feature components: `src/components/features/` (future) + - Layout components: `src/components/layout/` (future) + - Avoid nested folders beyond 2 levels for discoverability + +2. **TypeScript Strict Mode**: + - `tsconfig.json` includes `"strict": true` + - All variables require explicit types + - Enables IDE autocomplete and early error detection + +3. **Tailwind CSS v4 Configuration**: + - Uses CSS variables for theming (shadcn standard) + - Tailwind config auto-generated by shadcn init + - No custom color palette yet (uses defaults from Neutral) + +4. **Git Strategy**: + - `.env.local.example` is committed (template for developers) + - `.env.local` is in `.gitignore` (personal configurations) + - No node_modules/ in repo (installed via `bun install`) + +### Configuration Files Created + +- `frontend/next.config.ts` — Minimal, standalone output enabled +- `frontend/tsconfig.json` — Path aliases, strict TypeScript mode +- `frontend/.env.local.example` — Environment variable template +- `frontend/components.json` — shadcn/ui configuration +- `frontend/tailwind.config.ts` — Tailwind CSS configuration with Tailwind v4 +- `frontend/postcss.config.js` — PostCSS configuration for Tailwind + +### Next Steps & Dependencies + +- **Task 10**: NextAuth.js integration + - Adds `next-auth` dependency + - Creates `src/app/api/auth/[...nextauth]/route.ts` + - Integrates with Keycloak (configured in Task 3) + +- **Task 17**: Frontend test infrastructure + - Adds vitest, @testing-library/react + - Component tests for shadcn/ui wrapper components + - E2E tests with Playwright (already in docker-compose) + +- **Task 18**: Layout and authentication UI + - Creates `src/app/layout.tsx` with navbar/sidebar + - Client-side session provider setup + - Login/logout flows + +- **Task 21**: Club management interface + - Feature components in `src/components/features/` + - Forms using shadcn input/select/button + - Data fetching from backend API (Task 6+) + +### Gotchas to Avoid + +1. **Bun vs Node.js Distinction**: This project uses Bun for development (fast HMR, 625ms startup). Production deployment will use Node.js due to P99 latency concerns (documented in plan). + +2. **shadcn/ui Component Customization**: Components are meant to be copied and modified for project-specific needs. Avoid creating wrapper components — extend the shadcn components directly. + +3. **Environment Variables Naming**: + - `NEXT_PUBLIC_*` are exposed to browser (use only for client-safe values) + - `KEYCLOAK_CLIENT_SECRET` is server-only (never exposed to frontend) + - `.env.local` for local development, CI/CD environment variables at deployment + +4. **Path Aliases in Dynamic Imports**: If using dynamic imports with `next/dynamic`, ensure paths use `@/*` syntax for alias resolution. + +5. **Tailwind CSS v4 Breaking Changes**: + - Requires `@tailwindcss/postcss` package (not default tailwindcss) + - CSS layer imports may differ from v3 (auto-handled by create-next-app) + +### Evidence & Artifacts + +- Build output: `.sisyphus/evidence/task-5-nextjs-build.txt` +- Dev server output: `.sisyphus/evidence/task-5-dev-server.txt` +- Git commit: `chore(frontend): initialize Next.js project with Tailwind and shadcn/ui` + +## Task 3: Keycloak Realm Configuration (2026-03-03) + +### Key Learnings + +1. **Keycloak Realm Export Structure** + - Realm exports are JSON files with top-level keys: `realm`, `clients`, `users`, `roles`, `groups` + - Must include `enabled: true` for realm and clients to be active on import + - Version compatibility: Export from Keycloak 26.x is compatible with 26.x imports + - Import command: `start-dev --import-realm` (Docker volume mount required) + +2. **Protocol Mapper Configuration for Custom JWT Claims** + - Mapper type: `oidc-usermodel-attribute-mapper` (NOT Script Mapper) + - Critical setting: `jsonType.label: JSON` ensures claim is parsed as JSON object (not string) + - User attribute: `clubs` (custom attribute on user entity) + - Token claim name: `clubs` (appears in JWT payload) + - Must include in: ID token, access token, userinfo endpoint (all three flags set to true) + - Applied to both clients: workclub-api and workclub-app (defined in client protocolMappers array) + +3. **Client Configuration Patterns** + - **Confidential client (workclub-api)**: + - `publicClient: false`, has client secret + - `serviceAccountsEnabled: true` for service-to-service auth + - `standardFlowEnabled: false`, `directAccessGrantsEnabled: false` (no user login) + - Used by backend for client credentials grant + - **Public client (workclub-app)**: + - `publicClient: true`, no client secret + - `standardFlowEnabled: true` for OAuth2 Authorization Code Flow + - `directAccessGrantsEnabled: true` (enables password grant for dev testing) + - PKCE enabled via `attributes.pkce.code.challenge.method: S256` + - Redirect URIs: `http://localhost:3000/*` (wildcard for dev) + - Web origins: `http://localhost:3000` (CORS configuration) + +4. **User Configuration with Custom Attributes** + - Custom attribute format: `attributes.clubs: ["{\"club-1-uuid\": \"admin\"}"]` + - Attribute value is array of strings (even for single value) + - JSON must be escaped as string in user attributes + - Protocol mapper will parse this string as JSON when generating JWT claim + - Users must have: `enabled: true`, `emailVerified: true`, no `requiredActions: []` + +5. **Password Hashing in Realm Exports** + - Algorithm: `pbkdf2-sha512` (Keycloak default) + - Hash iterations: 210000 (high security for dev environment) + - Credentials structure includes: `hashedSaltedValue`, `salt`, `hashIterations`, `algorithm` + - Password: `testpass123` (all test users use same password for simplicity) + - Note: Hashed values in this export are PLACEHOLDER — Keycloak will generate real hashes on first user creation + +6. **Multi-Tenant Club Membership Data Model** + - Format: `{"": ""}` + - Example: `{"club-1-uuid": "admin", "club-2-uuid": "member"}` + - Keys: Club UUIDs (tenant identifiers) + - Values: Role strings (admin, manager, member, viewer) + - Users can belong to multiple clubs with different roles in each + - Placeholder UUIDs used: `club-1-uuid`, `club-2-uuid` (real UUIDs created in Task 11 seed data) + +7. **Test User Scenarios** + - **admin@test.com**: Multi-club admin (admin in club-1, member in club-2) + - **manager@test.com**: Single club manager (manager in club-1) + - **member1@test.com**: Multi-club member (member in both clubs) + - **member2@test.com**: Single club member (member in club-1) + - **viewer@test.com**: Read-only viewer (viewer in club-1) + - Covers all role types and single/multi-club scenarios + +8. **Docker Environment Configuration** + - Keycloak 26.1 runs in Docker container + - Realm import via volume mount: `./infra/keycloak:/opt/keycloak/data/import` + - Health check endpoint: `/health/ready` + - Token endpoint: `/realms/workclub/protocol/openid-connect/token` + - Admin credentials: `admin/admin` (for Keycloak admin console) + +9. **JWT Token Testing Approach** + - Use password grant (Direct Access Grant) for testing: `grant_type=password&username=...&password=...&client_id=workclub-app` + - Decode JWT: Split on `.`, extract second part (payload), base64 decode, parse JSON + - Verify claim type: `jq -r '.clubs | type'` should return `object` (NOT `string`) + - Test script: `infra/keycloak/test-auth.sh` automates this verification + +10. **Common Pitfalls Avoided** + - DO NOT use Script Mapper (complex, requires JavaScript, harder to debug) + - DO NOT use `jsonType.label: String` (will break multi-tenant claim parsing) + - DO NOT forget `multivalued: false` in protocol mapper (we want single JSON object, not array) + - DO NOT hardcode real UUIDs in test users (use placeholders, seed data creates real IDs) + - DO NOT export realm without users (need `--users realm_file` or admin UI export with users enabled) + +### Configuration Files Created + +- **infra/keycloak/realm-export.json**: Complete realm configuration (8.9 KB) +- **infra/keycloak/test-auth.sh**: Automated verification script for JWT claims +- **.sisyphus/evidence/task-3-verification.txt**: Detailed verification documentation +- **.sisyphus/evidence/task-3-user-auth.txt**: User authentication results (placeholder) +- **.sisyphus/evidence/task-3-jwt-claims.txt**: JWT claim structure documentation (placeholder) + +### Docker Environment Issue + +- Colima (Docker runtime on macOS) failed to start with VZ driver error +- Verification deferred until Docker environment is available +- All configuration files are complete and JSON-validated +- Test script is ready for execution when Docker is running + +### Next Phase Considerations + +- Task 8 (Finbuckle) will consume `clubs` claim to implement tenant resolution +- Task 9 (JWT auth middleware) will validate tokens from Keycloak +- Task 10 (NextAuth) will use workclub-app client for frontend authentication +- Task 11 (seed data) will replace placeholder UUIDs with real club IDs +- Production deployment will need: real client secrets, HTTPS redirect URIs, proper password policies