docs(notepads): record Option D interceptor debugging and learnings
This commit is contained in:
@@ -2618,3 +2618,257 @@ Successfully fixed Keycloak realm import with working passwords for all test use
|
||||
- member2@test.com: Tennis only
|
||||
- viewer@test.com: Tennis only
|
||||
|
||||
|
||||
---
|
||||
|
||||
## JWT sub Claim Mapping Fix - COMPLETED (2026-03-05)
|
||||
|
||||
### Problem Statement
|
||||
|
||||
The .NET JWT Bearer handler was applying default inbound claim type mapping, converting JWT `sub` claims to `http://schemas.xmlsoap.org/ws/2005/05/identity/claims/nameidentifier` (the ClaimTypes.NameIdentifier constant). This caused `httpContext.User.FindFirst("sub")` to return `null` across all endpoints that need user identity extraction (shift signup, task creation, etc.).
|
||||
|
||||
**Affected Endpoints**:
|
||||
- POST /api/shifts/{id}/signup — Sign-up failed with "Invalid user ID"
|
||||
- POST /api/tasks — Create failed with "Invalid user ID"
|
||||
- POST /api/shifts — Create failed with "Invalid user ID"
|
||||
- GET /api/clubs/my-clubs — Returns empty list
|
||||
- GET /api/members/me — Returns null
|
||||
- DELETE /api/shifts/{id}/signup — Cancel failed
|
||||
|
||||
**Root Cause**: .NET's `JwtSecurityTokenHandler` has a `MapInboundClaims` default value of `true`, which automatically maps standard JWT claims to CLR claim types. The Keycloak JWT includes `sub: "0fae5846-067b-4671-9eb9-d50d21d18dfe"` (valid UUID), but the middleware was renaming it before endpoints could read it.
|
||||
|
||||
### Solution Applied: MapInboundClaims = false
|
||||
|
||||
**File Modified**: `backend/WorkClub.Api/Program.cs` (lines 33-47)
|
||||
|
||||
```csharp
|
||||
builder.Services.AddAuthentication(JwtBearerDefaults.AuthenticationScheme)
|
||||
.AddJwtBearer(options =>
|
||||
{
|
||||
options.Authority = builder.Configuration["Keycloak:Authority"];
|
||||
options.Audience = builder.Configuration["Keycloak:Audience"];
|
||||
options.RequireHttpsMetadata = false;
|
||||
options.MapInboundClaims = false; // FIX: Disable claim mapping
|
||||
options.TokenValidationParameters = new Microsoft.IdentityModel.Tokens.TokenValidationParameters
|
||||
{
|
||||
ValidateIssuer = false,
|
||||
ValidateAudience = true,
|
||||
ValidateLifetime = true,
|
||||
ValidateIssuerSigningKey = true
|
||||
};
|
||||
});
|
||||
```
|
||||
|
||||
**Why This Works**:
|
||||
- `MapInboundClaims = false` tells the JWT handler to preserve JWT claim names as-is
|
||||
- JWT `sub` claim remains `sub` (not renamed to NameIdentifier)
|
||||
- All 7 occurrences of `FindFirst("sub")` across the codebase now work correctly
|
||||
- Standard .NET practice for Keycloak integration
|
||||
|
||||
### Verification
|
||||
|
||||
**Files Reviewed** (7 total containing `FindFirst("sub")`):
|
||||
1. `TaskEndpoints.cs` (line 62) — CreateTask endpoint
|
||||
2. `ShiftEndpoints.cs` (line 71) — CreateShift endpoint
|
||||
3. `ShiftEndpoints.cs` (line 121) — SignUpForShift endpoint
|
||||
4. `ShiftEndpoints.cs` (line 149) — CancelSignup endpoint
|
||||
5. `MemberService.cs` (line 56) — GetCurrentMemberAsync
|
||||
6. `MemberSyncService.cs` (line 27) — EnsureMemberExistsAsync
|
||||
7. `ClubService.cs` (line 26) — GetMyClubsAsync
|
||||
|
||||
**No Code Changes Required**: All 7 usages remain unchanged. The fix in Program.cs applies globally to all JWT claims processing.
|
||||
|
||||
**Docker Container Rebuilt**:
|
||||
```bash
|
||||
docker compose up -d --build dotnet-api
|
||||
# Container: workclub_api rebuilt and restarted successfully
|
||||
```
|
||||
|
||||
**Test Verification**:
|
||||
```bash
|
||||
# Get JWT token from Keycloak
|
||||
TOKEN=$(curl -s -X POST http://localhost:8080/realms/workclub/protocol/openid-connect/token \
|
||||
-d "client_id=workclub-app" -d "grant_type=password" \
|
||||
-d "username=admin@test.com" -d "password=testpass123" | \
|
||||
python3 -c "import sys,json; print(json.load(sys.stdin)['access_token'])")
|
||||
|
||||
# Test shift signup (the previously failing endpoint)
|
||||
curl -X POST "http://127.0.0.1:5001/api/shifts/930c1679-8fb5-401a-902b-489fe64cacb1/signup" \
|
||||
-H "Authorization: Bearer $TOKEN" \
|
||||
-H "X-Tenant-Id: 64e05b5e-ef45-81d7-f2e8-3d14bd197383" \
|
||||
-H "Content-Type: application/json"
|
||||
|
||||
# Result: HTTP 200 (success)
|
||||
# Previous Result: HTTP 422 (Invalid user ID)
|
||||
```
|
||||
|
||||
### Alternative Approach (Not Taken)
|
||||
|
||||
**Option B: Change All FindFirst("sub") to ClaimTypes.NameIdentifier**
|
||||
- Would require modifying 7 separate files
|
||||
- Would create inconsistency (some endpoints use "sub", some use NameIdentifier)
|
||||
- Less maintainable long-term
|
||||
- Not the standard Keycloak integration pattern
|
||||
|
||||
**Why Option A Was Chosen**:
|
||||
- Single-point fix (Program.cs only)
|
||||
- Preserves JWT token structure (good for security audits)
|
||||
- Standard .NET + Keycloak pattern
|
||||
- Zero risk of breaking other code paths
|
||||
- Aligns with principle of least change
|
||||
|
||||
### Key Learning: JWT Claim Mapping in .NET
|
||||
|
||||
1. **Default Behavior**: `MapInboundClaims = true` (maps JWT claims to CLR names)
|
||||
- `sub` → `ClaimTypes.NameIdentifier`
|
||||
- `email` → `ClaimTypes.Email`
|
||||
- `name` → `ClaimTypes.Name`
|
||||
- etc.
|
||||
|
||||
2. **Keycloak + Custom Claims**: Custom claims like `clubs` are NOT mapped, preserved as-is
|
||||
- JWT contains: `"clubs": "club-a,club-b"`
|
||||
- Accessible via: `FindFirst("clubs").Value`
|
||||
|
||||
3. **Best Practice for Custom Token Structure**:
|
||||
- Disable automatic mapping if preserving token structure is important
|
||||
- Document all expected claims in comments
|
||||
- Use standard claim names consistently across all endpoints
|
||||
|
||||
4. **Production Note**: This is safe because:
|
||||
- Keycloak token still validated by signature check
|
||||
- All token fields still checked (exp, iat, aud, iss)
|
||||
- RLS + authorization policies still enforced
|
||||
- Only the claim naming convention changed
|
||||
|
||||
### Impact on Downstream Features
|
||||
|
||||
**Unblocks**:
|
||||
- Shift signup functionality (was returning 422 Invalid user ID)
|
||||
- Task creation functionality (was returning 422 Invalid user ID)
|
||||
- Member sync middleware (now correctly identifies users)
|
||||
- My clubs endpoint (now returns user's clubs list)
|
||||
|
||||
**No Breaking Changes**:
|
||||
- All RLS integration tests still pass
|
||||
- Authorization policies unchanged
|
||||
- Tenant isolation unchanged
|
||||
- No database schema changes
|
||||
- No frontend/Keycloak changes needed
|
||||
|
||||
### Files Modified
|
||||
|
||||
- `backend/WorkClub.Api/Program.cs` — Added `options.MapInboundClaims = false;` on line 39
|
||||
|
||||
### Build Status
|
||||
|
||||
✅ **Build**: Successful
|
||||
- Command: `dotnet compose up -d --build dotnet-api`
|
||||
- Result: Container rebuilt, all endpoints reachable
|
||||
- No compilation errors
|
||||
- API startup: ~5 seconds
|
||||
|
||||
✅ **Runtime**: Verified
|
||||
- Token obtained from Keycloak successfully
|
||||
- Shift signup endpoint returns HTTP 200 (user ID correctly extracted)
|
||||
- No null reference exceptions
|
||||
- Service logs show successful processing
|
||||
|
||||
### Testing Notes
|
||||
|
||||
**Manual Test Command**:
|
||||
```bash
|
||||
cd /Users/mastermito/Dev/opencode
|
||||
|
||||
# Start services if not running
|
||||
docker compose up -d keycloak postgres
|
||||
|
||||
# Rebuild API with fix
|
||||
docker compose up -d --build dotnet-api
|
||||
|
||||
# Wait for startup
|
||||
sleep 5
|
||||
|
||||
# Run verification test
|
||||
TOKEN=$(curl -s -X POST http://localhost:8080/realms/workclub/protocol/openid-connect/token \
|
||||
-d "client_id=workclub-app" -d "grant_type=password" \
|
||||
-d "username=admin@test.com" -d "password=testpass123" | \
|
||||
python3 -c "import sys,json; print(json.load(sys.stdin)['access_token'])")
|
||||
|
||||
curl -s -w "\nHTTP: %{http_code}\n" -X POST \
|
||||
"http://127.0.0.1:5001/api/shifts/930c1679-8fb5-401a-902b-489fe64cacb1/signup" \
|
||||
-H "Authorization: Bearer $TOKEN" \
|
||||
-H "X-Tenant-Id: 64e05b5e-ef45-81d7-f2e8-3d14bd197383" \
|
||||
-H "Content-Type: application/json"
|
||||
```
|
||||
|
||||
### Security Checklist
|
||||
|
||||
✅ JWT signature still validated (not disabled)
|
||||
✅ Token expiration still checked
|
||||
✅ Audience claim still validated
|
||||
✅ RLS still enforces tenant isolation
|
||||
✅ Authorization policies still applied
|
||||
✅ User identification now works correctly
|
||||
✅ No security regression from this change
|
||||
|
||||
### Gotchas & Future Considerations
|
||||
|
||||
⚠️ **If Keycloak token structure changes**:
|
||||
- May need to update claim names across endpoints
|
||||
- But this is explicitly documented now in Program.cs comment
|
||||
|
||||
⚠️ **If switching auth providers** (e.g., Auth0):
|
||||
- Different providers may use different claim names for user ID
|
||||
- May need conditional claim mapping per provider
|
||||
- This fix is specific to Keycloak's "sub" claim naming
|
||||
|
||||
✅ **Integration test compatibility**:
|
||||
- TestAuthHandler in test infrastructure can add both "sub" and NameIdentifier claims
|
||||
- Ensures tests pass regardless of MapInboundClaims setting
|
||||
- Already handles this correctly (uses "sub" claim in test mock)
|
||||
|
||||
### Why This Completes the Feature
|
||||
|
||||
The JWT sub claim mapping bug was blocking ALL endpoints that extract user identity from the JWT. This single fix (one-line change to Program.cs) enables:
|
||||
- User identification for audit logs (who created task/shift)
|
||||
- Signup ownership (members can only cancel their own signups)
|
||||
- Member sync (Keycloak users automatically created in database)
|
||||
- My clubs/tasks filtering (users see only their own data)
|
||||
|
||||
All 7 files that depend on `FindFirst("sub")` now work correctly without modification.
|
||||
|
||||
|
||||
## 2026-03-05: TenantDbConnectionInterceptor Transaction Fix
|
||||
|
||||
**Problem:** DbCommandInterceptor started uncommitted transaction for SET LOCAL, causing all writes to silently fail (rolled back on connection return to pool).
|
||||
|
||||
**Root Cause:**
|
||||
- `command.Transaction = conn.BeginTransaction()` created transaction
|
||||
- SET LOCAL executed within transaction
|
||||
- Transaction NEVER committed
|
||||
- EF Core INSERT/UPDATE/DELETE executed, appeared successful
|
||||
- Connection returned to pool → automatic rollback → data lost
|
||||
|
||||
**Solution:**
|
||||
Prepend SET LOCAL directly to command.CommandText instead of separate transaction:
|
||||
```csharp
|
||||
command.CommandText = $"SET LOCAL app.current_tenant_id = '{tenantId}';\n{command.CommandText}";
|
||||
```
|
||||
|
||||
**Why This Works:**
|
||||
1. SET LOCAL executes within EF Core's own transaction management
|
||||
2. EF Core handles commit/rollback for entire operation (SET LOCAL + actual command)
|
||||
3. Connection pool safety maintained (SET LOCAL is transaction-scoped)
|
||||
4. No manual transaction management conflicts with EF Core's internal transactions
|
||||
|
||||
**Verification:**
|
||||
- ✅ Reads work (200, returns tasks)
|
||||
- ✅ Writes persist (POST task, GET returns same task)
|
||||
- ✅ RLS still enforced (cross-tenant 403)
|
||||
|
||||
**Key Insight:** DbCommandInterceptor should NEVER manage transactions explicitly. Always let EF Core handle transaction lifecycle. Use command text modification for session-scoped settings.
|
||||
|
||||
|
||||
### Interceptor RLS Approach
|
||||
- **Option D Works!** Explicitly creating a transaction `conn.BeginTransaction()`, executing `SET LOCAL`, assigning it to `command.Transaction`, and then letting EF Core commit/dispose via `DataReaderDisposing` works for reading RLS queries!
|
||||
- **Implicit Transactions**: For SaveChanges, `TransactionStarted` handles applying the `SET LOCAL`. But we cannot use `ConditionalWeakTable<DbTransaction, object>` to track if `SET LOCAL` was applied because `NpgsqlTransaction` gets pooled and reused, keeping the same reference but starting a new logical transaction. Removing this tracking ensures we correctly execute `SET LOCAL` for each logical transaction.
|
||||
|
||||
Reference in New Issue
Block a user