Phase 2: Email Content Enrichment — Notes + Scraped Website Content

Status: ✅ IMPLEMENTED
Prerequisites: Phase 1 complete (email intent framing must be in place)
Objective: Make every email generation pull in the user's actual notes and the latest scraped website content, so email copy references specific products, features, and updates rather than generic company descriptions.

Background

The current email generator (replaceTagsWithContent via generateContentForRoles()) only has access to:

company_name, website, description — static, rarely updated
The prompt passed at generation time

Two data sources already exist and are not being used:

Table	Column	What it contains	Current use
`customer_notes`	`content`	User-written inspiration notes per client	Auto-generate only (`converted: false`)
`client_content_asset`	`content`, `url`	Markdown of scraped website pages	Stored, never injected into prompts

This phase wires both into every email generation path — manual, bot, and auto-generate.

8.1 — New Service: `email-context.service.ts`

File to create: api/src/services/email-context.service.ts

Fetches email-specific context for a customer in a single parallel call.

import { createSupabaseClient } from '../utils/supabase.js'

export interface EmailContext {
  notes: string[]
  scrapedPages: Array<{ url: string; excerpt: string }>
}

export async function getEmailContext(customerId: number): Promise<EmailContext> {
  const supabase = createSupabaseClient()

  const [notesResult, assetsResult] = await Promise.all([
    supabase
      .from('customer_notes')
      .select('content')
      .eq('customer_id', customerId)
      .eq('converted', false)
      .order('created_at', { ascending: false })
      .limit(10),

    supabase
      .from('client_content_asset')
      .select('url, content')
      .eq('customer_id', customerId)
      .eq('asset_type', 'scraped_page')
      .not('content', 'is', null)
      .order('created_at', { ascending: false })
      .limit(5),
  ])

  const notes = (notesResult.data ?? [])
    .map((r) => (r.content as string | null)?.trim())
    .filter((s): s is string => !!s)

  const scrapedPages = (assetsResult.data ?? [])
    .map((r) => ({
      url: r.url as string,
      // Trim to first 500 chars per page — enough context without overwhelming the prompt
      excerpt: ((r.content as string | null) ?? '').slice(0, 500).trim(),
    }))
    .filter((p) => p.excerpt.length > 0)

  return { notes, scrapedPages }
}

Tasks

[ ] Create api/src/services/email-context.service.ts with getEmailContext() as above

8.2 — Extend `generateContentForRoles()` in `smart-mapper.ts`

Target file: api/src/utils/smart-mapper.ts

Interface change

Add emailContext to the context parameter accepted by generateContentForRoles():

// Before
export async function generateContentForRoles(
  mappings: RoleMapping[],
  context: {
    company_name: string
    website: string
    description: string
    prompt: string
  }
): Promise<Record<string, string>>

// After
export async function generateContentForRoles(
  mappings: RoleMapping[],
  context: {
    company_name: string
    website: string
    description: string
    prompt: string
    emailContext?: {
      notes: string[]
      scrapedPages: Array<{ url: string; excerpt: string }>
    }
  }
): Promise<Record<string, string>>

Prompt injection

When emailContext is provided and non-empty, inject it as the first block in each role's generation prompt, before the company description:

${notesBlock}
${scrapedPagesBlock}

Company: {company_name} ({website})
Description: {description}

Brief: {prompt}

Where:

const notesBlock = context.emailContext?.notes.length
  ? `User notes — use these as specific inspiration for this email:\n${context.emailContext.notes.map((n) => `- ${n}`).join('\n')}`
  : ''

const scrapedPagesBlock = context.emailContext?.scrapedPages.length
  ? `Recent website/product content — reference specific details from here:\n${context.emailContext.scrapedPages
      .map((p) => `[${p.url}]\n${p.excerpt}`)
      .join('\n\n')}`
  : ''

Both blocks are omitted (not blank lines) when empty — no prompt noise when data is unavailable.

Tasks

[ ] Add emailContext to the context type in generateContentForRoles()
[ ] Build notesBlock and scrapedPagesBlock strings inside the function
[ ] Inject them at the top of each role's generation prompt
[ ] Verify existing call sites still compile (they pass no emailContext — TypeScript optional, no breaking change)

8.3 — Thread `emailContext` through `replaceTagsWithContent()`

Target file: api/src/agents/createPosts.ts

replaceTagsWithContent() is the active email generator in contentMap. It calls generateContentForRoles() internally.

Change

Add emailContext to the PostContent input object and forward it to generateContentForRoles():

// In PostContent type (or inline in the function signature)
emailContext?: {
  notes: string[]
  scrapedPages: Array<{ url: string; excerpt: string }>
}

// Inside replaceTagsWithContent():
const content = await generateContentForRoles(mappings, {
  company_name: customer.name,
  website: customer.url ?? '',
  description: customer.description ?? '',
  prompt,
  emailContext: postContent.emailContext,  // ← forwarded
})

Tasks

[ ] Add emailContext to the PostContent type / replaceTagsWithContent() parameter
[ ] Forward emailContext to generateContentForRoles() call

8.4 — Inject `emailContext` in `getPlatformContent()`

Target file: api/src/modules/posts.ts

getPlatformContent() is the quality-gate loop that calls each generator. It already has access to the customer object (which has customer.id).

Change

Before the retry loop, detect email platform and fetch context once:

// Near the top of getPlatformContent(), before the quality-gate loop:
let emailContext:
  | { notes: string[]; scrapedPages: Array<{ url: string; excerpt: string }> }
  | undefined

if (platform === 'email') {
  try {
    emailContext = await getEmailContext(customer.id)
  } catch (err) {
    // Non-blocking: generation continues without enrichment
    console.warn('getPlatformContent: failed to fetch email context', {
      customerId: customer.id,
      err,
    })
  }
}

// Then pass emailContext into the contentMap call:
const result = await contentMap[platform]({
  campaign,
  customer,
  prompt,
  brandVoice,
  toneExamples,
  emailContext,
})

Import

import { getEmailContext } from '../services/email-context.service.js'

Tasks

[ ] Import getEmailContext in posts.ts
[ ] Add email context fetch block before the quality-gate retry loop in getPlatformContent()
[ ] Pass emailContext to the contentMap[platform]() call
[ ] Wrap in try/catch — generation must not fail if notes/assets are unavailable

8.5 — Apply to Manual Generation Path

The manual generation path (POST /generate_post → generateAdHocPosts() → getPlatformContent()) already goes through getPlatformContent(), so changes from 8.4 apply automatically. No additional wiring needed.

Verify: The change in 8.4 must apply to all callers of getPlatformContent():

generatePosts() (campaign-based) ✅
generateAdHocPosts() (ad-hoc) ✅
generateContent() (single platform regen) ✅
runSingleGeneration() (auto-generate) ✅

Testing Checklist

Setup

Create at least 2 unconverted notes for a test customer via the Notepad UI
Ensure the customer has a recent website crawl (check client_content_asset for asset_type = 'scraped_page' rows)

Verification

[ ] Generate an email via POST /generate_post with platforms: ["email"] — inspect the saved customer_platform_post.content for references to note content or website-specific details
[ ] Generate an email for a customer with no notes and no scraped pages — verify generation succeeds normally with no errors
[ ] Regenerate an existing email post via POST /regenerate_post — verify enriched context is applied
[ ] Check server logs confirm getEmailContext was called (add a brief log line)
[ ] Run type check: cd api && pnpm build — no TypeScript errors

Edge Cases

[ ] Customer with 10 notes and 5 scraped pages — verify prompt length doesn't cause LLM errors (monitor token counts)
[ ] client_content_asset rows with null content — verify they are filtered out cleanly
[ ] Notes with very long content (>2,000 chars) — the getEmailContext service does not truncate notes; if token limits are a concern, add a per-note .slice(0, 400) cap

Rollback

Remove the emailContext fetch block from getPlatformContent() in posts.ts
Revert the generateContentForRoles() signature (remove emailContext param)
No database changes to revert — this phase makes no schema changes