Skip to main content

Overview

Last resort. Transcripts are token-heavy (a 1-hour meeting can exceed 30K tokens). Try search_notes first — it returns curated documents (one-pager, custom) that are typically 10× more compact and equally informative. Use this tool only when you need exact transcribed words for a direct quote.
Breaking change (2026-05-06). The response shape was unified around speaker-attributed segments. The fields transcription, paragraphs[].text, hasDiarization, and totalParagraphs have been removed. Every paragraph now exposes a segments[] array with {content, speaker}speaker is {label, name} when diarization data is available, otherwise null. See the Migration section below to reconstruct the old fields from the new shape.
get_note_transcript returns the full transcript of a note as paragraphs of speaker-attributed segments, plus note metadata. The shape is uniform whether or not the note has diarization data: every paragraph carries a segments[] array, and each segment has a content string and a speaker object (or null). Both content and speaker.name are HTML-stripped at the MCP layer. Primary Use Cases:
  • Quoting exact words spoken in the meeting.
  • Knowing who said what — the per-segment speaker field is the primary use case for this tool.
  • Slicing the transcript by time range using the structured paragraphs array.
  • Analyzing conversation context when documents don’t capture the wording you need.
Key Features:
  • Uniform shape: every paragraph contains segments[] regardless of diarization availability.
  • Speaker info: segment.speaker is {label, name} when diarization exists, null otherwise.
  • HTML-stripped at the MCP layer (content and speaker.name are both sanitized).

Parameters

ParameterTypeRequiredDescription
noteGuidstringYesUnique note identifier

noteGuid (required)

The unique identifier of the note to retrieve. Example:
{
  "noteGuid": "abc-123-def"
}
How to Get Note GUID:
  1. Use list_notes (lightweight) to find candidates by folder, date, or keyword.
  2. Use the noteGuid from the result. If you also need document content for context, fetch it via search_notes first; only fall back to this tool when verbatim spoken words are required.

Response Format

Success Response — Diarized Note

When the note has diarization data, paragraphs contain one or more segments and each segment’s speaker is an object with label (engine-assigned, stable within the note) and name (resolved person name, or null when the speaker is unmapped):
{
  "noteGuid": "abc-123-def",
  "title": "Weekly Team Meeting",
  "participants": ["Alice Kim", "Bob Park"],
  "createdAt": "2024-01-15T10:30:00Z",
  "recordingDurationSeconds": 3625,
  "paragraphs": [
    {
      "timeFrom": "2024-01-15T10:30:05Z",
      "timeTo":   "2024-01-15T10:30:18Z",
      "segments": [
        {
          "content": "Let's start with the project update.",
          "speaker": { "label": "SPEAKER_0", "name": "Alice Kim" }
        },
        {
          "content": "Sure, I've completed the migration.",
          "speaker": { "label": "SPEAKER_1", "name": "Bob Park" }
        }
      ]
    }
  ]
}

Success Response — Non-Diarized Note

When the note has no diarization data, each paragraph is reduced to a single segment whose speaker is null. The shape is identical otherwise — callers can iterate over paragraphs[].segments[] without branching:
{
  "noteGuid": "abc-123-def",
  "title": "Solo Voice Memo",
  "participants": ["Alice Kim"],
  "createdAt": "2024-01-15T10:30:00Z",
  "recordingDurationSeconds": 120,
  "paragraphs": [
    {
      "timeFrom": "2024-01-15T10:30:05Z",
      "timeTo":   "2024-01-15T10:30:18Z",
      "segments": [
        {
          "content": "Reminder to send the contract by Friday.",
          "speaker": null
        }
      ]
    }
  ]
}
Field Descriptions:
FieldTypeDescription
noteGuidstringUnique note identifier.
titlestringNote title.
participantsstring[]Participant names (or emails when names are unavailable).
createdAtstringISO 8601 creation timestamp.
recordingDurationSecondsnumberRecording length in seconds.
paragraphs[]arrayStructured paragraphs. Each paragraph is a time-bounded chunk containing one or more speaker-attributed segments.
paragraphs[].timeFromstring | nullISO 8601 start time of the paragraph.
paragraphs[].timeTostring | nullISO 8601 end time of the paragraph.
paragraphs[].segments[]arraySpeaker-attributed slices of the paragraph. Always non-empty when the paragraph is returned (empty segments and empty paragraphs are filtered out).
paragraphs[].segments[].contentstringHTML-stripped plain-text content of the segment.
paragraphs[].segments[].speakerobject | null{label, name} when diarization data is available; null otherwise.
paragraphs[].segments[].speaker.labelstringDiarization-engine label (e.g. SPEAKER_0). Stable within a single note.
paragraphs[].segments[].speaker.namestring | nullResolved person name when the user has mapped the speaker; null otherwise.

Migration

The pre-2026-05-06 response carried transcription (a joined string), paragraphs[].text (per-paragraph plain text), hasDiarization, and totalParagraphs. All four are gone. They can be reconstructed from the new shape:
// Reconstruct the old `paragraphs[].text` field.
const text = paragraph.segments.map(s => s.content).join(" ");

// Reconstruct the old `transcription` joined string.
const transcription = response.paragraphs
  .map(p => {
    const ts = p.timeFrom ? `[${p.timeFrom.slice(11, 19)}] ` : "";
    return ts + p.segments.map(s => s.content).join(" ");
  })
  .join("\n");

// Reconstruct the old `hasDiarization` flag.
const hasDiarization = response.paragraphs.some(p =>
  p.segments.some(s => s.speaker !== null)
);

// Reconstruct the old `totalParagraphs` count.
const totalParagraphs = response.paragraphs.length;
If you need “who said what”, read segments[].speaker directly — that’s the new primary use case for this tool.

Usage Examples

Example 1: Basic Usage

AI Request:
Show me the full transcript of meeting abc-123-def
MCP Call:
{
  "jsonrpc": "2.0",
  "method": "tools/call",
  "params": {
    "name": "get_note_transcript",
    "arguments": {
      "noteGuid": "abc-123-def"
    }
  },
  "id": 1
}
Response:
{
  "jsonrpc": "2.0",
  "result": {
    "noteGuid": "abc-123-def",
    "title": "Q4 Marketing Strategy Meeting",
    "participants": ["Sarah Kim", "John Park"],
    "createdAt": "2025-11-10T10:00:00Z",
    "recordingDurationSeconds": 3625,
    "paragraphs": [
      {
        "timeFrom": "2025-11-10T10:00:05Z",
        "timeTo":   "2025-11-10T10:00:18Z",
        "segments": [
          {
            "content": "Welcome to the Q4 marketing strategy meeting.",
            "speaker": { "label": "SPEAKER_0", "name": "Sarah Kim" }
          }
        ]
      },
      {
        "timeFrom": "2025-11-10T10:00:18Z",
        "timeTo":   "2025-11-10T10:00:42Z",
        "segments": [
          {
            "content": "Thanks. Let me present our website analytics.",
            "speaker": { "label": "SPEAKER_1", "name": "John Park" }
          }
        ]
      }
    ]
  },
  "id": 1
}

Example 2: Progressive Disclosure Pattern

Step 1: Search
Find recent marketing meetings
Uses search_notes - Fast (~1 second, ~100 tokens) Step 2: Summary
What was discussed in meeting abc-123-def?
Uses get_note(include: ['summary']) - Medium (~2 seconds, ~500 tokens) Step 3: Full Transcript (Only if Needed)
Show me the exact conversation about budget allocation
Uses get_note_transcript - Slow (~5 seconds, ~4,000 tokens)

Token Usage

Typical Meeting (1 hour): ~3,000-5,000 tokens Compared to the pre-2026-05-06 shape, single-speaker notes see ~30% reduction because the redundant transcription joined string is no longer emitted. Multi-speaker (diarized) notes are roughly equivalent — the per-segment speaker metadata replaces the old transcription payload, and the segment content is the same data the old text field carried. Comparison (3-tier discovery):
ToolToken UsageWhen to Use
list_notes~50 tokens per noteFinding which notes exist (metadata only)
search_notes~1,500 tokens per noteReading the content behind a topic (returns documents)
get_note_transcript~3,000-5,000 tokens per hourLast resort — exact spoken words for quoting
get_note(include: ['summary'])~200-800 tokensQuick AI summary of one note
get_note(include: ['documents'])~300-2,000 tokensSpecific generated document
Token SavingsTry search_notes first — its document output is typically 10× more compact than the raw transcript and answers most “what did the team decide” questions. Only reach for get_note_transcript when you genuinely need verbatim wording.

Best Practices

Always follow this order:
  1. list_notes — find which notes exist (~50 tokens per note)
  2. search_notes — read the documents for a matched topic (~1,500 tokens per note)
  3. get_note_transcript — only if exact wording is required (~4,000 tokens per hour)
Most “what did we decide?” questions stop at step 2. This saves up to 90% of tokens compared to jumping straight to transcripts.
Use full transcripts when you need to:
  • Find exact quotes or statements
  • Analyze detailed conversation flow
  • Understand complete context
  • Reference specific discussions
  • Know who said what — the segments[].speaker field makes this a primary use case for this tool.
Don’t use transcripts for:
  • Quick overview (use get_note(include: ['summary']) instead)
  • Action items (use get_note(include: ['documents']))
  • General understanding (use get_note(include: ['summary']))
For very long meetings (2+ hours):
  • Expect 5,000-10,000 tokens
  • May cause timeout (60 seconds)
  • Consider requesting specific sections via summary first
  • Use summary to identify relevant parts before loading full transcript

Common Errors

Missing Note GUID

Solution: Provide the noteGuid parameter.

Note Not Found

Possible Causes:
  • Note doesn’t exist
  • Note has been deleted
  • You don’t have access to this note
Solution: Verify the note GUID using search_notes.

Request Timeout

Causes:
  • Very long meeting transcript (2+ hours)
  • Server load
Solutions:
  1. Use get_note(include: ['summary']) instead
  2. Retry after a short delay
  3. Contact support if problem persists

Performance

Response Time:
Meeting LengthTokensResponse Time
30 minutes~1,500~2 seconds
1 hour~3,000~3 seconds
2 hours~6,000~5 seconds
3+ hours~9,000+~8 seconds (may timeout)
Cache BehaviorTranscripts are cached for 15 minutes. Subsequent requests for the same note will be faster.