@tpmjs/tools-table-extract
Extract HTML tables as structured data from web pages
Module not found "https://esm.sh/node:sqlite?target=denonext".
at [0m[36mhttps://esm.sh/undici@7.16.0?target=denonext[0m:[0m[33m2[0m:[0m[33m8[0mLast checked: 1/1/2026, 1:05:53 AM
Test @tpmjs/tools-table-extract (tableExtractTool) with AI-powered execution
0/2000 characters
Install this tool and use it with the AI SDK
npm install @tpmjs/tools-table-extractpnpm add @tpmjs/tools-table-extractyarn add @tpmjs/tools-table-extractbun add @tpmjs/tools-table-extractdeno add npm:@tpmjs/tools-table-extractimport { tableExtractTool } from '@tpmjs/tools-table-extract';import { generateText } from 'ai';
import { openai } from '@ai-sdk/openai';
import { tableExtractTool } from '@tpmjs/tools-table-extract';
const result = await generateText({
model: openai('gpt-4o'),
tools: { tableExtractTool },
prompt: 'Your prompt here...',
});
console.log(result.text);Available configuration options
urlstringThe URL to fetch and extract tables from
tableIndexnumberWhich table to extract (0-based index, default: 0 for first table)
Try to auto-extract schema from the package
Extract HTML tables from web pages and convert them to structured data.
npm install @tpmjs/tools-table-extract
import { tableExtractTool } from '@tpmjs/tools-table-extract'; // Extract all tables from a page const result = await tableExtractTool.execute({ url: 'https://example.com/data' }); console.log(result); // { // url: 'https://example.com/data', // tables: [ // { // headers: ['Name', 'Price', 'Stock'], // rows: [ // { name: 'Widget', price: '$10', stock: '50' }, // { name: 'Gadget', price: '$20', stock: '30' } // ], // rowCount: 2, // columnCount: 3, // caption: 'Product Inventory' // } // ], // tableCount: 1, // metadata: { // fetchedAt: '2025-12-31T12:00:00.000Z', // domain: 'example.com' // } // } // Extract only the first table (index 0) const firstTable = await tableExtractTool.execute({ url: 'https://example.com/data', tableIndex: 0 });
Smart Header Detection: Automatically finds headers from:
<thead> elements<th> cells in first rowStructured Output: Converts tables to arrays of objects using headers as keys
Caption Extraction: Captures table captions when present
Flexible Extraction: Extract all tables or a specific table by index
Normalized Keys: Header text is normalized for use as object keys
Empty Row Filtering: Skips rows with no data
Comprehensive Error Handling: Detailed error messages for network issues
{ url: string; // The URL to fetch and extract tables from tableIndex?: number; // Optional: Which table to extract (0-based index) }
{ url: string; // The fetched URL tables: StructuredTable[]; // Array of extracted tables tableCount: number; // Total number of tables found on page metadata: { fetchedAt: string; // ISO timestamp of fetch domain: string; // Domain name extracted from URL }; } interface StructuredTable { headers: string[]; // Original header text rows: Array<Record<string, string>>; // Data rows as objects rowCount: number; // Number of data rows columnCount: number; // Number of columns caption?: string; // Table caption if present }
const result = await tableExtractTool.execute({ url: 'https://example.com/pricing' }); // Find the pricing table (assuming it's the first one) const pricingTable = result.tables[0]; console.log(`Found ${pricingTable.rowCount} pricing tiers`); // Access individual rows for (const row of pricingTable.rows) { console.log(`${row.plan}: ${row.price}/month`); } // Convert to CSV const csv = [ pricingTable.headers.join(','), ...pricingTable.rows.map(row => pricingTable.headers.map(h => row[h.toLowerCase().replace(/\s+/g, '_')]).join(',') ) ].join('\n');
// If you know there are multiple tables and want the third one const result = await tableExtractTool.execute({ url: 'https://example.com/reports', tableIndex: 2 // Third table (0-based index) }); const table = result.tables[0]; // Only contains the requested table console.log(`Extracted table with ${table.rowCount} rows`);
Headers are normalized for use as object keys:
Examples:
"Product Name" → "product_name""Price ($)" → "price""Stock Level" → "stock_level"This ensures consistent, JavaScript-friendly property names.
If a table has no <thead> or <th> elements, the tool:
<td> cells with no clear headers, generates generic headers: column_1, column_2, etc.""""The tool provides detailed error messages:
fetch API)@tpmjs/tools-page-brief - Extract main content and create summaries@tpmjs/tools-extract-json-ld - Extract JSON-LD structured data@tpmjs/tools-links-catalog - Extract and categorize all linksMIT
Downloads/month
0
Quality Score