Home/Tools/@tpmjs/tools-sitemap-read

sitemapReadTool

@tpmjs/tools-sitemap-read

Parse XML sitemaps and extract URLs from sitemap.xml files

Official
web
v0.2.0
MIT

Interactive Playground

Test @tpmjs/tools-sitemap-read (sitemapReadTool) with AI-powered execution

0/2000 characters

Installation & Usage

Install this tool and use it with the AI SDK

1. Install the package

npm install @tpmjs/tools-sitemap-read
pnpm add @tpmjs/tools-sitemap-read
yarn add @tpmjs/tools-sitemap-read
bun add @tpmjs/tools-sitemap-read
deno add npm:@tpmjs/tools-sitemap-read

2. Import the tool

import { sitemapReadTool } from '@tpmjs/tools-sitemap-read';

3. Use with AI SDK

import { generateText } from 'ai';
import { openai } from '@ai-sdk/openai';
import { sitemapReadTool } from '@tpmjs/tools-sitemap-read';

const result = await generateText({
  model: openai('gpt-4o'),
  tools: { sitemapReadTool },
  prompt: 'Your prompt here...',
});

console.log(result.text);

Signature

(url: string) => Promise<unknown>

Tags

crawler
extract
files
parse
read
seo
sitemap
sitemaps
tpmjs
urls
web
xml

Parameters

Available configuration options

Auto-extracted
url
Required
Type: string

The sitemap.xml URL to parse

Schema extracted: 3/1/2026, 4:25:34 AM

README

@tpmjs/tools-sitemap-read

Parse XML sitemaps and extract URLs from sitemap.xml files.

Installation

npm install @tpmjs/tools-sitemap-read

Usage

import { sitemapReadTool } from '@tpmjs/tools-sitemap-read';
import { generateText } from 'ai';

const result = await generateText({
  model: yourModel,
  tools: { sitemapReadTool },
  prompt: 'Get all URLs from https://example.com/sitemap.xml',
});

Tool Parameters

  • url (string, required): The sitemap.xml URL to parse

Returns

{
  urls: Array<{
    loc: string;
    lastmod?: string;
    changefreq?: string;
    priority?: string;
  }>;
  isSitemapIndex: boolean;
  urlCount: number;
  sitemapIndexUrls?: Array<{
    loc: string;
    lastmod?: string;
  }>;
  metadata: {
    fetchedAt: string;
    sourceUrl: string;
    type: 'urlset' | 'sitemapindex';
  };
}

Features

  • Supports both regular sitemaps (urlset) and sitemap indexes (sitemapindex)
  • Extracts URL locations with optional metadata (lastmod, changefreq, priority)
  • Handles sitemap index files that reference other sitemaps
  • Comprehensive error handling
  • 30-second timeout protection
  • Validates XML structure

Sitemap Types

Regular Sitemap (urlset)

Contains direct page URLs with optional metadata:

<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
  <url>
    <loc>https://example.com/page</loc>
    <lastmod>2024-01-01</lastmod>
    <changefreq>weekly</changefreq>
    <priority>0.8</priority>
  </url>
</urlset>

Sitemap Index (sitemapindex)

Contains references to other sitemap files:

<sitemapindex xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
  <sitemap>
    <loc>https://example.com/sitemap1.xml</loc>
    <lastmod>2024-01-01</lastmod>
  </sitemap>
</sitemapindex>

Requirements

  • Node.js 18+ (uses native fetch API)

License

MIT

Statistics

Downloads/month

13

GitHub Stars

0

Quality Score

75%

Bundle Size

NPM Keywords

tpmjs
sitemap
xml
seo
crawler
web

Maintainers

thomasdavis(thomasalwyndavis@gmail.com)

Frameworks

vercel-ai
sitemapReadTool | TPMJS | TPMJS