Introduction
web-meta-scraper is a TypeScript library that extracts metadata from web pages using a plugin-based architecture.
Why web-meta-scraper?
| web-meta-scraper | metascraper | open-graph-scraper | |
|---|---|---|---|
| Dependencies | 1 (cheerio) | 10+ | 4+ |
| Bundle size | ~5KB min+gzip | ~50KB+ | ~15KB+ |
| Plugin system | Composable plugins | Rule-based | Monolithic |
| Custom plugins | Simple function | Complex rules | Not supported |
| TypeScript | First-class | Partial | Partial |
| oEmbed | Built-in plugin | Separate package | Not supported |
| Custom rules | Configurable | Fixed | Fixed |
| HTTP client | Native fetch() | got | undici |
How It Works
- Create a scraper with
createScraper()(or use thescrape()shorthand). - The scraper fetches HTML from a URL or accepts raw HTML.
- Each plugin extracts metadata from the HTML via the
ScrapeContext. - The resolver merges results using priority rules — e.g.,
og:titlewins over<title>. - You get a
ScraperResultwith mergedmetadataand rawsources.
import { scrape } from 'web-meta-scraper';
const result = await scrape('https://example.com');
console.log(result.metadata.title); // Best title from available sources
console.log(result.metadata.description); // Best description
console.log(result.metadata.image); // Best image URL
// See what each plugin extracted
console.log(result.sources['open-graph']); // { title: "...", image: "..." }
console.log(result.sources['meta-tags']); // { title: "...", keywords: [...] }Priority Merging
When the same field exists in multiple sources, the highest-priority value wins:
| Field | Priority (high → low) |
|---|---|
title | Open Graph → Meta Tags → Twitter |
description | Open Graph → Meta Tags → Twitter |
image | Open Graph → Twitter |
url | Open Graph → Meta Tags (canonical) |
Source-specific fields (twitterCard, siteName, locale, jsonLd, oembed) are always included directly.
You can customize these rules with the rules option in createScraper().
Next Steps
- Installation — Add web-meta-scraper to your project
- Quick Start — See it in action
- Plugins — Learn about built-in plugins