Error Handling
web-meta-scraper throws ScraperError when something goes wrong during fetching or parsing.
ScraperError
ScraperError extends the native Error class with an additional cause property containing the original error.
import { scrape, ScraperError } from 'web-meta-scraper';
try {
const result = await scrape('https://example.com');
} catch (error) {
if (error instanceof ScraperError) {
console.error(error.message); // Human-readable message
console.error(error.cause); // Original error, if available
}
}Error Scenarios
| Scenario | Error Message |
|---|---|
| Request timeout | Request timeout after {timeout}ms |
| HTTP error (4xx/5xx) | Failed to fetch URL: {url} |
| Non-HTML response | Failed to fetch URL: {url} (cause: Invalid content type: {type}) |
| Response too large | Failed to fetch URL: {url} (cause: Content too large: {bytes} bytes) |
| Invalid URL protocol | Only HTTP(S) protocols are supported |
| Parsing failure | Failed to scrape metadata |
Timeout
Set a custom timeout to avoid long waits:
import { createScraper, openGraph } from 'web-meta-scraper';
const scraper = createScraper({
plugins: [openGraph],
fetch: { timeout: 5000 }, // 5 seconds
});
try {
await scraper.scrapeUrl('https://slow-site.example.com');
} catch (error) {
// ScraperError: Request timeout after 5000ms
}Content Size Limit
Prevent downloading excessively large pages:
const scraper = createScraper({
plugins: [openGraph],
fetch: { maxContentLength: 1024 * 1024 }, // 1MB
});URL Validation
scrapeUrl() validates that the input is a valid HTTP or HTTPS URL:
// Throws immediately — no network request
await scraper.scrapeUrl('ftp://example.com');
// ScraperError: Only HTTP(S) protocols are supportedFor raw HTML, use scraper.scrape() which doesn't validate URLs.
JSON-LD Parse Errors
Invalid JSON in <script type="application/ld+json"> tags is silently skipped — the plugin ignores malformed entries instead of throwing. This is by design, since many websites have broken JSON-LD markup.
oEmbed Fetch Errors
The oEmbed plugin silently returns empty data if the oEmbed endpoint is unreachable or returns an error. This prevents one failing endpoint from breaking the entire scrape.