Batch Scraping
The batchScrape() function lets you scrape metadata from multiple URLs concurrently with built-in error isolation and order preservation.
Basic Usage
import { batchScrape } from 'web-meta-scraper';
const results = await batchScrape([
'https://github.com',
'https://nodejs.org',
'https://example.com',
]);
for (const r of results) {
if (r.success) {
console.log(r.url, r.result.metadata.title);
} else {
console.error(r.url, r.error);
}
}Options
interface BatchScrapeOptions {
concurrency?: number; // Max parallel requests (default: 5)
scraper?: ScraperConfig; // Forwarded to each scrape() call
}Concurrency
Control how many URLs are fetched in parallel:
// Conservative — 2 at a time
const results = await batchScrape(urls, { concurrency: 2 });
// Aggressive — 10 at a time
const results = await batchScrape(urls, { concurrency: 10 });Custom Scraper Config
Pass plugins, fetch options, or post-processing settings:
import { batchScrape, metaTags, openGraph, favicons, feeds } from 'web-meta-scraper';
const results = await batchScrape(urls, {
concurrency: 5,
scraper: {
plugins: [metaTags, openGraph, favicons, feeds],
fetch: { timeout: 10000 },
postProcess: { secureImages: true },
},
});Result Structure
Each element in the returned array is a BatchScrapeResult:
interface BatchScrapeResult {
url: string; // The URL that was scraped
success: boolean; // Whether the scrape completed without errors
result?: ScraperResult; // Full scraper output (on success)
error?: string; // Error message (on failure)
}Results are always returned in the same order as the input URLs, regardless of the order in which individual requests complete.
Error Isolation
Each URL is processed independently. A failure in one URL does not affect the others:
const results = await batchScrape([
'https://valid-site.com', // success
'https://does-not-exist.xyz', // fails — others continue
'https://another-valid.com', // success
]);
// results[0].success === true
// results[1].success === false, results[1].error === "..."
// results[2].success === trueHow It Works
batchScrape uses a promise-based worker pool with no external dependencies. It spawns up to concurrency parallel workers that pull URLs from a shared index counter. Since JavaScript is single-threaded, the shared index is safe from race conditions.