Link Conversion
After downloading a website, all the internal links still point to absolute URLs (https://example.com/style.css). The LinkConverter rewrites these to relative paths (./style.css) so the mirrored site works when opened from the local filesystem.
This implements wget’s -k / --convert-links functionality.
Quick Start
Enable link conversion with the fluent API:
import { Wget } from 'rezo/wget';
await new Wget()
.pageRequisites()
.convertLinks()
.outputDir('./offline-site')
.get('https://example.com/');
// All HTML and CSS files now use relative paths
// Open offline-site/index.html in your browser - it just works How It Works
Link conversion runs as a post-processing step after all downloads complete:
- The Downloader builds a URL map:
Map<originalUrl, localFilePath> LinkConverterreads each downloaded HTML and CSS file- For every URL found in the file, it checks the URL map
- If the URL was downloaded, it’s replaced with a relative path from the current file to the target file
- If the URL was not downloaded, it remains as an absolute URL
HTML Conversion
The converter processes these elements and attributes in HTML:
| Element | Attributes |
|---|---|
a, area, link | href |
script, img, source, video, audio, track | src |
img, source | srcset |
video | poster |
iframe, frame, embed | src |
object | data |
form | action |
input | src |
It also converts:
- Inline styles (
style="background: url(...)") <style>tag contents (CSSurl()and@import)srcsetdescriptors (multi-resolution images)
CSS Conversion
In standalone CSS files, the converter rewrites:
@import url('...')and@import '...'rulesurl()functions in property values (backgrounds, fonts, cursors, etc.)
Relative Path Calculation
Paths are calculated relative to the file being modified:
File: output/example.com/docs/guide.html
Asset: output/css/main.css
Result: ../../css/main.css File: output/example.com/index.html
Asset: output/example.com/images/logo.png
Result: ./images/logo.png Content-Type Awareness
The converter determines which files to process using the MIME type recorded during download, not file extensions. This is important because many web applications serve HTML from paths like /login, /dashboard.aspx, or /api/page — none of which have .html extensions.
Files with these MIME types are processed:
text/html,application/xhtml+xml— treated as HTMLtext/css— treated as CSS
If MIME type information is unavailable, the converter falls back to file extensions (.html, .htm, .css).
URL Resolution
The converter handles multiple URL formats:
Absolute URLs
<!-- Before -->
<link href="https://example.com/css/style.css">
<!-- After (if downloaded) -->
<link href="./css/style.css">
<!-- After (if NOT downloaded) -->
<link href="https://example.com/css/style.css"> Site-Root Relative URLs
<!-- Before -->
<img src="/images/logo.png">
<!-- After (resolved against page URL, then converted) -->
<img src="./images/logo.png"> Already-Relative URLs
<!-- Before -->
<a href="../other-page.html">
<!-- Stays relative, may be adjusted for organized assets -->
<a href="../other-page.html"> Query Strings and Fragments
The converter strips query strings and fragments when looking up URLs in the download map, ensuring assets like /style.css?v=123 match the downloaded file.
Organized Asset Paths
When organizeAssets is enabled, assets are moved to categorized folders. The converter accounts for this when computing relative paths. For example, if a site-root image /core/misc/tree.png was organized into images/tree.png, the converter predicts the organized path and generates the correct relative reference.
Backup Original Files
Enable backups to preserve the original HTML/CSS before conversion:
await new Wget({
recursive: { convertLinks: true },
download: { backupConverted: true },
}).get('https://example.com/');
// Creates index.html.orig alongside the modified index.html Conversion Statistics
The conversion phase returns detailed statistics:
interface ConversionStats {
filesProcessed: number; // Total files examined
filesModified: number; // Files that were actually changed
linksConverted: number; // Total URLs rewritten
linksToRelative: number; // URLs converted to relative paths
linksToAbsolute: number; // URLs left as absolute (not downloaded)
} Special URLs
These URL schemes are never converted:
data:— Data URIsjavascript:— JavaScript URIsmailto:— Email linkstel:— Phone linksblob:— Blob URIs#— Fragment-only anchors
Complete Example
Mirror a site with link conversion and organized assets:
import { Wget } from 'rezo/wget';
const stats = await new Wget({
organizeAssets: true,
download: { adjustExtension: true },
})
.concurrency(5)
.convertLinks()
.pageRequisites()
.domains('docs.example.com')
.outputDir('./docs-mirror')
.on('link-conversion', (event) => {
if (event.phase === 'complete') {
console.log(`Converted ${event.linksConverted} links in ${event.convertedFiles} files`);
}
})
.get('https://docs.example.com/');