Crawler

Proxy Services

The Rezo crawler integrates with two managed proxy services: Oxylabs and Decodo. Both provide residential and datacenter proxy networks with geo-targeting, browser emulation, and session management. Configuration is done through the CrawlerOptions builder or directly in ICrawlerOptions.

Oxylabs Integration

Oxylabs provides a web scraping API that routes requests through residential proxies with browser-type emulation, geo-location targeting, and locale selection.

import { CrawlerOptions } from 'rezo/crawler';

Configuration via Builder

const options = new CrawlerOptions({
  baseUrl: 'https://example.com',
  concurrency: 20
});

options.addOxylabs({
  domain: 'example.com',   // Apply to this domain
  isGlobal: false,          // Or set true for all domains
  options: {
    username: 'customer-user_12345',
    password: 'your_password',
    browserType: 'desktop_chrome',
    locale: 'en-us',
    geoLocation: 'United States'
  },
  queueOptions: {
    concurrency: 5,
    interval: 1000,
    intervalCap: 3
  }
});

OxylabsOptions

interface OxylabsOptions {
  /** Oxylabs API username (required) */
  username: string;

  /** Oxylabs API password (required) */
  password: string;

  /** Browser type for request emulation */
  browserType?: BrowserType | string;

  /** Locale for content localization (e.g., 'en-us', 'de-de') */
  locale?: Locale | string;

  /** Geographic location for IP targeting */
  geoLocation?: GeoLocation | string;

  /** HTTP method override */
  http_method?: 'get' | 'post';

  /** Base64-encoded request body (for POST) */
  base64Body?: string;

  /** Return response as base64 */
  returnAsBase64?: boolean;

  /** Status codes considered successful */
  successful_status_codes?: number[];

  /** Sticky session ID for IP persistence */
  session_id?: string;

  /** Follow redirects (default: true) */
  follow_redirects?: boolean;

  /** Enable JavaScript rendering */
  javascript_rendering?: boolean;

  /** Custom headers to include */
  headers?: OutgoingHttpHeaders;

  /** Cookies to include */
  cookies?: { key: string; value: string }[];

  /** Enable rendering mode */
  render?: boolean;

  /** Custom context parameters */
  context?: Record<string, any>;

  /** Request timeout in ms */
  timeout?: number;
}

Browser Types

Oxylabs supports the following browser type emulations:

LabelValue
Desktopdesktop
Desktop Chromedesktop_chrome
Desktop Edgedesktop_edge
Desktop Firefoxdesktop_firefox
Desktop Operadesktop_opera
Desktop Safaridesktop_safari
Mobilemobile
Mobile Androidmobile_android
Mobile iOSmobile_ios
Tablettablet
Tablet Androidtablet_android
Tablet iOStablet_ios

Per-Request Overrides

When visiting through Oxylabs, you can override options per request:

await crawler.visitOxylabs('https://example.com/us-page', {
  geoLocation: 'United States',
  locale: 'en-us',
  browserType: 'desktop_chrome'
});

await crawler.visitOxylabs('https://example.com/de-page', {
  geoLocation: 'Germany',
  locale: 'de-de',
  browserType: 'desktop_firefox'
});

Decodo Integration

Decodo is a proxy service with support for device types, geographic targeting (country, state, city), and headless rendering modes.

Configuration via Builder

const options = new CrawlerOptions({
  baseUrl: 'https://example.com',
  concurrency: 20
});

options.addDecodo({
  domain: 'example.com',
  options: {
    username: 'user_12345',
    password: 'your_password',
    deviceType: 'desktop',
    country: 'United States',
    state: 'California',
    city: 'San Francisco',
    headless: 'html'
  },
  queueOptions: {
    concurrency: 3,
    interval: 2000,
    intervalCap: 2
  }
});

DecodoOptions

interface DecodoOptions {
  /** Decodo username (required) */
  username: string;

  /** Decodo password (required) */
  password: string;

  /** Device type emulation */
  deviceType?: BrowserType | string;

  /** Locale for content localization */
  locale?: Locale | string;

  /** Target country */
  country?: GeoLocation | string;

  /** Target state (US states or regional) */
  state?: string;

  /** Target city */
  city?: string;

  /** Headless rendering mode: 'html', 'png', or 'pdf' */
  headless?: 'html' | 'png' | 'pdf';

  /** Sticky session ID */
  sessionId?: string;

  /** Session duration in seconds */
  sessionDuration?: number;

  /** JavaScript to execute on page */
  javascript?: string;

  /** Wait time for JavaScript execution in ms */
  javascriptWait?: number;

  /** CSS selector to wait for before returning */
  waitForCss?: string;

  /** HTTP method override */
  http_method?: 'get' | 'post';

  /** Base64-encoded request body */
  base64Body?: string;

  /** Status codes considered successful */
  successful_status_codes?: number[];

  /** Sticky session ID (legacy) */
  session_id?: string;

  /** Enable JavaScript rendering */
  javascript_rendering?: boolean;

  /** Cookies to include */
  cookies?: { key: string; value: string }[];

  /** Request timeout in ms */
  timeout?: number;
}

Headless Modes

Decodo supports three headless rendering modes:

ModeDescription
htmlReturns rendered HTML after JavaScript execution
pngReturns a PNG screenshot of the page
pdfReturns a PDF rendering of the page

Per-Request Overrides

await crawler.visitDecodo('https://example.com/page', {
  country: 'Germany',
  city: 'Berlin',
  headless: 'html',
  javascriptWait: 3000,
  waitForCss: '.content-loaded'
});

Geographic Targeting

Both services support extensive geographic targeting. The options files define available locations:

Countries

Both Oxylabs and Decodo support 200+ countries including all major markets:

// Oxylabs
options.addOxylabs({
  domain: 'example.com',
  options: {
    username: 'user',
    password: 'pass',
    geoLocation: 'Japan'
  }
});

// Decodo
options.addDecodo({
  domain: 'example.com',
  options: {
    username: 'user',
    password: 'pass',
    country: 'Japan',
    city: 'Tokyo'
  }
});

US State Targeting

Both services support all 50 US states:

options.addDecodo({
  domain: 'example.com',
  options: {
    username: 'user',
    password: 'pass',
    country: 'United States',
    state: 'California',
    city: 'Los Angeles'
  }
});

Combining Services

You can use different proxy services for different domains:

const options = new CrawlerOptions({
  baseUrl: 'https://example.com',
  concurrency: 30
});

// Use Oxylabs for the main target
options.addOxylabs({
  domain: 'example.com',
  options: {
    username: 'oxy-user',
    password: 'oxy-pass',
    geoLocation: 'United States',
    browserType: 'desktop_chrome'
  },
  queueOptions: { concurrency: 5 }
});

// Use Decodo for a secondary target
options.addDecodo({
  domain: 'api.partner.com',
  options: {
    username: 'decodo-user',
    password: 'decodo-pass',
    country: 'Germany',
    headless: 'html'
  },
  queueOptions: { concurrency: 3 }
});

// Direct access for everything else (no proxy)
const crawler = new Crawler(options);

Random Value Helpers

Both integrations provide helper functions for randomized parameters:

// Oxylabs helpers
import {
  getRandomBrowserType,
  getRandomLocale,
  getRandomGeoLocation
} from 'rezo/crawler';

// Decodo helpers
import {
  getRandomDeviceType,
  getRandomLocale,
  getRandomCountry,
  getRandomCity,
  generateSessionId
} from 'rezo/crawler';