Proxy

Apify Proxy provides access to Apify's pool of IP addresses to crawlers, actors or any other application that support HTTP proxies. The proxy enables intelligent rotation of IP addresses during web scraping in order to avoid being blocked by target websites. It supports HTTP as well as other protocols like HTTPS and FTP. You can view your Apify Proxy settings on the Proxy page in the app.

Overview

Apify Proxy automatically rotates IP addresses. For each HTTP or HTTPS request, the proxy takes the list of all IP addresses available to the user and selects the one that has been used the longest time ago for the specific hostname. This minimizes the chance of the proxy being blocked.

Note that by default each proxied HTTP request is potentially sent via a different target proxy server. This might add overheads and could be potentially problematic for certain websites. If you want to force the proxy to pick an IP address and then pass all subsequent connections via the same IP address, you can use the session parameter. See Username parameters more details.

Here's the full list of Apify Proxy features:

  • Periodic health checks of proxies in the pool to ensure requests are not forwarded via dead proxies.
  • Intelligent rotation of IP addresses to ensure target hosts are accessed via proxies that have accessed them the longest time ago, in order to reduce the chance of blocking.
  • Periodically checks on whether proxies are banned by selected target websites, and if they are, stops forwarding traffic to them in order to get the proxies unbanned as soon as possible.
  • Ensures proxies are located in specific countries using IP geolocation.
  • Allows selection of groups of proxy servers with specific characteristics.
  • Supports persistent sessions that enable you to keep the same IP address for certain parts of your crawls.
  • Measures statistics of traffic for specific users and hostnames.

Connection settings

The following table shows HTTP proxy connection settings for the Apify Proxy.

Proxy type HTTP
Hostname proxy.apify.com
Port 8000
Username Specifies parameters of the proxy connection. For the default behavior use auto. See Username parameters below for details.
Password Proxy password. Your password is displayed on the Proxy page in the app. Also, in Apify actors, it is passed as the APIFY_PROXY_PASSWORD environment variable. See actor documentation for more details.
URL http://<username>:<password>@proxy.apify.com:8000

WARNING: All usage of Apify Proxy with your password will be charged towards your account. Do not share the password with untrusted parties or use it from insecure networks, because the password is sent unencrypted due to the limitations of the HTTP protocol.

Username parameters

HTTP proxy username is used to pass various parameters for the proxy connection. For example, the username can look as follows:

groups-SHADER+DEFAULT,session-rand123456

The following table describes the available parameters:

groups Specifies which groups of IP addresses should be used. For example: groups-SHADER+DEFAULT. By default, the proxy uses all available proxy servers from all groups the user has access to.
session If specified, all proxied requests with the same session identifier are routed through the same IP address. For example: session-rand123456. If the IP address is no longer available, another one will be picked. By default, each proxied request is assigned a randomly picked least used IP address. The session string can only only contain numbers (0-9), letters (a-z or A-Z), dot (.), underscore (_), tilde (~) and the maximum length is 50 characters!

Both groups and session parameters are optional. For the default behavior, simply use the following username:

auto

Exclusive proxy groups

Note that certain proxy groups have a special behavior and therefore they cannot be selected together with "normal" proxy groups. Such proxy groups are marked as Exclusive in the Proxy groups section on the Proxy page in the application.

If you attempt to use an exclusive proxy group in combination with other groups then Apify Proxy will return an HTTP error and it will abort the connection.

Troubleshooting

To view the status of the connection to Apify Proxy, open the following URL in the browser that uses the proxy:

http://proxy.apify.com/

If the proxy connection works well, the web page should look something like this:

To test that your requests are proxied and rotate the IP addresses correctly, you might open the following API endpoint via the proxy. It shows information about the client IP address:

https://api.apify.com/v2/browser-info/

Examples

The following sections contains several examples of how to use Apify Proxy in actors.

Usage in PuppeteerCrawler

const Apify = require('apify');

Apify.main(async () => {
    const requestList = new Apify.RequestList({
        sources: [{ url: ''http://www.example.com'', method: 'GET', headers: {} }],
    });
    await requestList.initialize(); // Load requests.

    const crawler = new Apify.PuppeteerCrawler({
        requestList,
        launchPuppeteerOptions: {
            useApifyProxy: true,
            apifyProxyGroups: ['SHADER'],
            apifyProxySession: 'my_session_1',
        },
        handlePageFunction: async ({ page, request }) => {
            await Apify.pushData({
                title: await page.title(),
                url: request.url,
                succeeded: true,
            })
        },
        handleFailedRequestFunction: async ({ request }) => {
            await Apify.pushData({
                url: request.url,
                succeeded: false,
                errors: request.errorMessages,
            })
        },
    });

    await crawler.run();
});

Usage in Apify.launchPuppeteer()

const Apify = require('apify');

Apify.main(async () => {
    const password = process.env.APIFY_PROXY_PASSWORD;
    const browser = await Apify.launchPuppeteer({
        proxyUrl: `http://auto:${password}@proxy.apify.com:8000`
    });

    const page = await browser.newPage();

    await page.goto('http://www.example.com');

    const html = await page.content();

    console.log('HTML:');
    console.log(html);
});
    

Usage with request NPM package

const request = require('request');

const password = process.env.APIFY_PROXY_PASSWORD;
request(
    {
        url: 'http://www.example.com',
        method: 'GET',
        proxyUrl: `http://groups-SHADER:${password}@proxy.apify.com:8000`
    },
    (err, response, HTML) => {
        if (err) {
            console.error(error);
            return;
        }
        console.log(HTML);
    }
);