Actor

mtrunkat/crawl-url-list-1by1

  • Builds
  • latest 0.0.37 / 2018-02-05
  • Created 2017-08-11
  • Last modified 2018-09-05
  • grade 11

Description

Crawls given list of urls with one crawler execution per url.


API

To run the actor, send a HTTP POST request to:

https://api.apify.com/v2/acts/mtrunkat~crawl-url-list-1by1/runs?token=<YOUR_API_TOKEN>

The POST payload will be passed as input for the actor. For more information, read the docs.


Example input

Content type: application/json

{
    "urlListFile": "@TODO",
    "crawlerId": "@TODO",
    "concurrency": 2
}

Readme

Crawl Url List 1 by 1

Apify.com act that takes a list of urls and starts given crawler for each of the urls.

Crawler is published at Apify.com as mtrunkat/crawl-url-list-1by1.

You can start this act via POST request to following url with it's input as JSON payload:

https://api.apify.com/v2/acts/mtrunkat~crawl-url-list-1by1?token=[YOUR_API_TOKEN]

Example input:

You can either send url of publicly hosted file containing your url list (one url per line):

{
    "urlListFile": "http://example.com/urllist.txt",
    "crawlerId": "ytXL3jaRKwrfWC9tz",
    "concurrency": 2
}

Or you can pass urls directly:

{
    "urlList": ["http://example.com", "http://google.com"],
    "crawlerId": "ytXL3jaRKwrfWC9tz",
    "concurrency": 2
}

Possible options:

Options crawlerId, cocurrency and one of urlListFile and urlList are required!

Option Type Description
urlListFile String Url of the texfile containing urls to be crawled with one url per line
urlList Array Array of urls to be crawled.
crawlerId String Crawler ID.
concurrency Number Concurrency of crawler executions.
crawlerSettings Object Overrides of crawler settings passed to startExecution call