Crawler Cheerio is a ready-made solution for crawling the web using plain HTTP requests to retrieve HTML pages and then parsing and inspecting the HTML using the Cheerio NPM package. Cheerio is a server-side version of the popula...
Example showing how to use headless Chromium with Puppeteer to open a web page, determine its dimensions, save a screenshot and print the page to PDF. For more information about Puppeteer, please see https://github.com/GoogleChro...
Act sends mail.
Crawler is a ready-made solution for crawling the web using the Chrome browser. It takes away all the work necessary to set up a browser for crawling, controls the browser automatically and produces machine readable results in sev...
Act which takes URL and array of strings to search for and returns a definition of a crawler.
Act to upload results from Apify crawler to AWS S3. It is designed to run from crawler finish webhook.
Contains a basic boilerplate of an Apify actor with a custom Dockerfile and hosted in a Git repository. It's purpose is to help you get started quickly to create your own actors.
Example act using PHP as the main language.
Example of loading a web page in headless Chrome using Selenium Webdriver.
Example act that iterates through all results from a crawler run and counts them. The act shall be called from the crawler's finish webhook. To do so, simply add the following URL to the finish webhook of your crawler: https://ap...
Crawler Puppeteer is the most powerful crawler tool in our arsenal (aside from developing custom actors). It uses the Puppeteer library to programmatically control a headless Chrome browser and it can make it do almost anything. I...
Development version of apify/crawler. Unstable, untested.
Hello world act to demonstrate a simple usage of Apify Actor.
Simple example showing how to call another act. It doesn't accept any input and doesn't generate any output.
Example act that opens a webpage with Golden Gate webcam stream. It takes a screenshot from the stream and saves it as output to key-value store. You can easily use it as API that returns a screenshot with: https://api.apify.com/v...
This actor serves as an example of a crawling run using the Live View feature. It crawls through Hacker News page by page and you may inspect any of the pages' screenshot or HTML in the Live View panel.
Generates a HTTP Archive (HAR) file for web pages specified in a list of URLs. Optionally, the pages can be loaded using proxies from a specific country - to use this feature, you'll need access to Apify Proxy. On input, the act...
This actor simply tests given array of URLs against selected proxy URLs or Apify proxy groups.
Example of Apify Actor act stored in a GitHub Gist. This is useful for small projects that have multiple source code files, where creating a full Git repository does not make sense.
This act simply counts from one up. In each run it prints one number. Its state (counter position) is stored in named key-value store. Name of the store is example-counter and you can find in Apify app under the Storages.
Returns diff of two given images as JPEG or PNG image.
Actor serving as an example of Input Schema. Takes URL of website and screenshot configuration parameters as input and outputs a screenshot of the website into Key-Value store.
This example demonstrates how to use web server in actor as communication channel with outer world. Read article about this crawler in Apify knowledge base: https://kb.apify.com/actor/running-a-web-server