User

Jaroslav Hejlekjaroslavhejlek

JS for the win!

All
Popularity
Actor

kickstarter-search

jaroslavhejlek/kickstarter-search

Wrapper above Kickstarter search page. Takes configuration with search filters and outputs list of projects matching the search filters found on Kickstarter. Does not require proxy.

avatarjaroslavhejlek
5star
FEATURED
Actor

act-page-analyzer

jaroslavhejlek/act-page-analyzer

Act which takes URL and array of strings to search for and returns a definition of a crawler.

avatarjaroslavhejlek
7star
Actor

bf-upload-to-mysql

jaroslavhejlek/bf-upload-to-mysql

Loads data from crawler executions and uploads them directly to mysql.

avatarjaroslavhejlek
7star
Actor

analyse-pages

jaroslavhejlek/analyse-pages

Analyses pages provided in input and outputs XHR Requests with responses for HTML/JSON, window variables with content, metadata from <meta> tags, metadata from schema.org markup and metadata from JSON-LD.

avatarjaroslavhejlek
5star
Actor

mongo-import

jaroslavhejlek/mongo-import

avatarjaroslavhejlek
4star
Actor

measure-downloaded-bytes

jaroslavhejlek/measure-downloaded-bytes

Example of how to measure downloaded data from network requests made by a webpage.

avatarjaroslavhejlek
2star
Actor

kickstarter-location-to-ids

jaroslavhejlek/kickstarter-location-to-ids

Helper actor which queries Kickstarter places and outputs a list of 10 location with their Kickstarter ID's (needed when querying Kickstarter by location).

avatarjaroslavhejlek
2star
Actor

zip-key-value-store

jaroslavhejlek/zip-key-value-store

Takes ID of key value store as parameter "keyValueStoreId" and archives all keys in the key-value store into a zip file which is then saved into key-value store of the act. If there is more then 1000 keys in the store, multiple z...

avatarjaroslavhejlek
2star
Actor

bf-mysql-deduplicate

jaroslavhejlek/bf-mysql-deduplicate

Loads data from mysql table and creates a new table without duplicates

avatarjaroslavhejlek
1star
Actor

cnn-top-stories

jaroslavhejlek/cnn-top-stories

Measures data traffic done when CNN top stories are crawled. Optionally can cache responses in memory based on cache-control max-age header. Optionally can also block some tracking and analytics requests.

avatarjaroslavhejlek
1star