Change log

Stay up-to-date and see what's new on the Apify platform

July 2018

  • Actor: Memory option for actor runs now supports only values that are power of 2 (ie. 128MB, 256MB, 512MB, 1024MB, 2048MB, ...)!
  • Crawler: Proxy configuration of crawler now offers "automatic" mode that rotates all the proxies available for a user.
  • Actor: Each actor run can now start a web server accessible at a certain unique URL. This enables you to run a web server inside the actor to provide real-time snapshots or receive tasks on the fly. See documentation for more details.

June 2018

  • API: Added API endpoints to abort Actor run and build.
  • Proxy: New Apify Proxy service launched!
  • Keboola integration: Added support for running Actors. Check knowledge article for more information.
  • Actor: Minimum memory for actor runs is now 128MB.

May 2018

  • SDK: Bunch of improvements and new features. Check the changelog.
  • Crawler: Now it is not possible to combine custom proxies and Apify proxy groups.
  • Actor: Run console now shows information about current/max/avegare CPU and memory.
  • Actor: Actors are now notified 120s before migration to another worker machine. Check documentation for more information.

April 2018

  • API: Added a new API end-point to obtain information about a user account
  • API: Storage API now also supports use of [username]~[storage-name] instead of Dataset ID and Key-value store ID.
  • CLI: We have just released an Apify CLI (command line tool) to simplify local development, debugging and deployment to Apify.
  • Request queue: New storage type for Actor platform that helps to manage dynamic queue of URLs to be processed. Check storage documentation for more information.
  • SDK: apify NPM package contains a lot of new features. Check its changelog for details.
  • Actor: limit for number of processes per actor run was increased to 2 × [memory megabytes] so with 2 GB memory your limit is 4000 processes.
  • Actor: host machine now sends migrating event to actor process in a case of upcoming restart or shutdown. Check documentation.

March 2018

  • Actor: actor runs have now fixed amount of CPU capacity reserved and therefore each run should take about the same time. We also added a new checkbox "Use spare CPU capacity" in actor settings allowing actors to use spare CPU capacity at host machine as free boost.
  • Community: we released a new version of our open souce apify npm package containing a lot of new stuff to help you with your web scraping and automation projects. Check its npm page, source code at GitHub repository and the documentation.
  • Actor: apify/actor-node-puppeteer Docker image is now deprecated. Use apify/actor-node-chrome image instead.
  • Actor: we have added apify/actor-node-chrome-xvfb image that supports non-headless Chrome. If you choose this image then Apify.launchPuppeteer() opens Puppeteer with non-headless Chrome by default.
  • API client for Javascript v0.2.0: Method datasets.getItems() now returns object PaginationList with items wrapped inside instead of plain items array. This helps to iterate through all the items using pagination. This change is not backward compatible!
  • Actor: we did improvements of our infrastructure to improve actor starts and overall performance.
  • Actor: logs are now rate-limited. Each actor run and build has 10 000 lines log credit with 10 lines added each second. Log lines over the limit won't be available in both UI and API.

February 2018

  • Web: launched Page Analyzer tool to enable setting up crawlers with less manual steps. Read more on Apify blog.
  • Infrastructure: Major improvements to our Linux server configuration to improve stability and performance of the system
  • Actor: actors can now run with 16GB memory (available for users with Medium and large plans see https://www.apify.com/docs/actor#limits
  • Actor: actor runs and their default key-value stores and datasets are now being deleted after data retention period.

January 2018

  • App: We've added support for PayPal payments for all subscription plans
  • Actor: the actor source code can now come from a GitHub Gist, which is much simpler than having a full Git repository (read the docs)
  • Support: We have re-launched the Knowledge base with a new design and much better search options.
  • API: Added API endpoint to run an actor and get its output in a single HTTP request.
  • Actor: We've added a new storage type Dataset. This enables you to store results in a way similar to Apify Crawler.
  • Actor: Actor usage statistics are now available in user account.

December 2017

  • Community: Released the proxy-chain NPM package as open source
  • Actor: Smarter allocation of tasks to servers to improve performance
  • Actor: Environment variables can now also be passed to actor builds (as docker --build-arg parameter)
  • Actor: Added option to automatically restart actor runs on error
  • Crawler: Fixed URL in the link element of RSS formatted last crawler execution result. This bug was causing that some RSS readers never refreshed the data

November 2017

  • Crawler: Added support for automatic rotation of user agents
  • Open source: Released a new NPM package called proxy-chain to support usage of proxies with password from headless Chrome
  • API: Added support for XLSX output Format for crawler results
  • App: Upgraded the web app to Meteor 1.6 and thus greatly improved the speed of the app
  • Internal: Improved internal notifications, performance and infrastructure improvements
  • Actor: Added feature to enable actor to be anonymously runnable

October 2017

  • Apifier is dead, long live Apify! On 9th October we launched our biggest upgrade yet.
  • The old website at www.apifier.com was replaced with public static website www.apify.com and the app running at my.apify.com
  • A new product called Actor was introduced. Read more in our blog
  • Added actor support to scheduler.
  • Git and Zip file source type added to actor.

August 2017

July 2017

  • API: API endpoint providing results in XML format now allows to set XML tag names.
  • API: Added support for JSONL output format
  • Web: Created Crawler request form to help customers specify the crawlers they would like to have built

June 2017

  • Crawler: Added finish webhook data feature that enables sending of additional info in webhook request payload. (see docs)
  • Web: Added a feature to delete user account

May 2017

  • Internal: Improvements in logging system
  • General: Officially launched Zapier integration
  • Crawler: Added a new context.actId property that enables users to fetch information about their crawler. (see docs)
  • Internal: Consolidated logging in the web application, improvements in Admin interface

April 2017

  • Crawler: Added proxy groups crawler setting to simplify usage of proxy servers (see docs).
  • Web: Added Schedule button to the crawler details page to simplify scheduling of the crawlers
  • Internal: Improvements in administration interface
  • Web: Performance optimizations in UI
  • Web: Added a tool to test the crawler on a single URL only (see Run console on the crawler details page)
  • Internal: Improved reports in admin section
  • Web: Changed Twitter handle from @ApifierInfo to @apifier.
  • Crawler: Bugfix - cookies set in the last page function were not persisted
  • Internal: Deployed some upgrades in data storage infrastructure to improve performance and reduce costs

March 2017

  • Web: Added sorting to Community crawlers.
  • Web: Bugfixes, performance and cosmetic improvements.
  • Internal: improvements in administration interface.
  • Web: Extended public user profile pages in Community crawlers.
  • API: Bugfix in exports of results in XML format.
  • Crawler: Added a new context.actExecutionId property that enables users to stop crawler during its execution, fetch results etc. (see docs).
  • Web: Improvements in internal administration interface.

February 2017

  • Web: Launched an external Apifier Status page to keep our users informed about system status and potential outages.
  • Web: Numerous improvements on Community crawlers page, added user profile page, enabled anonymous sharing
  • API: Improved sorting of columns in CSV/HTML results table - values are now sorted according to numerical indexes (e.g. "val/0", ..., "val/9", "val/10")
  • Web: Launched Apifier community page
  • General: Invoices are now in the PDF format and are sent to customers by email
  • We didn't launch anything today, just wishing you a happy Valentine's Day
  • Web: New testimonials from ePojisteni.cz and Finbox.io published on our Customers page. Thanks Dušan and Andy!
  • Web: Released a major upgrade of billing and invoicing infrastructure to support European value-added tax (VAT)

January 2017

  • Web: Added a new Video tutorials page
  • Crawler: Improved normalization of URLs which is used by the crawler to determine whether a page has already been visited (see Request.uniqueKey property in docs for more details)
  • Infrastructure: changed CDN provider from CloudFlare to AWS CloudFront to improve performance of web and API
  • API: Bugfix in the start execution API endpoint - synchronous wait would sometimes time out after 60 seconds
  • Internal: further improvements in administration interface
  • Web: improved aggregation of usage statistics, now it refreshes automatically
  • Crawler: Request.proxy is now available even inside of the page function
  • Web: improved Invoices page
  • Internal: improvements in administration interface
  • Web: displaying snapshot of the crawling queue in the Run console

December 2016

  • API: all paginated API endpoints now support desc=1 query parameter to sort records in descending order
  • API: added support for XML attributes in results
  • General: added support for RSS output format to enable creating RSS feeds for any website
  • General: launched a new discussion forum
  • Crawler: custom proxy used by a particular request is now saved in Request.proxy field (see Custom proxies in docs)
  • Crawler: performance improvements
  • API: enabled rate limiting
  • Major API upgrades:
  • added new endpoints to update and delete crawlers
  • support for synchronous execution of crawlers
  • all endpoints that return lists now support pagination
  • API Reference was greatly improved
  • Web: Added new Tag and Do not start crawler if previous still running settings to schedules
  • General: Added new Initial cookies setting to enable users to edit cookies used by their crawlers

November 2016

  • Web: Added a list of invoices to Account page
  • Web: Added a new usage stats chart to Account page
  • Internal: Large improvements in the deployment system completed
  • General: Increased the length limit for Start URLs to 2000 characters
  • Web: Showing more relevant statistics in crawler progress bar
  • Web: Released a new shiny API reference
  • Internal: Performance and usability improvements in admin interface
  • Internal: Migrated our main database to MongoDB 3.2, deployed new integration test suite, new metrics in admin interface

October 2016

  • Web: Showing current service limits on the Account page, various internal improvements in user handling code
  • Web: Added new example crawlers to demonstrate how to use page's internal JavaScript variable and AJAX calls
  • New feature: Released Schedules that enable to automatically run crawlers at certain times.
  • Web: Switched to Intercom to manage communication with our users

September 2016

  • Web: Added functionality to test finish webhooks
  • Web: Security fix - added rel="noopener" to all external links in order to avoid exploitation of the window.opener
  • Web: Displaying Internal ID field on crawler details page, and User ID and API token token on the Account page to simplify setup of integrations
  • Web: Added a new Jobs page, because we're hiring!
  • Web: Deployed various performance optimizations and bugfixes
  • Internal: Updated our Meteor application to use ES2015 modules
  • Web: Published a new testimonial from Shopwings on our Customers page. Thanks Guillaume!
  • Crawler: queuePosition can now also be overridden in interceptRequest function (see docs)
  • Web: Performance improvements of results exports
  • Web: Added new example crawler to demonstrate a basic SEO analysis tool
  • Internal: Upgraded Meteor platform from version 1.3 to 1.4
  • Docs: Added API property name and type next to each crawler settings (see docs)
  • Crawler: Added a new context.stats property to pass statistics from the current crawler to user code (see docs).
  • Crawler: Added a new signature for context.enqueuePage() function that enables placing new pages to beginning of the crawling queue and overriding uniqueKey and label fields (see docs).
  • Crawler: Enabled users to define custom User-Agent HTTP header, updated the default value to resemble latest Chrome on Windows.
  • Web: Implemented optimization that enables user to export even large result sets to CSV/HTML format.
  • Web: Created this wonderful page to keep our users up-to-date with new features