- Actor: Smarter allocation of tasks to servers to improve performance
- Actor: Environment variables can now also be passed to act builds
- Actor: Added option to automatically restart act runs on error
- Crawler: added support for automatic rotation of user agents
- Open source: released a new NPM package called proxy-chain to support usage of proxies with password from headless Chrome
- API: added support for XLSX output format for crawler results
- App: Upgraded the web app to Meteor 1.6 and thus greatly improved the speed of the app
- Internal: improved internal notifications, performance and infrastructure improvements
- Apifier is dead, long live Apify! On 9th October we launched our biggest upgrade yet.
- The old website at www.apifier.com was replaced with public static website www.apify.com and the app running at my.apify.com
- A new product called Actor was introduced. Read more in our blog
- Added actor support to scheduler.
- Git and Zip file source type added to actor.
- API: API endpoint providing results in XML format now allows to set XML tag names.
- API: Added support for JSONL output format
- Web: Created Crawler request form to help customers specify the crawlers they would like to have built
- Crawler: Added finish webhook data feature that enables sending of additional info in webhook request payload. (see docs)
- Web: Added a feature to delete user account
- Internal: Improvements in logging system
- General: Officially launched Zapier integration
- Crawler: Added a new
context.actId property that enables users to fetch information about their crawler. (see docs)
- Internal: Consolidated logging in the web application, improvements in Admin interface
- Crawler: Added proxy groups crawler setting to simplify usage of proxy servers (see docs).
- Web: Added Schedule button to the crawler details page to simplify scheduling of the crawlers
- Internal: Improvements in administration interface
- Web: Performance optimizations in UI
- Web: Added a tool to test the crawler on a single URL only (see Run console on the crawler details page)
- Internal: Improved reports in admin section
- Web: Changed Twitter handle from @ApifierInfo to @apifier.
- Crawler: Bugfix - cookies set in the last page function were not persisted
- Internal: Deployed some upgrades in data storage infrastructure to improve performance and reduce costs
- Web: Added sorting to Community crawlers.
- Web: Bugfixes, performance and cosmetic improvements.
- Internal: improvements in administration interface.
- Web: Extended public user profile pages in Community crawlers.
- API: Bugfix in exports of results in XML format.
- Crawler: Added a new
context.actExecutionId property that enables users to stop crawler during its execution, fetch results etc. (see docs).
- Web: Improvements in internal administration interface.
- Web: Launched an external Apifier Status page to keep our users informed about system status and potential outages.
- Web: Numerous improvements on Community crawlers page, added user profile page, enabled anonymous sharing
- API: Improved sorting of columns in CSV/HTML results table - values are now sorted according to numerical indexes (e.g. "val/0", ..., "val/9", "val/10")
- Web: Launched Apifier community page
- General: Invoices are now in the PDF format and are sent to customers by email
- We didn't launch anything today, just wishing you a happy Valentine's Day
- Web: New testimonials from ePojisteni.cz and Finbox.io published on our Customers page. Thanks Dušan and Andy!
- Web: Released a major upgrade of billing and invoicing infrastructure to support European value-added tax (VAT)
- Web: Added a new Video tutorials page
- Crawler: Improved normalization of URLs which is used by the crawler to determine whether a page has already been visited (see Request.uniqueKey property in docs for more details)
- Infrastructure: changed CDN provider from CloudFlare to AWS CloudFront to improve performance of web and API
- API: Bugfix in the start execution API endpoint - synchronous wait would sometimes time out after 60 seconds
- Internal: further improvements in administration interface
- Web: improved aggregation of usage statistics, now it refreshes automatically
- Crawler: Request.proxy is now available even inside of the page function
- Web: improved Invoices page
- Internal: improvements in administration interface
- Web: displaying snapshot of the crawling queue in the Run console
- API: all paginated API endpoints now support
desc=1 query parameter to sort records in descending order
- API: added support for XML attributes in results
- General: added support for RSS output format to enable creating RSS feeds for any website
- General: launched a new discussion forum
- Crawler: custom proxy used by a particular request is now saved in
Request.proxy field (see Custom proxies in docs)
- Crawler: performance improvements
- API: enabled rate limiting
- Major API upgrades:
- added new endpoints to update and delete crawlers
- support for synchronous execution of crawlers
- all endpoints that return lists now support pagination
- API Reference was greatly improved
- Web: Added new Tag and Do not start crawler if previous still running settings to schedules
- General: Added new Initial cookies setting to enable users to edit cookies used by their crawlers
- Web: Added a list of invoices to Account page
- Web: Added a new usage stats chart to Account page
- Internal: Large improvements in the deployment system completed
- General: Increased the length limit for Start URLs to 2000 characters
- Web: Showing more relevant statistics in crawler progress bar
- Web: Released a new shiny API reference
- Internal: Performance and usability improvements in admin interface
- Internal: Migrated our main database to MongoDB 3.2, deployed new integration test suite, new metrics in admin interface
- Web: Showing current service limits on the Account page, various internal improvements in user handling code
- New feature: Released Schedules that enable to automatically run crawlers at certain times.
- Web: Switched to Intercom to manage communication with our users
- Web: Added functionality to test finish webhooks
- Web: Security fix - added
rel="noopener" to all external links in order to avoid exploitation of the
- Web: Displaying Internal ID field on crawler details page, and User ID and Manage Acts token on the Account page to simplify setup of integrations
- Web: Added a new Jobs page, because we're hiring!
- Web: Deployed various performance optimizations and bugfixes
- Internal: Updated our Meteor application to use ES2015 modules
- Web: Published a new testimonial from Shopwings on our Customers page. Thanks Guillaume!
queuePosition can now also be overridden in
interceptRequest function (see docs)
- Web: Performance improvements of results exports
- Web: Added new example crawler to demonstrate a basic SEO analysis tool
- Internal: Upgraded Meteor platform from version 1.3 to 1.4
- Docs: Added API property name and type next to each crawler settings (see docs)
- Crawler: Added a new
context.stats property to pass statistics from the current crawler to user code (see docs).
- Crawler: Added a new signature for
context.enqueuePage() function that enables placing new pages to beginning of the crawling queue and overriding
label fields (see docs).
- Crawler: Enabled users to define custom User-Agent HTTP header, updated the default value to resemble latest Chrome on Windows.
- Web: Implemented optimization that enables user to export even large result sets to CSV/HTML format.
- Web: Created this wonderful page to keep our users up-to-date with new features