No credit card required

Article Text Extractor

mtrunkat/article-text-extractor

No credit card required

Simply extracts article texts and other meta info from the given URL. Uses https://github.com/ageitgey/node-unfluff which is a NodeJS implementation of https://github.com/grangier/python-goose.

Simply extracts article text and other meta info from given url. Uses https://github.com/ageitgey/node-unfluff which is a NodeJS implementation of https://github.com/grangier/python-goose. Check out also lukaskrivka/article-extractor-smart.

Output get's saved into a default key-value store under the OUTPUT key. HTML of the given page is stored under the page.html key.

Example output:

1{
2  "title": "Sánchez no logra extender su poder territorial pese al triunfo del 26-M",
3  "softTitle": "Sánchez no logra extender su poder territorial pese al triunfo del 26-M",
4  "date": "16/06/2019 22:03",
5  "author": [
6    "Madrid"
7  ],
8  "publisher": "La Vanguardia",
9  "copyright": "La Vanguardia Ediciones Todos los derechos reservados",
10  "favicon": "https://www.lavanguardia.com/rsc/images/ico/favicon.ico",
11  "description": "El PSOE ganó el pasado 26 de mayo las elecciones municipales y autonómicas de manera 'clara y rotunda', según celebró el propio Pedro Sánchez aquella misma noche. Aunque la victoria socialista se tiñó...",
12  "lang": "es",
13  "canonicalLink": "https://www.lavanguardia.com/politica/20190617/462906149711/psoe-pedro-sanchez-elecciones-26m-alcaldias-gobiernos-espana.html",
14  "tags": [],
15  "image": "https://www.lavanguardia.com/r/GODO/LV/p6/WebSite/2019/06/17/Recortada/20190614-636961455890161857_20190614215051428-kvhE-U462903686315FDE-992x558@LaVanguardia-Web.jpg",
16  "videos": [],
17  "links": [],
18  "text": "..."
19}

Developer

Marek Trunkát

Actor metrics

23 monthly users
98.2% runs succeeded
0.0 days response time
Created in Mar 2018
Modified 7 months ago

Categories

News

Twitter Scraper

quacker/twitter-scraper

Scrape tweets from any Twitter user profile. Top Twitter API alternative to scrape Twitter hashtags, threads, replies, followers, images, videos, statistics, and Twitter history. Export scraped data, run the scraper via API, schedule and monitor runs or integrate with other tools.

Quacker

23.8k

Google Trends Scraper

emastra/google-trends-scraper

Scrape data from Google Trends by search terms or URLs. Specify locations, define time ranges, select categories to get interest by subregion and over time, related queries and topics, and more. Export scraped data, run the scraper via API, schedule and monitor runs, or integrate with other tools.

Emiliano Mastragostino

3.1k

Twitter URL Scraper

quacker/twitter-url-scraper

Copy any Twitter URL and extract Twitter usernames, profile photos, follower count, tweets, hashtags, favorite count, and more. Export scraped datasets, run the scraper via API, schedule and monitor runs or integrate with other tools.

Quacker

4.2k

Smart Article Extractor

lukaskrivka/article-extractor-smart

📰 Smart Article Extractor extracts articles from any scientific, academic, or news website with just one click. The extractor crawls the whole website and automatically distinguishes articles from other web pages. Download your data as HTML table, JSON, Excel, RSS feed, and more.

Lukáš Křivka

3.2k

Reddit Scraper Lite

trudax/reddit-scraper-lite

Pay Per Result, unlimited Reddit web scraper to crawl posts, comments, communities, and users without login. Limit web scraping by number of posts or items and extract all data in a dataset in multiple formats.

Gustavo Rudiger

2.8k

Transfermarkt Scraper

curious_coder/transfermarkt

⚽ Use this free tool as an API for the Transfermarkt website. Scrape and extract data from competition, club or player pages, or almost any Transfermarkt page. Download your data as HTML table, JSON, CSV, Excel, XML, and RSS feed.

Curious Coder

2.4k

Dun & Bradstreet Scraper

epctex/dnb-scraper

Effortlessly extract valuable company information, financial projections, industry insights, and more from the extensive Dun & Bradstreet commercial database. Dive deep into the D&B Data Cloud, Business Directory, articles, companies, and industries with customized search terms.

epctex

1.5k

Dark Web Scraper

epctex/darkweb-scraper

Uncover valuable insights with our Dark Web Scraper. Extract sensitive data, including crypto wallets, API keys, emails, phone numbers, and more, from the depths of the Dark Web. You can specify search terms, and customize and retrieve OSINT data out of the box.

epctex

852