full-text-rss/README.md

84 lines
3.2 KiB
Markdown
Raw Normal View History

2011-11-04 17:10:31 +00:00
Full-Text RSS
=============
2013-07-17 21:21:22 +00:00
### NOTE
2011-11-04 17:40:29 +00:00
2017-02-18 15:22:14 +00:00
This is a our public version of Full-Text RSS available to download for free from <https://bitbucket.org/fivefilters>.
2011-11-04 17:40:29 +00:00
2013-07-18 14:32:11 +00:00
For best extraction results, and to help us sustain the project, you can purchase the most up-to-date version at <http://fivefilters.org/content-only/#download> - so if you like this free version, please consider supporting us by purchasing the latest release.
If you have no need for the latest release, but would still like to contribute something, you can donate via [Gittip](https://www.gittip.com/fivefilters/) or [Flattr](https://flattr.com/profile/k1m).
2011-11-04 17:40:29 +00:00
2013-07-17 21:21:22 +00:00
### About
2011-11-04 17:10:31 +00:00
2013-07-17 21:21:22 +00:00
See <http://fivefilters.org/content-only/> for a description of the code.
2011-11-04 17:10:31 +00:00
2013-07-17 21:21:22 +00:00
### Installation
2011-11-04 17:10:31 +00:00
1. Extract the files in this ZIP archive to a folder on your computer.
2. FTP the files up to your server
2011-11-04 17:40:29 +00:00
3. Access index.php through your browser. E.g. http://example.org/full-text-rss/index.php
2011-11-04 17:10:31 +00:00
4. Enter a URL in the form field to test the code
5. If you get an RSS feed with full-text content, all is working well. :)
2013-07-17 21:21:22 +00:00
### Configuration (optional)
2011-11-04 17:10:31 +00:00
1. Save a copy of config.php as custom_config.php and edit custom_config.php
2013-07-17 21:27:04 +00:00
2. If you decide to enable caching, make sure the cache folder (and its 2 sub folders) is writable. (You might need to change the permissions of these folders to 777 through your FTP client.)
2013-07-18 14:22:47 +00:00
### Site-specific extraction rules
2013-07-18 14:32:11 +00:00
This free version does not contain the site config files we include with purchased copies, but these are now all available [online](https://github.com/fivefilters/ftr-site-config). If you'd like to keep yours up to date using Git, follow the steps below:
2013-07-18 14:22:47 +00:00
1. Change into the site_config/standard/ folder
2. Delete everything in there
3. Using the command line, enter: `git clone https://github.com/fivefilters/ftr-site-config.git .`
4. Git should now download the latest site config files for you.
5. To update the site config files again, you can simply run `git pull` from the directory.
2013-07-18 14:32:11 +00:00
### Code example
If you're developing an application which requires content extraction, you can call Full-Text RSS as a web service from within your application. Here's how to do it in PHP:
2013-07-17 21:27:04 +00:00
<?php
// $ftr should be URL where you installed this application
$ftr = 'http://example.org/full-text-rss/';
$article = 'http://www.bbc.co.uk/news/world-europe-21936308';
$request = $ftr.'makefulltextfeed.php?format=json&url='.urlencode($article);
// Send HTTP request and get response
$result = @file_get_contents($request);
if (!$result) die('Failed to fetch content');
$json = @json_decode($result);
if (!$json) die('Failed to parse JSON');
// What do we have?
// var_dump($json);
// Items?
// var_dump($json->rss->channel->item);
$title = $json->rss->channel->item->title;
// Note: this works when you're processing an article.
// If the input URL is a feed, ->item will be an array.
2013-07-18 14:22:47 +00:00
echo $title;
### Different language?
Although we don't have examples in other programming languages, the essential steps should be:
1. Construct the request URL using URL where you installed Full-Text RSS and the article or feed URL (see $ftr, $article, $request in example above).
2. Fetch the resulting URL using an HTTP GET request.
3. Parse the HTTP response body as JSON and grab what you need.