Don't know if anybody here already knows flexget (
http://flexget.com): It's a multi-purpose download automation tool and can take inputs from RSS feeds, Twitter time-lines etc.
It's also MIT licensed / Open Source. As flexget is Python-based, it will run on Linux, Windows or Mac OS X.
Just created a new configuration file for automated downloads of recently-added items belonging to specific collections on the LMA and wanted to share this recipe:
The trick here is that archive.org appears to use some sort of download redirection approach (possibly because of load-balancing considerations). Therefore, you can't use flexget's own download plug-in (but a command-line based tool like curl or wget) to actually download the new found items.
N.B.: As this config file (config.yml) is based on a mark-up language called YAML, please make sure to get the syntax (respectively indentation) right when editing this file (
https://docs.ansible.com/ansible/YAMLSyntax.html). Otherwise, the config file won't load properly in flexget...
tasks:
lma:
rss:
url: http://archive.org/services/collection-rss.php?collection=etree&query=%28%28collection%3Aetree%20OR%20mediatype%3Aetree%29%20AND%20-collection%3AGratefulDead%29%20AND%20-mediatype%3Acollection
link: guid
# Copy the following details from the .archive.org cookie settings in your web browser
headers:
Cookie: "ui3=<your_id>; visited=<your_timestamp>"
# Define filters for acceptable items (artist name, date, bit depth, recording gear...)
regexp:
accept:
- \b(mogwai)\b.*((20(?:1[5-9]|[2-9]\d)){1}-(0?[1-9]|1[0-2]){1}-(0{1}[1-9]|[1-2][0-9]|3[01]){1}):
from: title
- \b(bardo)[\s]+(pond)\b.*((20(?:1[5-9]|[2-9]\d)){1}-(0?[1-9]|1[0-2]){1}-(0{1}[1-9]|[1-2][0-9]|3[01]){1}):
from: title
reject:
- \b(24bit)\b\s\b(flac)\b:
from: description
content_filter:
reject:
- '*.mp3'
urlrewrite:
lma:
regexp: 'http://archive.org/details/(?P<id>.*)'
format: 'http://archive.org/compress/\g<id>/formats=FLAC&file=/\g<id>.zip'
# Use wget instead of the internal download plugin because of download redirection ("302 Moved Temporarily")!
exec: nohup wget "{{url}}" -a /var/log/flexget/lma.log -P /home/<username>/LMA/DOWNLOADS/ &
# Don't download the new item, if the file already exists
exists:
- /home/<username>/LMA/DOWNLOADS
# Don't download the item, if there's less than 500 MB left (this specific plug-in works on Linux only!)
free_space:
path: /home
space: 500
# Send a mail notification when a new item has been downloaded
email:
active: True
from: <sender_address@isp.tld>
to: <recipient_address@isp.tld>
smtp_host: <smtp_server_name>
smtp_port: <smtp_server_port>
smtp_username: <your_mail_server_login>
smtp_password: <your_smtp_password>
smtp_tls: yes
template: accepted
# Polling intervals of the individual tasks
schedules:
- tasks: lma
schedule:
day_of_week: mon-fri
minute: "*/15"
hour: 0-5,7-23