Grabber
Fetching schedule data for each service is done during the grabbing process. The grabber supports various method for fetching data files. These files can be on a web-server, ftp-server or local filesystem.
The grabber is fetching files from the configured source to target directory.
This target directory is different for each service. The directory is named by
the SID of the service and located inside /var/lib/cherryepg/ingest/
.
This directory in created on demand and handled by cherryEPG.
The configuration of the grabber is done by the scheme file. The columns relevant for configuring the grabber are URL and Update.
URL
Here we configure the source for data fetching.
- String starting with http:// specifies a web address accessible via http.
e.g.:
http://epg.cherryhill.eu/epgdata/schedule29.xml
Before fetching files the given URL is passed to strftime for inserting date/time related information.
e.g.:
http://sportchannel.example.com/schedule/basket-%Y-%m-%d.xml will fetch basket-2020-07-25.xml depending on the current date. - String starting with ftp:// specifies a ftp server.
e.g.:
ftp://username:password@ftp.exampleserver.com/epgdata.html - All other files are assumed to be on the local filesystem in the directory
/var/lib/cherryepg/stock/
and are copied from this location. For fetching the files a “smart copy” tool is used (rsync). The typical Linux wildcards can be used.
e.g.:
news_*.xml - this will copy all files starting with news_ and extension .xml in the directory /var/lib/cherryepg/stock/
Update
The process of grabbing is started on regular basis. The intensity is depending on the data source. With the parameter
in the Update field you can select the update interval. Various update intervals are defined in the cherryEPG
user crontab file. You can view it by running crontab -l
or edit it with crontab -e
.
Possible values are:
- monthly - will be run at the end of the month;
- weekly - will be run at Friday night,
- daily - will be run daily at 2:35am in the morning,
- hourly - will be run 5 minutes before the hour.
Control from Command-line
The grabber is invoked from command-line with the cherryTool script.
When grabbing by target the ingester is automatically started.
Grabbing and ingesting of schedule files by target is started by specifying the target e.g. daily
cherryTool -G daily
or to grab and ingest schedule files for all services by specifying all
cherryTool -G all
It also possible to grab schedule files for individual services by specifying the service SID. In this case the ingester is not run.
cherryTool -g 42
but is is possible to combine grabbing and ingesting
cherryTool -gi 42
The files inside a service directory can be deleted with
cherryTool -d 42
It is also possible to clean the whole ingest directory. All directories inside /var/lib/cherryepg/ingest/
will
be deleted.
cherryTool -d