API#
The following section describes the available resources in Scrapyd JSON API.
daemonstatus.json#
To check the load status of a service.
Supported Request Methods:
GET
Example request:
curl http://localhost:6800/daemonstatus.json
If basic authentication is enabled:
curl -u yourusername:yourpassword http://localhost:6800/daemonstatus.json
Example response:
{ "status": "ok", "running": "0", "pending": "0", "finished": "0", "node_name": "node-name" }
addversion.json#
Add a version to a project, creating the project if it doesn’t exist.
Supported Request Methods:
POST
Parameters:
project
(string, required) - the project nameversion
(string, required) - the project versionegg
(file, required) - a Python egg containing the project’s code
Example request:
$ curl http://localhost:6800/addversion.json -F project=myproject -F version=r23 -F egg=@myproject.egg
Example response:
{"status": "ok", "spiders": 3}
Note
Scrapyd uses the packaging Version to interpret the version numbers you provide.
The latest version for a project will be used by default whenever necessary.
schedule.json and listspiders.json allow you to explicitly set the desired project version.
schedule.json#
Schedule a spider run (also known as a job), returning the job id.
Supported Request Methods:
POST
Parameters:
project
(string, required) - the project namespider
(string, required) - the spider namesetting
(string, optional) - a Scrapy setting to use when running the spiderjobid
(string, optional) - a job id used to identify the job, overrides the default generated UUIDpriority
(float, optional) - priority for this project’s spider queue — 0 by default_version
(string, optional) - the version of the project to useany other parameter is passed as spider argument
Example request:
$ curl http://localhost:6800/schedule.json -d project=myproject -d spider=somespider
Example response:
{"status": "ok", "jobid": "6487ec79947edab326d6db28a2d86511e8247444"}
Example request passing a spider argument (arg1
) and a setting
(DOWNLOAD_DELAY):
$ curl http://localhost:6800/schedule.json -d project=myproject -d spider=somespider -d setting=DOWNLOAD_DELAY=2 -d arg1=val1
Note
Spiders scheduled with Scrapyd should allow for an arbitrary number of keyword arguments, as Scrapyd sends internally-generated spider arguments to the spider being scheduled.
Note
When a parameter other than setting
is entered multiple times with -d
, only the first
value is sent to the spider.
cancel.json#
New in version 0.15.
Cancel a spider run (aka. job). If the job is pending, it will be removed. If the job is running, it will be terminated.
Supported Request Methods:
POST
Parameters:
project
(string, required) - the project namejob
(string, required) - the job id
Example request:
$ curl http://localhost:6800/cancel.json -d project=myproject -d job=6487ec79947edab326d6db28a2d86511e8247444
Example response:
{"status": "ok", "prevstate": "running"}
listprojects.json#
Get the list of projects uploaded to this Scrapy server.
Supported Request Methods:
GET
Parameters: none
Example request:
$ curl http://localhost:6800/listprojects.json
Example response:
{"status": "ok", "projects": ["myproject", "otherproject"]}
listversions.json#
Get the list of versions available for some project. The versions are returned in order, the last one is the currently used version.
Supported Request Methods:
GET
Parameters:
project
(string, required) - the project name
Example request:
$ curl http://localhost:6800/listversions.json?project=myproject
Example response:
{"status": "ok", "versions": ["r99", "r156"]}
listspiders.json#
Get the list of spiders available in the last (unless overridden) version of some project.
Supported Request Methods:
GET
Parameters:
project
(string, required) - the project name_version
(string, optional) - the version of the project to examine
Example request:
$ curl http://localhost:6800/listspiders.json?project=myproject
Example response:
{"status": "ok", "spiders": ["spider1", "spider2", "spider3"]}
listjobs.json#
New in version 0.15.
Get the list of pending, running and finished jobs of some project.
Supported Request Methods:
GET
Parameters:
project
(string, option) - restrict results to project name
Example request:
$ curl http://localhost:6800/listjobs.json?project=myproject | python -m json.tool
Example response:
{
"status": "ok",
"pending": [
{
"project": "myproject", "spider": "spider1",
"id": "78391cc0fcaf11e1b0090800272a6d06"
}
],
"running": [
{
"id": "422e608f9f28cef127b3d5ef93fe9399",
"project": "myproject", "spider": "spider2",
"start_time": "2012-09-12 10:14:03.594664"
}
],
"finished": [
{
"id": "2f16646cfcaf11e1b0090800272a6d06",
"project": "myproject", "spider": "spider3",
"start_time": "2012-09-12 10:14:03.594664",
"end_time": "2012-09-12 10:24:03.594664",
"log_url": "/logs/myproject/spider3/2f16646cfcaf11e1b0090800272a6d06.log",
"items_url": "/items/myproject/spider3/2f16646cfcaf11e1b0090800272a6d06.jl"
}
]
}
Note
All job data is kept in memory by default and will be reset when the Scrapyd service is restarted. See jobstorage.
delversion.json#
Delete a project version. If there are no more versions available for a given project, that project will be deleted too.
Supported Request Methods:
POST
Parameters:
project
(string, required) - the project nameversion
(string, required) - the project version
Example request:
$ curl http://localhost:6800/delversion.json -d project=myproject -d version=r99
Example response:
{"status": "ok"}
delproject.json#
Delete a project and all its uploaded versions.
Supported Request Methods:
POST
Parameters:
project
(string, required) - the project name
Example request:
$ curl http://localhost:6800/delproject.json -d project=myproject
Example response:
{"status": "ok"}