Telnet控制台

Scrapy提供了内置的telnet控制台,以供检查,控制Scrapy运行的进程。Telnet 控制台仅仅是一个运行在Scrapy进程中的普通python shell,因此您可以在其中做任何事。

Telnet控制台是一个 自带的Scrapy扩展, 该扩展默认为启用,不过您也可以关闭。For more information about the extension itself see Telnet console extension.

如何访问telnet控制台

Telnet控制台监听设置中定义的 TELNETCONSOLE_PORT ,默认为 6023To access the console you need to type:

telnet localhost 6023
>>>

You need the telnet program which comes installed by default in Windows, and most Linux distros.

telnet控制台中可用的变量

Telnet控制台仅仅是一个运行在Scrapy进程中的普通python shell,因此您可以做任何事情,甚至是导入新模块。

However, the telnet console comes with some default variables defined for convenience:

ShortcutDescription
crawlerthe Scrapy Crawler (scrapy.crawler.Crawler object)
engineCrawler.engine attribute
spider活着的Spider
slotthe engine slot
extensionsthe Extension Manager (Crawler.extensions attribute)
statsthe Stats Collector (Crawler.stats attribute)
settingsthe Scrapy settings object (Crawler.settings attribute)
estprint a report of the engine status
prefsfor memory debugging (see Debugging memory leaks)
ppprint.pprint函数的快捷方式
hpyfor memory debugging (see Debugging memory leaks)

Telnet控制台使用示例

下面是使用telnet 控制台的一些例子:

查看引擎状态

You can use the est() method of the Scrapy engine to quickly show its state using the telnet console:

telnet localhost 6023
>>> est()
Execution engine status

time()-engine.start_time                        : 8.62972998619
engine.has_capacity()                           : False
len(engine.downloader.active)                   : 16
engine.scraper.is_idle()                        : False
engine.spider.name                              : followall
engine.spider_is_idle(engine.spider)            : False
engine.slot.closing                             : False
len(engine.slot.inprogress)                     : 16
len(engine.slot.scheduler.dqs or [])            : 0
len(engine.slot.scheduler.mqs)                  : 92
len(engine.scraper.slot.queue)                  : 0
len(engine.scraper.slot.active)                 : 0
engine.scraper.slot.active_size                 : 0
engine.scraper.slot.itemproc_size               : 0
engine.scraper.slot.needs_backout()             : False

暂停、恢复和停止Scrapy引擎

To pause:

telnet localhost 6023
>>> engine.pause()
>>>

To resume:

telnet localhost 6023
>>> engine.unpause()
>>>

To stop:

telnet localhost 6023
>>> engine.stop()
Connection closed by foreign host.

Telnet控制台信号

scrapy.extensions.telnet.update_telnet_vars(telnet_vars)

Sent just before the telnet console is opened. You can hook up to this signal to add, remove or update the variables that will be available in the telnet local namespace. In order to do that, you need to update the telnet_vars dict in your handler.

Parameters:telnet_vars (dict) – the dict of telnet variables

Telnet settings

These are the settings that control the telnet console’s behaviour:

TELNETCONSOLE_PORT

Default: [6023, 6073]

The port range to use for the telnet console. If set to None or 0, a dynamically assigned port is used.

TELNETCONSOLE_HOST

Default: '127.0.0.1'

The interface the telnet console should listen on