抓包框架:php-spider
简介:
The easiest way to install PHP-Spider is with composer. Find it on Packagist.
PHP-Spoder在Composer上安装非常简单。
特性:
(1)supports two traversal algorithms: breadth-first and depth-first
(1)支持两种遍历算法:广度优先和深度优先
(2)supports crawl depth limiting, queue size limiting and max downloads limiting
(2)支持抓取深度限制,队列大小限制和最大下载限制
(3)supports adding custom URI discovery logic, based on XPath, CSS selectors, or plain old PHP
(3)支持基于XPath,CSS选择器,或普通的PHP的自定义url设计模式
(4)comes with a useful set of URI filters, such as Domain limiting
(4)配备一套有用的URI的过滤器,如限域
(5)supports custom URI filters, both prefetch (URI) and postfetch (Resource content)
(5)支持自定义URL过滤器,预取(URI)和postfetch(资源量)
(6)supports custom request handling logic
(6)支持自定义请求处理逻辑
(7)comes with a useful set of persistence handlers (memory, file. Redis soon to follow)
(7)自带一个有用的持久化处理程序集(内存,文件。redis跟随)
(8)supports custom persistence handlers
(8)支持自定义持久处理程序
(9)collects statistics about the crawl for reporting
(9)收集关于报告的抓取的统计信息
(10)dispatches useful events, allowing developers to add even more custom behavior
(10)将有用的事件,允许开发者添加更多的自定义行为
(11)supports a politeness policy
(11)符合法律规定
(12)will soon come with many default discoverers: RSS, Atom, RDF, etc.
(12)即将支持(暂未实现):RSS,原子,RDF,等。
(13)will soon support multiple queueing mechanisms (file, memcache, redis)
即将支持(暂未实现)多队列机制(文件、Memcache、Redis)
(14) will eventually support distributed spidering with a central queue
(14)最终将支持分布式搜索与中央队列
使用教程:
Windows 上安装 需要composer 环境
(1)下载:https://github.com/mvdbos/php-spider 下载到本地,放在xampp环境下的htdoc下。
(2)进入 目录 使用 composer update 更新
网页打开即可使用
http://localhost/php-spider-master/example/example_simple.php
使用技巧:非直译,有误请指正。
demo:测试
修改:文件
常用方法: