中山大学学报自然科学版 ›› 2018, Vol. 57 ›› Issue (1): 76-82.

• 论文 • 上一篇    下一篇

基于随机森林的流动摊贩分布模型

戴若颖1,党雪薇1,冯兆1,李海霞2,柳林1,3   

  1. 1. 中山大学地理科学与规划学院, 广东 广州 510275; 
    2. 中山大学岭南学院, 广东 广州 510275;
    3.美国辛辛那提大学地理系,俄亥俄 辛辛那提 45221-0131
  • 收稿日期:2016-08-27 出版日期:2018-01-25 发布日期:2018-01-25
  • 通讯作者: 柳林(1965年生),男;研究方向:地理信息和公共安全;E-mail:lin.liu@uc.edu

Randomforestbased street vendors distribution model: A case study of Haizhu District

DAI Ruoying1, DANG Xuewei1,FENG Zhao1,LI Haixia2,LIU Lin1,3   

  1. 1School of Geography and Planning,Sun Yat-sen University,Guangzhou 510275, China; 
    2Lingnan (University) College,Sun Yat-sen University,Guangzhou 510275, China;
    3. Department of Geography, University of Cincinnati, Cincinnati 452210131,USA
  • Received:2016-08-27 Online:2018-01-25 Published:2018-01-25

摘要:

流动摊贩聚集是我国城市管理的难题。相关研究多针对流动摊贩形成与管理的机制,对特定街道进行实证分析,而缺少针对区域流动摊贩分布的定量建模。流动摊贩形成与许多因子有关,各因子关系复杂,所选建-模算法随机森林建立适用于特征值关系复杂的分类器,不需要依赖贡献度大的单个因子,并且能够一定程度回避其他分类算法常有的噪声、异常值和过拟合造成的问题,得到高精度的预测模型,同时能够在此基础上得到各因子的贡献度,为进一步研究流动摊贩问题提供依据。根据文献和实地调查分析,选择房价、街道段长度、路口岔口数量、段内外公交线路数量等因子作为建模的特征值。以广州海珠区为例,将流动摊贩数量分为4个等级,等级1对应0个流动摊贩,等级2对应1~10个流动摊贩,等级3对应11~20个流动摊贩,等级4对应20个以上流动摊贩。通过模型训练和参数调整,使用卡帕系数和整体精度作为标准,选择性能最优的预测模型。基于随机森林的流动摊贩分布模型预测流动摊贩的空间位置,一定程度揭示流动摊贩形成和分布的规律,可用于城市规划和管理,同时辅助流动摊贩的相关研究。

关键词: 随机森林, 流动摊贩, 广州

Abstract:

Street vendors problem is a challenge to city management in China. Vendors on the street provide convenience for citizens on one hand, and to some extent disturb the daily order of cities on the other hand. Most studies on street vendors focus on the mechanics of vendors aggregation with certain management methodology, and focus on vendors of particular streets. A quantitative modeling on vendor distribution is rare. A number of factors with complex relationship to one another decide the aggregation of street vendors, therefore, the algorithm of random forest is suitable for its application to building classifiers with complex features. With this algorithm not relying on particular critical factor, problems occurred frequently in other algorithms, such as outliers, noises and overfitting, can be well avoided. Further analysis can be made since the algorithm provides magnitude for each street vendors concerned factors. According to literature and field investigation, average house price per square, length of street segments, folks number, bus line number are chosen for modeling. Haizhu district of Guangzhou has been investigated. Taking kappa coefficient and overall accuracy as standard, the optimum predict model of street vendors is generated with training and parameters adjustment, setting no vendor as grade 1, 1~10 vendors as grade 2, 11~20 vendors as grade 3, 21 or more vendors as grade 4. The proposed model of street vendors distribution which predicts the appearance of street vendors, not only facilitates city planning and management, but also throws new light on relevant research.

Key words: random forest, street vendor, Guangzhou

中图分类号: