基于爬虫的外卖系统商户信息获取与呈现系统的设计与实现毕业论文

 2022-09-18 17:19:28

论文总字数:43871字

摘 要

随着人们生活水平的提高以及互联网技术的发展,饿了么,美团,百度外卖等外卖平台在人们的生活中,特别是在忙碌的上班族的日常餐饮生活中扮演着举足轻重的作用。但是,这种线上平台在给人们带来便利的同时也给相关监管部门的管理工作带来了很多的不便,增加了有关人员督查工作的重担。为了改善这一现状,本系统借助于当今大数据时代爬虫技术的发展,利用python编程语言开发一项针对外卖平台上的商户信息进行获取、统计以及分析的爬虫系统,以减轻有关监管部门的工作量,同时也有利于消费平台上的消费者们进行有方向性,目的性的商家选择以及线上、线下选店就餐等。

本系统采用如今发展势头最火的计算机程序设计语言python,以及nosql型的数据库redis进行开发,主要需求是对全国地图进行建模,根据经纬度坐标信息规划外卖平台上的商户信息并进行有针对性的抓取,借助json数据交换格式来对抓取对象的字符编码进行转换并相继进行存储,最后通过具体的python数据可视化实现方法统计某省会城市(南京)外卖平台上商户的空间分布情况。本文主要对以下几个方面进行阐述:

1)采用左上角,右下角的区域划分方法对南京市所在的区域进行划分,并通过切块取中心点经纬度坐标的方式取出所划区域的所有经纬度坐标信息。

2)利用python语言结合redis数据库编写一个根据经纬度坐标信息对饿了么外卖平台上的商户信息进行获取与呈现的网络爬虫,该爬虫系统按照广度优先的爬取策略对外卖平台上的商户信息进行定向爬取以及周期性的抓取。

3)采用redis数据库的query_redis()方法对爬取到的商户信息进行数据字段的抽取,将商户的编号、名称、地址、联系电话、经纬度坐标信息等数据字段解析出来并存入到redis数据库中。

4)对redis数据库中的数据字段进行有针对性的分析,提取相关经纬度坐标信息,借助Python散点图可视化实现方法,统计出所爬取出来的商户的空间分布。

除以上几个部分,在实现的过程中要兼顾考虑:1.怎样提高爬虫技术以灵活高效的方式对商户信息获取。2如何确定相应的主题相关性。3.对于开发过程的请求与缓存如何作出相应的处理。4.是否考虑采用并发进程进行处理。5.如何选取爬取数据的存储方式。6.如何整合爬取的数据信息,并对其中的有效信息进行提取。

最后一部分就是要对所开发出来的系统进行功能和性能上的测试,以确保用户对本系统的正常使用。

通过使用该外卖平台商户信息爬虫系统,监管部门的人员能够实时地对平台上的商户进行监管以及相关信息的数据分析,同时,有利于广大消费者进行导向性的线上与线下消费选择。该系统对于全面,实时地了解平台上的商户信息具有重要意义。

关键词:外卖平台;爬虫技术;python ;redis;商户信息

Abstract

With the improvement of people's living standards and the development of Internet technology, Eeme, Meituan, Baiduwaimai and other takeaway platforms in people's lives, especially in the daily life of busy office workers play a decisive role. However, this kind of online platform also brings much inconvenience to the management of relevant supervision department while bringing convenience to people, it has increased the burden of supervision work of the personnel concerned. In order to improve this situation, this system uses the Python programming language to develop a take-away platform with the help of the development of reptile technology in the big data era merchant information access, statistics, and analysis of the crawler system to alleviate the workload of the relevant regulatory department, but also conducive to the consumer platform carry out directional, purposeful business choice and online and offline store dining and so on.

This system uses Python, the most popular computer programming language, and the nosq l type database redis, the main demand is to model the national map, according to the latitude and longitude coordinates information planning take-away platform. Merchant information on and targeted to grab, with the help of JSON data exchange format, the character encoding of the grabbing object is converted and stored sequentially, and finally, the concrete Python data visualization is realized statistics of a provincial capital city ( Nanjing ) take-away platform. Spatial distribution of merchants. This article mainly discusses the following aspects :

1 ) Using the upper left corner and the lower right corner to divide the area where Nanjing is located, and take out the area by cutting the center point latitude and longitude coordinates all the latitude and longitude coordinates of the area.

2 ) Using the Python language and the REDIS database to write a merchant information acquisition and presentation based on the latitude and longitude coordinates information on the ELEME-take-away platform the crawler system uses breadth-first crawling strategy to carry out directional crawling and periodic crawling of merchant information on the external platform.

3 ) Using the query _ redis ( ) method of the REDIS database to extract the data fields of the merchant information that is crawled, and the number of the merchant, name, address, contact telephone number, latitude and longitude coordinate information and other data fields are parsed and deposited into the REDIS database.

4 ) To analyze the data fields in the REDIS database, extract the relevant latitude and longitude coordinate information, and use Python scatterplot to visualize the implementation, the spatial distribution of the merchants that they climbed out was counted.

In addition to the above parts, in the process of realization, we should take into consideration : 1. How to improve the reptile technology in a flexible and efficient way to obtain merchant information. 2 How to determine the corresponding topic relevance. 3. How to deal with the request and cache of the development process. 4. Are concurrent processes considered for processing. 5. How to select the way to store data. 6. How to integrate the data information of crawling and extract the effective information.

The last part is to test the function and performance of the developed system to ensure users' normal use of the system.

Through the use of the merchant information crawler system of the takeout platform, the personnel of the supervision department ^ can be real-time to the platform. The merchants on the supervision and data analysis of relevant information, at the same time, is conducive to the online and offline consumer choice for the vast number of consumers. The system is of great significance for a comprehensive and real-time understanding of merchant information on the platform.

Key words : takeout platform; crawler technology; python; redis; merchant Information

目 录

摘 要 I

Abstract II

目 录 IV

第一章 前 言 1

1.1 系统设计总述 1

1.1.1 背景介绍 1

1.1.2 研究意义 2

1.1.3 研究现状 2

1.2 开发技术及数据库 4

1.2.1 开发技术 4

1.2.2 Redis数据库技术分析 4

1.2.3 绘图工具:Microsoft Visio 5

1.2.4 运行环境 5

1.3 主要工作内容 5

1.4 论文组织结构 5

1.5 本章小结 6

第二章 系统需求分析 7

2.1 系统可行性分析 7

2.1.1 技术可行性 7

2.1.2 经济可行性 7

2.1.3 操作可行性 7

2.2 功能需求分析 8

2.2.1 系统开发的目的 8

2.2.2 系统的开发要求 8

2.2.3 系统的主要功能 8

2.2.4 系统E-R 图 8

2.3 系统静态模型的分析 10

2.3.1 类说明 10

2.3.2 建立类图 11

2.4 本章小结 12

第三章 系统的概要设计与分析 13

3.1 数据库设计 13

3.1.1 redis数据库介绍 13

3.1.2 数据库设计 13

剩余内容已隐藏,请支付后下载全文,论文总字数:43871字

您需要先支付 80元 才能查看全部内容!立即支付

课题毕业论文、开题报告、任务书、外文翻译、程序设计、图纸设计等资料可联系客服协助查找。