基于hadoop的高校公共资源分布式存储系统设计毕业论文

 2022-10-16 11:10

论文总字数:32076字

摘 要

信息技术的快速发展,导致了社会经济架构,生产形式等发生了重大变化,深刻地影响着生活的方方面面。特别是,人们对计算机的依赖在过去十年中有所增加,随着数据量不断增长,大数据分析也就变得极为重要了。一方面,由于移动互联网的普及和可穿戴设备的激增,信息数据以每年40%的趋势快速增长;与此同时,数据类型往往更多样化和复杂化,导致更多的结构化和非结构化数据的出现。另一方面,需要强大而智能的实时数据交互。

为了应对数据不断复杂化的问题,本论文提出了基于Hadoop实现大量数据存储的想法,将多台虚拟机构建成一个集群,使用Hadoop作为基本框架平台,利用整个集群的空间存储实现该系统。该分布式策略,可以有效的避免集中存储单点故障的问题。首先,分析和介绍了分布式存储系统和关键技术的理论和框架原理。然后,基于这种分布式的思想,设计实现了分布式存储系统,文件数据访问功能在集群可行的基础上执行,最后确认系统的性能。

关键词:大数据;Hadoop;HDFS;分布式文件存储系统

Design of Distributed Storage System for University Public Resources Based on Hadoop

Abstract

The rapid development of information technology has led to major changes in the socio-economic structure and production forms, which have profoundly affected all aspects of life. In particular, people's reliance on computers has increased over the past decade, and as data volumes continue to grow, big data analytics becomes extremely important. On the one hand, due to the popularity of the mobile Internet and the proliferation of wearable devices, information data is growing at a rate of 40% per year; at the same time, data types tend to be more diverse and complex, leading to more structured and unstructured The emergence of data. On the other hand, powerful and intelligent real-time data interaction is required.

In order to cope with the increasing complexity of data, this paper puts forward the idea of implementing a large amount of data storage based on Hadoop. It builds a virtual cluster into a cluster and uses Hadoop as the basic framework platform to realize the system by using the space storage of the entire cluster. This distributed strategy can effectively avoid the problem of centralized storage single point of failure. First, the theory and framework principles of distributed storage systems and key technologies are analyzed and introduced. Then, based on this distributed idea, the distributed storage system is designed and implemented. The file data access function is executed on the feasible basis of the cluster, and finally the performance of the system is confirmed.

Keywords:Big data,Hadoop,HDFS,Distributed file storage system

目 录

摘 要......................................................................................................................................................................I

Abstract.................................................................................................................................................................II

第一章 前 言........................................................................................................................................................1

1.1 课题研究背景.............................................................................................................................................1

1.2 分布式文件系统分类.................................................................................................................................1

1.2.1 GFS系统...............................................................................................................................................1

1.2.2蓝鲸分布式文件系统...........................................................................................................................1

1.2.3 FastDFS.................................................................................................................................................2

1.3 课题研究现状.............................................................................................................................................2

1.4 课题研究目的和主要工作任务.................................................................................................................3

1.5 论文组织架构.............................................................................................................................................3

第二章 分布式文件数据处理系统简述..............................................................................................................4

2.1 分布式文件数据处理系统概述.................................................................................................................4

2.2 分布式文件数据处理系统优势.................................................................................................................4

2.3 分布式文件存储系统关键技术.................................................................................................................5

2.4 本章小结.....................................................................................................................................................5

第三章 HADOOP核心技术架构研究 ................................................................................................................6

3.1 HADOOP发展历史....................................................................................................................................6

3.2 HDFS(HADOOP分布式文件系统)机制...................................................................................................6

3.2.1 Namenode和Datanode...............................................................................................................。........6

3.2.2 文件系统命名......................................................................................................................................7

3.2.3 通讯协议..............................................................................................................................................7

3.2.4 健壮性..................................................................................................................................................7

3.3本章小结......................................................................................................................................................8

第四章 平台的搭建与验证..................................................................................................................................9

4.1 安装UbuntuLinux操作系统......................................................................................................................9

4.1.1 克隆虚拟机..........................................................................................................................................9

4.1.2 ubuntu系统名称修改...........................................................................................................................9

4.1.3 虚拟机之间网络互通配置................................................................................................................10

4.2 ssh免密码登录 ......................................................................................................................................11

4.2.1 生成密钥............................................................................................................................................11

4.2.2 发送私钥(本机)............................................................................................................................11

4.2.3 发送公钥(其他计算机)................................................................................................................12

4.2.4 测试免密钥登录................................................................................................................................12

4.3 安装JDK...................................................................................................................................................13

4.3.1 下载jdk..............................................................................................................................................13

4.3.2 解压....................................................................................................................................................13

4.3.3 设置环境变量....................................................................................................................................13

4.3.4 验证jdk..............................................................................................................................................14

4.4 安装hadoop ..............................................................................................................................................14

4.4.1 下载Hadoop.......................................................................................................................................14

4.4.2 解压....................................................................................................................................................14

4.4.3 设置JAVA_HOME环境变量...........................................................................................................15

4.4.4 设置Hadoop安装目录环境变量......................................................................................................15

4.4.5 验证Hadoop.......................................................................................................................................15

剩余内容已隐藏,请支付后下载全文,论文总字数:32076字

您需要先支付 80元 才能查看全部内容!立即支付

课题毕业论文、开题报告、任务书、外文翻译、程序设计、图纸设计等资料可联系客服协助查找。