Spring Hadoop搭建例子配置Spring方法

当你想要开端体验 Spring Hadoop 的时分, 你会遇到形形色色奇异的问题, 目前也有人开端陆续报答了.

假如你只是想要简单的试用一下, 又不想要本人处理这些疑问杂症, 倡议大家能够参考下面的步骤来快速体验一下 Spring Hadoop 的能力.

Spring Hadoop 快速入门

Step1. 下载 Spring Hadoop, 这边是运用 git 去下载, 假如你对 git 不熟习的话, 也能够直接从官网下载再解紧缩。

这边的例子里面是用我的 home 目录为例, 大家记得要改成你本人的目录称号

/home/evanshsu mkdir springhadoop
/home/evanshsu cd springhadoop
/home/evanshsu/springhadoop git init
/home/evanshsu/springhadoop git pull "git://github.com/SpringSource/spring-hadoop.git"
Step2. build spring-hadoop.jar

build完之后, 我们要把一切的 jar 檔都放在 /home/evanshsu/springhadoop/lib 里面, 以便之后把一切的jar 档包在同一包里面

/home/evanshsu/springhadoop ./gradlew jar
/home/evanshsu/springhadoop mkdir lib
/home/evanshsu/springhadoop cp build/libs/spring-data-hadoop-1.0.0.BUILD-SNAPSHOT.jar lib/
Step3. get spring-framework.

由于 spring hadoop 是倚赖于 spring-framework 的, 所以我们也要把 spring-framework 的 jar 檔放在 lib 里面

/home/evanshsu/spring wget "http://s3.amazonaws.com/dist.springframework.org/release/SPR/spring-framework-3.1.1.RELEASE.zip"
/home/evanshsu/spring unzip spring-framework-3.1.1.RELEASE.zip
/home/evanshsu/spring cp spring-framework-3.1.1.RELEASE/dist/*.jar /home/evanshsu/springhadoop/lib/

Step4. 修正 build file 让我们能够把一切的 jar 檔, 封装到同一个 jar 档里面

/home/evanshsu/spring/samples/wordcount vim build.gradle

[code=php]

description = 'Spring Hadoop Samples - WordCount'

apply plugin: 'base'
apply plugin: 'java'
apply plugin: 'idea'
apply plugin: 'eclipse'

repositories {
flatDir(dirs: '/home/evanshsu/springhadoop/lib/')
// Public Spring artefacts
maven { url "http://repo.springsource.org/libs-release" }
maven { url "http://repo.springsource.org/libs-milestone" }
maven { url "http://repo.springsource.org/libs-snapshot" }
}

dependencies {
compile fileTree('/home/evanshsu/springhadoop/lib/')
compile "org.apache.hadoop:hadoop-examples:$hadoopVersion"
// see HADOOP-7461
runtime "org.codehaus.jackson:jackson-mapper-asl:$jacksonVersion"

testCompile "junit:junit:$junitVersion"
testCompile "org.springframework:spring-test:$springVersion"
}

jar {
from configurations.compile.collect { it.isDirectory() ? it : zipTree(it).matching{
exclude 'META-INF/spring.schemas'
exclude 'META-INF/spring.handlers'
} }
}

[code/]

Step5. 这边有一个特殊的 hadoop.properties 主要是放置 hadoop 相关的设定数据.

根本上我们要把 wordcount.input.path wordcount.output.path 改成之后执行 wordcount 要运用的目录,　而且wordcount.input.path 里面记得要放几个文本文件

另外, 还要把 hd.fs 改成你 hdfs 的设定

假如你是用国网中心 Hadoop 的话, 要把 hd.fs 改成 hd.fs=hdfs://gm2.nchc.org.tw:8020

/home/evanshsu/spring/samples/wordcount vim src/main/resources/hadoop.properties

[code=php]

wordcount.input.path=/user/evanshsu/input.txt
wordcount.output.path=/user/evanshsu/output

hive.host=localhost
hive.port=12345
hive.url=jdbc:hive://${hive.host}:${hive.port}
hd.fs=hdfs://localhost:9000
mapred.job.tracker=localhost:9001

path.cat=bin${file.separator}stream-bin${file.separator}cat
path.wc=bin${file.separator}stream-bin${file.separator}wc

input.directory=logs
log.input=/logs/input/
log.output=/logs/output/

distcp.src=${hd.fs}/distcp/source.txt
distcp.dst=${hd.fs}/distcp/dst

[code/]

Step6. 这是最重要的一个配置文件, 有用过 Spring 的人都晓得这个配置文件是Spring 的灵魂

/home/evanshsu/spring/samples/wordcount vim src/main/resources/META-INF/spring/context.xml

xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"

xmlns:context="http://www.springframework.org/schema/context"

xmlns:hdp="http://www.springframework.org/schema/hadoop"

xmlns:p="http://www.springframework.org/schema/p"

xsi:schemaLocation="http://www.springframework.org/schema/beanshttp://www.springframework.org/schema/beans/spring-beans.xsd

http://www.springframework.org/schema/context http://www.springframework.org/schema/context/spring-context.xsd

http://www.springframework.org/schema/hadoop http://www.springframework.org/schema/hadoop/spring-hadoop.xsd">

fs.default.name=${hd.fs}

input-path="${wordcount.input.path}" output-path="${wordcount.output.path}"

mapper="org.springframework.data.hadoop.samples.wordcount.WordCountMapper"

reducer="org.springframework.data.hadoop.samples.wordcount.WordCountReducer"

jar-by-class="org.springframework.data.hadoop.samples.wordcount.WordCountMapper" />

Step7. 加上本人的 mapper, reducer

/home/evanshsu/spring/samples/wordcount vim src/main/java/org/springframework/data/hadoop/samples/wordcount/WordCountMapper.java

package org.springframework.data.hadoop.samples.wordcount;

import java.io.IOException;

import java.util.StringTokenizer;

import org.apache.hadoop.io.IntWritable;

import org.apache.hadoop.io.Text;

import org.apache.hadoop.mapreduce.Mapper;

public class WordCountMapper extends Mapper {

private final static IntWritable one = new IntWritable(1);

private Text word = new Text();

public void map(Object key, Text value, Context context)

throws IOException, InterruptedException {

StringTokenizer itr = new StringTokenizer(value.toString());

while (itr.hasMoreTokens()) {

word.set(itr.nextToken());

context.write(word, one);

}

/home/evanshsu/spring/samples/wordcount vim src/main/java/org/springframework/data/hadoop/samples/wordcount/WordCountReducer.java

package org.springframework.data.hadoop.samples.wordcount;

import java.io.IOException;

import org.apache.hadoop.io.IntWritable;

import org.apache.hadoop.io.Text;

import org.apache.hadoop.mapreduce.Reducer;

public class WordCountReducer extends

Reducer {

private IntWritable result = new IntWritable();

public void reduce(Text key, Iterable values, Context context)

throws IOException, InterruptedException {

int sum = 0;

for (IntWritable val : values) {

sum += val.get();

}

result.set(sum);

context.write(key, result);

}

Step8. 加上 spring.schemas, spring.handlers

/home/evanshsu/spring/samples/wordcount vim src/main/resources/META-INF/spring.schemas

http\://www.springframework.org/schema/context/spring-context.xsd=org/springframework/context/config/spring-context-3.1.xsd

http\://www.springframework.org/schema/hadoop/spring-hadoop.xsd=/org/springframework/data/hadoop/config/spring-hadoop-1.0.xsd

/home/evanshsu/spring/samples/wordcount vim src/main/resources/META-INF/spring.handlers

http\://www.springframework.org/schema/p=org.springframework.beans.factory.xml.SimplePropertyNamespaceHandler

http\://www.springframework.org/schema/context=org.springframework.context.config.ContextNamespaceHandler

http\://www.springframework.org/schema/hadoop=org.springframework.data.hadoop.config.HadoopNamespaceHandler

Step9. 终于到最后一步啰, 这一步我们要把一切的 jar 档封装在一同, 并且丢到hadoop 上面去跑

/home/evanshsu/spring/samples/wordcount ../../gradlew jar

/home/evanshsu/spring/samples/wordcount hadoop jar build/libs/wordcount-1.0.0.M1.jar org.springframework.data.hadoop.samples.wordcount.Main

Step10. 最后来确认看看结果有没有跑出来吧

/home/evanshsu/spring/samples/wordcount hadoop fs -cat /user/evanshsu/output/*

作者：互联网来源：本站整理发布时间：2019-11-21 14:21:23

上一篇文章：Eclipse环境变量配置方法,需要注意的事项有哪些

下一篇文章：Code Studio是什么功能介绍

------------------------------- · 相关文档浏览 · --------------------------------------------------------------------- · 热门文档浏览 · -------------------------------------

Spring Hadoop搭建例子 配置Spring方法

Spring Hadoop 快速入门

Spring Hadoop搭建例子配置Spring方法