出门莫恨无人随,书中车马多如簇。这篇文章主要讲述如何修复mapreduce中mapper的setup方法给出的字符串值的不规则行为?相关的知识,希望能为你提供帮助。
【如何修复mapreduce中mapper的setup方法给出的字符串值的不规则行为()】我是MapReduce的新手,并且正在学习设置方法的实现。配置给出的新字符串值正确打印,但是当我尝试进一步处理它时,字符串的初始值就会生效。我知道字符串是不可变的,但它应该提供当前指向其他方法的值。
public class EMapper extends Mapper<
LongWritable, Text, Text, Text>
{String wordstring = "abcd";
//initialized wordstring with "abcd"public void setup(Context context) {
Configuration config = new Configuration(context.getConfiguration());
wordstring = config.get("mapper.word");
// As string is immutable,
// wordstring should now point to
// value given by mapper.word
//Here mapper.word="ankit" by
//using -D in hadoop command}String def = wordstring;
String jkl = String.valueOf(wordstring);
//tried to copy current value
//but
//string jkl prints the initial
/value.public void map(LongWritable key, Text value, Context context)
throws InterruptedException, IOException {
context.write(new Text("wordstring=" + wordstring + "" + "def=" +
def),
new Text("jkl=" + jkl));
}
}public class EDriver extends Configured implements Tool {private static Logger logger = LoggerFactory.getLogger(EDriver.class);
public static void main(String[] args) throws Exception {
logger.info("Driver started");
int res = ToolRunner.run(new Configuration(), new EDriver(), args);
System.exit(res);
}public int run(String[] args) throws Exception {
if (args.length != 2) {
System.err.printf("Usage: %sneedsarguments",
getClass().getSimpleName());
return -1;
}
Configuration conf = getConf();
Job job = new Job(conf);
job.setJarByClass(EDriver.class);
job.setJobName("E Record Reader");
job.setMapperClass(EMapper.class);
job.setMapOutputKeyClass(Text.class);
job.setMapOutputValueClass(Text.class);
job.setReducerClass(EReducer.class);
job.setNumReduceTasks(0);
job.setOutputKeyClass(Text.class);
job.setOutputValueClass(NullWritable.class);
FileInputFormat.addInputPath(job, new Path(args[0]));
FileOutputFormat.setOutputPath(job, new Path(args[1]));
job.setInputFormatClass(ExcelInputFormat.class);
return job.waitForCompletion(true) ? 0 : 1;
}}
我期待输出
wordstring=ankitdef=ankitjkl=ankit
实际输出是
wordstring=ankitdef=abcdjkl=abcd
答案这与字符串的可变性无关,而与代码执行顺序有关。
只有在执行任何类级别命令后才会调用
setup
方法。您编写代码的顺序不会改变任何内容。如果您按照实际执行的顺序重新编写代码的顶部,则可以:public class EMapper extends Mapper<
LongWritable, Text, Text, Text>
{
String wordstring = "abcd";
String jkl = String.valueOf(wordstring);
public void setup(Context context) {
Configuration config = new Configuration(context.getConfiguration());
wordstring = config.get("mapper.word");
//By the time this is called, jkl has already been assigned to "abcd"
}
因此,
jkl
仍然是abcd
并不奇怪。您应该在jkl
方法中设置setup
,如下所示:public class EMapper extends Mapper<
LongWritable, Text, Text, Text>
{
String wordstring;
String jkl;
public void setup(Context context) {
Configuration config = new Configuration(context.getConfiguration());
wordstring = config.get("mapper.word");
jkl = wordstring;
//Here, jkl and wordstring are both different variables pointing to "ankit"
}//Here, jkl and wordstring are null, as setup(Context context) has not yet runpublic void map(LongWritable key, Text value, Context context)
throws InterruptedException, IOException {
//Here, jkl and wordstring are both different variables pointing to "ankit"
context.write(new Text("wordstring=" + wordstring),
new Text("jkl=" + jkl));
}
当然你实际上并不需要
jkl
,你可以直接使用wordstring
。另一答案问题已经解决了。实际上,我在分布式模式下运行Hadoop,其中SETUP,MAPPER,REDUCER和CLEANUP在不同的JVM上运行。因此,数据无法直接从SETUP传输到MAPPER。第一个wordstring对象在mapper中被初始化为“abcd”。我试图改变SETUP中的wordstring(创建了wordstring的另一个对象),这实际上是在另一个JVM中发生的。所以,当我试图复制jkl中的“wordstring”时
String jkl = String.valueOf(wordstring);
wordstring的第一个值(由mapper创建并初始化为“abcd”)被复制到jkl。
如果我在独立模式下运行Hadoop,它将使用单个JVM,并且SETUP给予wordstring的值将被复制到jkl。
因此,jkl将wordstring的副本初始化为“abcd”而不是SETUP给出的副本。
我用了
HashMap map = new HashMap();
在SETUP到MAPPER之间传输数据,然后jkl获得了SETUP的字符串给出的值的副本。
推荐阅读
- Android上的BLE(蓝牙低功耗蓝牙),创建并重新连接到并不总是存在的设备
- 无法增加Max Application Master资源
- (Zeppelin + Livy)SparkUI.appUIAddress(),一定是错的
- Winform Application UI在其他打开的应用程序之间切换时冻结
- Android Place自动填充片段(无法设置文字)
- 如何在Android上的Kotlin中从Long类型变量中提取日期和时间
- 如何在Dapper.Net中编写一对多查询()
- 错误(任务':app:transformDexWithInstantRunSlicesApkForDebug'的执行失败。无法读取zip文件)
- WPF应用程序中的ReactiveUI和MVVM模式