問題描述
我在 MongoDB 和 python 中使用 Map Reduce,但遇到了一個(gè)奇怪的限制.我只是想計(jì)算書"記錄的數(shù)量.它在少于 100 條記錄時(shí)有效,但當(dāng)超過 100 條記錄時(shí),由于某種原因計(jì)數(shù)會重置.
I'm playing around with Map Reduce in MongoDB and python and I've run into a strange limitation. I'm just trying to count the number of "book" records. It works when there are less than 100 records but when it goes over 100 records the count resets for some reason.
這是我的 MR 代碼和一些示例輸出:
Here is my MR code and some sample outputs:
var M = function () {
book = this.book;
emit(book, {count : 1});
}
var R = function (key, values) {
var sum = 0;
values.forEach(function(x) {
sum += 1;
});
var result = {
count : sum
};
return result;
}
記錄數(shù)為99時(shí)的MR輸出:
MR output when record count is 99:
{u'_id': u'superiors', u'value': {u'count': 99}}
記錄數(shù)為101時(shí)的MR輸出:
MR output when record count is 101:
{u'_id': u'superiors', u'value': {u'count': 2.0}}
有什么想法嗎?
推薦答案
你的 reduce
函數(shù)應(yīng)該是對 count
值求和,而不僅僅是添加 1
每個(gè)值.否則,一個(gè) reduce
的輸出不能被正確地用作另一個(gè) reduce
的輸入.試試這個(gè):
Your reduce
function should be summing up the count
values, not just adding 1
for each value. Otherwise the output of a reduce
can't properly be used as input back into another reduce
. Try this instead:
var R = function (key, values) {
var sum = 0;
values.forEach(function(x) {
sum += x.count;
});
var result = {
count : sum
};
return result;
}
這篇關(guān)于MapReduce 結(jié)果似乎限制為 100?的文章就介紹到這了,希望我們推薦的答案對大家有所幫助,也希望大家多多支持html5模板網(wǎng)!