Pig|Pig FOREACH 嵌套循环

Example: Nested Block Suppose we have relations A and B. Note that relation B contains an inner bag.

A = LOAD 'data' AS (url:chararray,outlink:chararray); DUMP A; (www.ccc.com,www.hjk.com) (www.ddd.com,www.xyz.org) (www.aaa.com,www.cvn.org) (www.www.com,www.kpt.net) (www.www.com,www.xyz.org) (www.ddd.com,www.xyz.org)B = GROUP A BY url; DUMP B; (www.aaa.com,{(www.aaa.com,www.cvn.org)}) (www.ccc.com,{(www.ccc.com,www.hjk.com)}) (www.ddd.com,{(www.ddd.com,www.xyz.org),(www.ddd.com,www.xyz.org)}) (www.www.com,{(www.www.com,www.kpt.net),(www.www.com,www.xyz.org)})

In this example we perform two of the operations allowed in a nested block, FILTER and DISTINCT. Note that the last statement in the nested block must be GENERATE. Also, note the use of projection (PA = FA.outlink; ).
X = FOREACH B { FA= FILTER A BY outlink == 'www.xyz.org'; PA = FA.outlink; DA = DISTINCT PA; GENERATE group, COUNT(DA); }DUMP X; (www.aaa.com,0) (www.ccc.com,0) (www.ddd.com,1) (www.www.com,1)

版权声明:本文为博主原创文章,未经博主允许不得转载。
【Pig|Pig FOREACH 嵌套循环】转载于:https://www.cnblogs.com/jamesf/p/4751587.html

    推荐阅读