From c51a4ebd39833f4ecb654f27a205720057c55353 Mon Sep 17 00:00:00 2001 From: heibaiying <31504331+heibaiying@users.noreply.github.com> Date: Tue, 4 Jun 2019 14:41:44 +0800 Subject: [PATCH 1/2] =?UTF-8?q?Update=20Spark=5FTransformation=E5=92=8CAct?= =?UTF-8?q?ion=E7=AE=97=E5=AD=90.md?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --- notes/Spark_Transformation和Action算子.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/notes/Spark_Transformation和Action算子.md b/notes/Spark_Transformation和Action算子.md index 595822a..934409f 100644 --- a/notes/Spark_Transformation和Action算子.md +++ b/notes/Spark_Transformation和Action算子.md @@ -73,7 +73,7 @@ sc.parallelize(list).filter(_ >= 10).foreach(println) ### 1.3 flatMap -`flatMap(func)`与`map`类似,但每一个输入的 item 会被映射成 0 个或多个输出的 items( *func* 返回类型需要为`Seq`类型)。 +`flatMap(func)`与`map`类似,但每一个输入的item会被映射成 0 个或多个输出的items( *func* 返回类型需要为`Seq`)。 ```scala val list = List(List(1, 2), List(3), List(), List(4, 5)) From c636fbfd074c15d8e105a3a3516a6eddeb74a5c9 Mon Sep 17 00:00:00 2001 From: heibaiying <31504331+heibaiying@users.noreply.github.com> Date: Tue, 4 Jun 2019 14:44:17 +0800 Subject: [PATCH 2/2] =?UTF-8?q?Update=20Spark=5FTransformation=E5=92=8CAct?= =?UTF-8?q?ion=E7=AE=97=E5=AD=90.md?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --- notes/Spark_Transformation和Action算子.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/notes/Spark_Transformation和Action算子.md b/notes/Spark_Transformation和Action算子.md index 934409f..a39041d 100644 --- a/notes/Spark_Transformation和Action算子.md +++ b/notes/Spark_Transformation和Action算子.md @@ -319,7 +319,7 @@ sc.parallelize(list,numSlices = 2).aggregateByKey(zeroValue = 0,numPartitions = (spark,7) ``` -`aggregateByKey(zeroValue = 0,numPartitions = 3)`的第二个参数`numPartitions `决定的是输出RDD的分区数量,想要验证这个问题,可以对上面代码进行改写,使用`getNumPartitions`方法获取分区数量: +`aggregateByKey(zeroValue = 0,numPartitions = 3)`的第二个参数`numPartitions`决定的是输出RDD的分区数量,想要验证这个问题,可以对上面代码进行改写,使用`getNumPartitions`方法获取分区数量: ```scala sc.parallelize(list,numSlices = 6).aggregateByKey(zeroValue = 0,numPartitions = 3)(