以前spark.write时总要先把原来的删了,但其实是可以设置写入模式的。
val df = spark.read.parquet(input) df.write.mode("overwrite").parquet(output)dataframe写入的模式一共有4种:
overwrite 覆盖已经存在的文件append 向存在的文件追加ignore 如果文件已存在,则忽略保存操作error / default 如果文件存在,则报错 def mode(saveMode: String): DataFrameWriter = { this.mode = saveMode.toLowerCase match { case "overwrite" => SaveMode.Overwrite case "append" => SaveMode.Append case "ignore" => SaveMode.Ignore case "error" | "default" => SaveMode.ErrorIfExists case _ => throw new IllegalArgumentException(s"Unknown save mode: $saveMode. " + "Accepted modes are 'overwrite', 'append', 'ignore', 'error'.") } this }