📜  org.apache.spark.sql.avro.IncompatibleSchemaException:意外类型 org.apache.spark.ml.linalg.VectorUDT - SQL 代码示例

📅  最后修改于: 2022-03-11 15:04:51.944000             🧑  作者: Mango

代码示例1
# To convert any Vector to an Array[Double] you can use the following UDF:

import org.apache.spark.sql.functions.udf
import org.apache.spark.sql.functions.col
import org.apache.spark.ml.linalg.Vector

val vectorToArrayUdf = udf((vector: Vector) => vector.toArray)

// The following will work
val output = dataPredictions
    .withColumn("probabilities", vectorToArrayUdf(col("probability")))
    .select("id", "probabilities", "prediction")

output.write.format("com.databricks.spark.avro").save(path)