📅  最后修改于: 2022-03-11 14:45:09.856000             🧑  作者: Mango
scala> val regex = "([0-9\\-TZ\\.:]+) (\\{.*)"
regex: String = ([0-9\-TZ\.:]+) (\{.*)
scala> val dff = df.withColumn("tstamp", regexp_extract('json_content, regex, 1)).withColumn("json", regexp_extract('json_content, regex, 2)).drop("json_content")
dff: org.apache.spark.sql.DataFrame = [country: string, city: string ... 2 more fields]
scala> dff.show(false)
+-------+-------+------------------------+----------------------------------+
|country|city |tstamp |json |
+-------+-------+------------------------+----------------------------------+
|america|chicago|2019-06-28T00:00:00.000Z|{ "a": 123, "b": "456", "c": 789 }|
|india |mumbai |2019-06-28T00:00:00.000Z|{ "a": 123, "b": "456", "c": 789 }|
+-------+-------+------------------------+----------------------------------+