apache spark sql - How do I select columns " from the below schema?

Question

Welcome To Ask or Share your Answers For Others

apache spark sql - How do I select columns " from the below schema?

asked Jan 27, 2021 in Technique[技术] by 深蓝 (71.8m points)

apache spark sql - How do I select columns " from the below schema?

Read a JSON file and registered a temporary table with the below schema(inferred from JSON file with Native Spark SQL inference).

df = spark.read.json('/path/to/json', multiLine=True)
babynames.registerTempTable("babynames")

Now I would like to select columns

"sid", "id", "position", "created_at", "created_meta", "updated_at", "updated_meta", "meta", "year", "first_name", "county", "sex", "count"

using Spark SQL select statement.

Here is the data source: https://data.cityofnewyork.us/api/views/25th-nujf/rows.json?accessType=DOWNLOAD

与恶龙缠斗过久,自身亦成为恶龙；凝视深渊过久,深渊将回以凝视…

1 Answer

深蓝 · Answer 1 · 2021-01-27T03:46:49+0000

Once you have the json file located at specific location you can read the column names as under but you need to have a better understanding of the json elements.

Using spark Sql :

val df = spark.read.option("multiline",true).json("/path/to/json")
df.createOrReplaceTempView("TestTable")
val selectedColumnsDf = spark.sql(""" Select meta.view.columns.id ,meta.view.columns.position, meta.view.createdAt  from TestTable """)

Using DataFrame Api it can be done as below :

val df = spark.read.option("multiline",true).json("/path/to/json")
val selectedColumnsDf = df.select("meta.view.columns.id","meta.view.columns.position","meta.view.createdAt")

I am just selecting the three columns just to give you an idea. you can add remaining columns as per your requirement.

Categories

apache spark sql - How do I select columns " from the below schema?

apache spark sql - How do I select columns " from the below schema?

Please log in or register to add a comment.

Please log in or register to answer this question.

1 Answer

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags