Spark SQL How to get the 5th column from the Spark SQL Query

Question

Hi,

I have a headerless file which I am reading in the spark.read to create a data frame now I want to get the value of the 5th column from the file.File is comma seperated. How to achieve it. I know it is possible in the T-SQL but not sure how to achieve it in Spark SQL.
Example of the file

myname,1/4/1977,xyz,50,60

Now I want to get the value of 60.

Regards
Rajaniesh

Accepted Answer

Hello Rajaniesh ,

The below piece of code should do the trick . Please do let me know how it goes .

cols = ['_c4']  
df.select(*cols).show()

Himanshu

Please do consider to click on "Accept Answer" and "Up-vote" on the post that helps you, as it can be beneficial to other community members

Answer

One way to do it would be to read it with "schema" option and assign some arbitrary names to your columns, e.g. col_1, col_2, and then reference it by that name.
Also I think Spark will do it for you even if you don't use the schema option, not sure what the pattern would be.

Share via

Spark SQL How to get the 5th column from the Spark SQL Query

1 additional answer