Hello @Alex ,
• Yes, this is good and its correct way of doing.
• You can do either decodes the stream from IoT and then writes to Bronze. I think the question is do you want to be able to query the data directly in bronze or are you OK with decoding the payload when querying. If the data is large I wouldn't as it is likely to have a significant impact on query performance as you will have to call your decode function (like .from_json().cast('string')) for each record. Better to take the hit during the stream read as the data flow is going to be regulated by the data coming from the event hub.
• I would define the schema and pass it as part of the spark.readStream() function.
• I don't think so - you will probably want to select F type VMs or other compute optimized for the cluster.
Hope this helps. Do let us know if you any further queries.
------------
Please don’t forget to Accept Answer
and Up-Vote
wherever the information provided helps you, this can be beneficial to other community members.