Azure Synaps serverless scan file that should be filtered
Shlomi Lanton
56
Reputation points
Hey,
We have a large amount of files stored in Storage Account in Parquet format with the following hierarchy.
/m_id=%M_ID%/quarter=%QUARTER%/month=%MONTH%/partition_version=last/part-0.parquet'.
examples for quarter values: 2024Q1, 2024Q2, 2024Q3 etc.
We have a view we use the query the data, the view is defined using OPENROWSET
, when running a query a WHERE filter and [quarter] in ('2023Q2', '2023Q3', '2023Q4', '2024Q1')
we get an error
Error handling external file: 'IO request completed with an error. ERROR = 0x0000000C'. Underlying data description: file '/machine_id=aaaa/quarter=2024Q2/month=6/partition_version=last/part-0.parquet'.
but note that the quarter of 2024Q2
is not in the query, so we expect this file to not be scanned at all.
- How can we dig down to this issue?
- can we get a list of all files scanned in a query?
- We apply multiple filters in the query (on partitions and values), does the order of elements in the query matter?
Thanks