read_statestore
table-valued function
Applies to: Databricks SQL Databricks Runtime 14.3 and above
Important
This feature is in Public Preview.
A table valued function for reading records from the state store of streaming queries. The returned relation only supports running as a batch query.
Syntax
read_statestore ( path [, option_key => option_value ] [ ... ] )
Arguments
read_statestore
requires named parameter invocation for its optional parameters.
path
: ASTRING literal
. The path of the streaming query checkpoint location.option_key
: The name of the option to configure. You need to use backticks (`) for options that contain dots (.).option_value
: A constant expression to set the option to. Accepts literals and scalar functions.
All option_value
s are case-insensitive.
option_key | Type | Default | Description |
---|---|---|---|
batchId |
BIGINT |
latest batch id | Represents the target batch to read from. This option is used when users want to perform time-travel. The batch should be committed but not yet cleaned up. |
OperatorId |
BIGINT |
0 | Represents the target operator to read from. This option is used when the query is using multiple stateful operators. |
storeName |
STRING |
'DEFAULT' |
Represents the target side to read from. This option is used when users want to read the state from stream-stream join. |
joinSide |
STRING |
'None' |
Represents the target side to read from. This option is used when users want to read the state from stream-stream join. One of: 'Left' , 'Right' , 'None' . |
Returns
The function returns a result set with the following columns.
Note
The nested columns for key and value heavily depend on the input schema of the stateful operator and the type of operator.
Name | Data type | Nullable | Standard | Description |
---|---|---|---|---|
id |
STRUCT |
No | Key row of the stateful operator stored in the state checkpoint. | |
value |
STRUCT |
Yes | Value row of the stateful operator stored in the state checkpoint. | |
partition_id |
INTEGER |
No | The partition that contains the record. | |
shardId |
STRING |
No | A unique identifier for the shard where the data was read from. | |
sequenceNumber |
BIGINT |
No | The unique identifier of the record within its shard. | |
approximateArrivalTimestamp |
TIMESTAMP |
No | The approximate time that the record was inserted into the stream. |
Examples
- Read from state
> SELECT * FROM read_statestore('/checkpoint/path');
– Read from state with storeName option
> SELECT * FROM read_statestore(
'/checkpoint/path',
operatorId => 0,
batchId => 2,
storeName => 'default'
);
– Read from state with joinSide option
> SELECT * FROM read_statestore(
'/checkpoint/path',
joinSide => 'left'
);