DataViewRowCursor Class
Definition
Important
Some information relates to prerelease product that may be substantially modified before it’s released. Microsoft makes no warranties, express or implied, with respect to the information provided here.
Class used to cursor through rows of an IDataView.
public abstract class DataViewRowCursor : Microsoft.ML.DataViewRow
type DataViewRowCursor = class
inherit DataViewRow
Public MustInherit Class DataViewRowCursor
Inherits DataViewRow
- Inheritance
Remarks
Note that this is also an DataViewRow. The Position is incremented by MoveNext(). Prior to the first call to MoveNext(), or after MoveNext() returns false
, Position is -1
. Otherwise, when MoveNext() returns true
, Position >= 0.
Constructors
DataViewRowCursor() |
Properties
Batch |
This provides a means for reconciling multiple rows that have been produced generally from GetRowCursorSet(IEnumerable<DataViewSchema.Column>, Int32, Random). When getting a set, there is a need to, while allowing parallel processing to proceed, always have an aim that the original order should be recoverable. Note, whether or not a user cares about that original order in one's specific application is another story altogether (most callers of this as a practical matter do not, otherwise they would not call it), but at least in principle it should be possible to reconstruct the original order one would get from an identically configured GetRowCursor(IEnumerable<DataViewSchema.Column>, Random). So: for any cursor implementation, batch numbers should be non-decreasing. Furthermore, any given batch number should only appear in one of the cursors as returned by GetRowCursorSet(IEnumerable<DataViewSchema.Column>, Int32, Random). In this way, order is determined by batch number. An operation that reconciles these cursors to produce a consistent single cursoring, could do so by drawing from the single cursor, among all cursors in the set, that has the smallest batch number available. Note that there is no suggestion that the batches for a particular entry will be consistent from cursoring to cursoring, except for the consistency in resulting in the same overall ordering. The same entry could have different batch numbers from one cursoring to another. There is also no requirement that any given batch number must appear, at all. It is merely a mechanism for recovering ordering from a possibly arbitrary partitioning of the data. It also follows from this, of course, that considering the batch to be a property of the data is completely invalid. (Inherited from DataViewRow) |
Position |
This is incremented when the underlying contents changes, giving clients a way to detect change. It should be
-1 when the object is in a state where values cannot be fetched. In particular, for an DataViewRowCursor,
this will be before MoveNext() if ever called for the first time, or after the first time
MoveNext() is called and returns Note that this position is not position within the underlying data, but position of this cursor only. If one, for example, opened a set of parallel streaming cursors, or a shuffled cursor, each such cursor's first valid entry would always have position 0. (Inherited from DataViewRow) |
Schema |
Gets a Schema, which provides name and type information for variables (i.e., columns in ML.NET's type system) stored in this row. (Inherited from DataViewRow) |
Methods
Dispose() |
Implementation of dispose. Calls Dispose(Boolean) with |
Dispose(Boolean) |
The disposable method for the disposable pattern. This default implementation does nothing. (Inherited from DataViewRow) |
GetGetter<TValue>(DataViewSchema+Column) |
Returns a value getter delegate to fetch the value of the given |
GetIdGetter() |
A getter for a 128-bit ID value. It is common for objects to serve multiple DataViewRow instances to iterate over what is supposed to be the same data, for example, in a IDataView a cursor set will produce the same data as a serial cursor, just partitioned, and a shuffled cursor will produce the same data as a serial cursor or any other shuffled cursor, only shuffled. The ID exists for applications that need to reconcile which entry is actually which. Ideally this ID should be unique, but for practical reasons, it suffices if collisions are simply extremely improbable. Note that this ID, while it must be consistent for multiple streams according to the semantics above, is not considered part of the data per se. So, to take the example of a data view specifically, a single data view must render consistent IDs across all cursorings, but there is no suggestion at all that if the "same" data were presented in a different data view (as by, say, being transformed, cached, saved, or whatever), that the IDs between the two different data views would have any discernible relationship. (Inherited from DataViewRow) |
IsColumnActive(DataViewSchema+Column) |
Returns whether the given column is active in this row. (Inherited from DataViewRow) |
MoveNext() |
Advance to the next row. When the cursor is first created, this method should be called to
move to the first row. Returns |