Computation of {{ $ctrl.embeddingType }} drift between different evaluations is not supported.
| Feature | Euclidian distance | Cosine similarity | Classifier gini |
|---|---|---|---|
| {{ embeddingDrift._key }} |
{{embeddingDrift.euclidianDistance | nicePrecision: 2 | ifEmpty: '-' }}
|
{{embeddingDrift.cosineSimilarity | nicePrecision: 2 | ifEmpty: '-' }}
|
{{embeddingDrift.classifierGini | nicePrecision: 2 | ifEmpty: '-' }}
|
In order to detect data drift in {{ $ctrl.embeddingType }} columns, we convert {{ $ctrl.embeddingType }} into embeddings vectors that serve as a numerical representation of the {{ $ctrl.embeddingType }}s, designed to capture semantic relationships and contextual information about the objects they represent.
We then compute statistical and geometrical metrics in this embedding space to detect a shift in the embedding distribution, which in turn reflects a shift in the {{ $ctrl.embeddingType }} distribution.
To materialize these shift, we are using the Euclidian distance and the Cosine similarity metrics. Additionally, we are training a Binary Classifier to differentiate the embeddings with a resulting metric ranging from 0 (no drift) to 1 (drift).
See