TabICL

⚠️ Warning: TabICL can lead to high memory consumption. It is recommended not to use this model on datasets with more than 50k samples and 150 features.
TabICL is a pre-trained Tabular Foundation Model for In-Context Learning, built on a transformer architecture. It delivers strong performance on classification tasks with minimal configuration. However, TabICL exhibits high memory consumption which scales with the size of the datasets. For optimal performance and resource management, it is recommended to run the model within a containerized environment ensuring reproducibility, isolation and efficient allocation of computational resources.

The number of CPU threads allocated to run the model. Set to -1 to use all available CPU cores.
The default values for Normalization methods will be set to 'none' and 'power'.
Defines how input features are scaled or transformed before modeling. Applied during view generation (linked to the "Number of Estimators"), each method helps create diverse feature representations to enhance robustness and performance. These methods are fixed and not tuned via grid search.
Controls the z-score threshold for outlier detection and clipping.
At least one option must be selected.
Indicates whether to apply cyclic shifts to the class labels across ensemble members.
Temperature value for the softmax function. Lower values make predictions more confident, higher values make them more conservative.
Number of dataset "views" processed simultaneously during inference. A smaller batch size reduces memory usage but may increase inference time, while a larger batch size can speed up processing at the cost of higher memory consumption.
Used for reproducibility of ensemble generation, affecting feature shuffling and other randomized operations.