Blok Data AI LogoBlok Data AI
210+ Building Blocks

Every block you need to build ML pipelines

Drag-and-drop components for tabular data, computer vision, NLP, time series, and geospatial analysis — all in one studio.

Showing 69 blocks

CSV Upload

Tabular

Data Sources

Load structured data from CSV files for analysis and model training.

Excel File

Tabular

Data Sources

Import data from Excel spreadsheets (.xlsx, .xls) including multi-sheet files.

Parquet File

Tabular

Data Sources

Load columnar-format Parquet files — ideal for large datasets.

SQL Database

Tabular

Data Sources

Connect to relational databases (PostgreSQL, MySQL, SQLite) and run custom queries.

API Webhook

Tabular

Data Sources

Fetch data from REST APIs with support for authentication and pagination.

Remove Nulls

Tabular

Data Prep

Handle missing values through deletion or statistical imputation.

Data Imputer

Tabular

Data Prep

Advanced missing value imputation using KNN or iterative model-based strategies.

Feature Scale

Tabular

Data Prep

Normalise or standardise numeric features to comparable scales.

Encode Categories

Tabular

Data Prep

Convert categorical variables to numeric representations for ML models.

SMOTE

Tabular

Data Prep

Synthetic Minority Over-sampling Technique for imbalanced classification.

Remove Outliers

Tabular

Data Prep

Detect and remove or cap statistical outliers from numeric features.

Split Data

Tabular

Data Prep

Divide dataset into training and test sets with optional stratification.

PCA

Tabular

Feature Engineering

Principal Component Analysis — reduce dimensionality while preserving variance.

Target Encoding

Tabular

Feature Engineering

Replace categorical values with smoothed mean of the target variable.

Polynomial Features

Tabular

Feature Engineering

Generate interaction terms and polynomial combinations of numeric features.

Random Forest

Tabular

Classical Models

Ensemble of decision trees — robust, accurate, and scale-invariant.

XGBoost

Tabular

Classical Models

Optimised gradient boosting — frequently wins structured-data competitions.

LightGBM

Tabular

Classical Models

Gradient boosting optimised for speed and large datasets.

Logistic Regression

Tabular

Classical Models

Linear classifier with probabilistic output — fast and interpretable.

SVM

Tabular

Classical Models

Support Vector Machine — effective in high-dimensional spaces.

Decision Tree

Tabular

Classical Models

Interpretable tree-based model — transparent decision rules.

Neural Network (MLP)

Tabular

Deep Learning

Multi-layer perceptron for complex non-linear patterns in tabular data.

LSTM

Tabular

Deep Learning

Long Short-Term Memory — recurrent network for sequential tabular data.

Transformer

Tabular

Deep Learning

Self-attention based model — state of the art for tabular and sequence tasks.

Accuracy Score

Tabular

Evaluation

Classification accuracy, precision, recall, and F1 — all in one block.

ROC AUC

Tabular

Evaluation

Area under the ROC curve — threshold-independent classifier evaluation.

F1 Score

Tabular

Evaluation

Harmonic mean of precision and recall — best for imbalanced classification.

Confusion Matrix

Tabular

Evaluation

Visualise all correct and incorrect predictions across every class.

Correlation Matrix

Tabular

Analysis

Visualize pairwise correlations between numeric features as a color-coded heatmap matrix.

Correlation Table

Tabular

Analysis

Display pairwise correlations in a sortable table format with numeric precision.

SHAP Values

Tabular

Evaluation

Game-theory feature importance — explains how each feature drives predictions.

Cross-Validation

Tabular

Evaluation

K-fold cross-validation for robust, unbiased performance estimates.

Image Folder

Vision

Vision Sources

Load image datasets from a folder organised by class subdirectories.

COCO Dataset

Vision

Vision Sources

Load object detection datasets in COCO JSON annotation format.

Resize

Vision

Vision Prep

Resize images to a fixed resolution required by vision models.

Normalize

Vision

Vision Prep

Normalise pixel values to ImageNet mean/std for pretrained model compatibility.

Random Flip

Vision

Vision Prep

Data augmentation via horizontal and vertical random flipping.

EfficientNet-B0

Vision

Vision Models

Efficient and accurate image classifier — best accuracy/parameter ratio.

ResNet-50

Vision

Vision Models

Classic 50-layer residual network — reliable baseline for image tasks.

YOLOv8

Vision

Vision Models

Real-time object detection — state of the art speed and accuracy.

mAP Score

Vision

Vision Evaluation

Mean Average Precision — the standard metric for object detection.

Text CSV

NLP

Text Sources

Load text data from CSV files for NLP pipelines.

PDF Loader

NLP

Text Sources

Extract text from PDF files for document analysis pipelines.

Clean Text

NLP

Text Prep

Remove noise from text — HTML, URLs, special characters, stopwords.

TF-IDF Vectorizer

NLP

Text Prep

Convert text to numeric TF-IDF feature vectors for classical ML models.

Text Embeddings

NLP

Text Prep

Dense sentence embeddings using sentence-transformers models.

BERT

NLP

NLP Models

Bidirectional transformer — state of the art for text classification and NER.

DistilBERT

NLP

NLP Models

Lightweight BERT distillation — 40% smaller, 60% faster, 97% of performance.

LLM Processor

NLP

NLP Models

Connect to GPT-4, Claude, or Llama via API for zero-shot NLP tasks.

Time Series CSV

Time Series

TS Sources

Load time series data from CSV with automatic datetime parsing.

Financial API

Time Series

TS Sources

Stream historical stock, crypto, or forex prices from market data APIs.

Resample

Time Series

TS Prep

Change time series frequency — upsample or downsample with aggregation.

Differencing

Time Series

TS Prep

Remove trends and make time series stationary via differencing.

Lag Features

Time Series

TS Features

Create lagged versions of the target to use as input features.

Rolling Statistics

Time Series

TS Features

Rolling mean, std, min, max over a sliding time window.

Prophet

Time Series

TS Models

Facebook Prophet — robust forecasting with trend, seasonality, and holidays.

ARIMA

Time Series

TS Models

AutoRegressive Integrated Moving Average — classic statistical forecasting.

XGBoost (TS)

Time Series

TS Models

XGBoost trained on lag and rolling features for time series regression.

MAE

Time Series

TS Evaluation

Mean Absolute Error — average magnitude of forecast errors.

RMSE

Time Series

TS Evaluation

Root Mean Squared Error — penalises large forecast errors more heavily.

Shapefile Loader

Geospatial

Geo Sources

Load vector geospatial data from Shapefile or GeoJSON.

GeoJSON Loader

Geospatial

Geo Sources

Load GeoJSON feature collections directly from file or URL.

Reproject

Geospatial

Geo Prep

Transform geometries between coordinate reference systems (CRS).

Grid Sampling

Geospatial

Geo Prep

Sample spatial data onto a regular grid at a specified resolution.

NDVI / Spectral

Geospatial

Geo Features

Compute vegetation indices (NDVI, EVI) from multispectral satellite bands.

Distance Features

Geospatial

Geo Features

Compute distances from points to reference geometries (roads, cities, POIs).

Spatial RF

Geospatial

Geo Models

Random Forest with spatial cross-validation to prevent leakage.

Kriging

Geospatial

Geo Models

Gaussian process spatial interpolation — optimal linear unbiased estimator.

Spatial Accuracy

Geospatial

Geo Evaluation

Evaluate predictions with spatially-aware metrics (RMSE, coverage error).

No ML experience required

Start building in minutes

Drop blocks onto the canvas, connect them, and run your pipeline. Blok handles the infrastructure.