How it works
All data integrations follow the same pattern:- Define inputs: Specify which columns contain the data to research (company name, website, etc.)
- Define outputs: Describe what information you want to extract (“CEO name”, “Founding year”, etc.)
- Choose a processor: Select speed vs thoroughness based on your needs
- Get enriched data: Receive structured results with optional citations
Available integrations
Apache Spark
Distributed enrichment for large-scale data processing with PySpark UDFs
Google BigQuery
SQL-native remote functions for enrichment directly in BigQuery queries
Snowflake
SQL-native UDTF with batched processing via External Access Integration
DuckDB
Batch processing and SQL UDFs for local analytics databases
Polars
DataFrame-native enrichment with batch processing and LazyFrame support
Supabase
Edge Functions for enrichment in Supabase applications
Choosing an integration
| Integration | Best for | Processing model |
|---|---|---|
| Spark | Large-scale distributed processing | UDF with concurrent processing per partition |
| BigQuery | Google Cloud data warehouses | Remote function with batched API calls |
| Snowflake | Snowflake data warehouses | Batched UDTF (partition-based) |
| DuckDB | Local analytics, embedded databases | Batch processing (recommended) or SQL UDF |
| Polars | Python DataFrame workflows | Batch processing |
| Supabase | PostgreSQL/Supabase applications | Edge Function |
Installation
All Python-based integrations are available via theparallel-web-tools package: