- Go 98.7%
- Shell 0.6%
- Python 0.5%
Collect equality delete entries during manifest scanning instead of rejecting them. Match equality deletes to data files by partition and sequence number (strictly greater, per spec). Apply hash-based anti-join filter in the Arrow record processing pipeline. Key encoding uses type-tagged, length-prefixed values to avoid collisions. Delete files with different equality field IDs are applied as independent filter groups. Partitioned tables are not yet tested for the read path. |
||
|---|---|---|
| .github | ||
| catalog | ||
| cmd/iceberg | ||
| config | ||
| dev | ||
| internal | ||
| io | ||
| puffin | ||
| table | ||
| utils | ||
| view | ||
| website | ||
| .asf.yaml | ||
| .gitattributes | ||
| .gitignore | ||
| .golangci.yml | ||
| .pre-commit-config.yaml | ||
| CONTRIBUTING.md | ||
| errors.go | ||
| exprs.go | ||
| exprs_test.go | ||
| go.mod | ||
| go.sum | ||
| LICENSE | ||
| literals.go | ||
| literals_test.go | ||
| Makefile | ||
| manifest.go | ||
| manifest_test.go | ||
| name_mapping.go | ||
| name_mapping_test.go | ||
| NOTICE | ||
| operation_string.go | ||
| partitions.go | ||
| partitions_bench_test.go | ||
| partitions_test.go | ||
| predicates.go | ||
| README.md | ||
| schema.go | ||
| schema_conversions.go | ||
| schema_conversions_test.go | ||
| schema_test.go | ||
| transforms.go | ||
| transforms_test.go | ||
| types.go | ||
| types_test.go | ||
| utils.go | ||
| visitors.go | ||
| visitors_test.go | ||
Iceberg Golang
iceberg is a Golang implementation of the Iceberg table spec.
Build From Source
Prerequisites
- Go 1.23 or later
Build
$ git clone https://github.com/apache/iceberg-go.git
$ cd iceberg-go/cmd/iceberg && go build .
Running Tests
Use the Makefile so commands stay in sync with CI (e.g. golangci-lint version).
Unit tests
make test
Linting
make lint
Install the linter first
make lint-install
# or: go install github.com/golangci/golangci-lint/cmd/golangci-lint@v2.8.0
Integration tests
Prerequisites: Docker, Docker Compose
-
Start the Docker containers using docker compose:
make integration-setup -
Export the required environment variables:
export AWS_S3_ENDPOINT=http://$(docker inspect -f '{{range.NetworkSettings.Networks}}{{.IPAddress}}{{end}}' minio):9000 export AWS_REGION=us-east-1 export SPARK_CONTAINER_ID=$(docker ps -qf 'name=spark-iceberg') export DOCKER_API_VER=$(docker version -f '{{.Server.APIVersion}}') -
Run the integration tests:
make integration-testOr run a single suite:
make integration-scanner,make integration-io,make integration-rest,make integration-spark.
Feature Support / Roadmap
FileSystem Support
| Filesystem Type | Supported |
|---|---|
| S3 | X |
| Google Cloud Storage | X |
| Azure Blob Storage | X |
| Local Filesystem | X |
Metadata
| Operation | Supported |
|---|---|
| Get Schema | X |
| Get Snapshots | X |
| Get Sort Orders | X |
| Get Partition Specs | X |
| Get Manifests | X |
| Create New Manifests | X |
| Plan Scan | x |
| Plan Scan for Snapshot | x |
Catalog Support
| Operation | REST | Hive | Glue | SQL |
|---|---|---|---|---|
| Load Table | X | X | X | X |
| List Tables | X | X | X | X |
| Create Table | X | X | X | X |
| Register Table | X | X | X | |
| Update Current Snapshot | X | X | X | X |
| Create New Snapshot | X | X | X | X |
| Rename Table | X | X | X | X |
| Drop Table | X | X | X | X |
| Alter Table | X | X | X | X |
| Check Table Exists | X | X | X | X |
| Set Table Properties | X | X | X | X |
| List Namespaces | X | X | X | X |
| Create Namespace | X | X | X | X |
| Check Namespace Exists | X | X | X | X |
| Drop Namespace | X | X | X | X |
| Update Namespace Properties | X | X | X | X |
| Create View | X | X | X | |
| Load View | X | X | ||
| List View | X | X | X | |
| Drop View | X | X | X | |
| Check View Exists | X | X | X |
Read/Write Data Support
- Data can currently be read as an Arrow Table or as a stream of Arrow record batches.
Supported Write Operations
As long as the FileSystem is supported and the Catalog supports altering the table, the following tracks the current write support:
| Operation | Supported |
|---|---|
| Append Stream | X |
| Append Data Files | X |
| Rewrite Files | |
| Rewrite manifests | |
| Overwrite Files | X |
| Copy-On-Write Delete | X |
| Write Pos Delete | X |
| Write Eq Delete | |
| Row Delta |
CLI Usage
Run go build ./cmd/iceberg from the root of this repository to build the CLI executable, alternately you can run go install github.com/apache/iceberg-go/cmd/iceberg@latest to install it to the bin directory of your GOPATH.
The iceberg CLI usage is very similar to pyiceberg CLI
You can pass the catalog URI with --uri argument.
Example:
You can start the Iceberg REST API docker image which runs on default in port 8181
docker pull apache/iceberg-rest-fixture:latest
docker run -p 8181:8181 apache/iceberg-rest-fixture:latest
and run the iceberg CLI pointing to the REST API server.
./iceberg --uri http://0.0.0.0:8181 list
┌─────┐
| IDs |
| --- |
└─────┘
Create Namespace
./iceberg --uri http://0.0.0.0:8181 create namespace taxitrips
List Namespace
./iceberg --uri http://0.0.0.0:8181 list
┌───────────┐
| IDs |
| --------- |
| taxitrips |
└───────────┘