feat(connectors): Clickhouse Sink Connector#2886
Conversation
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## master #2886 +/- ##
=============================================
- Coverage 74.57% 47.42% -27.15%
Complexity 937 937
=============================================
Files 1249 1252 +3
Lines 123564 110008 -13556
Branches 99837 86313 -13524
=============================================
- Hits 92143 52172 -39971
- Misses 28438 55018 +26580
+ Partials 2983 2818 -165
🚀 New features to boost your workflow:
|
|
This pull request has been automatically marked as stale because it has not had recent activity. It will be closed in 7 days if no further activity occurs. If you need a review, please ensure CI is green and the PR is rebased on the latest master. Don't hesitate to ping the maintainers - either @core on Discord or by mentioning them directly here on the PR. Thank you for your contribution! |
hubcio
left a comment
There was a problem hiding this comment.
overall good direction, just needs a little bit polishing
|
This pull request has been automatically marked as stale because it has not had recent activity. It will be closed in 7 days if no further activity occurs. If you need a review, please ensure CI is green and the PR is rebased on the latest master. Don't hesitate to ping the maintainers - either @core on Discord or by mentioning them directly here on the PR. Thank you for your contribution! |
|
This pull request has been automatically marked as stale because it has not had recent activity. It will be closed in 7 days if no further activity occurs. If you need a review, please ensure CI is green and the PR is rebased on the latest master. Don't hesitate to ping the maintainers - either @core on Discord or by mentioning them directly here on the PR. Thank you for your contribution! |
|
/ready |
|
/author could you look into it |
|
/ready |
|
|
||
| /// Build a RowBinaryWithDefaults body. | ||
| /// Each `Payload::Json` message is serialised to binary using the table schema. | ||
| pub(crate) fn build_row_binary_body( |
There was a problem hiding this comment.
body.rs:58-77 (build_row_binary_body) vs module doc body.rs:18-23 — propagates Err from serialize_row on first bad row, failing whole batch. Doc claims "mixed batch never causes complete failure" — true for json/string builders, false here. Fix: skip-and-log bad row (match other builders) or document fail-fast as intentional poison-pill in README Reliability section.
| if attempts >= max_retries { | ||
| error!( | ||
| "Insert failed after {attempts} attempts (HTTP {status}): {body_text}" | ||
| ); |
There was a problem hiding this comment.
client.rs:243-256 (insert) — Error::CannotStoreData used for BOTH retryable (429/5xx/network) and non-retryable 4xx. Other connectors doris_sink/influxdb_sink use Error::PermanentHttpError for 4xx, per SDK doc "so circuit breakers are not tripped by bad data" (sdk/src/lib.rs:420-425).
Fix: 4xx branch → Error::PermanentHttpError.
| container: ContainerAsync<GenericImage>, | ||
| pub base_url: String, | ||
| } | ||
|
|
There was a problem hiding this comment.
fixtures/clickhouse/container.rs:90-96 — ClickHouseContainer::start() missing.with_container_name(fixtures::unique_container_name("clickhouse")). postgres/influxdb fixtures set this; fixtures/mod.rs:32-36 docs iggy-test-* sweep convention (docker ps -aqf 'name=^iggy-test-'). Without it, container gets random testcontainers name, invisible to cleanup sweep, leaks on crash/SIGKILL. Fix: add the call.
|
/author |
Which issue does this PR close?
Closes #2539
Rationale
Clickhouse is a real-time data analytics engine, and very popular in modern analytics architectures.
What changed?
This PR introduces a Clickhouse Sink Connector that enables writing data from Iggy to Clickhouse.
The Clickhouse writing logic is heavily inspired by the official Clickhouse Kafka Connector.
Local Execution
Images 1&2: Produced 30456 + 29060 rows into Iggy in two batches
Image 3: Verified schema and number of rows in Clickhouse
AI Usage