<!DOCTYPE html><html lang="en"><head><meta http-equiv="Content-Type" content="text/html charset=UTF-8"><meta charset="UTF-8"><meta name="viewport" content="width=device-width"><meta name="x-apple-disable-message-reformatting"><title>TLDR Data</title><meta name="color-scheme" content="light dark"><meta name="supported-color-schemes" content="light dark"><style type="text/css">
:root {
color-scheme: light dark; supported-color-schemes: light dark;
}
*,
*:after,
*:before {
-webkit-box-sizing: border-box; -moz-box-sizing: border-box; box-sizing: border-box;
}
* {
-ms-text-size-adjust: 100%; -webkit-text-size-adjust: 100%;
}
html,
body,
.document {
width: 100% !important; height: 100% !important; margin: 0; padding: 0;
}
body {
-webkit-font-smoothing: antialiased; -moz-osx-font-smoothing: grayscale; text-rendering: optimizeLegibility;
}
div[style*="margin: 16px 0"] {
margin: 0 !important;
}
table,
td {
mso-table-lspace: 0pt; mso-table-rspace: 0pt;
}
table {
border-spacing: 0; border-collapse: collapse; table-layout: fixed; margin: 0 auto;
}
img {
-ms-interpolation-mode: bicubic; max-width: 100%; border: 0;
}
*[x-apple-data-detectors] {
color: inherit !important; text-decoration: none !important;
}
.x-gmail-data-detectors,
.x-gmail-data-detectors *,
.aBn {
border-bottom: 0 !important; cursor: default !important;
}
.btn {
-webkit-transition: all 200ms ease; transition: all 200ms ease;
}
.btn:hover {
background-color: #f67575; border-color: #f67575;
}
* {
font-family: Arial, Helvetica, sans-serif; font-size: 18px;
}
@media screen and (max-width: 600px) {
.container {
width: 100%; margin: auto;
}
.stack {
display: block!important; width: 100%!important; max-width: 100%!important;
}
.btn {
display: block; width: 100%; text-align: center;
}
}
body,
p,
td,
tr,
.body,
table,
h1,
h2,
h3,
h4,
h5,
h6,
div,
span {
background-color: #FEFEFE !important; color: #010101 !important;
}
@media (prefers-color-scheme: dark) {
body,
p,
td,
tr,
.body,
table,
h1,
h2,
h3,
h4,
h5,
h6,
div,
span {
background-color: #27292D !important; color: #FEFEFE !important;
}
}
a {
color: inherit !important; text-decoration: underline !important;
}
</style><!--[if mso | ie]>
<style type="text/css">
a {
background-color: #FEFEFE !important; color: #010101 !important;
}
@media (prefers-color-scheme: dark) {
a {
background-color: #27292D !important; color: #FEFEFE !important;
}
}
</style>
<![endif]--></head><body class="">
<div style="display: none; max-height: 0px; overflow: hidden;">Hudi, Delta Lake, and Iceberg all support ACID, Copy-On-Write, schema evolution, and time travel. Hudi excels with Merge-On-Read </div>
<div style="display: none; max-height: 0px; overflow: hidden;">
<br>
</div>
<table align="center" class="document"><tbody><tr><td valign="top">
<table align="center" border="0" cellpadding="0" cellspacing="0" class="container" width="600"><tbody><tr class="inner-body"><td>
<table align="center" border="0" cellpadding="0" cellspacing="0" width="100%"><tbody><tr class="header"><td bgcolor="" class="container">
<table width="100%"><tbody><tr><td class="container">
<table align="center" bgcolor="" border="0" cellpadding="0" cellspacing="0" style="margin-top: 0px;" width="100%"><tbody><tr><td style="padding: 0px;">
<table align="center" bgcolor="" border="0" cellpadding="0" cellspacing="0" width="100%"><tbody><tr><td class="container" style="padding: 15px 15px;">
<div style="text-align: center;">
<span style="margin-right: 0px;"><a href="https://tracking.tldrnewsletter.com/CL0/https:%2F%2Ftldr.tech%2Fdata%3Futm_source=tldrdata/1/01000199c86fc6bb-c4163ca0-652c-48bc-8c1f-8c9fb7150184-000000/oNSgx4_fV9CcYMdMkJlrRIpQqYxAF9NhZGwtjiepPvI=426" rel="noopener noreferrer" target="_blank"><span>Sign Up</span></a>
|<span style="margin-right: 2px; margin-left: 2px;"><a href="https://tracking.tldrnewsletter.com/CL0/https:%2F%2Fadvertise.tldr.tech%3Futm_source=tldrdata%26utm_medium=newsletter%26utm_campaign=advertisetopnav/1/01000199c86fc6bb-c4163ca0-652c-48bc-8c1f-8c9fb7150184-000000/WHhF1q4UoGuB9iO-Y2szBMsIQu9gK1JLJLycSGQjgmA=426" rel="noopener noreferrer" target="_blank"><span>Advertise</span></a></span>|<span style="margin-left: 2px;"><a href="https://tracking.tldrnewsletter.com/CL0/https:%2F%2Fa.tldrnewsletter.com%2Fweb-version%3Fep=1%26lc=1670a604-84b7-11f0-bcf5-55fc1d40139c%26p=e89f6a42-a4ca-11f0-be1d-fd5fc949f751%26pt=campaign%26t=1760004392%26s=d31dcebf0037d57a39329682e37093fdbba35c39a480cd320b34c1d937697f96/1/01000199c86fc6bb-c4163ca0-652c-48bc-8c1f-8c9fb7150184-000000/cxka1qjUMZkTSckeNLMlyWma8orhqFU-P7O7kIWJH9U=426"><span>View Online</span></a></span>
<br>
</span></div>
</td></tr></tbody></table>
<table align="center" bgcolor="" border="0" cellpadding="0" cellspacing="0" width="100%"><tbody><tr><td class="container" style="text-align: center;"><span data-darkreader-inline-color="" style="--darkreader-inline-color:#3db3ff; color: rgb(51, 175, 255) !important; font-size: 30px;">T</span><span style="font-size: 30px;"><span data-darkreader-inline-color="" style="color: rgb(232, 192, 96) !important; --darkreader-inline-color:#e8c163; font-size:30px;">L</span><span data-darkreader-inline-color="" style="color: rgb(101, 195, 173) !important; --darkreader-inline-color:#6ec7b2; font-size:30px;">D</span></span><span data-darkreader-inline-color="" style="--darkreader-inline-color:#dd6e6e; color: rgb(220, 107, 107) !important; font-size: 30px;">R</span>
<br>
</td></tr></tbody></table>
<br>
<table align="center" bgcolor="" border="0" cellpadding="0" cellspacing="0" width="100%"><tbody><tr id="together-with"><td align="center" height="20" style="vertical-align:middle !important;" valign="middle" width="100%"><strong style="vertical-align:middle !important; height: 100%;">Together With </strong>
<a href="https://tracking.tldrnewsletter.com/CL0/https:%2F%2Fsifted.com%2Fresources%2F2026-fedex-gri-analysis%2F%3Futm_source=TLDR%26utm_medium=email%26utm_campaign=25Q4_TOFU_BG_2026FedExGRIAnalysis_EM_TLDR_Data%26utm_content=2026FedExGRIAnalysis/1/01000199c86fc6bb-c4163ca0-652c-48bc-8c1f-8c9fb7150184-000000/HIDYN25vJ3P23wQKNa3_ONtKJzyyuMu2LE2giPSbXAo=426"><img src="https://images.tldr.tech/sifted.png" valign="middle" style="vertical-align: middle !important; height: 100%;" alt="Sifted"></a></td></tr></tbody></table>
<table style="table-layout: fixed; width:100%;" width="100%"><tbody><tr><td style="padding:0;border-collapse:collapse;border-spacing:0;margin:0;">
<div style="text-align: center;">
<h1><strong>TLDR Data <span id="date">2025-10-09</span></strong></h1>
</div>
</td></tr></tbody></table>
<table style="table-layout: fixed; width:100%;" width="100%"><tbody><tr id="sponsy-copy"><td class="container" style="padding: 15px 15px;">
<div class="text-block">
<span>
<a href="https://tracking.tldrnewsletter.com/CL0/https:%2F%2Fsifted.com%2Fresources%2F2026-fedex-gri-analysis%2F%3Futm_source=TLDR%26utm_medium=email%26utm_campaign=25Q4_TOFU_BG_2026FedExGRIAnalysis_EM_TLDR_Data%26utm_content=2026FedExGRIAnalysis/2/01000199c86fc6bb-c4163ca0-652c-48bc-8c1f-8c9fb7150184-000000/UKfdyFxPLpRlhMKhfhOi77WoJjDItbmwO7IT59niKSo=426">
<span>
<strong>How AI software is helping parcel shippers navigate FedEx rate increases (Sponsor)</strong>
</span>
</a>
<br>
<br>
<span style="font-family: "Helvetica Neue", Helvetica, Arial, Verdana, sans-serif;">
FedEx announced a 5.9% General Rate Increase for 2026 - leaving parcel shippers to analyze the impact.<p></p><p>But 5.9% is just an average, complicating the analysis. The real cost comes down to surcharges, dimensional weight calculations, zone-based pricing, minimum charges, etc.</p><p><a href="https://tracking.tldrnewsletter.com/CL0/https:%2F%2Fsifted.com%2Fresources%2F2026-fedex-gri-analysis%2F%3Futm_source=TLDR%26utm_medium=email%26utm_campaign=25Q4_TOFU_BG_2026FedExGRIAnalysis_EM_TLDR_Data%26utm_content=2026FedExGRIAnalysis/3/01000199c86fc6bb-c4163ca0-652c-48bc-8c1f-8c9fb7150184-000000/dYacMKDhjI-5QirhNaHSovxgAf3DZ6js6bMj4jCEMkc=426" rel="noopener noreferrer nofollow" target="_blank"><span>SiftedAI is helping companies adapt</span></a> to pricing changes by taking their parcel data and modeling the true impact on their business. From all the data points, <a href="https://tracking.tldrnewsletter.com/CL0/https:%2F%2Fsifted.com%2Fresources%2F2026-fedex-gri-analysis%2F%3Futm_source=TLDR%26utm_medium=email%26utm_campaign=25Q4_TOFU_BG_2026FedExGRIAnalysis_EM_TLDR_Data%26utm_content=2026FedExGRIAnalysis/4/01000199c86fc6bb-c4163ca0-652c-48bc-8c1f-8c9fb7150184-000000/b1aLsaa-LhmjkhsvBos8PRb3z8YtWXURagKUnh100zY=426" rel="noopener noreferrer nofollow" target="_blank"><span>the software can build a personalized impact analysis</span></a> that shows their effective GRI + how it is distributed across services and fees.</p>
<p>Learn more about the FedEx rate increases on the <a href="https://tracking.tldrnewsletter.com/CL0/https:%2F%2Fsifted.com%2Fresources%2F2026-fedex-gri-analysis%2F%3Futm_source=TLDR%26utm_medium=email%26utm_campaign=25Q4_TOFU_BG_2026FedExGRIAnalysis_EM_TLDR_Data%26utm_content=2026FedExGRIAnalysis/5/01000199c86fc6bb-c4163ca0-652c-48bc-8c1f-8c9fb7150184-000000/3QHSGoYJR8IGptduyweB0v_ynCgACCFE5e4gJGHQA4o=426" rel="noopener noreferrer nofollow" target="_blank"><span>SiftedAI blog</span></a>.
</p>
</span></span></div>
</td></tr></tbody></table>
</td></tr></tbody></table>
</td></tr></tbody></table>
</td></tr>
<tr bgcolor=""><td class="container">
<table align="center" bgcolor="" border="0" cellpadding="0" cellspacing="0" width="100%"><tbody><tr><td style="padding: 0px;">
<table align="center" bgcolor="" border="0" cellpadding="0" cellspacing="0" width="100%"><tbody><tr><td class="container" style="padding-top: 0px; padding-bottom: 0px;">
<div class="text-block">
<div style="text-align: center;"><span style="font-size: 36px;">📱</span></div></div>
</td></tr></tbody></table>
<table align="center" bgcolor="" border="0" cellpadding="0" cellspacing="0" width="100%"><tbody><tr><td class="container" style="padding-top: 0px; padding-bottom: 0px;">
<div class="text-block">
<div style="text-align: center;">
<h1><strong>Deep Dives</strong></h1>
</div>
</div>
</td></tr></tbody></table>
<table style="table-layout: fixed; width: 100%;" width="100%"><tbody><tr><td style="padding:0;border-collapse:collapse;border-spacing:0;margin:0;" valign="top">
<table align="center" border="0" cellpadding="0" cellspacing="0" width="100%"><tbody><tr><td class="container" style="padding: 15px 15px;">
<div class="text-block">
<span>
<a href="https://tracking.tldrnewsletter.com/CL0/https:%2F%2Fwww.onehouse.ai%2Fblog%2Fapache-hudi-vs-delta-lake-vs-apache-iceberg-lakehouse-feature-comparison%3Futm_source=tldrdata/1/01000199c86fc6bb-c4163ca0-652c-48bc-8c1f-8c9fb7150184-000000/r89CT0hV7j5Z_HVXkmtTzuE8zdCmTuQuaGNIW0GL8kA=426">
<span>
<strong>Apache Iceberg vs Delta Lake vs Apache Hudi - Feature Comparison Deep Dive (15 minute read)</strong>
</span>
</a>
<br>
<br>
<span style="font-family: "Helvetica Neue", Helvetica, Arial, Verdana, sans-serif;">
All three formats (Hudi, Delta Lake, and Iceberg) support ACID, Copy-On-Write, schema evolution, and time travel. Hudi excels with Merge-On-Read, advanced indexing, partial updates, non-blocking concurrency, and automated compaction/clustering. Delta shines in Databricks/Z-order integration but uses experimental features and proprietary elements, while Iceberg leads in partition evolution and read/write support, yet demands manual maintenance, slower metadata, and lacks CDC/primary keys.
</span>
</span>
</div>
</td></tr></tbody></table>
<table align="center" border="0" cellpadding="0" cellspacing="0" width="100%"><tbody><tr><td class="container" style="padding: 15px 15px;">
<div class="text-block">
<span>
<a href="https://tracking.tldrnewsletter.com/CL0/https:%2F%2Fklaviyo.tech%2Fbuilding-a-resilient-event-publisher-with-dual-failure-capture-518749cb5600%3Futm_source=tldrdata/1/01000199c86fc6bb-c4163ca0-652c-48bc-8c1f-8c9fb7150184-000000/kOO2wJ0ve6s3oCbP2NXlccGd0rNd7QcW2JI9mQCUdlY=426">
<span>
<strong>Building a Resilient Event Publisher with Dual Failure Capture (9 minute read)</strong>
</span>
</a>
<br>
<br>
<span style="font-family: "Helvetica Neue", Helvetica, Arial, Verdana, sans-serif;">
Klaviyo revamped its event publishing system to eliminate data loss during network hiccups, Kafka timeouts, or serialization errors, processing up to 170,000 events/sec at peak. The solution implements a dual failure capture strategy: automatic retries write failed events to a self-hosted Kafka DLQ (retained for 7 days), while persistent failures or serialization bugs route events to S3 for infinite retention and manual recovery.
</span>
</span>
</div>
</td></tr></tbody></table>
<table align="center" border="0" cellpadding="0" cellspacing="0" width="100%"><tbody><tr><td class="container" style="padding: 15px 15px;">
<div class="text-block">
<span>
<a href="https://tracking.tldrnewsletter.com/CL0/https:%2F%2Fblog.bytebytego.com%2Fp%2Fhow-openai-uses-kubernetes-and-apache%3Futm_source=tldrdata/1/01000199c86fc6bb-c4163ca0-652c-48bc-8c1f-8c9fb7150184-000000/AGPRF61iqq8OtGEsacl3yEGaAmd4_nC5jFLwLnCEAKo=426">
<span>
<strong>How OpenAI Uses Kubernetes And Apache Kafka for GenAI (15 minute read)</strong>
</span>
</a>
<br>
<br>
<span style="font-family: "Helvetica Neue", Helvetica, Arial, Verdana, sans-serif;">
OpenAI's engineering team built a stream processing platform using PyFlink on Kubernetes with Apache Kafka as the event streaming backbone to handle massive data volumes for AI systems, shifting from batch to real-time processing for fresher data. Kafka acts as the multi-primary event backbone for logs, training data, and experiments.
</span>
</span>
</div>
</td></tr></tbody></table>
</td></tr></tbody></table>
<table align="center" bgcolor="" border="0" cellpadding="0" cellspacing="0" width="100%"><tbody><tr><td class="container" style="padding-top: 0px; padding-bottom: 0px;">
<div class="text-block">
<div style="text-align: center;"><span style="font-size: 36px;">🚀</span></div>
</div>
</td></tr></tbody></table>
<table align="center" bgcolor="" border="0" cellpadding="0" cellspacing="0" width="100%"><tbody><tr><td class="container" style="padding-top: 0px; padding-bottom: 0px;">
<div class="text-block">
<div style="text-align: center;">
<h1><strong>Opinions & Advice</strong></h1>
</div>
</div>
</td></tr></tbody></table>
<table style="table-layout: fixed; width: 100%;" width="100%"><tbody><tr><td style="padding:0;border-collapse:collapse;border-spacing:0;margin:0;" valign="top">
<table align="center" border="0" cellpadding="0" cellspacing="0" width="100%"><tbody><tr><td class="container" style="padding: 15px 15px;">
<div class="text-block">
<span>
<a href="https://tracking.tldrnewsletter.com/CL0/https:%2F%2Fwww.dataengineeringweekly.com%2Fp%2Fengineering-growth-the-data-layers%3Futm_source=tldrdata/1/01000199c86fc6bb-c4163ca0-652c-48bc-8c1f-8c9fb7150184-000000/VD5bINndSDZqMRlph7J7lPMPOQvi7X-gGb8_NzruG68=426">
<span>
<strong>Engineering Growth: The Data Layers Powering Modern GTM (12 minute read)</strong>
</span>
</a>
<br>
<br>
<span style="font-family: "Helvetica Neue", Helvetica, Arial, Verdana, sans-serif;">
Growth no longer rewards the widest net. Modern Go-To-Market (GTM) teams win with precision, not volume, building revenue on infrastructure like pipelines, warehouses, and customer data platforms that turn signals into action. However, not all data is created equal, as the insights draw from five distinct data sources, each with unique engineering challenges, governance requirements, and strategic value.
</span>
</span>
</div>
</td></tr></tbody></table>
<table align="center" border="0" cellpadding="0" cellspacing="0" width="100%"><tbody><tr><td class="container" style="padding: 15px 15px;">
<div class="text-block">
<span>
<a href="https://tracking.tldrnewsletter.com/CL0/https:%2F%2Fdataengineeringcentral.substack.com%2Fp%2Fthe-single-node-rebellion%3Futm_source=tldrdata/1/01000199c86fc6bb-c4163ca0-652c-48bc-8c1f-8c9fb7150184-000000/hX1mK0vjawxYBMFGGI0ZPZONuZfXVFlkbgZhSoj8mUU=426">
<span>
<strong>The Single Node Rebellion (6 minute read)</strong>
</span>
</a>
<br>
<br>
<span style="font-family: "Helvetica Neue", Helvetica, Arial, Verdana, sans-serif;">
Tools like DuckDB and Polars are challenging distributed systems (e.g., Spark, AWS EC2, and Databricks) for most workloads, as datasets are rarely "Big Data" and single-node solutions offer cost savings and simplicity amid rising cloud expenses.
</span>
</span>
</div>
</td></tr></tbody></table>
<table align="center" border="0" cellpadding="0" cellspacing="0" width="100%"><tbody><tr><td class="container" style="padding: 15px 15px;">
<div class="text-block">
<span>
<a href="https://tracking.tldrnewsletter.com/CL0/https:%2F%2Fseattledataguy.substack.com%2Fp%2F7-questions-every-data-team-should%3Futm_source=tldrdata/1/01000199c86fc6bb-c4163ca0-652c-48bc-8c1f-8c9fb7150184-000000/S6_sWQzwZQO9E1P3VF9gY3SnhZ6UsvKwdcm8uEu2PNE=426">
<span>
<strong>7 Questions Every Data Team Should Ask the Business (5 minute read)</strong>
</span>
</a>
<br>
<br>
<span style="font-family: "Helvetica Neue", Helvetica, Arial, Verdana, sans-serif;">
Data teams often face vague or misaligned project requests from business partners. Instead of asking "What data do you need?", they should use targeted questions to uncover pain points, decision gaps, and opportunities. For example, ask what recent win they want to scale (to build rapport and amplify successes), and when a lack of data led to a bad decision (to reveal gaps and value perceptions).
</span>
</span>
</div>
</td></tr></tbody></table>
</td></tr></tbody></table>
<table align="center" bgcolor="" border="0" cellpadding="0" cellspacing="0" width="100%"><tbody><tr><td class="container" style="padding-top: 0px; padding-bottom: 0px;">
<div class="text-block">
<div style="text-align: center;"><span style="font-size: 36px;">💻</span></div>
</div>
</td></tr></tbody></table>
<table align="center" bgcolor="" border="0" cellpadding="0" cellspacing="0" width="100%"><tbody><tr><td class="container" style="padding-top: 0px; padding-bottom: 0px;">
<div class="text-block">
<div style="text-align: center;">
<h1><strong>Launches & Tools</strong></h1>
</div>
</div>
</td></tr></tbody></table>
<table style="table-layout: fixed; width: 100%;" width="100%"><tbody><tr><td style="padding:0;border-collapse:collapse;border-spacing:0;margin:0;" valign="top">
<table align="center" border="0" cellpadding="0" cellspacing="0" width="100%"><tbody><tr><td class="container" style="padding: 15px 15px;">
<div class="text-block">
<span>
<a href="https://tracking.tldrnewsletter.com/CL0/https:%2F%2Fossinsight.io%2F%3Futm_source=tldrdata/1/01000199c86fc6bb-c4163ca0-652c-48bc-8c1f-8c9fb7150184-000000/r-Npf1S3hA-5Kb9w-dyJ5l9W6w9UvmnwTVqDyb76OvM=426">
<span>
<strong>OSS Insight (Tool)</strong>
</span>
</a>
<br>
<br>
<span style="font-family: "Helvetica Neue", Helvetica, Arial, Verdana, sans-serif;">
You can gauge open source momentum by tracking GitHub activity, especially pull requests, which signal innovation speed and community engagement. Comparing trends across areas like analytics engines, event streaming, orchestration, and lakehouse formats can reveal where the ecosystem is moving.
</span>
</span>
</div>
</td></tr></tbody></table>
<table align="center" border="0" cellpadding="0" cellspacing="0" width="100%"><tbody><tr><td class="container" style="padding: 15px 15px;">
<div class="text-block">
<span>
<a href="https://tracking.tldrnewsletter.com/CL0/https:%2F%2Fengineering.fb.com%2F2025%2F10%2F06%2Fdeveloper-tools%2Fopenzl-open-source-format-aware-compression-framework%2F%3Futm_source=tldrdata/1/01000199c86fc6bb-c4163ca0-652c-48bc-8c1f-8c9fb7150184-000000/FKo8QnMOhf5l202IXUUMQ1LdZix_0s3CPryfLzrZB08=426">
<span>
<strong>Introducing OpenZL: An Open Source Format-Aware Compression Framework (8 minute read)</strong>
</span>
</a>
<br>
<br>
<span style="font-family: "Helvetica Neue", Helvetica, Arial, Verdana, sans-serif;">
Meta has open-sourced OpenZL, a format-aware, lossless compression framework that outperforms generic compressors by leveraging explicit data structure descriptions. Tailored for structured data like database tables, timeseries, and ML tensors, OpenZL achieves higher compression ratios and speed while maintaining a single universal decompressor, reducing operational complexity. The offline trainer generates data-specific compression configs, enabling rapid adaptation to schema changes without re-deployment.
</span>
</span>
</div>
</td></tr></tbody></table>
<table align="center" border="0" cellpadding="0" cellspacing="0" width="100%"><tbody><tr><td class="container" style="padding: 15px 15px;">
<div class="text-block">
<span>
<a href="https://tracking.tldrnewsletter.com/CL0/https:%2F%2Fwww.databricks.com%2Fblog%2Fexamining-versionless-apache-sparktm-ai-powered-upgrades-and-seamless-stability-2-billion%3Futm_source=tldrdata/1/01000199c86fc6bb-c4163ca0-652c-48bc-8c1f-8c9fb7150184-000000/hFC4O_VkznX7kN4BZJDSi9dWPoLY0uM2YY8jJrCRh0g=426">
<span>
<strong>Examining Versionless Apache Spark: AI-powered upgrades and seamless stability for 2 billion workloads (4 minute read)</strong>
</span>
</a>
<br>
<br>
<span style="font-family: "Helvetica Neue", Helvetica, Arial, Verdana, sans-serif;">
Versionless Spark decouples clients from servers via a stable Spark Connect API for automatic upgrades, using environment versioning with base images (Spark Connect and Python deps). Its AI-powered Release Stability System (RSS) facilitates upgrades via workload fingerprints, historical metadata, ML-driven error triage, and anomaly detection, resulting in a 99.99% success rate across 2 billion jobs transitioned from DBR 14 to 17 (including Spark 4) with features like collation and bloom filters unlocked.
</span>
</span>
</div>
</td></tr></tbody></table>
<table align="center" border="0" cellpadding="0" cellspacing="0" width="100%"><tbody><tr><td class="container" style="padding: 15px 15px;">
<div class="text-block">
<span>
<a href="https://tracking.tldrnewsletter.com/CL0/https:%2F%2Fgithub.com%2FBasekick-Labs%2Farc%3Futm_source=tldrdata/1/01000199c86fc6bb-c4163ca0-652c-48bc-8c1f-8c9fb7150184-000000/nhcTS82fpH92vpktU8U7dDVHe12bwffKADyBHsBEwog=426">
<span>
<strong>Arc Core (GitHub Repo)</strong>
</span>
</a>
<br>
<br>
<span style="font-family: "Helvetica Neue", Helvetica, Arial, Verdana, sans-serif;">
Arc Core is a high-performance time-series data warehouse designed for rapid data ingestion, achieving 1.89 million records per second on native deployment. It utilizes DuckDB, Parquet, and MinIO, making it a great choice for those who require efficient storage and querying of time-series data.
</span>
</span>
</div>
</td></tr></tbody></table>
</td></tr></tbody></table>
<table align="center" bgcolor="" border="0" cellpadding="0" cellspacing="0" width="100%"><tbody><tr><td class="container" style="padding-top: 0px; padding-bottom: 0px;">
<div class="text-block">
<div style="text-align: center;"><span style="font-size: 36px;">🎁</span></div></div>
</td></tr></tbody></table>
<table align="center" bgcolor="" border="0" cellpadding="0" cellspacing="0" width="100%"><tbody><tr><td class="container" style="padding-top: 0px; padding-bottom: 0px;">
<div class="text-block">
<div style="text-align: center;"><strong><h1>Miscellaneous</h1></strong></div>
</div>
</td></tr></tbody></table>
<table bgcolor="" style="table-layout: fixed; width: 100%;" width="100%"><tbody><tr><td style="padding:0;border-collapse:collapse;border-spacing:0;margin:0;" valign="top">
<table align="center" border="0" cellpadding="0" cellspacing="0" width="100%"><tbody><tr><td class="container" style="padding: 15px 15px;">
<div class="text-block">
<span>
<a href="https://tracking.tldrnewsletter.com/CL0/https:%2F%2Fbrooker.co.za%2Fblog%2F2025%2F10%2F05%2Flocality.html%3Futm_source=tldrdata/1/01000199c86fc6bb-c4163ca0-652c-48bc-8c1f-8c9fb7150184-000000/kOq-OZsKBNLAzlA9yVkKoxcqsAT_U-rAZc-OJnYGohk=426">
<span>
<strong>Locality, and Temporal-Spatial Hypothesis (8 minute read)</strong>
</span>
</a>
<br>
<br>
<span style="font-family: "Helvetica Neue", Helvetica, Arial, Verdana, sans-serif;">
The "temporal-spatial locality hypothesis" states that data written around the same time is likely read around the same time, justifying storage proximity for efficiency, using the example of faster forward scans (read-ahead caching) versus slower backward ones (database needs to block on IO to fetch the next page). The hypothesis holds trivially in time-ordered systems like streaming and time-series data, but hash-based stores like DynamoDB reject it via random keys to avoid write hotspots, trading read spatial locality for write scalability.
</span>
</span>
</div>
</td></tr></tbody></table>
<table align="center" border="0" cellpadding="0" cellspacing="0" width="100%"><tbody><tr><td class="container" style="padding: 15px 15px;">
<div class="text-block">
<span>
<a href="https://tracking.tldrnewsletter.com/CL0/https:%2F%2Fdeveloper.nvidia.com%2Fblog%2Faccelerating-large-scale-data-analytics-with-gpu-native-velox-and-nvidia-cudf%2F%3Futm_source=tldrdata/1/01000199c86fc6bb-c4163ca0-652c-48bc-8c1f-8c9fb7150184-000000/BzljuBTa-qYwyPgVaf8mgBSTpvg1NOqbERmnsNy7wRw=426">
<span>
<strong>Accelerating Large-Scale Data Analytics with GPU-Native Velox and NVIDIA cuDF (7 minute read)</strong>
</span>
</a>
<br>
<br>
<span style="font-family: "Helvetica Neue", Helvetica, Arial, Verdana, sans-serif;">
IBM and NVIDIA have integrated cuDF with the Velox execution engine to enable GPU-native SQL query execution in systems like Presto and Apache Spark. Velox rewrites plans to use GPU operators (joins, scans, and aggregations) and supports UCX-based exchange for multi-GPU data routing. In benchmarks, single-node Presto showed an order-of-magnitude performance improvement over CPU.
</span>
</span>
</div>
</td></tr></tbody></table>
</td></tr></tbody></table>
<table align="center" bgcolor="" border="0" cellpadding="0" cellspacing="0" width="100%"><tbody><tr><td class="container" style="padding-top: 0px; padding-bottom: 0px;">
<div class="text-block">
<div style="text-align: center;"><span style="font-size: 36px;">⚡</span></div></div>
</td></tr></tbody></table>
<table align="center" bgcolor="" border="0" cellpadding="0" cellspacing="0" width="100%"><tbody><tr><td class="container" style="padding-top: 0px; padding-bottom: 0px;">
<div class="text-block">
<div style="text-align: center;">
<h1><strong>Quick Links</strong></h1>
</div>
</div>
</td></tr></tbody></table>
<table bgcolor="" style="table-layout: fixed; width: 100%;" width="100%"><tbody><tr><td style="padding:0;border-collapse:collapse;border-spacing:0;margin:0;" valign="top">
<table align="center" border="0" cellpadding="0" cellspacing="0" width="100%"><tbody><tr><td class="container" style="padding: 15px 15px;">
<div class="text-block">
<span>
<a href="https://tracking.tldrnewsletter.com/CL0/https:%2F%2Fluminousmen.com%2Fpost%2Fhow-not-to-partition-data-in-s3-and-what-to-do-instead%2F%3Futm_source=tldrdata/1/01000199c86fc6bb-c4163ca0-652c-48bc-8c1f-8c9fb7150184-000000/ZCirF-wX1wbXUKm5rWua9wXJHA7vRnrzoVUB-vC6YrM=426">
<span>
<strong>How Not to Partition Data in S3 (And What to Do Instead) (5 minute read)</strong>
</span>
</a>
<br>
<br>
<span style="font-family: "Helvetica Neue", Helvetica, Arial, Verdana, sans-serif;">
Partitioning S3 data lakes by year/month/day seems logical, but often degrades performance, creating many small files that increase scan costs and slow queries.
</span>
</span>
</div>
</td></tr></tbody></table>
<table align="center" border="0" cellpadding="0" cellspacing="0" width="100%"><tbody><tr><td class="container" style="padding: 15px 15px;">
<div class="text-block">
<span>
<a href="https://tracking.tldrnewsletter.com/CL0/https:%2F%2Fdagster.io%2Fblog%2Fbuilding-a-better-lakehouse-from-airflow-to-dagster%3Futm_source=tldrdata/1/01000199c86fc6bb-c4163ca0-652c-48bc-8c1f-8c9fb7150184-000000/rUqOjddKV7ShMF21K3H3jB8gCfPJmb3sh7Be9FSpbgk=426">
<span>
<strong>Building a Better Lakehouse: From Airflow to Dagster (7 minute read)</strong>
</span>
</a>
<br>
<br>
<span style="font-family: "Helvetica Neue", Helvetica, Arial, Verdana, sans-serif;">
Replacing Airflow with Dagster enabled smarter partitioning, event-driven monitoring, and pure SQL data loading, significantly improving lakehouse efficiency and capabilities.
</span>
</span>
</div>
</td></tr></tbody></table>
</td></tr></tbody></table>
<table align="center" bgcolor="" border="0" cellpadding="0" cellspacing="0" width="100%"><tbody><tr><td align="left" style="word-break: break-word; vertical-align: top; padding: 5px 10px;">
<p style="padding: 0; margin: 0; font-size: 22px; color: #000000; line-height: 1.6; font-weight: bold;">
Want to advertise in TLDR? 📰
</p>
<div class="text-block" style="margin-top: 10px;">
If your company is interested in reaching an audience of data engineering professionals and decision makers, you may want to <a href="https://tracking.tldrnewsletter.com/CL0/https:%2F%2Fadvertise.tldr.tech%2F%3Futm_source=tldrdata%26utm_medium=newsletter%26utm_campaign=advertisecta/1/01000199c86fc6bb-c4163ca0-652c-48bc-8c1f-8c9fb7150184-000000/E4vjJlCSPks8OcBj30PUDH0bYP8SfzY3hQXQ53izw2c=426"><strong><span>advertise with us</span></strong></a>.
</div>
<br>
<!-- New "Want to work at TLDR?" section -->
<p style="padding: 0; margin: 0; font-size: 22px; color: #000000; line-height: 1.6; font-weight: bold;">
Want to work at TLDR? 💼
</p>
<div class="text-block" style="margin-top: 10px;">
<a href="https://tracking.tldrnewsletter.com/CL0/https:%2F%2Fjobs.ashbyhq.com%2Ftldr.tech/1/01000199c86fc6bb-c4163ca0-652c-48bc-8c1f-8c9fb7150184-000000/aY57mfDCWQdm-oVdZhOb7Ls3Xnm6nroZVK0ab_F0_t4=426" rel="noopener noreferrer" style="color: #0000EE; text-decoration: underline;" target="_blank"><strong>Apply here</strong></a> or send a friend's resume to <a href="mailto:jobs@tldr.tech" style="color: #0000EE; text-decoration: underline;">jobs@tldr.tech</a> and get $1k if we hire them!
</div>
<br>
<div class="text-block">
If you have any comments or feedback, just respond to this email!
<br>
<br> Thanks for reading,
<br>
<a href="https://tracking.tldrnewsletter.com/CL0/https:%2F%2Fwww.linkedin.com%2Fin%2Fjoelvanveluwen%2F/1/01000199c86fc6bb-c4163ca0-652c-48bc-8c1f-8c9fb7150184-000000/8ZXKT4RLFk0ugyjzKzXutdM72jWzXnbfUXAqA5Zmm5Y=426"><span>Joel Van Veluwen</span></a>, <a href="https://tracking.tldrnewsletter.com/CL0/https:%2F%2Fwww.linkedin.com%2Fin%2Fjennytzurueyching%2F/1/01000199c86fc6bb-c4163ca0-652c-48bc-8c1f-8c9fb7150184-000000/nQLGOC8lRVfiJ_yi7ViF5WBTE7QEOk-UlJXl8asl-68=426"><span>Tzu-Ruey Ching</span></a> & <a href="https://tracking.tldrnewsletter.com/CL0/https:%2F%2Fwww.linkedin.com%2Fin%2Fremi-turpaud%2F/1/01000199c86fc6bb-c4163ca0-652c-48bc-8c1f-8c9fb7150184-000000/MmmJlfbw7VJy_X37vUziPsaJ-Y51XRAot9lptTJHUGQ=426"><span>Remi Turpaud</span></a>
<br>
<br>
</div>
<br>
</td></tr></tbody></table>
<table align="center" bgcolor="" border="0" cellpadding="0" cellspacing="0" width="100%"><tbody><tr><td class="container" style="padding: 15px 15px;">
<div class="text-block" id="testing-id">
<a href="https://tracking.tldrnewsletter.com/CL0/https:%2F%2Ftldr.tech%2Fdata%2Fmanage%3Femail=silk.theater.56%2540fwdnl.com/1/01000199c86fc6bb-c4163ca0-652c-48bc-8c1f-8c9fb7150184-000000/JUt-vjH8cX1Glj5WCLWgQtPuXuGFNXiqLKXh-6zdF5U=426">Manage your subscriptions</a> to our other newsletters on tech, startups, and programming. Or if TLDR Data isn't for you, please <a href="https://tracking.tldrnewsletter.com/CL0/https:%2F%2Fa.tldrnewsletter.com%2Funsubscribe%3Fep=1%26l=037ede50-92cc-11ee-b0f2-b761aa2217ad%26lc=1670a604-84b7-11f0-bcf5-55fc1d40139c%26p=e89f6a42-a4ca-11f0-be1d-fd5fc949f751%26pt=campaign%26pv=4%26spa=1760004072%26t=1760004392%26s=608254b06241419c8637248471ae42b7f554c03da84b5818ef8e3b20c10c86a6/1/01000199c86fc6bb-c4163ca0-652c-48bc-8c1f-8c9fb7150184-000000/Ne_guaXbiZGQijD_9JxSQjjCSIdkxk_8i70Slxouz3o=426">unsubscribe</a>.
<br>
</div>
</td></tr></tbody></table>
</td></tr></tbody></table>
</td></tr></tbody></table>
</td></tr></tbody></table>
</td></tr></tbody></table>
<img alt="" src="http://tracking.tldrnewsletter.com/CI0/01000199c86fc6bb-c4163ca0-652c-48bc-8c1f-8c9fb7150184-000000/mMr6wtgZyCHWzlO4F2nKnJqb95GX9c_PO7pVN0H5sVA=426" style="display: none; width: 1px; height: 1px;">
</body></html>