<!DOCTYPE html><html lang="en"><head>
<meta http-equiv="Content-Type" content="text/html charset=UTF-8">
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width">
<meta name="x-apple-disable-message-reformatting">
<title>TLDR Data</title>
<meta name="color-scheme" content="light dark">
<meta name="supported-color-schemes" content="light dark">
<style type="text/css">
:root {
color-scheme: light dark; supported-color-schemes: light dark;
}
*,
*:after,
*:before {
-webkit-box-sizing: border-box; -moz-box-sizing: border-box; box-sizing: border-box;
}
* {
-ms-text-size-adjust: 100%; -webkit-text-size-adjust: 100%;
}
html,
body,
.document {
width: 100% !important; height: 100% !important; margin: 0; padding: 0;
}
body {
-webkit-font-smoothing: antialiased; -moz-osx-font-smoothing: grayscale; text-rendering: optimizeLegibility;
}
div[style*="margin: 16px 0"] {
margin: 0 !important;
}
table,
td {
mso-table-lspace: 0pt; mso-table-rspace: 0pt;
}
table {
border-spacing: 0; border-collapse: collapse; table-layout: fixed; margin: 0 auto;
}
img {
-ms-interpolation-mode: bicubic; max-width: 100%; border: 0;
}
*[x-apple-data-detectors] {
color: inherit !important; text-decoration: none !important;
}
.x-gmail-data-detectors,
.x-gmail-data-detectors *,
.aBn {
border-bottom: 0 !important; cursor: default !important;
}
.btn {
-webkit-transition: all 200ms ease; transition: all 200ms ease;
}
.btn:hover {
background-color: #f67575; border-color: #f67575;
}
* {
font-family: Arial, Helvetica, sans-serif; font-size: 18px;
}
@media screen and (max-width: 600px) {
.container {
width: 100%; margin: auto;
}
.stack {
display: block!important; width: 100%!important; max-width: 100%!important;
}
.btn {
display: block; width: 100%; text-align: center;
}
}
body,
p,
td,
tr,
.body,
table,
h1,
h2,
h3,
h4,
h5,
h6,
div,
span {
background-color: #FEFEFE !important; color: #010101 !important;
}
@media (prefers-color-scheme: dark) {
body,
p,
td,
tr,
.body,
table,
h1,
h2,
h3,
h4,
h5,
h6,
div,
span {
background-color: #27292D !important; color: #FEFEFE !important;
}
}
a {
color: inherit !important; text-decoration: underline !important;
}
</style>
<!--[if mso | ie]>
<style type="text/css">
a {
background-color: #FEFEFE !important; color: #010101 !important;
}
@media (prefers-color-scheme: dark) {
a {
background-color: #27292D !important; color: #FEFEFE !important;
}
}
</style>
<![endif]-->
</head>
<body class="">
<div style="display: none; max-height: 0px; overflow: hidden;">Data contracts are most effective when implemented at the "first mile": directly at the data source within software code, rather than downstream </div>
<div style="display: none; max-height: 0px; overflow: hidden;">
<br>
</div>
<table align="center" class="document">
<tbody>
<tr>
<td valign="top">
<table align="center" border="0" cellpadding="0" cellspacing="0" class="container" width="600">
<tbody>
<tr class="inner-body">
<td>
<table align="center" border="0" cellpadding="0" cellspacing="0" width="100%">
<tbody>
<tr class="header">
<td bgcolor="" class="container">
<table width="100%">
<tbody>
<tr>
<td class="container">
<table align="center" bgcolor="" border="0" cellpadding="0" cellspacing="0" style="margin-top: 0px;" width="100%">
<tbody>
<tr>
<td style="padding: 0px;">
<table align="center" bgcolor="" border="0" cellpadding="0" cellspacing="0" width="100%">
<tbody>
<tr>
<td class="container" style="padding: 15px 15px;">
<div style="text-align: center;">
<span style="margin-right: 0px;"><a href="https://tracking.tldrnewsletter.com/CL0/https:%2F%2Ftldr.tech%2Fdata%3Futm_source=tldrdata/1/01000199dd08d0d5-2062b390-f7b4-487d-a701-3346a3987b84-000000/X_myqMU6edWJnxcwvVoLvSW6NGFsKcJZFeyv9XYCjp0=426" rel="noopener noreferrer" target="_blank"><span>Sign Up</span></a>
|<span style="margin-right: 2px; margin-left: 2px;"><a href="https://tracking.tldrnewsletter.com/CL0/https:%2F%2Fadvertise.tldr.tech%3Futm_source=tldrdata%26utm_medium=newsletter%26utm_campaign=advertisetopnav/1/01000199dd08d0d5-2062b390-f7b4-487d-a701-3346a3987b84-000000/lquoPQjkn-KjJj2Gs-JmQyJ5LCPAoOkrFEydDSeWjJQ=426" rel="noopener noreferrer" target="_blank"><span>Advertise</span></a></span>|<span style="margin-left: 2px;"><a href="https://tracking.tldrnewsletter.com/CL0/https:%2F%2Fa.tldrnewsletter.com%2Fweb-version%3Fep=1%26lc=1670a604-84b7-11f0-bcf5-55fc1d40139c%26p=de6b7c8c-a7f4-11f0-87c6-c986eb7c7d96%26pt=campaign%26t=1760349966%26s=0d62eec6e91afa32e69038e43968ffe88b682c8313ea7a55e61c738ae3753213/1/01000199dd08d0d5-2062b390-f7b4-487d-a701-3346a3987b84-000000/7hiyt_hj0apZ6LWM5GRgF7LLiNw8nrA05IcsuvLs_1A=426"><span>View Online</span></a></span>
<br>
</span></div>
</td>
</tr>
</tbody>
</table>
<table align="center" bgcolor="" border="0" cellpadding="0" cellspacing="0" width="100%">
<tbody>
<tr>
<td class="container" style="text-align: center;"><span data-darkreader-inline-color="" style="--darkreader-inline-color:#3db3ff; color: rgb(51, 175, 255) !important; font-size: 30px;">T</span><span style="font-size: 30px;"><span data-darkreader-inline-color="" style="color: rgb(232, 192, 96) !important; --darkreader-inline-color:#e8c163; font-size:30px;">L</span><span data-darkreader-inline-color="" style="color: rgb(101, 195, 173) !important; --darkreader-inline-color:#6ec7b2; font-size:30px;">D</span></span><span data-darkreader-inline-color="" style="--darkreader-inline-color:#dd6e6e; color: rgb(220, 107, 107) !important; font-size: 30px;">R</span>
<br>
</td>
</tr>
</tbody>
</table>
<br>
<table align="center" bgcolor="" border="0" cellpadding="0" cellspacing="0" width="100%">
<tbody></tbody>
</table>
<table style="table-layout: fixed; width:100%;" width="100%">
<tbody>
<tr>
<td style="padding:0;border-collapse:collapse;border-spacing:0;margin:0;">
<div style="text-align: center;">
<h1><strong>TLDR Data <span id="date">2025-10-13</span></strong></h1>
</div>
</td>
</tr>
</tbody>
</table>
<table style="table-layout: fixed; width:100%;" width="100%">
<tbody></tbody>
</table>
</td>
</tr>
</tbody>
</table>
</td>
</tr>
</tbody>
</table>
</td>
</tr>
<tr bgcolor="">
<td class="container">
<table align="center" bgcolor="" border="0" cellpadding="0" cellspacing="0" width="100%">
<tbody>
<tr>
<td style="padding: 0px;">
<table align="center" bgcolor="" border="0" cellpadding="0" cellspacing="0" width="100%">
<tbody>
<tr>
<td class="container" style="padding-top: 0px; padding-bottom: 0px;">
<div class="text-block">
<div style="text-align: center;"><span style="font-size: 36px;">📱</span></div></div>
</td>
</tr>
</tbody>
</table>
<table align="center" bgcolor="" border="0" cellpadding="0" cellspacing="0" width="100%">
<tbody>
<tr>
<td class="container" style="padding-top: 0px; padding-bottom: 0px;">
<div class="text-block">
<div style="text-align: center;">
<h1><strong>Deep Dives</strong></h1>
</div>
</div>
</td>
</tr>
</tbody>
</table>
<table style="table-layout: fixed; width: 100%;" width="100%">
<tbody>
<tr>
<td style="padding:0;border-collapse:collapse;border-spacing:0;margin:0;" valign="top">
<table align="center" border="0" cellpadding="0" cellspacing="0" width="100%">
<tbody>
<tr>
<td class="container" style="padding: 15px 15px;">
<div class="text-block">
<span>
<a href="https://tracking.tldrnewsletter.com/CL0/https:%2F%2Fjack-vanlightly.com%2Fblog%2F2025%2F10%2F8%2Fbeyond-indexes-how-open-table-formats-optimize-query-performance%3Futm_source=tldrdata/1/01000199dd08d0d5-2062b390-f7b4-487d-a701-3346a3987b84-000000/g8ymBTAgo6CKJjvRhRvZsOIn0GNw2X7KdjnF_svDSvQ=426">
<span>
<strong>Beyond Indexes: How Open Table Formats Optimize Query Performance (20 minute read)</strong>
</span>
</a>
<br>
<br>
<span style="font-family: "Helvetica Neue", Helvetica, Arial, Verdana, sans-serif;">
Traditional secondary indexes in OLTP databases drive efficient point lookups but are fundamentally misaligned with the large-scale, columnar, append-only architecture of open table formats like Apache Iceberg, Delta Lake, and Hudi. Read performance boils down to minimizing IO via data layout optimization (partitioning, sorting, and compaction) and leveraging auxiliary structures such as metadata-based column statistics, Bloom filters, and materialized views for efficient pruning. Analytical query speed hinges on how well physical data organization supports common access patterns, fundamentally diverging from the "index" paradigm of the OLTP world.
</span>
</span>
</div>
</td>
</tr>
</tbody>
</table>
<table align="center" border="0" cellpadding="0" cellspacing="0" width="100%">
<tbody>
<tr>
<td class="container" style="padding: 15px 15px;">
<div class="text-block">
<span>
<a href="https://tracking.tldrnewsletter.com/CL0/https:%2F%2Fblog.bytebytego.com%2Fp%2Fhow-facebooks-distributed-priority%3Futm_source=tldrdata/1/01000199dd08d0d5-2062b390-f7b4-487d-a701-3346a3987b84-000000/7yc1NVPs5OtHUI7vv1bnumB_OEXHbxNwpEH3QabjhYY=426">
<span>
<strong>How Facebook's Distributed Priority Queue Handles Trillions of Items (13 minute read)</strong>
</span>
</a>
<br>
<br>
<span style="font-family: "Helvetica Neue", Helvetica, Arial, Verdana, sans-serif;">
Facebook's FOQS is a fully managed, horizontally scalable, multi-tenant distributed priority queue processing over one trillion items per day. Built on sharded MySQL, FOQS utilizes namespaces, topics, and item-level metadata to ensure strict isolation and high throughput. Features include in-memory buffering, batching, adaptive prefetching, and demand-aware routing. The pull-based delivery model, robust disaster recovery, and idempotent ack/nack operations enable resilient, low-latency task processing suitable for massive workloads and varied enterprise use cases.
</span>
</span>
</div>
</td>
</tr>
</tbody>
</table>
<table align="center" border="0" cellpadding="0" cellspacing="0" width="100%">
<tbody>
<tr>
<td class="container" style="padding: 15px 15px;">
<div class="text-block">
<span>
<a href="https://tracking.tldrnewsletter.com/CL0/https:%2F%2Fmedium.com%2Fagoda-engineering%2Fwhy-we-bet-on-rust-to-supercharge-feature-store-at-agoda-ed4a70d2efb7%3Futm_source=tldrdata/1/01000199dd08d0d5-2062b390-f7b4-487d-a701-3346a3987b84-000000/PCK4cpNRyHa9DsHcAfLG272IKGxLqkUPBGdKl5PONWY=426">
<span>
<strong>Why We Bet on Rust to Supercharge Feature Store at Agoda (8 minute read)</strong>
</span>
</a>
<br>
<br>
<span style="font-family: "Helvetica Neue", Helvetica, Arial, Verdana, sans-serif;">
To overcome unpredictable latency and scaling limitations in its JVM/Scala-based Feature Store, Agoda rewrote its Feature Store Serving component in Rust. The switch slashed CPU and memory use, enabled handling 5x more traffic, and cut projected compute costs by ~84 %. A careful migration strategy (“shadow testing,” incremental proof of concept, and using Copilot + the Rust compiler) ensured correctness while onboarding a team without prior Rust experience.
</span>
</span>
</div>
</td>
</tr>
</tbody>
</table>
</td>
</tr>
</tbody>
</table>
<table align="center" bgcolor="" border="0" cellpadding="0" cellspacing="0" width="100%">
<tbody>
<tr>
<td class="container" style="padding-top: 0px; padding-bottom: 0px;">
<div class="text-block">
<div style="text-align: center;"><span style="font-size: 36px;">🚀</span></div>
</div>
</td>
</tr>
</tbody>
</table>
<table align="center" bgcolor="" border="0" cellpadding="0" cellspacing="0" width="100%">
<tbody>
<tr>
<td class="container" style="padding-top: 0px; padding-bottom: 0px;">
<div class="text-block">
<div style="text-align: center;">
<h1><strong>Opinions & Advice</strong></h1>
</div>
</div>
</td>
</tr>
</tbody>
</table>
<table style="table-layout: fixed; width: 100%;" width="100%">
<tbody>
<tr>
<td style="padding:0;border-collapse:collapse;border-spacing:0;margin:0;" valign="top">
<table align="center" border="0" cellpadding="0" cellspacing="0" width="100%">
<tbody>
<tr>
<td class="container" style="padding: 15px 15px;">
<div class="text-block">
<span>
<a href="https://tracking.tldrnewsletter.com/CL0/https:%2F%2Fdataproducts.substack.com%2Fp%2Fyour-data-contracts-are-in-the-wrong%3Futm_source=tldrdata/1/01000199dd08d0d5-2062b390-f7b4-487d-a701-3346a3987b84-000000/ZUFjFO9-ZMUdnbzWB63ga9d4WqjdL-gsrVzkJarhLcs=426">
<span>
<strong>Your Data Contracts Are in the Wrong Spot (6 minute read)</strong>
</span>
</a>
<br>
<br>
<span style="font-family: "Helvetica Neue", Helvetica, Arial, Verdana, sans-serif;">
Data contracts are most effective when implemented at the "first mile": directly at the data source within software code, rather than downstream in analytical warehouses where they only address symptoms. Teams often misplace contracts, leading to persistent data quality bottlenecks or undetected upstream issues. Successful enforcement requires coordinated cultural change, seamless onboarding, management of tech sprawl, contract versioning, and contextual alerting. Target mission-critical data products with end-to-end "steel threads," ensuring visible, organization-wide quality improvements.
</span>
</span>
</div>
</td>
</tr>
</tbody>
</table>
<table align="center" border="0" cellpadding="0" cellspacing="0" width="100%">
<tbody>
<tr>
<td class="container" style="padding: 15px 15px;">
<div class="text-block">
<span>
<a href="https://tracking.tldrnewsletter.com/CL0/https:%2F%2Fblog.langchain.com%2Fnot-another-workflow-builder%2F%3Futm_source=tldrdata/1/01000199dd08d0d5-2062b390-f7b4-487d-a701-3346a3987b84-000000/NGmGVltVOYd_j9Kt9_1_pjVgoZHrCoq_i-CowDDyOFg=426">
<span>
<strong>Not Another Workflow Builder (4 minute read)</strong>
</span>
</a>
<br>
<br>
<span style="font-family: "Helvetica Neue", Helvetica, Arial, Verdana, sans-serif;">
Visual workflow builders limit scalable AI development. They are too complex for nontechnical users and too limited for advanced use cases. LangChain sees a future where no-code agents handle simple tasks, and code-based workflows with AI-generated code manage complex ones, combining flexibility, maintainability, and practical usability for enterprises.
</span>
</span>
</div>
</td>
</tr>
</tbody>
</table>
<table align="center" border="0" cellpadding="0" cellspacing="0" width="100%">
<tbody>
<tr>
<td class="container" style="padding: 15px 15px;">
<div class="text-block">
<span>
<a href="https://tracking.tldrnewsletter.com/CL0/https:%2F%2Fwww.confluent.io%2Fblog%2Fdata-lake-governance-tableflow%2F%3Futm_source=tldrdata/1/01000199dd08d0d5-2062b390-f7b4-487d-a701-3346a3987b84-000000/RRZRR9TfJCDQ-2-dxkYkA2Wo84yKjoumy9uftTO2N7Q=426">
<span>
<strong>No More Swamps: Building a Better-Governed Data Lake Architecture (8 minute read)</strong>
</span>
</a>
<br>
<br>
<span style="font-family: "Helvetica Neue", Helvetica, Arial, Verdana, sans-serif;">
TableFlow bridges Kafka topics/schemas to open formats like Iceberg/Delta Lake, automating preprocessing, validation, and transfer to lakes/warehouses. It uses Confluent Cloud for Flink-based cleansing near sources, enforces schemas via Registry for evolution (e.g., suspending flows on breaks), and pushes read-only metadata directly to catalogs (AWS Glue, Polaris, and Databricks) for real-time sync and consistency across catalogs.
</span>
</span>
</div>
</td>
</tr>
</tbody>
</table>
</td>
</tr>
</tbody>
</table>
<table align="center" bgcolor="" border="0" cellpadding="0" cellspacing="0" width="100%">
<tbody>
<tr>
<td class="container" style="padding-top: 0px; padding-bottom: 0px;">
<div class="text-block">
<div style="text-align: center;"><span style="font-size: 36px;">💻</span></div>
</div>
</td>
</tr>
</tbody>
</table>
<table align="center" bgcolor="" border="0" cellpadding="0" cellspacing="0" width="100%">
<tbody>
<tr>
<td class="container" style="padding-top: 0px; padding-bottom: 0px;">
<div class="text-block">
<div style="text-align: center;">
<h1><strong>Launches & Tools</strong></h1>
</div>
</div>
</td>
</tr>
</tbody>
</table>
<table style="table-layout: fixed; width: 100%;" width="100%">
<tbody>
<tr>
<td style="padding:0;border-collapse:collapse;border-spacing:0;margin:0;" valign="top">
<table align="center" border="0" cellpadding="0" cellspacing="0" width="100%">
<tbody>
<tr>
<td class="container" style="padding: 15px 15px;">
<div class="text-block">
<span>
<a href="https://tracking.tldrnewsletter.com/CL0/https:%2F%2Fmultitudes.typeform.com%2Fto%2FuHFHmXrj%3Futm_source=tldr/1/01000199dd08d0d5-2062b390-f7b4-487d-a701-3346a3987b84-000000/v9o9eJlfHpZOQYpE-A9pUTLHHriX0yGfoeU7s4T0nvI=426">
<span>
<strong>How has AI impacted your productivity? See how others have answered (Sponsor)</strong>
</span>
</a>
<br>
<br>
<span style="font-family: "Helvetica Neue", Helvetica, Arial, Verdana, sans-serif;">
Want to see how other tech leaders are rolling out AI and measuring impact? Fill out the <a href="https://tracking.tldrnewsletter.com/CL0/https:%2F%2Fmultitudes.typeform.com%2Fto%2FuHFHmXrj%3Futm_source=tldr/2/01000199dd08d0d5-2062b390-f7b4-487d-a701-3346a3987b84-000000/vOi8_ZW7Fv-UoQL_3BXlODaXcJblmrxKGTsx8GCTjnc=426" rel="noopener noreferrer nofollow" target="_blank"><span>AI impact survey</span></a> to get early access to insights from eng, data, and product leaders like you. It's anonymous and takes 15 minutes. <a href="https://tracking.tldrnewsletter.com/CL0/https:%2F%2Fmultitudes.typeform.com%2Fto%2FuHFHmXrj%3Futm_source=tldr/3/01000199dd08d0d5-2062b390-f7b4-487d-a701-3346a3987b84-000000/dAj1yz4HLkTCdxeQPn5dZiVe1ig-Ofq2144y-0KnFsI=426" rel="noopener noreferrer nofollow" target="_blank"><span>Take the survey and get free access</span></a>
</span>
</span>
</div>
</td>
</tr>
</tbody>
</table>
<table align="center" border="0" cellpadding="0" cellspacing="0" width="100%">
<tbody>
<tr>
<td class="container" style="padding: 15px 15px;">
<div class="text-block">
<span>
<a href="https://tracking.tldrnewsletter.com/CL0/https:%2F%2Fgithub.com%2Folooney%2Fjellyjoin%3Futm_source=tldrdata/1/01000199dd08d0d5-2062b390-f7b4-487d-a701-3346a3987b84-000000/8192_2dVc5Oxe6fekcS0ThYDi3GruMJsy6Xs-tf7qyQ=426">
<span>
<strong>Jellyjoin: Join Dataframes or Lists Based on Semantic Similarity (GitHub Repo)</strong>
</span>
</a>
<br>
<br>
<span style="font-family: "Helvetica Neue", Helvetica, Arial, Verdana, sans-serif;">
The Jellyjoin Python package facilitates soft joins using embedding vectors, allowing data engineers to efficiently merge datasets based on similarity rather than exact matches. Key features include handling high-dimensional data and optimizing join operations, making it relevant for tasks involving machine learning and data integration.
</span>
</span>
</div>
</td>
</tr>
</tbody>
</table>
<table align="center" border="0" cellpadding="0" cellspacing="0" width="100%">
<tbody>
<tr>
<td class="container" style="padding: 15px 15px;">
<div class="text-block">
<span>
<a href="https://tracking.tldrnewsletter.com/CL0/https:%2F%2Fwww.onehouse.ai%2Fblog%2Fmeasuring-etl-price-performance-on-cloud-data-platforms%3Futm_source=tldrdata/1/01000199dd08d0d5-2062b390-f7b4-487d-a701-3346a3987b84-000000/eiO4wZbNyS5_DeZkRL4UORGBOfxGs-ELHucjQJfuKQk=426">
<span>
<strong>Measuring ETL Price-Performance On Cloud Data Platforms (22 minute read)</strong>
</span>
</a>
<br>
<br>
<span style="font-family: "Helvetica Neue", Helvetica, Arial, Verdana, sans-serif;">
ETL benchmarks typically overlook the load stage, which can account for 20-50 % of pipeline time and significantly distort cost-performance results. Lake Loader, a newly open source tool by OneHouse, can be combined with TPC-DS to measure realistic incremental loads across dimensions, facts, and event tables. This method enables ETL price-performance comparisons across data platforms and cost attribution to different phases (ET vs L) and target object types. A simulator with EMR, Databricks, and Snowflake is included in the article.
</span>
</span>
</div>
</td>
</tr>
</tbody>
</table>
<table align="center" border="0" cellpadding="0" cellspacing="0" width="100%">
<tbody>
<tr>
<td class="container" style="padding: 15px 15px;">
<div class="text-block">
<span>
<a href="https://tracking.tldrnewsletter.com/CL0/https:%2F%2Fwww.databricks.com%2Fblog%2Fintroducing-variant-new-open-standard-semi-structured-data-apache-parquettm-delta-lake%3Futm_source=tldrdata/1/01000199dd08d0d5-2062b390-f7b4-487d-a701-3346a3987b84-000000/PGbIAokacWtZ3llAXBKVA5gv58PPHZFxXfZrxev-9DM=426">
<span>
<strong>Introducing Variant: A New Open Standard for Semi-Structured Data in Apache Parquet, Delta Lake, and Apache Iceberg (5 minute read)</strong>
</span>
</a>
<br>
<br>
<span style="font-family: "Helvetica Neue", Helvetica, Arial, Verdana, sans-serif;">
Variant is now a native Parquet type, with Delta Lake support for over a year, Iceberg v3 approval in May 2025, and over 9,600 lines of code contributed by Databricks to Parquet-java. Variant employs a compact binary format with offsets for fast field navigation (e.g., accessing "order.item.name" without full parsing). Shredding extracts common fields into typed Parquet columns for pruned I/O, data skipping, and compression, supported via SQL table creation and automatic optimization in Delta and Iceberg tables.
</span>
</span>
</div>
</td>
</tr>
</tbody>
</table>
<table align="center" border="0" cellpadding="0" cellspacing="0" width="100%">
<tbody>
<tr>
<td class="container" style="padding: 15px 15px;">
<div class="text-block">
<span>
<a href="https://tracking.tldrnewsletter.com/CL0/https:%2F%2Fgithub.com%2Flakekeeper%2Flakekeeper%3Futm_source=tldrdata/1/01000199dd08d0d5-2062b390-f7b4-487d-a701-3346a3987b84-000000/UMGEv1iTIlcBt5zThJ9GEAMxJEuqabe0cPzDhtfqbgI=426">
<span>
<strong>Lakekeeper (GitHub Repo)</strong>
</span>
</a>
<br>
<br>
<span style="font-family: "Helvetica Neue", Helvetica, Arial, Verdana, sans-serif;">
Lakekeeper is an Apache-licensed, secure, fast, and user-friendly implementation of the Apache Iceberg REST Catalog spec, written in Rust (based on apache/iceberg-rust). Integrated with Spark, PyIceberg, Trino, and Starrocks, it acts as a catalog for Apache Iceberg tables in open lakehouses, supporting multi-table commits. To start using it, use a Docker image from the catalog or deploy via Helm on K8s.
</span>
</span>
</div>
</td>
</tr>
</tbody>
</table>
</td>
</tr>
</tbody>
</table>
<table align="center" bgcolor="" border="0" cellpadding="0" cellspacing="0" width="100%">
<tbody>
<tr>
<td class="container" style="padding-top: 0px; padding-bottom: 0px;">
<div class="text-block">
<div style="text-align: center;"><span style="font-size: 36px;">🎁</span></div></div>
</td>
</tr>
</tbody>
</table>
<table align="center" bgcolor="" border="0" cellpadding="0" cellspacing="0" width="100%">
<tbody>
<tr>
<td class="container" style="padding-top: 0px; padding-bottom: 0px;">
<div class="text-block">
<div style="text-align: center;"><strong><h1>Miscellaneous</h1></strong></div>
</div>
</td>
</tr>
</tbody>
</table>
<table bgcolor="" style="table-layout: fixed; width: 100%;" width="100%">
<tbody>
<tr>
<td style="padding:0;border-collapse:collapse;border-spacing:0;margin:0;" valign="top">
<table align="center" border="0" cellpadding="0" cellspacing="0" width="100%">
<tbody>
<tr>
<td class="container" style="padding: 15px 15px;">
<div class="text-block">
<span>
<a href="https://tracking.tldrnewsletter.com/CL0/https:%2F%2Fengineering.salesforce.com%2Fbuilding-real-time-multimodal-ai-pipelines-scaling-file-processing-to-50m-daily-uploads%2F%3Futm_source=tldrdata/1/01000199dd08d0d5-2062b390-f7b4-487d-a701-3346a3987b84-000000/x3IepnA5Jn7St-bZmMh9fUDSJ-iCHcp_fIYiE6jqmqM=426">
<span>
<strong>Engineering Real-Time Multimodal AI Pipelines: Scaling File Processing to 50M Daily Uploads (5 minute read)</strong>
</span>
</a>
<br>
<br>
<span style="font-family: "Helvetica Neue", Helvetica, Arial, Verdana, sans-serif;">
Salesforce's Prompt Builder now delivers real-time multimodal AI integration, enabling large language models to process and extract structured data from unindexed files (such as PDFs, images, and policy documents) across both Data Cloud and non-Data Cloud environments. The platform's new pipeline manages up to 50 million daily file uploads, validates diverse file types dynamically, and supports seamless interoperability with major LLMs (OpenAI, Gemini, and Anthropic) through a compatibility abstraction layer.
</span>
</span>
</div>
</td>
</tr>
</tbody>
</table>
<table align="center" border="0" cellpadding="0" cellspacing="0" width="100%">
<tbody>
<tr>
<td class="container" style="padding: 15px 15px;">
<div class="text-block">
<span>
<a href="https://tracking.tldrnewsletter.com/CL0/https:%2F%2Fdiscord.com%2Fblog%2Ffrom-single-node-to-multi-gpu-clusters-how-discord-made-distributed-compute-easy-for-ml-engineers%3Futm_source=tldrdata/1/01000199dd08d0d5-2062b390-f7b4-487d-a701-3346a3987b84-000000/y9PcWKn-Y6jmLGZaDhWY54HgiQRe2_uPAFZEAjD3xBo=426">
<span>
<strong>From Single-Node to Multi-GPU Clusters: How Discord Made Distributed Compute Easy for ML Engineers (5 minute read)</strong>
</span>
</a>
<br>
<br>
<span style="font-family: "Helvetica Neue", Helvetica, Arial, Verdana, sans-serif;">
Discord's ML systems scaled from simple classifiers to complex models serving millions, facing hurdles like multi-GPU training needs, datasets exceeding single-node capacity, inconsistent manual Ray cluster setups, non-standardized resource management, and siloed custom solutions across teams that hindered reproducibility and efficiency. The team built a Ray-centric platform with a custom CLI for simplified cluster lifecycle management, Dagster-KubeRay orchestration for automated provisioning and workflows, and X-Ray for real-time observability.
</span>
</span>
</div>
</td>
</tr>
</tbody>
</table>
</td>
</tr>
</tbody>
</table>
<table align="center" bgcolor="" border="0" cellpadding="0" cellspacing="0" width="100%">
<tbody>
<tr>
<td class="container" style="padding-top: 0px; padding-bottom: 0px;">
<div class="text-block">
<div style="text-align: center;"><span style="font-size: 36px;">⚡</span></div></div>
</td>
</tr>
</tbody>
</table>
<table align="center" bgcolor="" border="0" cellpadding="0" cellspacing="0" width="100%">
<tbody>
<tr>
<td class="container" style="padding-top: 0px; padding-bottom: 0px;">
<div class="text-block">
<div style="text-align: center;">
<h1><strong>Quick Links</strong></h1>
</div>
</div>
</td>
</tr>
</tbody>
</table>
<table bgcolor="" style="table-layout: fixed; width: 100%;" width="100%">
<tbody>
<tr>
<td style="padding:0;border-collapse:collapse;border-spacing:0;margin:0;" valign="top">
<table align="center" border="0" cellpadding="0" cellspacing="0" width="100%">
<tbody>
<tr>
<td class="container" style="padding: 15px 15px;">
<div class="text-block">
<span>
<a href="https://tracking.tldrnewsletter.com/CL0/https:%2F%2Fduckdb.org%2F2025%2F10%2F09%2Fbenchmark-results-14-lts.html%3Futm_source=tldrdata/1/01000199dd08d0d5-2062b390-f7b4-487d-a701-3346a3987b84-000000/_dcoJTnlJk1CqVEZVfrDDemE_SCm4lfjqZT-d1WgN00=426">
<span>
<strong>Benchmark Results for DuckDB v1.4 LTS (1 minute read)</strong>
</span>
</a>
<br>
<br>
<span style="font-family: "Helvetica Neue", Helvetica, Arial, Verdana, sans-serif;">
DuckDB scales to 100 TB TPC-H on a single node, proving the in-process engine's viability for massive analytic workloads.
</span>
</span>
</div>
</td>
</tr>
</tbody>
</table>
<table align="center" border="0" cellpadding="0" cellspacing="0" width="100%">
<tbody>
<tr>
<td class="container" style="padding: 15px 15px;">
<div class="text-block">
<span>
<a href="https://tracking.tldrnewsletter.com/CL0/https:%2F%2Fpostgresql.verite.pro%2Fblog%2F2025%2F10%2F01%2Fpsql-pipeline.html%3Futm_source=tldrdata/1/01000199dd08d0d5-2062b390-f7b4-487d-a701-3346a3987b84-000000/05r5TNIgxWq3pkiEIjMmNrXJBzkD9dyX6zvODCyEBgU=426">
<span>
<strong>Pipelining in psql (PostgreSQL 18) (4 minute read)</strong>
</span>
</a>
<br>
<br>
<span style="font-family: "Helvetica Neue", Helvetica, Arial, Verdana, sans-serif;">
PostgreSQL 18 introduces enhanced pipelining capabilities in the psql command-line client, allowing clients to send multiple queries without waiting for previous results, which significantly boosts query throughput.
</span>
</span>
</div>
</td>
</tr>
</tbody>
</table>
</td>
</tr>
</tbody>
</table>
<table align="center" bgcolor="" border="0" cellpadding="0" cellspacing="0" width="100%">
<tbody>
<tr>
<td align="left" style="word-break: break-word; vertical-align: top; padding: 5px 10px;">
<p style="padding: 0; margin: 0; font-size: 22px; color: #000000; line-height: 1.6; font-weight: bold;">
Want to advertise in TLDR? 📰
</p>
<div class="text-block" style="margin-top: 10px;">
If your company is interested in reaching an audience of data engineering professionals and decision makers, you may want to <a href="https://tracking.tldrnewsletter.com/CL0/https:%2F%2Fadvertise.tldr.tech%2F%3Futm_source=tldrdata%26utm_medium=newsletter%26utm_campaign=advertisecta/1/01000199dd08d0d5-2062b390-f7b4-487d-a701-3346a3987b84-000000/zOeQFQYEoRY_1jOsYrUYqltlq6KZ4ioQyFjxQKwjoRQ=426"><strong><span>advertise with us</span></strong></a>.
</div>
<br>
<!-- New "Want to work at TLDR?" section -->
<p style="padding: 0; margin: 0; font-size: 22px; color: #000000; line-height: 1.6; font-weight: bold;">
Want to work at TLDR? 💼
</p>
<div class="text-block" style="margin-top: 10px;">
<a href="https://tracking.tldrnewsletter.com/CL0/https:%2F%2Fjobs.ashbyhq.com%2Ftldr.tech/1/01000199dd08d0d5-2062b390-f7b4-487d-a701-3346a3987b84-000000/bwN9zcFJlS55VFnbHnOsc4TRGQgXLqaEPXedep7eWQI=426" rel="noopener noreferrer" style="color: #0000EE; text-decoration: underline;" target="_blank"><strong>Apply here</strong></a> or send a friend's resume to <a href="mailto:jobs@tldr.tech" style="color: #0000EE; text-decoration: underline;">jobs@tldr.tech</a> and get $1k if we hire them!
</div>
<br>
<div class="text-block">
If you have any comments or feedback, just respond to this email!
<br>
<br> Thanks for reading,
<br>
<a href="https://tracking.tldrnewsletter.com/CL0/https:%2F%2Fwww.linkedin.com%2Fin%2Fjoelvanveluwen%2F/1/01000199dd08d0d5-2062b390-f7b4-487d-a701-3346a3987b84-000000/F_qN_tSpaIOFivHtEZRS5M4FZKWsjCHxD4nLyfISG48=426"><span>Joel Van Veluwen</span></a>, <a href="https://tracking.tldrnewsletter.com/CL0/https:%2F%2Fwww.linkedin.com%2Fin%2Fjennytzurueyching%2F/1/01000199dd08d0d5-2062b390-f7b4-487d-a701-3346a3987b84-000000/yfwVdpyG1CTeUnuGtEQd4MW0seMYB7oz4lMbXiJ5ink=426"><span>Tzu-Ruey Ching</span></a> & <a href="https://tracking.tldrnewsletter.com/CL0/https:%2F%2Fwww.linkedin.com%2Fin%2Fremi-turpaud%2F/1/01000199dd08d0d5-2062b390-f7b4-487d-a701-3346a3987b84-000000/7Tma71LXie2F56vuf8kYzIzrVakfGK5fYxaulg-IJUg=426"><span>Remi Turpaud</span></a>
<br>
<br>
</div>
<br>
</td>
</tr>
</tbody>
</table>
<table align="center" bgcolor="" border="0" cellpadding="0" cellspacing="0" width="100%">
<tbody>
<tr>
<td class="container" style="padding: 15px 15px;">
<div class="text-block" id="testing-id">
<a href="https://tracking.tldrnewsletter.com/CL0/https:%2F%2Ftldr.tech%2Fdata%2Fmanage%3Femail=silk.theater.56%2540fwdnl.com/1/01000199dd08d0d5-2062b390-f7b4-487d-a701-3346a3987b84-000000/2hbuHNdCWsADoZEPrIi1PW3MAs6WqpGyzc1UrLAS4KQ=426">Manage your subscriptions</a> to our other newsletters on tech, startups, and programming. Or if TLDR Data isn't for you, please <a href="https://tracking.tldrnewsletter.com/CL0/https:%2F%2Fa.tldrnewsletter.com%2Funsubscribe%3Fep=1%26l=037ede50-92cc-11ee-b0f2-b761aa2217ad%26lc=1670a604-84b7-11f0-bcf5-55fc1d40139c%26p=de6b7c8c-a7f4-11f0-87c6-c986eb7c7d96%26pt=campaign%26pv=4%26spa=1760349652%26t=1760349966%26s=68955b658acdc63f26583ac9afff06b51a7acd2845c7872def1cbc7cfdbe22e0/1/01000199dd08d0d5-2062b390-f7b4-487d-a701-3346a3987b84-000000/C_BiuzFXwuLqf3vHNqL29MI4VOI_5RhhYgwTFMk7VZw=426">unsubscribe</a>.
<br>
</div>
</td>
</tr>
</tbody>
</table>
</td>
</tr>
</tbody>
</table>
</td>
</tr>
</tbody>
</table>
</td>
</tr>
</tbody>
</table>
</td>
</tr>
</tbody>
</table>
<img alt="" src="http://tracking.tldrnewsletter.com/CI0/01000199dd08d0d5-2062b390-f7b4-487d-a701-3346a3987b84-000000/iDVvheUKPCEBZXXlKtYgFdcfaqZtpHyU352OIRgVtzM=426" style="display: none; width: 1px; height: 1px;">
</body></html>