Most data teams believe that adding LIMIT 100 to an exploratory query makes the database stop reading after finding 100 rows. In BigQuery, that is an expensive illusion. We have seen teams burning thousands of dollars per month on queries that should cost cents — and the culprit is a SQL pattern that looks harmless.
The pricing model nobody reads
BigQuery charges by bytes read, not by rows returned. That is the fundamental rule that changes everything. In a traditional relational database, LIMIT 100 is efficient because the engine reads row by row and stops. In a columnar database like BigQuery, reads operate on entire column blocks.
When you run SELECT * FROM table LIMIT 100, the engine needs to retrieve the requested columns from storage blocks before applying the cutoff. On partitioned tables without a partition filter, this frequently means scanning entire partitions — though partitioning and clustering can partially mitigate the cost by limiting the scan scope. You pay for the full banquet and consume only the olive.
At Tech86, when we audit clients' BigQuery bills, the SELECT * ... LIMIT pattern on large tables consistently ranks among the top sources of waste. It is not a logic error — it is a mental model error. The developer designs the query thinking in relational terms, but the engine executes in columnar architecture.
TABLESAMPLE SYSTEM: the physics change
The TABLESAMPLE SYSTEM (1 PERCENT) command changes the physics of the operation. Instead of scanning the entire table and then cutting, it instructs the engine to read random data blocks from disk until the defined sampling percentage is reached. The engine never reads more than specified.
In our internal testing at Tech86, an exploratory query that processed 3.45 GB with LIMIT 100 dropped to 172 MB with TABLESAMPLE SYSTEM (1 PERCENT). A 95% reduction in data volume processed — and in query cost.
The difference is not marginal. In an operation running dozens of exploratory queries per day, monthly savings can reach thousands of dollars. And the code change is minimal: one SQL clause replacing another.
The problem is not exclusive to BigQuery
The concept applies to any database with block-level reads. In PostgreSQL, the syntax changes to TABLESAMPLE BERNOULLI (1) or TABLESAMPLE SYSTEM (1), available since version 9.5. The difference between the two methods:
- BERNOULLI: row-level sampling. More statistically precise, but slower because the engine must visit each row to decide whether to include it.
- SYSTEM: page-level sampling. Faster because the engine decides per entire block, but with higher variance since rows within the same block tend to be correlated.
For quick exploration, SYSTEM is sufficient. For analyses requiring rigorous sample representativeness, BERNOULLI is safer. In both cases, I/O drops dramatically compared to a SELECT * ... LIMIT that scans the entire table.
The FinOps lesson behind the syntax
FinOps is not about shutting down servers on weekends. FinOps is about understanding the internal architecture of the tool you operate. When a data team does exploration with LIMIT on petabyte-scale tables, they are not just testing code — they are draining the company's OPEX out of ignorance of the platform's cost model.
Knowing the syntax is common. Understanding the computational cost of the syntax is what separates professionals from amateurs. The same query, written differently, can cost $0.05 or $50. The difference is not in the result — it is in the physics of execution.
At Tech86, we include query pattern auditing as a standard part of our FinOps engagements. Identifying SELECT * ... LIMIT on large tables is one of the first checks. The fix is simple, the savings are immediate, and the impact compounds month over month.
Conclusion
LIMIT 100 in BigQuery is the equivalent of filling up your gas tank to drive around the block. It works, but it costs far more than it should. TABLESAMPLE SYSTEM is the right tool for the right job: data exploration with cost proportional to what you actually consume.
If your data team runs exploratory queries on large tables without TABLESAMPLE, you are paying for bytes you never use. We can help identify and fix these patterns.
