Pular para o conteúdo principal
Close
FinOps

BigQuery LIMIT 100: The Illusion Draining Your OPEX

Gabriel Ferraresi· CEO | Tech86June 11, 20264 min
finopsbigquerytablesampleoptimizationdata-engineering

Most data teams believe that adding LIMIT 100 to an exploratory query makes the database stop reading after finding 100 rows. In BigQuery, that is an expensive illusion. We have seen teams burning thousands of dollars per month on queries that should cost cents — and the culprit is a SQL pattern that looks harmless.

The pricing model nobody reads

BigQuery charges by bytes read, not by rows returned. That is the fundamental rule that changes everything. In a traditional relational database, LIMIT 100 is efficient because the engine reads row by row and stops. In a columnar database like BigQuery, reads operate on entire column blocks.

When you run SELECT * FROM table LIMIT 100, the engine needs to retrieve the requested columns from storage blocks before applying the cutoff. On partitioned tables without a partition filter, this frequently means scanning entire partitions — though partitioning and clustering can partially mitigate the cost by limiting the scan scope. You pay for the full banquet and consume only the olive.

At Tech86, when we audit clients' BigQuery bills, the SELECT * ... LIMIT pattern on large tables consistently ranks among the top sources of waste. It is not a logic error — it is a mental model error. The developer designs the query thinking in relational terms, but the engine executes in columnar architecture.

TABLESAMPLE SYSTEM: the physics change

The TABLESAMPLE SYSTEM (1 PERCENT) command changes the physics of the operation. Instead of scanning the entire table and then cutting, it instructs the engine to read random data blocks from disk until the defined sampling percentage is reached. The engine never reads more than specified.

In our internal testing at Tech86, an exploratory query that processed 3.45 GB with LIMIT 100 dropped to 172 MB with TABLESAMPLE SYSTEM (1 PERCENT). A 95% reduction in data volume processed — and in query cost.

The difference is not marginal. In an operation running dozens of exploratory queries per day, monthly savings can reach thousands of dollars. And the code change is minimal: one SQL clause replacing another.

The problem is not exclusive to BigQuery

The concept applies to any database with block-level reads. In PostgreSQL, the syntax changes to TABLESAMPLE BERNOULLI (1) or TABLESAMPLE SYSTEM (1), available since version 9.5. The difference between the two methods:

  • BERNOULLI: row-level sampling. More statistically precise, but slower because the engine must visit each row to decide whether to include it.
  • SYSTEM: page-level sampling. Faster because the engine decides per entire block, but with higher variance since rows within the same block tend to be correlated.

For quick exploration, SYSTEM is sufficient. For analyses requiring rigorous sample representativeness, BERNOULLI is safer. In both cases, I/O drops dramatically compared to a SELECT * ... LIMIT that scans the entire table.

The FinOps lesson behind the syntax

FinOps is not about shutting down servers on weekends. FinOps is about understanding the internal architecture of the tool you operate. When a data team does exploration with LIMIT on petabyte-scale tables, they are not just testing code — they are draining the company's OPEX out of ignorance of the platform's cost model.

Knowing the syntax is common. Understanding the computational cost of the syntax is what separates professionals from amateurs. The same query, written differently, can cost $0.05 or $50. The difference is not in the result — it is in the physics of execution.

At Tech86, we include query pattern auditing as a standard part of our FinOps engagements. Identifying SELECT * ... LIMIT on large tables is one of the first checks. The fix is simple, the savings are immediate, and the impact compounds month over month.

Conclusion

LIMIT 100 in BigQuery is the equivalent of filling up your gas tank to drive around the block. It works, but it costs far more than it should. TABLESAMPLE SYSTEM is the right tool for the right job: data exploration with cost proportional to what you actually consume.

If your data team runs exploratory queries on large tables without TABLESAMPLE, you are paying for bytes you never use. We can help identify and fix these patterns.

Interested in this solution?

Explore our managed services and infrastructure.

Explore FinOps for Cloud

Frequently Asked Questions

Not in most cases. BigQuery charges by bytes read, not by rows returned. With SELECT * ... LIMIT 100, the engine frequently needs to scan the entire table to retrieve the requested columns before applying the cutoff. You pay for the full scan and receive only 100 rows.

TABLESAMPLE SYSTEM is a SQL clause that instructs the engine to read a percentage of random data blocks from disk, instead of scanning the entire table. In BigQuery, TABLESAMPLE SYSTEM (1 PERCENT) reads approximately 1% of storage blocks. This dramatically reduces bytes processed and query cost.

Yes. PostgreSQL has supported TABLESAMPLE since version 9.5, with two methods: BERNOULLI (row-level sampling, more precise but slower) and SYSTEM (page-level sampling, faster but with higher variance). The concept is the same: reduce I/O by reading less data from disk.

It depends on the use case. For visual exploration, schema validation, and quick data inspection, the precision is sufficient. For statistical analyses requiring rigorous sample representativeness, increase the percentage or use BERNOULLI in PostgreSQL. TABLESAMPLE SYSTEM does not replace production queries — it is an exploration tool.

Blog — Get in Touch

Have a question about our articles or services? Our team is ready to help.

Schedule a Meeting

Book a time slot.

Schedule Now

Email

Send us a message.

[email protected]

WhatsApp

Quick conversation.

Address

Avenida Paulista, 1636 - São Paulo - SP - 01310-200

Tech86 Specialist

Online now

Hello! How can we help scale your business today?

Tech86 Engineering

We Value Your Privacy

We use cookies and similar technologies to optimize your experience, analyze site traffic, and personalize content. By clicking "Accept All", you agree to the use of all cookies. Read our Privacy Policy.