Block nested loop explained
A block-nested loop (BNL) is an algorithm used to join two relations in a relational database.[1]
and
(the "outer" and "inner" join operands, respectively). Suppose
. In a traditional nested loop join,
will be scanned once for every
tuple of
. If there are many qualifying
tuples, and particularly if there is no applicable
index for the join key on
, this operation will be very expensive.
The block nested loop join algorithm improves on the simple nested loop join by only scanning
once for every
group of
tuples. Here groups are
disjoint sets of tuples in
and the
union of all groups has the same tuples as
. For example, one variant of the block nested loop join reads an entire
page of
tuples into
memory and loads them into a
hash table. It then scans
, and probes the hash table to find
tuples that match any of the tuples in the current page of
. This reduces the number of scans of
that are necessary.
algorithm block_nested_loop_join is for each page pr in R do for each page ps in S do for each tuple r in pr do for each tuple s in ps do if r and s satisfy the join condition then yield tuple <r,s>
A more aggressive variant of this algorithm loads as many pages of
as can be fit in the available memory, loading all such tuples into a hash table, and then repeatedly scans
. This further reduces the number of scans of
that are necessary. In fact, this algorithm is essentially a special-case of the classic
hash join algorithm.
The block nested loop runs in
I/Os where
is the number of available pages of internal memory and
and
is size of
and
respectively in pages. Notethat block nested loop runs in
I/Os if
fits in the available internal memory.
Notes and References
- Web site: 8.2.1.14 Block Nested-Loop and Batched Key Access Joins. MySQL 5.6 Reference Manual. Oracle Corporation. 2 August 2015.
- Web site: Block Nested Loop Join. MariaDB. MariaDB Corporation Ab. 2 August 2015.