mirror of
https://github.com/krahets/hello-algo.git
synced 2026-02-03 10:53:35 +08:00
* Review the EN heading format. * Fix pythontutor headings. * Fix pythontutor headings. * bug fixes * Fix headings in **/summary.md * Revisit the CN-to-EN translation for Python code using Claude-4.5 * Revisit the CN-to-EN translation for Java code using Claude-4.5 * Revisit the CN-to-EN translation for Cpp code using Claude-4.5. * Fix the dictionary. * Fix cpp code translation for the multipart strings. * Translate Go code to English. * Update workflows to test EN code. * Add EN translation for C. * Add EN translation for CSharp. * Add EN translation for Swift. * Trigger the CI check. * Revert. * Update en/hash_map.md * Add the EN version of Dart code. * Add the EN version of Kotlin code. * Add missing code files. * Add the EN version of JavaScript code. * Add the EN version of TypeScript code. * Fix the workflows. * Add the EN version of Ruby code. * Add the EN version of Rust code. * Update the CI check for the English version code. * Update Python CI check. * Fix cmakelists for en/C code. * Fix Ruby comments
84 lines
5.5 KiB
Markdown
84 lines
5.5 KiB
Markdown
# Binary Search
|
|
|
|
<u>Binary search</u> is an efficient searching algorithm based on the divide-and-conquer strategy. It leverages the orderliness of data to reduce the search range by half in each round until the target element is found or the search interval becomes empty.
|
|
|
|
!!! question
|
|
|
|
Given an array `nums` of length $n$ with elements arranged in ascending order and no duplicates, search for and return the index of element `target` in the array. If the array does not contain the element, return $-1$. An example is shown in the figure below.
|
|
|
|

|
|
|
|
As shown in the figure below, we first initialize pointers $i = 0$ and $j = n - 1$, pointing to the first and last elements of the array respectively, representing the search interval $[0, n - 1]$. Note that square brackets denote a closed interval, which includes the boundary values themselves.
|
|
|
|
Next, perform the following two steps in a loop:
|
|
|
|
1. Calculate the midpoint index $m = \lfloor {(i + j) / 2} \rfloor$, where $\lfloor \: \rfloor$ denotes the floor operation.
|
|
2. Compare `nums[m]` and `target`, which results in three cases:
|
|
1. When `nums[m] < target`, it indicates that `target` is in the interval $[m + 1, j]$, so execute $i = m + 1$.
|
|
2. When `nums[m] > target`, it indicates that `target` is in the interval $[i, m - 1]$, so execute $j = m - 1$.
|
|
3. When `nums[m] = target`, it indicates that `target` has been found, so return index $m$.
|
|
|
|
If the array does not contain the target element, the search interval will eventually shrink to empty. In this case, return $-1$.
|
|
|
|
=== "<1>"
|
|

|
|
|
|
=== "<2>"
|
|

|
|
|
|
=== "<3>"
|
|

|
|
|
|
=== "<4>"
|
|

|
|
|
|
=== "<5>"
|
|

|
|
|
|
=== "<6>"
|
|

|
|
|
|
=== "<7>"
|
|

|
|
|
|
It's worth noting that since both $i$ and $j$ are of `int` type, **$i + j$ may exceed the range of the `int` type**. To avoid large number overflow, we typically use the formula $m = \lfloor {i + (j - i) / 2} \rfloor$ to calculate the midpoint.
|
|
|
|
The code is shown below:
|
|
|
|
```src
|
|
[file]{binary_search}-[class]{}-[func]{binary_search}
|
|
```
|
|
|
|
**Time complexity is $O(\log n)$**: In the binary loop, the interval is reduced by half each round, so the number of loops is $\log_2 n$.
|
|
|
|
**Space complexity is $O(1)$**: Pointers $i$ and $j$ use constant-size space.
|
|
|
|
## Interval Representation Methods
|
|
|
|
In addition to the closed interval mentioned above, another common interval representation is the "left-closed right-open" interval, defined as $[0, n)$, meaning the left boundary includes itself while the right boundary does not. Under this representation, the interval $[i, j)$ is empty when $i = j$.
|
|
|
|
We can implement a binary search algorithm with the same functionality based on this representation:
|
|
|
|
```src
|
|
[file]{binary_search}-[class]{}-[func]{binary_search_lcro}
|
|
```
|
|
|
|
As shown in the figure below, under the two interval representations, the initialization, loop condition, and interval narrowing operations of the binary search algorithm are all different.
|
|
|
|
Since both the left and right boundaries in the "closed interval" representation are defined as closed, the operations to narrow the interval through pointers $i$ and $j$ are also symmetric. This makes it less error-prone, **so the "closed interval" approach is generally recommended**.
|
|
|
|

|
|
|
|
## Advantages and Limitations
|
|
|
|
Binary search performs well in both time and space aspects.
|
|
|
|
- Binary search has high time efficiency. With large data volumes, the logarithmic time complexity has significant advantages. For example, when the data size $n = 2^{20}$, linear search requires $2^{20} = 1048576$ loop rounds, while binary search only needs $\log_2 2^{20} = 20$ rounds.
|
|
- Binary search requires no extra space. Compared to searching algorithms that require additional space (such as hash-based search), binary search is more space-efficient.
|
|
|
|
However, binary search is not suitable for all situations, mainly for the following reasons:
|
|
|
|
- Binary search is only applicable to sorted data. If the input data is unsorted, sorting specifically to use binary search would be counterproductive, as sorting algorithms typically have a time complexity of $O(n \log n)$, which is higher than both linear search and binary search. For scenarios with frequent element insertions, maintaining array orderliness requires inserting elements at specific positions with a time complexity of $O(n)$, which is also very expensive.
|
|
- Binary search is only applicable to arrays. Binary search requires jump-style (non-contiguous) element access, and jump-style access has low efficiency in linked lists, making it unsuitable for linked lists or data structures based on linked list implementations.
|
|
- For small data volumes, linear search performs better. In linear search, each round requires only 1 comparison operation; while in binary search, it requires 1 addition, 1 division, 1-3 comparison operations, and 1 addition (subtraction), totaling 4-6 unit operations. Therefore, when the data volume $n$ is small, linear search is actually faster than binary search.
|