8.5. bisect — 数组二分算法

版本2.1中的新特性

源代码: Lib/bisect.py

本模块提供支持按顺序来对列表进行排序,而不必每次在列表中插入值后再去排序。对于项目中的长列表用比较运算成本太高,这个模块可以改善那些通常的用法。这个模块也被叫做 二分模块 因为它是使用基本的二分算法来工作的。作为二分法的工作示例该模块的源代码可能是最有用的(至少已经是正确的!).

该模块提供了以下方法:

bisect.bisect_left(a, x, lo=0, hi=len(a))

定位到插入点 x数组a中 并保持数组的排序顺序。参数 lohi 可用于指定列表a的子集; 默认情况下是整个列表都被使用了。如果 x 已经存在于 a 中, 那么将插入在已存在值的最左侧。返回的值可以作为第一个参数被 list.insert() 调用,前提是 a 是已经排序好了的。

The returned insertion point i partitions the array a into two halves so that all(val < x for val in a[lo:i]) for the left side and all(val >= x for val in a[i:hi]) for the right side.

bisect.bisect_right(a, x, lo=0, hi=len(a))
bisect.bisect(a, x, lo=0, hi=len(a))

Similar to bisect_left(), but returns an insertion point which comes after (to the right of) any existing entries of x in a.

The returned insertion point i partitions the array a into two halves so that all(val <= x for val in a[lo:i]) for the left side and all(val > x for val in a[i:hi]) for the right side.

bisect.insort_left(a, x, lo=0, hi=len(a))

Insert x in a in sorted order. This is equivalent to a.insert(bisect.bisect_left(a, x, lo, hi), x) assuming that a is already sorted. 注意,虽然搜索是 O(log n), 但是插入的时候,是 O(n)。

bisect.insort_right(a, x, lo=0, hi=len(a))
bisect.insort(a, x, lo=0, hi=len(a))

Similar to insort_left(), but inserting x in a after any existing entries of x.

See also

SortedCollection recipe that uses bisect to build a full-featured collection class with straight-forward search methods and support for a key-function. The keys are precomputed to save unnecessary calls to the key function during searches.

8.5.1. Searching Sorted Lists

bisect()函数通常用于寻找插入点, 但用于一般情况下的元素查找却是有困难的.下面的5个函数将实现如何通过bisect模块方法在排序表中查寻.

def index(a, x):
    'Locate the leftmost value exactly equal to x'
    i = bisect_left(a, x)
    if i != len(a) and a[i] == x:
        return i
    raise ValueError

def find_lt(a, x):
    'Find rightmost value less than x'
    i = bisect_left(a, x)
    if i:
        return a[i-1]
    raise ValueError

def find_le(a, x):
    'Find rightmost value less than or equal to x'
    i = bisect_right(a, x)
    if i:
        return a[i-1]
    raise ValueError

def find_gt(a, x):
    'Find leftmost value greater than x'
    i = bisect_right(a, x)
    if i != len(a):
        return a[i]
    raise ValueError

def find_ge(a, x):
    'Find leftmost item greater than or equal to x'
    i = bisect_left(a, x)
    if i != len(a):
        return a[i]
    raise ValueError

8.5.2. 更多例子

The bisect() function can be useful for numeric table lookups. This example uses bisect() to look up a letter grade for an exam score (say) based on a set of ordered numeric breakpoints: 90 and up is an ‘A’, 80 to 89 is a ‘B’, and so on:

>>> def grade(score, breakpoints=[60, 70, 80, 90], grades='FDCBA'):
        i = bisect(breakpoints, score)
        return grades[i]

>>> [grade(score) for score in [33, 99, 77, 70, 89, 90, 100]]
['F', 'A', 'C', 'C', 'B', 'A', 'A']

Unlike the sorted() function, it does not make sense for the bisect() functions to have key or reversed arguments because that would lead to an inefficient design (successive calls to bisect functions would not “remember” all of the previous key lookups).

Instead, it is better to search a list of precomputed keys to find the index of the record in question:

>>> data = [('red', 5), ('blue', 1), ('yellow', 8), ('black', 0)]
>>> data.sort(key=lambda r: r[1])
>>> keys = [r[1] for r in data]         # precomputed list of keys
>>> data[bisect_left(keys, 0)]
('black', 0)
>>> data[bisect_left(keys, 1)]
('blue', 1)
>>> data[bisect_left(keys, 5)]
('red', 5)
>>> data[bisect_left(keys, 8)]
('yellow', 8)