识别列表中的连续号码组
我想识别列表中的连续号码组,以便:
myfunc([2, 3, 4, 5, 12, 13, 14, 15, 16, 17, 20])
返回:
[(2,5), (12,17), 20]
想知道做这件事的最好方法是什么(特别是如果Python中有东西的话)。
编辑:注意我最初忘了提及个人数字应作为个人数字,而不是范围。
编辑2:回答OP的新要求
ranges = [] for key, group in groupby(enumerate(data), lambda (index, item): index - item): group = map(itemgetter(1), group) if len(group) > 1: ranges.append(xrange(group[0], group[-1])) else: ranges.append(group[0])
输出:
[xrange(2, 5), xrange(12, 17), 20]
您可以使用范围或任何其他自定义类来replacexrange。
Python文档有一个非常干净的配方 :
from operator import itemgetter from itertools import groupby data = [2, 3, 4, 5, 12, 13, 14, 15, 16, 17] for k, g in groupby(enumerate(data), lambda (i,x):ix): print map(itemgetter(1), g)
输出:
[2, 3, 4, 5] [12, 13, 14, 15, 16, 17]
如果你想得到完全相同的输出,你可以这样做:
ranges = [] for k, g in groupby(enumerate(data), lambda (i,x):ix): group = map(itemgetter(1), g) ranges.append((group[0], group[-1]))
输出:
[(2, 5), (12, 17)]
编辑:该示例已经在文档中解释,但也许我应该更多地解释:
解决scheme的关键是与范围进行区分,以便连续的数字全部出现在同一组中。
如果数据是: [2, 3, 4, 5, 12, 13, 14, 15, 16, 17]
groupby(enumerate(data), lambda (i,x):ix)
[2, 3, 4, 5, 12, 13, 14, 15, 16, 17]
然后groupby(enumerate(data), lambda (i,x):ix)
等价于以下内容:
groupby( [(0, 2), (1, 3), (2, 4), (3, 5), (4, 12), (5, 13), (6, 14), (7, 15), (8, 16), (9, 17)], lambda (i,x):ix )
lambda函数从元素值中减去元素索引。 所以,当你在每个项目上应用lambda。 你会得到以下关键groupby:
[-2, -2, -2, -2, -8, -8, -8, -8, -8, -8]
groupby按相同的键值对元素进行分组,所以前4个元素将被分组在一起等等。
我希望这使得它更可读。
我觉得至less有点可读的“天真”的解决scheme。
x = [2, 3, 4, 5, 12, 13, 14, 15, 16, 17, 22, 25, 26, 28, 51, 52, 57] def group(L): first = last = L[0] for n in L[1:]: if n - 1 == last: # Part of the group, bump the end last = n else: # Not part of the group, yield current group and start a new yield first, last first = last = n yield first, last # Yield the last group >>>print list(group(x)) [(2, 5), (12, 17), (22, 22), (25, 26), (28, 28), (51, 52), (57, 57)]
假设您的列表已sorting:
>>> from itertools import groupby >>> def ranges(lst): pos = (j - i for i, j in enumerate(lst)) t = 0 for i, els in groupby(pos): l = len(list(els)) el = lst[t] t += l yield range(el, el+l) >>> lst = [2, 3, 4, 5, 12, 13, 14, 15, 16, 17] >>> list(ranges(lst)) [range(2, 6), range(12, 18)]
这里是应该工作的东西,不需要任何导入:
def myfunc(lst): ret = [] a = b = lst[0] # a and b are range's bounds for el in lst[1:]: if el == b+1: b = el # range grows else: # range ended ret.append(a if a==b else (a,b)) # is a single or a range? a = b = el # let's start again with a single ret.append(a if a==b else (a,b)) # corner case for last single/range return ret
请注意,使用groupby
的代码不能像Python 3中给出的那样工作,所以使用这个。
for k, g in groupby(enumerate(data), lambda x:x[0]-x[1]): group = list(map(itemgetter(1), g)) ranges.append((group[0], group[-1]))
这不使用一个标准的函数 – 它只是在input,但它应该工作:
def myfunc(l): r = [] p = q = None for x in l + [-1]: if x - 1 == q: q += 1 else: if p: if q > p: r.append('%s-%s' % (p, q)) else: r.append(str(p)) p = q = x return '(%s)' % ', '.join(r)
请注意,它要求input只包含升序的正数。 您应该validationinput,但为了清楚起见,省略了该代码。
这是我提出的答案。 我正在为其他人编写代码来理解,所以我对variables名和注释非常详细。
首先是一个快速辅助function:
def getpreviousitem(mylist,myitem): '''Given a list and an item, return previous item in list''' for position, item in enumerate(mylist): if item == myitem: # First item has no previous item if position == 0: return None # Return previous item return mylist[position-1]
然后实际的代码:
def getranges(cpulist): '''Given a sorted list of numbers, return a list of ranges''' rangelist = [] inrange = False for item in cpulist: previousitem = getpreviousitem(cpulist,item) if previousitem == item - 1: # We're in a range if inrange == True: # It's an existing range - change the end to the current item newrange[1] = item else: # We've found a new range. newrange = [item-1,item] # Update to show we are now in a range inrange = True else: # We were in a range but now it just ended if inrange == True: # Save the old range rangelist.append(newrange) # Update to show we're no longer in a range inrange = False # Add the final range found to our list if inrange == True: rangelist.append(newrange) return rangelist
示例运行:
getranges([2, 3, 4, 5, 12, 13, 14, 15, 16, 17])
收益:
[[2, 5], [12, 17]]
import numpy as np myarray = [2, 3, 4, 5, 12, 13, 14, 15, 16, 17, 20] sequences = np.split(myarray, np.array(np.where(np.diff(myarray) > 1)[0]) + 1) l = [] for s in sequences: if len(s) > 1: l.append((np.min(s), np.max(s))) else: l.append(s[0]) print(l)
输出:
[(2, 5), (12, 17), 20]