Richard Bucker

Slicing a list into a list of variable length sub-lists

Posted at — Aug 12, 2011

[Update 2011-08-15] Please see my update at the end of the document.This was not much of a programming challenge but it was a lot of fun and it only took a few minutes to decide how to implement it. The background for this task was something that, overall, I have not determined how I’m going to address it. Confused?I’m reviewing a payment system’s transaction specification. The messages are not the standard ISO 8583 messages that many vendors use, but they more like a ‘C’ struct in that it’s just a collection of concatenated strings. The individual string, or columns, are in a predictable format but they are fixed length but different from each other. Consider this:Field 1 : N : 6bytesField 2 : A/N : 4bytesField 3 : N : 12bytes… and so on …There is a useful example for “Slicing a list into a list of variable length sub-lists”, however, if you look at the examples you will see that the “step” is set to a fixed length (n=2). And if you look at the range() function you’ll see or determine that range creates a list and then the for() iterates over the range. I’m sure there is a way to create a lazy range() function but for my example this will do.The original code looked like:>>> n = 2>>> listo=‘1234567890’>>> [listo[i:i+n] for i in range(0, len(listo), n)][‘12’, ‘34’, ‘56’, ‘78’, ‘90’]Here is my replacement for range()… with lrange() where the step param is a list.  I suppose I should test the step type but this will do for now.def lrange(start, stop, step=[1]): if not step: step = [1,] i = 0 retval = [] while start < stop: retval.append((start, start+step[i])) start += step[i] i += 1 if i >= len(step): i = 0 return retvalHere is the working example. Notice that we ran out of data before we exhausted the step list.>>> n = [ 1, 2, 3, 4, 5]>>> listo=‘1234567890’>>> [listo[i:l] for (i,l) in lrange(0, len(listo), n)][‘1’, ‘23’, ‘456’, ‘7890’]And here is another working example where we exhausted the step list before we exhausted the input.>>> n = [ 1, 2]>>> listo=‘1234567890’>>> [listo[i:l] for (i,l) in lrange(0, len(listo), n)][‘1’, ‘23’, ‘4’, ‘56’, ‘7’, ‘89’, ‘0’]Finally, the code also works when listo is list and not a string… although they are closely related.>>> listo=[1,2,3,4,5,6,7,8,9,0]>>> [listo[i:l] for (i,l) in lrange(0, len(listo), n)][[1], [2, 3], [4], [5, 6], [7], [8, 9], [0]]>[UPDATE] I hated the idea that I was creating a collection of lists upfront. If the set were large enough then a) time to pre-calculate the collection; and b) the amount of storage required. Just think about lrange(0,1000000, [2,3,5,6,8,7,5,43,23,4,67]) or maybe even lrange(0,1000000, range(1,5)). These samples can get rough. This update for lrange uses a “generator” design pattern in Python.def lrange(start, stop, step=[1]): if stop is None: stop = start start = 0 else: stop = int(stop) start = int(start) if not step: step = [1,] i = 0 retval = [] while start < stop: yield (start, start+step[i]) start += step[i] i += 1 if i >= len(step): i = 0I also have another sample test as I mentioned in the update text. Since  n = [ 1, 2, 3, 4, 5] can be represented as range(1,5) what would it look like nested:>>> listo=‘1234567890’>>> [listo[i:l] for (i,l) in lrange(0, len(listo), range(1,5))][‘1’, ‘23’, ‘456’, ‘7890’]Exactly the same.