Python – how to flatten a 2D list to 1D without using numpy?

arrayslistpython

I have a list looks like this:

[[1,2,3],[1,2],[1,4,5,6,7]]

and I want to flatten it into [1,2,3,1,2,1,4,5,6,7]

is there a light weight function to do this without using numpy?

Best Answer

Without numpy ( ndarray.flatten ) one way would be using chain.from_iterable which is an alternate constructor for itertools.chain :

>>> list(chain.from_iterable([[1,2,3],[1,2],[1,4,5,6,7]]))
[1, 2, 3, 1, 2, 1, 4, 5, 6, 7]

Or as another yet Pythonic approach you can use a list comprehension :

[j for sub in [[1,2,3],[1,2],[1,4,5,6,7]] for j in sub]

Another functional approach very suitable for short lists could also be reduce in Python2 and functools.reduce in Python3 (don't use this for long lists):

In [4]: from functools import reduce # Python3

In [5]: reduce(lambda x,y :x+y ,[[1,2,3],[1,2],[1,4,5,6,7]])
Out[5]: [1, 2, 3, 1, 2, 1, 4, 5, 6, 7]

To make it slightly faster you can use operator.add, which is built-in, instead of lambda:

In [6]: from operator import add

In [7]: reduce(add ,[[1,2,3],[1,2],[1,4,5,6,7]])
Out[7]: [1, 2, 3, 1, 2, 1, 4, 5, 6, 7]

In [8]: %timeit reduce(lambda x,y :x+y ,[[1,2,3],[1,2],[1,4,5,6,7]])
789 ns ± 7.3 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

In [9]: %timeit reduce(add ,[[1,2,3],[1,2],[1,4,5,6,7]])
635 ns ± 4.38 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

benchmark:

:~$ python -m timeit "from itertools import chain;chain.from_iterable([[1,2,3],[1,2],[1,4,5,6,7]])"
1000000 loops, best of 3: 1.58 usec per loop
:~$ python -m timeit "reduce(lambda x,y :x+y ,[[1,2,3],[1,2],[1,4,5,6,7]])"
1000000 loops, best of 3: 0.791 usec per loop
:~$ python -m timeit "[j for i in [[1,2,3],[1,2],[1,4,5,6,7]] for j in i]"
1000000 loops, best of 3: 0.784 usec per loop

A benchmark on @Will's answer that used sum (its fast for short list but not for long list) :

:~$ python -m timeit "sum([[1,2,3],[4,5,6],[7,8,9]], [])"
1000000 loops, best of 3: 0.575 usec per loop
:~$ python -m timeit "sum([range(100),range(100)], [])"
100000 loops, best of 3: 2.27 usec per loop
:~$ python -m timeit "reduce(lambda x,y :x+y ,[range(100),range(100)])"
100000 loops, best of 3: 2.1 usec per loop