Velocity Reviews > testing if a list contains a sublist

# testing if a list contains a sublist

Johannes
Guest
Posts: n/a

 08-15-2011
hi list,
what is the best way to check if a given list (lets call it l1) is
totally contained in a second list (l2)?

for example:
l1 = [1,2], l2 = [1,2,3,4,5] -> l1 is contained in l2
l1 = [1,2,2,], l2 = [1,2,3,4,5] -> l1 is not contained in l2
l1 = [1,2,3], l2 = [1,3,5,7] -> l1 is not contained in l2

my problem is the second example, which makes it impossible to work with
sets insteads of lists. But something like set.issubset for lists would
be nice.

greatz Johannes

Roy Smith
Guest
Posts: n/a

 08-16-2011
In article <(E-Mail Removed)>,
Johannes <(E-Mail Removed)> wrote:

> hi list,
> what is the best way to check if a given list (lets call it l1) is
> totally contained in a second list (l2)?
>
> for example:
> l1 = [1,2], l2 = [1,2,3,4,5] -> l1 is contained in l2
> l1 = [1,2,2,], l2 = [1,2,3,4,5] -> l1 is not contained in l2
> l1 = [1,2,3], l2 = [1,3,5,7] -> l1 is not contained in l2
>
> my problem is the second example, which makes it impossible to work with
> sets insteads of lists. But something like set.issubset for lists would
> be nice.
>
> greatz Johannes

import re

def sublist(l1, l2):
s1 = ''.join(map(str, l1))
s2 = ''.join(map(str, l2))
return re.search(s1, s2)

assert sublist([1,2], [1,2,3,4,5])
assert not sublist ([1,2,2], [1,2,3,4,5])
assert not sublist([1,2,3], [1,3,5,7])

(running and ducking)

Steven D'Aprano
Guest
Posts: n/a

 08-16-2011
On Tue, 16 Aug 2011 09:26 am Johannes wrote:

> hi list,
> what is the best way to check if a given list (lets call it l1) is
> totally contained in a second list (l2)?

This is not the most efficient algorithm, but for short lists it should be
plenty fast enough:

def contains(alist, sublist):
if len(sublist) == 0 or len(sublist) > len(alist):
return False
start = 0
while True:
try:
p = alist.index(sublist[0], start)
except ValueError:
return False
for i,x in enumerate(sublist):
if alist[p+i] != x:
start = p+1
break
else: # for loop exits without break
return True

--
Steven

ChasBrown
Guest
Posts: n/a

 08-16-2011
On Aug 15, 4:26*pm, Johannes <(E-Mail Removed)> wrote:
> hi list,
> what is the best way to check if a given list (lets call it l1) is
> totally contained in a second list (l2)?
>
> for example:
> l1 = [1,2], l2 = [1,2,3,4,5] -> l1 is contained in l2
> l1 = [1,2,2,], l2 = [1,2,3,4,5] -> l1 is not contained in l2
> l1 = [1,2,3], l2 = [1,3,5,7] -> l1 is not contained in l2
>
> my problem is the second example, which makes it impossible to work with
> sets insteads of lists. But something like set.issubset for lists would
> be nice.
>
> greatz Johannes

My best guess:

from collections import Counter

def contains(alist, sublist):
alist = Counter(alist)
for k in sublist:
try:
alist[k] -= 1
if alist[k] < 0:
return False
except KeyError:
return False
return True

'Counter' being better understood as 'Multiset'.

If m = len(alist) and n = len(sublist), looks to be roughly O(m+n).

Cheers - Chas

ChasBrown
Guest
Posts: n/a

 08-16-2011
On Aug 15, 4:26*pm, Johannes <(E-Mail Removed)> wrote:
> hi list,
> what is the best way to check if a given list (lets call it l1) is
> totally contained in a second list (l2)?
>
> for example:
> l1 = [1,2], l2 = [1,2,3,4,5] -> l1 is contained in l2
> l1 = [1,2,2,], l2 = [1,2,3,4,5] -> l1 is not contained in l2
> l1 = [1,2,3], l2 = [1,3,5,7] -> l1 is not contained in l2
>
> my problem is the second example, which makes it impossible to work with
> sets insteads of lists. But something like set.issubset for lists would
> be nice.
>
> greatz Johannes

My best guess:

from collections import Counter

def contains(alist, sublist):
alist = Counter(alist)
for k in sublist:
try:
alist[k] -= 1
if alist[k] < 0:
return False
except KeyError:
return False
return True

'Counter' being better understood as 'Multiset'.

If m = len(alist) and n = len(sublist), looks to be roughly O(m+n).

Cheers - Chas

ChasBrown
Guest
Posts: n/a

 08-16-2011
On Aug 15, 4:26*pm, Johannes <(E-Mail Removed)> wrote:
> hi list,
> what is the best way to check if a given list (lets call it l1) is
> totally contained in a second list (l2)?
>
> for example:
> l1 = [1,2], l2 = [1,2,3,4,5] -> l1 is contained in l2
> l1 = [1,2,2,], l2 = [1,2,3,4,5] -> l1 is not contained in l2
> l1 = [1,2,3], l2 = [1,3,5,7] -> l1 is not contained in l2
>
> my problem is the second example, which makes it impossible to work with
> sets insteads of lists. But something like set.issubset for lists would
> be nice.
>
> greatz Johannes

My best guess:

from collections import Counter

def contains(alist, sublist):
alist = Counter(alist)
for k in sublist:
try:
alist[k] -= 1
if alist[k] < 0:
return False
except KeyError:
return False
return True

'Counter' being better understood as 'Multiset'.

If m = len(alist) and n = len(sublist), looks to be roughly O(m+n).

Cheers - Chas

Laszlo Nagy
Guest
Posts: n/a

 08-16-2011

>> hi list,
>> what is the best way to check if a given list (lets call it l1) is
>> totally contained in a second list (l2)?
>>
>> for example:
>> l1 = [1,2], l2 = [1,2,3,4,5] -> l1 is contained in l2
>> l1 = [1,2,2,], l2 = [1,2,3,4,5] -> l1 is not contained in l2
>> l1 = [1,2,3], l2 = [1,3,5,7] -> l1 is not contained in l2
>>
>> my problem is the second example, which makes it impossible to work with
>> sets insteads of lists. But something like set.issubset for lists would
>> be nice.
>>
>> greatz Johannes

Fastest, error-free and simplest solution is to use sets:

>>> l1 = [1,2]
>>> l2 = [1,2,3,4,5]
>>> set(l1)-set(l2)

set([])
>>> set(l2)-set(l1)

set([3, 4, 5])
>>>

Although with big lists, this is not very memory efficient. But I must
tell you, sometimes I use this method for lists with millions of
integers, and it is very fast and reliable, and memory is not a concern
for me, at least - some million integers will fit into a few MB of
difference etc.

Best,

Laszlo

alex23
Guest
Posts: n/a

 08-16-2011
On Aug 16, 4:51*pm, Laszlo Nagy <(E-Mail Removed)> wrote:
> >> hi list,
> >> what is the best way to check if a given list (lets call it l1) is
> >> totally contained in a second list (l2)?

>
> >> for example:
> >> l1 = [1,2], l2 = [1,2,3,4,5] -> *l1 is contained in l2
> >> l1 = [1,2,2,], l2 = [1,2,3,4,5] -> *l1 is not contained in l2
> >> l1 = [1,2,3], l2 = [1,3,5,7] -> *l1 is not contained in l2

>
> >> my problem is the second example, which makes it impossible to work with
> >> sets insteads of lists. But something like set.issubset for lists would
> >> be nice.

>
> >> greatz Johannes

>
> Fastest, error-free and simplest solution is to use sets:
>
> *>>> l1 = [1,2]
> *>>> l2 = [1,2,3,4,5]
> *>>> set(l1)-set(l2)
> set([])
> *>>> set(l2)-set(l1)
> set([3, 4, 5])

Error free? Consider this stated requirement:
> l1 = [1,2,2,], l2 = [1,2,3,4,5] -> l1 is not contained in l2

>>> l1 = [1,2,2]
>>> l2 = [1,2,3,4,5]
>>> set(l1)-set(l2)

set()
>>> set(l2)-set(l1)

{3, 4, 5}

So set([1,2]) == set([1,2,2]), despite [1,2,2] not being a sublist of
the original.

It also completely ignores list order, which would make [9,8,7] a
sublist of [5,6,7,8,9].

alex23
Guest
Posts: n/a

 08-16-2011
Laszlo Nagy <(E-Mail Removed)> wrote:
> Fastest, error-free and simplest solution is to use sets:
>
> *>>> l1 = [1,2]
> *>>> l2 = [1,2,3,4,5]
> *>>> set(l1)-set(l2)
> set([])
> *>>> set(l2)-set(l1)
> set([3, 4, 5])
> *>>>

Error-free? Not given the stated requirements:

> l1 = [1,2,2,], l2 = [1,2,3,4,5] -> l1 is not contained in l2

>>> l1 = [1,2,2]
>>> l2 = [1,2,3,4,5]
>>> set(l1)-set(l2)

set()
>>> set(l2)-set(l1)

{3, 4, 5}

As you can see, using sets provides the exact same result for a non-
contained sub-list as it does for your valid example. The list
[1,2,2,2,2,2,2,2,2] is clearly not contained with [1,2,3], but your
code would say it is.

Similarly, [9,8,7] would appear to be a sub-list of [5,6,7,8,9],
something I'd also considered to be false in this instance.

(My apologies if this is a double-up, the original post hasn't
appeared yet)

ChasBrown
Guest
Posts: n/a

 08-16-2011
On Aug 15, 11:51*pm, Laszlo Nagy <(E-Mail Removed)> wrote:
> >> hi list,
> >> what is the best way to check if a given list (lets call it l1) is
> >> totally contained in a second list (l2)?

>
> >> for example:
> >> l1 = [1,2], l2 = [1,2,3,4,5] -> *l1 is contained in l2
> >> l1 = [1,2,2,], l2 = [1,2,3,4,5] -> *l1 is not contained in l2
> >> l1 = [1,2,3], l2 = [1,3,5,7] -> *l1 is not contained in l2

>
> >> my problem is the second example, which makes it impossible to work with
> >> sets insteads of lists. But something like set.issubset for lists would
> >> be nice.

>
> >> greatz Johannes

>
> Fastest, error-free and simplest solution is to use sets:
>
> *>>> l1 = [1,2]
> *>>> l2 = [1,2,3,4,5]
> *>>> set(l1)-set(l2)
> set([])
> *>>> set(l2)-set(l1)
> set([3, 4, 5])
> *>>>
>

This approach fails the OP's desires when

>>> l1 = [1,2,2,]
>>> l2 = [1,2,3,4,5]

The OP explicitly desires 'l2 contains l1' to be false in that case.

Cheers - Chas