Velocity Reviews > testing if a list contains a sublist

# testing if a list contains a sublist

Johannes
Guest
Posts: n/a

 08-16-2011
Am 16.08.2011 10:00, schrieb Laszlo Nagy:
>
>> Error free? Consider this stated requirement:
>>> l1 = [1,2,2,], l2 = [1,2,3,4,5] -> l1 is not contained in l2

> If you look it the strict way, "containment" relation for lists is meant
> this way:
>
>
> l1 = []
> l2 = [1,l1,2] # l2 CONTAINS l1
>
> But you are right, I was wrong. So let's clarify what the OP wants!
>
> For example:
>
> l1 = [1,2,2,], l2 = [2,1,2,3,4,5]

I've chosen the following solution

> def _list_contained_in_list(l1,l2):
> d1 = {}
> d2 = {}
> for i in l1:
> if i in d1:
> d1[i] += 1
> else:
> d1[i] = 1
> for i in l2:
> if i in d2:
> d2[i] += 1
> else:
> d2[i] = 1
> if not all([k in d2.keys() for k in d1.keys()]):
> return false
> return all([d1[i] <= d2[i] for i in d1])

greatz Johannes

Alain Ketterlin
Guest
Posts: n/a

 08-16-2011
Laszlo Nagy <(E-Mail Removed)> writes:

>>>>> def sublist(lst1, lst2):

>> s1 = ','.join(map(str, lst1))
>> s2 = ','.join(map(str, lst2))
>> return False if s2.find(s1)==-1 else True
>>
>> I don't know about best, but it works for the examples given.
>>

> For numbers, it will always work.

I'm not even sure: if str() is locale-sensitive, it may introduce commas
in numbers (I don't know whether str() is locale-sensitive, the doc
doesn't mention anything.)

>
> lst1 = [",",",,"]
> lst1 = [",",",",","]

Yes.

It will also fail on nested lists, for fundamental reasons which are
impossible to handle with regexps. (Tough I'm not sure the OP had nested
lists in mind.)

The "brute-force" algorithm given somewhere else in this thread is
probably the way to go, unless the lists are really long, in which case
one of the "string searching" algorithm should be used (I would be
surprised noone has implemented Boyer-Moore or Karp-Rabin).

-- Alain.

Neil Cerutti
Guest
Posts: n/a

 08-16-2011
On 2011-08-16, nn <(E-Mail Removed)> wrote:
> That can be easily fixed:
>
>>>> def sublist(lst1, lst2):

> s1 = ','.join(map(str, lst1))
> s2 = ','.join(map(str, lst2))
> return False if s2.find(s1)==-1 else True
>
>>>> sublist([1,2,3],[1,2,3,4,5])

> True
>>>> sublist([1,2,2],[1,2,3,4,5])

> False
>>>> sublist([1,2,3],[1,3,5,7])

> False
>>>> sublist([12],[1,2])

> False
>>>>

String conversion is risky:

>>> sublist(['1,2', '3,4'], [1, 2, 3, 4])

True

Since we're bike-shedding, here's my entry. It's not clear to me
if accepting iterables rather than lists is a win, but I thought,
"Why not be general if the implementation is easy?"

def is_subseq_of(x, y):
r"""Return True if the elements in iterable x are found contiguously in
iterable y.

>>> is_subseq_of([], [])

True
>>> is_subseq_of([], [1, 2, 3])

True
>>> is_subseq_of([1], [1, 2, 3])

True
>>> is_subseq_of([1], [])

False
>>> is_subseq_of([4, 5], [1, 2, 3, 4, 5])

True
>>> is_subseq_of([1, 2], [1, 3, 2])

False
>>> is_subseq_of([2, 3], [1, 2, 3, 4])

True
>>> is_subseq_of([1, 2, 2], [1, 2, 3, 4, 5])

False
>>> is_subseq_of([1, 2], [1, 2])

True
>>> is_subseq_of([1, 2, 3], [1, 2])

False
>>> is_subseq_of(['1,2', '3,4'], [1, 2, 3, 4])

False
"""
x = tuple(x)
ix = 0
lenx = len(x)
if lenx == 0:
return True
for elem in y:
if x[ix] == elem:
ix += 1
else:
ix = 0
if ix == lenx:
return True
return False

if __name__ == '__main__':
import doctest
doctest.testmod()

--
Neil Cerutti

ChasBrown
Guest
Posts: n/a

 08-17-2011
On Aug 16, 1:37*am, Steven D'Aprano <steve
(E-Mail Removed)> wrote:
> On Tue, 16 Aug 2011 04:14 pm ChasBrown wrote:
>
>
>
> > On Aug 15, 4:26*pm, Johannes <(E-Mail Removed)> wrote:
> >> hi list,
> >> what is the best way to check if a given list (lets call it l1) is
> >> totally contained in a second list (l2)?

>
> >> for example:
> >> l1 = [1,2], l2 = [1,2,3,4,5] -> l1 is contained in l2
> >> l1 = [1,2,2,], l2 = [1,2,3,4,5] -> l1 is not contained in l2
> >> l1 = [1,2,3], l2 = [1,3,5,7] -> l1 is not contained in l2

>
> >> my problem is the second example, which makes it impossible to work with
> >> sets insteads of lists. But something like set.issubset for lists would
> >> be nice.

>
> >> greatz Johannes

>
> > My best guess:

>
> > from collections import Counter

>
> There's no reason to think that the Original Poster wants a multiset based
> solution. He asked about lists and sublists. That's a standard term, like
> substring:
>
> "12" is a substring of "01234".
> "21" and "13" are not.
>
> [1, 2] is a sublist of [0, 1, 2, 3, 4].
> [2, 1] and [1, 3] are not.
>
> Since lists are ordered, so are sublists.
>

That's reasonable; although except in the subject, the OP never uses
the term 'sublist'; instead using more ambiguous terms like
'contains', 'is totally contained', etc., with definition by limited
example. So it was a bit of a guess on my part of what was wanted.

> If the OP does want a solution that ignores order, then he needs to describe
> his problem better.

As it turns out, in another response the OP says he wants [2,1,2] to
be 'contained' by [1,2,2]. But in any case he always has sorted lists,
in which case, interestingly, the multiset approach and your more
canonical sublist approach yield the same results.

Cheers - Chas

Nobody
Guest
Posts: n/a

 08-17-2011
On Tue, 16 Aug 2011 09:57:57 -0400, John Posner wrote:

> How about using Python's core support for "==" on list objects:

> for i in range(alist_sz - slist_sz + 1):
> if slist == alist[i:i+slist_sz]:
> return True

This is bound to be asymptotically O(alist_sz * slist_sz), even if the
constant factor is reduced by use of ==. Boyer-Moore and regexps are
asymptotically O(alist_sz). However, the setup costs are much higher, so
you might need alist_sz to be very large before they win out.

Steven D'Aprano
Guest
Posts: n/a

 08-20-2011
Johannes wrote:

> hi list,
> what is the best way to check if a given list (lets call it l1) is
> totally contained in a second list (l2)?

[...]

For anyone interested, here's a pair of functions that implement
sub-sequence testing similar to str.find and str.rfind:

http://code.activestate.com/recipes/...-sub-sequence/

--
Steven