Velocity Reviews > grouping array

# grouping array

pkilambi@gmail.com
Guest
Posts: n/a

 09-29-2005
hi if I have an array

say x = [[2,2,0,0,1,1],
[1,1,0,0,1,1],
[1,1,0,0,1,1]]
I basically want to group regions that are non zero like I want to get
the coordinates of non zero regions..as (x1,y1,x2,y2)
[(0,0,2,1),(0,4,2,5)] which show the top left(x1,y1) and bottom
right(x2,y2) corners of each group.hope i am clear.

Thanks

Bill Mill
Guest
Posts: n/a

 09-29-2005
On 29 Sep 2005 10:01:40 -0700, http://www.velocityreviews.com/forums/(E-Mail Removed) <(E-Mail Removed)> wrote:
> hi if I have an array
>
> say x = [[2,2,0,0,1,1],
> [1,1,0,0,1,1],
> [1,1,0,0,1,1]]
> I basically want to group regions that are non zero like I want to get
> the coordinates of non zero regions..as (x1,y1,x2,y2)
> [(0,0,2,1),(0,4,2,5)] which show the top left(x1,y1) and bottom
> right(x2,y2) corners of each group.hope i am clear.
>

I don't understand. Could you give some inputs with expected outputs
and some explanation?

Peace
Bill Mill
bill.mill at gmail.com

pkilambi@gmail.com
Guest
Posts: n/a

 09-29-2005
sure:
basically I am looking for clustering of non zero groups in that 2D
list...so in the above array the first non zero cluster is 2,2 in row
0, 1,1 in row 1 and 1,1 in row 1 so if we think of this as one group we
have the first element of the group is at (0,0) in the list and last is
at (2,1) in the list so we have that group represented as (0,0,2,1)
similarly the second non group...and finally we get the result list
with the location of that group in the whole list as
[(0,0,2,1),(0,4,2,5)...]

hope I am clear.

Fredrik Lundh
Guest
Posts: n/a

 09-29-2005
(E-Mail Removed) wrote:

> hi if I have an array
>
> say x = [[2,2,0,0,1,1],
> [1,1,0,0,1,1],
> [1,1,0,0,1,1]]
> I basically want to group regions that are non zero like I want to get
> the coordinates of non zero regions..as (x1,y1,x2,y2)
> [(0,0,2,1),(0,4,2,5)] which show the top left(x1,y1) and bottom
> right(x2,y2) corners of each group.hope i am clear.

given your definitions, neither (0, 0, 2, 1) nor (0, 4, 2, 5) are clusters
correct, and your clusters are always this simple, here's a snippet that
does what I think you want:

x = [[2,2,0,0,1,1],
[1,1,0,0,1,1],
[1,1,0,0,1,1]]

# http://www.pythonware.com/products/pil/
import Image

h = len(x)
w = len(x[0])

data = []
for row in x:
data.extend(row)

im = Image.new("L", (w, h), None)
im.putdata(data)

def runlength(x):
out = []
u = 0
for i, v in enumerate(x):
if v:
if not u:
lo = i
elif u:
out.append((lo, i))
u = v
if u: out.append((lo, i+1))
return out

xx, yy = im.getprojection()

for y in runlength(yy):
y0, y1 = y
for x in runlength(xx):
x0, x1 = x
print (y0, x0, y1-1, x1-1)

</F>

pkilambi@gmail.com
Guest
Posts: n/a

 09-29-2005
1. why are you creating an Image object here? cant this be done by
handling lists?
2. What exactly is getprojection doing?

George Sakkis
Guest
Posts: n/a

 09-29-2005
<(E-Mail Removed)> wrote:

> hi if I have an array
>
> say x = [[2,2,0,0,1,1],
> [1,1,0,0,1,1],
> [1,1,0,0,1,1]]
> I basically want to group regions that are non zero like I want to get
> the coordinates of non zero regions..as (x1,y1,x2,y2)
> [(0,0,2,1),(0,4,2,5)] which show the top left(x1,y1) and bottom
> right(x2,y2) corners of each group.hope i am clear.
>
> Thanks

I guess you imply rectangular regions only ? If not, what's the output supposed to be for

[[2,2,0,0,1,1],
[1,1,3,0,0,1],
[1,1,3,0,1,1]]

or

[[2,2,2,2],
[1,0,3,3],
[1,1,3,0]]

George

Michael Spencer
Guest
Posts: n/a

 09-30-2005
(E-Mail Removed) wrote:
> hi if I have an array
>
> say x = [[2,2,0,0,1,1],
> [1,1,0,0,1,1],
> [1,1,0,0,1,1]]
> I basically want to group regions that are non zero like I want to get
> the coordinates of non zero regions..as (x1,y1,x2,y2)
> [(0,0,2,1),(0,4,2,5)] which show the top left(x1,y1) and bottom
> right(x2,y2) corners of each group.hope i am clear.
>
> Thanks
>

def getregions(grid):
"""Yield lists of adjancent points, not necessarily rectangular"""

points = set((y,x) for y, row in enumerate(grid)
for x, cell in enumerate(row)
if cell)

while points: # set of (y,x) non-zero points
region = [points.pop()] # start a new region with any remaining point
ptr = 0
while ptr < len(region):
y, x = region[ptr]
adjpoints = set((y + j, x + i) for j, i in adj)
adjpoints &= points # keep only the non-zero, unseen points
ptr += 1
yield region

def getregioncoords(grid):
"""Get top left and bottom right of *rectangular* regions"""
regions = getregions(grid)
return [(reg[0], reg[-1]) for reg in regions if reg.sort() or True]

>>> x = [[2,2,0,0,1,1],

... [1,1,0,0,1,1],
... [1,1,0,0,1,1]]
...
...
>>> getregioncoords(x)

[((0, 0), (2, 1)), ((0, 4), (2, 5))]
>>> x = [[1,0,1,0,1]]
>>> getregioncoords(x)

[((0, 0), (0, 0)), ((0, 2), (0, 2)), ((0, 4), (0, 4))]
>>> x = [[random.choice([0,1,2]) for x in range(6)] for y in range(6)]
>>> pprint.pprint(x)

[[1, 1, 2, 1, 2, 0],
[2, 0, 0, 2, 0, 1],
[1, 2, 2, 0, 2, 0],
[0, 1, 0, 0, 0, 0],
[2, 0, 0, 1, 1, 0],
[2, 2, 2, 0, 1, 0]]
>>> print "\n".join(str(reg) for reg in getregions(x))

[(0, 1), (0, 0), (0, 2), (1, 0), (0, 3), (2, 0), (1, 3), (0, 4), (2, 1), (3,
1), (2, 2)]
[(5, 4), (4, 4), (4, 3)]
[(5, 0), (5, 1), (4, 0), (5, 2)]
[(1, 5)]
[(2, 4)]
>>>

Unfortunately, it's rather slow. This one is much faster, using just one data
structure

def getregions2(grid):
"""Yield lists of adjancent points, not necessarily rectangular"""
rows = len(grid)
cols = len(grid[0])
griddata = []
for row in grid:
griddata.extend(row)
for y in xrange(rows):
ybase = y * cols
for x in xrange(cols):
if griddata[ybase + x]:
griddata[ybase + x] = 0
region = [(y, x)]
append = region.append
ptr = 0
while ptr < len(region):
y1, x1 = region[ptr]
y2, x2 = y1 + j, x1 + i
if y2 < 0 or y2 == rows: continue
if x2 < 0 or x2 == cols: continue
if griddata[y2 * cols + x2]:
append((y2, x2))
griddata[y2 * cols + x2] = 0
ptr += 1
yield region

Michael

pkilambi@gmail.com
Guest
Posts: n/a

 09-30-2005
fredrick's solutions seems to be more closer to what I was looking
for.But I am still not sure if that could be done without the use of
Image module.
[[1, 1, 2, 1, 2, 0],
[2, 0, 0, 2, 0, 1],
[1, 2, 2, 0, 2, 0],
[0, 1, 0, 0, 0, 0],
[2, 0, 0, 1, 1, 0],
[2, 2, 2, 0, 1, 0]]
>>> print "\n".join(str(reg) for reg in getregions(x))

[(0, 1), (0, 0), (0, 2), (1, 0), (0, 3), (2, 0), (1, 3), (0, 4), (2,
1), (3,
1), (2, 2)]
[(5, 4), (4, 4), (4, 3)]
[(5, 0), (5, 1), (4, 0), (5, 2)]
[(1, 5)]
[(2, 4)]
This is kind of confusing...could you please correlate the grid to the
result and explain

Michael Spencer
Guest
Posts: n/a

 09-30-2005
(E-Mail Removed) wrote:
> fredrick's solutions seems to be more closer to what I was looking
> for.But I am still not sure if that could be done without the use of
> Image module.

What do you mean by "closer to what I was looking
for"? For the single test case you provided:

> say x = [[2,2,0,0,1,1],
> [1,1,0,0,1,1],
> [1,1,0,0,1,1]]
> I basically want to group regions that are non zero like I want to get
> the coordinates of non zero regions..as (x1,y1,x2,y2)
> [(0,0,2,1),(0,4,2,5)] which show the top left(x1,y1) and bottom
> right(x2,y2) corners of each group.hope i am clear.
>

my solution provides the correct output:

>>> x = [[2,2,0,0,1,1],

... [1,1,0,0,1,1],
... [1,1,0,0,1,1]]
...
...
>>> getregioncoords(x)

[((0, 0), (2, 1)), ((0, 4), (2, 5))]

* except that the points aren't flattened. If that's important to you, rewrite
getregioncoords as follows:

def getregioncoords(grid):
"""Get top left and bottom right of *rectangular* regions"""
regions = getregions(grid)
return [reg[0]+reg[-1] for reg in regions if reg.sort() or True]

>>> getregioncoords(x)

[(0, 0, 2, 1), (0, 4, 2, 5)]
>>>

I broke the solution into two parts:

1) the getregions generator yields a list of all the contiguous regions. The
output below is the lists of coordinates that are contiguous non-zero cells in
the grid.

> [[1, 1, 2, 1, 2, 0],
> [2, 0, 0, 2, 0, 1],
> [1, 2, 2, 0, 2, 0],
> [0, 1, 0, 0, 0, 0],
> [2, 0, 0, 1, 1, 0],
> [2, 2, 2, 0, 1, 0]]
> >>> print "\n".join(str(reg) for reg in getregions(x))

> [(0, 1), (0, 0), (0, 2), (1, 0), (0, 3), (2, 0), (1, 3), (0, 4), (2,
> 1), (3,
> 1), (2, 2)]
> [(5, 4), (4, 4), (4, 3)]
> [(5, 0), (5, 1), (4, 0), (5, 2)]
> [(1, 5)]
> [(2, 4)]

2) If the regions are rectangular, the getregioncoords functions returns the
coordinates of the top-left and bottom-right points. You did not answer the
previous post which asked what to do if the regions were not rectangular.

HTH

Michael