Velocity Reviews

Velocity Reviews (http://www.velocityreviews.com/forums/index.php)
-   Python (http://www.velocityreviews.com/forums/f43-python.html)
-   -   help with recursive whitespace filter in (http://www.velocityreviews.com/forums/t683422-help-with-recursive-whitespace-filter-in.html)

Rustom Mody 05-10-2009 04:10 PM

help with recursive whitespace filter in
 
I am trying to write a recursive filter to remove whitespace-only
nodes for minidom.
The code is below.

Strangely it deletes some whitespace nodes and leaves some.
If I keep calling it -- like so: fws(fws(fws(doc))) then at some
stage all the ws nodes disappear

Does anybody have a clue?


from xml.dom.minidom import parse

#The input to fws is the output of parse("something.xml")


def fws(ele):
""" filter white space (recursive)"""

for c in ele.childNodes:
if isWsNode(c):
ele.removeChild(c)
#c.unlink() Makes no diff whether this is there or not
elif c.nodeType == ele.ELEMENT_NODE:
fws(c)


def isWsNode(ele):
return (ele.nodeType == ele.TEXT_NODE and not ele.data.strip())

Steve Howell 05-10-2009 04:49 PM

Re: help with recursive whitespace filter in
 
On May 10, 9:10*am, Rustom Mody <rustompm...@gmail.com> wrote:
> I am trying to write a recursive filter to remove whitespace-only
> nodes for minidom.
> The code is below.
>
> Strangely it deletes some whitespace nodes and leaves some.
> If I keep calling it -- like so: fws(fws(fws(doc))) *then at some
> stage all the ws nodes disappear
>
> Does anybody have a clue?
>
> from xml.dom.minidom import parse
>
> #The input to fws is the output of parse("something.xml")
>
> def fws(ele):
> * * """ filter white space (recursive)"""
>
> * *for c in ele.childNodes:
> * * * * if isWsNode(c):
> * * * * * * ele.removeChild(c)
> * * * * * * #c.unlink() Makes no diff whether this is there or not
> * * * * elif c.nodeType == ele.ELEMENT_NODE:
> * * * * * * fws(c)
>
> def isWsNode(ele):
> * * return (ele.nodeType == ele.TEXT_NODE and not ele.data.strip())


I would avoid doing things like delete/remove in a loop. Instead
build a list of things to delete.

rustom 05-10-2009 05:23 PM

Re: help with recursive whitespace filter in
 
On May 10, 9:49*pm, Steve Howell <showel...@yahoo.com> wrote:
> On May 10, 9:10*am, Rustom Mody <rustompm...@gmail.com> wrote:
>
>
>
> > I am trying to write a recursive filter to remove whitespace-only
> > nodes for minidom.
> > The code is below.

>
> > Strangely it deletes some whitespace nodes and leaves some.
> > If I keep calling it -- like so: fws(fws(fws(doc))) *then at some
> > stage all the ws nodes disappear

>
> > Does anybody have a clue?

>
> > from xml.dom.minidom import parse

>
> > #The input to fws is the output of parse("something.xml")

>
> > def fws(ele):
> > * * """ filter white space (recursive)"""

>
> > * *for c in ele.childNodes:
> > * * * * if isWsNode(c):
> > * * * * * * ele.removeChild(c)
> > * * * * * * #c.unlink() Makes no diff whether this is there or not
> > * * * * elif c.nodeType == ele.ELEMENT_NODE:
> > * * * * * * fws(c)

>
> > def isWsNode(ele):
> > * * return (ele.nodeType == ele.TEXT_NODE and not ele.data.strip())

>
> I would avoid doing things like delete/remove in a loop. *Instead
> build a list of things to delete.


Yeah I know. I would write the whole damn thing functionally if I knew
how. But cant figure out the API.
I actually started out to write a (haskell-style) copy out the whole
tree minus the unwanted nodes but could not figure it out

MRAB 05-10-2009 05:35 PM

Re: help with recursive whitespace filter in
 
rustom wrote:
> On May 10, 9:49 pm, Steve Howell <showel...@yahoo.com> wrote:
>> On May 10, 9:10 am, Rustom Mody <rustompm...@gmail.com> wrote:
>>
>>
>>
>>> I am trying to write a recursive filter to remove whitespace-only
>>> nodes for minidom.
>>> The code is below.
>>> Strangely it deletes some whitespace nodes and leaves some.
>>> If I keep calling it -- like so: fws(fws(fws(doc))) then at some
>>> stage all the ws nodes disappear
>>> Does anybody have a clue?
>>> from xml.dom.minidom import parse
>>> #The input to fws is the output of parse("something.xml")
>>> def fws(ele):
>>> """ filter white space (recursive)"""
>>> for c in ele.childNodes:
>>> if isWsNode(c):
>>> ele.removeChild(c)
>>> #c.unlink() Makes no diff whether this is there or not
>>> elif c.nodeType == ele.ELEMENT_NODE:
>>> fws(c)
>>> def isWsNode(ele):
>>> return (ele.nodeType == ele.TEXT_NODE and not ele.data.strip())

>> I would avoid doing things like delete/remove in a loop. Instead
>> build a list of things to delete.

>
> Yeah I know. I would write the whole damn thing functionally if I knew
> how. But cant figure out the API.
> I actually started out to write a (haskell-style) copy out the whole
> tree minus the unwanted nodes but could not figure it out
>

def fws(ele):
""" filter white space (recursive)"""
empty_nodes = []
for c in ele.childNodes:
if isWsNode(c):
empty_nodes.append(c)
elif c.nodeType == ele.ELEMENT_NODE:
fws(c)
for c in empty_nodes:
ele.removeChild(c)

Steve Howell 05-11-2009 12:17 AM

Re: help with recursive whitespace filter in
 
On May 10, 10:23*am, rustom <rustompm...@gmail.com> wrote:
> On May 10, 9:49*pm, Steve Howell <showel...@yahoo.com> wrote:
>
>
>
> > On May 10, 9:10*am, Rustom Mody <rustompm...@gmail.com> wrote:

>
> > > I am trying to write a recursive filter to remove whitespace-only
> > > nodes for minidom.
> > > The code is below.

>
> > > Strangely it deletes some whitespace nodes and leaves some.
> > > If I keep calling it -- like so: fws(fws(fws(doc))) *then at some
> > > stage all the ws nodes disappear

>
> > > Does anybody have a clue?

>
> > > from xml.dom.minidom import parse

>
> > > #The input to fws is the output of parse("something.xml")

>
> > > def fws(ele):
> > > * * """ filter white space (recursive)"""

>
> > > * *for c in ele.childNodes:
> > > * * * * if isWsNode(c):
> > > * * * * * * ele.removeChild(c)
> > > * * * * * * #c.unlink() Makes no diff whether this is there or not
> > > * * * * elif c.nodeType == ele.ELEMENT_NODE:
> > > * * * * * * fws(c)

>
> > > def isWsNode(ele):
> > > * * return (ele.nodeType == ele.TEXT_NODE and not ele.data.strip())

>
> > I would avoid doing things like delete/remove in a loop. *Instead
> > build a list of things to delete.

>
> Yeah I know. I would write the whole damn thing functionally if I knew
> how. *But cant figure out the API.
> I actually started out to write a (haskell-style) copy out the whole
> tree minus the unwanted nodes but could not figure it out


You can use list comprehensions for a more functional style.

Instead of deleting the unwanted nodes in place, try to create new
lists of just the wanted results.


All times are GMT. The time now is 10:56 AM.

Powered by vBulletin®. Copyright ©2000 - 2014, vBulletin Solutions, Inc.
SEO by vBSEO ©2010, Crawlability, Inc.