Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > C Programming > Memset is faster than simple loop?

Reply
Thread Tools

Memset is faster than simple loop?

 
 
AndersWang@gmail.com
Guest
Posts: n/a
 
      03-21-2007
Eric Sosman,

I appreciate your replies. I have to say I partially agree with you

> An observation: If the time spent in memset() is a
> significant or even a noticeable fraction of your program's
> running time, it is highly unlikely that the cure is to use
> a faster memset() equivalent.


That's right! I agree

>(After all, memset() generates
> almost no "new information," and does not "advance the state
> of the computation" by very much.) Rather, the cure is to
> consider what it is about the program that makes the memset()
> necessary, and to rearrange things so it isn't.


It really depands on what you are gonna do. Let say, I have an
initialized integer array with ten thousand elements,all elements
inside are zero, and then I will fill out this array by user input,
actually I don't know how many elements will be assigned new values
other than 0.(assuming 0 is not valid from user input).

So, if I want to retrieve these values, I have to loop the array until
seeing the first zero element. In this scenario,"new information" is
also very important. If I can not initialize the array with 0 , no
expected result will happen next. memset() becomes crucial to my
program.

> Just about the only time memset() speed is crucial is when
> one needs to "destroy" information. For example, an O/S may want
> to ensure that a new memory page given to Program X doesn't still
> contain data left in it by Previous Program Y, and so uses memset()
> or an equivalent to clobber Y's data. The speed of memset() in
> this kind of setting can indeed be important -- but in most user
> programs, memset() should be a negligible fraction of the running
> time, and even an infinite speedup makes a negligible overall
> improvement.


If the size of the memory block is very huge, it'll be a different
story,isn't it?


 
Reply With Quote
 
 
 
 
Eric Sosman
Guest
Posts: n/a
 
      03-21-2007
http://www.velocityreviews.com/forums/(E-Mail Removed) wrote:
> Eric Sosman,
>> [...]
>> (After all, memset() generates
>> almost no "new information," and does not "advance the state
>> of the computation" by very much.) Rather, the cure is to
>> consider what it is about the program that makes the memset()
>> necessary, and to rearrange things so it isn't.

>
> It really depands on what you are gonna do. Let say, I have an
> initialized integer array with ten thousand elements,all elements
> inside are zero, and then I will fill out this array by user input,
> actually I don't know how many elements will be assigned new values
> other than 0.(assuming 0 is not valid from user input).
>
> So, if I want to retrieve these values, I have to loop the array until
> seeing the first zero element. In this scenario,"new information" is
> also very important. If I can not initialize the array with 0 , no
> expected result will happen next. memset() becomes crucial to my
> program.


Probably not. Compared to the time taken by the input
itself, the time for memset() will likely be negligible. Ten
thousand elements, you say? Let's suppose we can get them all
with one disk read, which will take perhaps ten milliseconds.
In that ten milliseconds, your CPU's clock will tick ten to
thirty million times. With a dual-issue instruction pipeline
that's twenty to sixty million instruction opportunities, forty
to one-twenty with a quad-issue. ("Instruction opportunities"
are not necessarily "instructions executed" because the CPU will
eventually need to wait for memory to catch up to it, but you
get the idea: The speed difference between the CPU and the I/O
is about six or seven decimal orders of magnitude.)

"File system buffer cache," I hear you cry. Fine, but you've
told us this is some kind of sparse array, which means you can't
just read the data "in place." Instead, you need to read it into
some kind of buffer, then examine what you find in the buffer,
decide where it belongs in the big array, and deposit it there.
Do you think that will be faster or slower than the memset()?
And then you've got to scan the whole array afterwards, searching
out the data you deposited there -- faster than memset(), or
slower?

"But it's a *very* sparse array, and I'll only need to take
the decide-and-deposit step a few times, whereas memset() needs
to clear the entire thing." If the array is *that* sparse, it's
probably the wrong data structure: Stick the data on a linked
list, for example, and memset() will be entirely unnecessary.

> If the size of the memory block is very huge, it'll be a different
> story,isn't it?


Yes, clearing a large memory area to all-bytes-constant
will probably take a noticeable amount of time. My point is
that the clearing itself is often a waste, and could be avoided
altogether by rearranging the program.

An old friend of mine used to characterize this sort of thing
as a laudable effort to clean the beach of bottle caps and other
kinds of trash, to make the sand nice and neat around the whale
carcases.

--
Eric Sosman
(E-Mail Removed)lid
 
Reply With Quote
 
 
 
 
AndersWang@gmail.com
Guest
Posts: n/a
 
      03-21-2007
Eric,

Now I understand you

Don't put too much energy on this sort of tiny things, try to figure
out how to get rid of the whale carcases. Don't use memset(),if not
necessary. If have to, memset() won't be ur concern on performance,
unless the purpose of the program is to test the speed of memset()
itself

 
Reply With Quote
 
Jens Thoms Toerring
Guest
Posts: n/a
 
      03-22-2007
(E-Mail Removed) wrote:
> Don't put too much energy on this sort of tiny things, try to figure
> out how to get rid of the whale carcases. Don't use memset(),if not
> necessary. If have to, memset() won't be ur concern on performance,
> unless the purpose of the program is to test the speed of memset()
> itself


If I understood Eric correctly he wasn't advising against the use
of memset() but against worrying about possible speed differences
between using memset() and a loop. Actually there are arguments for
using memset() instead of a loop: it's shorter (just one line of
code instead of 2 or 3), no need for a loop variable and it might
be easier to understand what you're doing there (and it might even
be a bit faster. On the other hand you have to know where you
can't use memset() (e.g. for zeroing out an array of doubles or
pointers).
Regards, Jens
--
\ Jens Thoms Toerring ___ (E-Mail Removed)
\__________________________ http://toerring.de
 
Reply With Quote
 
user923005
Guest
Posts: n/a
 
      03-22-2007
On Mar 21, 1:36 pm, Eric Sosman <(E-Mail Removed)> wrote:
[snip]
> An old friend of mine used to characterize this sort of thing
> as a laudable effort to clean the beach of bottle caps and other
> kinds of trash, to make the sand nice and neat around the whale
> carcases.


Too bad it needs context, or that would be a Jim-Dandy Sig.

Duff's device makes for a faster memset, except when it makes things
slower (Dik Winter provided an example IIRC).

Typically, library memsets use CPU specific assembly instructions to:
Fill in register wide words at a time,
then with modulus remainder -- fill in bytes.

Alternatively, some memset variants will use fancy-pants registers
like the MMX stuff or other CPU specific instuctions.

At any rate, trying to speed up memset() is a tempest in a teapot if
there ever was one.

Once again, I am forced to cough up Mike Lee's mantra:
"The number one rule of code optimization is: Don't do it.
The number two rule of code optimization (for experts only) is: Don't
do it yet."

IMO-YMMV.

 
Reply With Quote
 
Eric Sosman
Guest
Posts: n/a
 
      03-22-2007
Jens Thoms Toerring wrote:
> (E-Mail Removed) wrote:
>> Don't put too much energy on this sort of tiny things, try to figure
>> out how to get rid of the whale carcases. Don't use memset(),if not
>> necessary. If have to, memset() won't be ur concern on performance,
>> unless the purpose of the program is to test the speed of memset()
>> itself

>
> If I understood Eric correctly he wasn't advising against the use
> of memset() but against worrying about possible speed differences
> between using memset() and a loop. Actually there are arguments for
> using memset() instead of a loop: it's shorter (just one line of
> code instead of 2 or 3), no need for a loop variable and it might
> be easier to understand what you're doing there (and it might even
> be a bit faster. On the other hand you have to know where you
> can't use memset() (e.g. for zeroing out an array of doubles or
> pointers).


Right: that's my argument, with the further point that if
you're doing so much memory-clearing that memset() becomes a
bottleneck, you're doing far too much memory-clearing and should
reconsider your design.

--
Eric Sosman
(E-Mail Removed)lid
 
Reply With Quote
 
Eric Sosman
Guest
Posts: n/a
 
      03-22-2007
user923005 wrote:
> [...]
> Duff's device makes for a faster memset, except when it makes things
> slower (Dik Winter provided an example IIRC).


FWIW, Duff's original Device was not memset-like.

http://en.wikipedia.org/wiki/Duff's_device
> [...]
> Once again, I am forced to cough up Mike Lee's mantra:
> "The number one rule of code optimization is: Don't do it.
> The number two rule of code optimization (for experts only) is: Don't
> do it yet."


s/Mike Lee/Michael A. Jackson/

Since I harp on this theme at such nauseating length, some
people have accused me of being in favor of slow code. I reject
the accusation: I am in favor of optimization, but only when it
makes economic sense. If I spend five minutes on an optimization
that shaves a microsecond off a piece of code that will run one
million times, I have made a bad bargain. But if the code will
run a million times a minute twenty-four by seven, or if the
microsecond savings is the difference between keeping the video
streaming smoothly or suffering a glitch, that's another matter.

Like Theodore Roosevelt's Big Stick, optimization skills
should be carefully honed, well maintained, and rarely used.

--
Eric Sosman
(E-Mail Removed)lid
 
Reply With Quote
 
user923005
Guest
Posts: n/a
 
      03-22-2007
On Mar 22, 6:02 am, Eric Sosman <(E-Mail Removed)> wrote:
> user923005 wrote:
> > [...]
> > Duff's device makes for a faster memset, except when it makes things
> > slower (Dik Winter provided an example IIRC).

>
> FWIW, Duff's original Device was not memset-like.
>
> http://en.wikipedia.org/wiki/Duff's_device

[snip]
It's also a C-FAQ, but (of course) the idea is used to unroll loops
and (ostensibly) make them faster.
20.35: What is "Duff's Device"?

A: It's a devastatingly deviously unrolled byte-copying loop,
devised by Tom Duff while he was at Lucasfilm. In its "classic"
form, it looks like:

register n = (count + 7) / 8; /* count > 0 assumed */
switch (count %
{
case 0: do { *to = *from++;
case 7: *to = *from++;
case 6: *to = *from++;
case 5: *to = *from++;
case 4: *to = *from++;
case 3: *to = *from++;
case 2: *to = *from++;
case 1: *to = *from++;
} while (--n > 0);
}

where count bytes are to be copied from the array pointed to by
from to the memory location pointed to by to (which is a memory-
mapped device output register, which is why to isn't
incremented). It solves the problem of handling the leftover
bytes (when count isn't a multiple of by interleaving a
switch statement with the loop which copies bytes 8 at a time.
(Believe it or not, it *is* legal to have case labels buried
within blocks nested in a switch statement like this. In his
announcement of the technique to C's developers and the world,
Duff noted that C's switch syntax, in particular its "fall
through" behavior, had long been controversial, and that "This
code forms some sort of argument in that debate, but I'm not
sure whether it's for or against.")

But appearances can be deceiving. Despite that wonderfully arcane
switch goo, it might not be faster (and could even be slower).

Most optimizing compilers can unroll loops automatically anyway.

 
Reply With Quote
 
Randy Howard
Guest
Posts: n/a
 
      03-22-2007
On Thu, 22 Mar 2007 14:06:33 -0500, user923005 wrote
(in article <(E-Mail Removed) .com>):

[duff's device]

> But appearances can be deceiving. Despite that wonderfully arcane
> switch goo, it might not be faster (and could even be slower).
>
> Most optimizing compilers can unroll loops automatically anyway.


It turns out that isn't always a true. There was an interesting
article on this in DDJ in the last year or two that goes into some
depth on it and some variations on the same theme.



--
Randy Howard (2reply remove FOOBAR)
"The power of accurate observation is called cynicism by those
who have not got it." - George Bernard Shaw





 
Reply With Quote
 
Kevin D. Quitt
Guest
Posts: n/a
 
      03-28-2007
On 21 Mar 2007 11:08:54 -0700, (E-Mail Removed) wrote:
>You are also able to find the c run-time library source codes in your
>VSStudio/vc_version/crt/src folder


I've checked my debian system all over and I can't find any such directory.


--
#include <standard.disclaimer>
_
Kevin D Quitt USA 91387-4454 96.37% of all statistics are made up
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Faster way to get PHP script than LWP::Simple Jason Carlton Perl Misc 2 11-29-2009 07:06 PM
Re: "memset" vs "= {0}"...Are they equivalent if your initializing variables? C++ 0 09-23-2004 01:28 PM
"memset" vs "= {0}"...Are they equivalent if your initializing variables? Nollie@runtime.com C++ 17 09-22-2004 06:06 PM
memset vs fill and iterators vs pointers Joe C C++ 5 08-24-2004 11:51 AM
2 questions: speed of memset() and pointer to multi-arrays k-man C++ 4 12-18-2003 08:52 PM



Advertisments