Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > C Programming > Memset is faster than simple loop?

Reply
Thread Tools

Memset is faster than simple loop?

 
 
AndersWang@gmail.com
Guest
Posts: n/a
 
      03-21-2007
Hi,

dose anybody here explain to me why memset would be faster than a
simple loop. I doubt about it!

In an int array scenario:

int array[10];

for(int i=0;i<10;i++) //ten loops
array[i]=0;

or

memset(array,0,sizeof(array));

So, what will memset do inside? Here is a snippet from MS c-run-time
codes:

void * __cdecl memset (
void *dst,
int val,
size_t count
)
{
void *start = dst;

#if defined (_M_MRX000) || defined (_M_ALPHA) || defined (_M_PPC) ||
defined (_M_IA64)
{
extern void RtlFillMemory( void *, size_t count, char );

RtlFillMemory( dst, count, (char)val );
}
#else /* defined (_M_MRX000) || defined (_M_ALPHA) || defined
(_M_PPC) || defined (_M_IA64) */
while (count--) { //Watch
here!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! !!!!!!!!!!
*(char *)dst = (char)val;
dst = (char *)dst + 1;
}
#endif /* defined (_M_MRX000) || defined (_M_ALPHA) || defined
(_M_PPC) || defined (_M_IA64) */

return(start);
}

memset initializes a block of memory byte by byte. So, the while loop
will be executed 4*10 times!


I got confused. Why people still believe memset is faster than loop
and pertains most of scenarios.


Thank you for your help!

 
Reply With Quote
 
 
 
 
Chris Dollin
Guest
Posts: n/a
 
      03-21-2007
http://www.velocityreviews.com/forums/(E-Mail Removed) wrote:

> dose anybody here explain to me why memset would be faster than a
> simple loop.


`memset` is provided by the implementation. It may use
lots of Cunning Implementation Tricks that -- for whatever
reason -- might not be applied by default to ordinary
C code.

Whether or not it's /actually/ faster will depend on all
sorts of things.

> I doubt about it!


Doubt is good. I think. Well, I'm not sure.

> In an int array scenario:
>
> int array[10];
>
> for(int i=0;i<10;i++) //ten loops
> array[i]=0;
>
> or
>
> memset(array,0,sizeof(array));
>
> So, what will memset do inside?


That depends.

> Here is a snippet from MS c-run-time
> codes:


(fx:snip #ifdef)

> extern void RtlFillMemory( void *, size_t count, char );
>
> RtlFillMemory( dst, count, (char)val );


(fx:snip #else)
> while (count--) { //Watch
> here!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! !!!!!!!!!!
> *(char *)dst = (char)val;
> dst = (char *)dst + 1;
> }


(fx:snip #endif)

> memset initializes a block of memory byte by byte. So, the while loop
> will be executed 4*10 times!


Urm, did you not notice the call to RtlFillMemory which happends
on suitable architectures?

Also you're /assuming/ that the compiler doesn't implement
`memset` with inline code, which it's allowed to. Perhaps
the original call

> memset(array,0,sizeof(array));


was replaced by 10 move-integer instructions.

> I got confused. Why people still believe memset is faster than loop
> and pertains most of scenarios.


The essence of the truth here is measureument as opposed to
speculation.

--
Chris Dollin
RIP John "BNF, Fortran, FP" Backus 3Dec1924 - 17Mar2007

 
Reply With Quote
 
 
 
 
Chris Dollin
Guest
Posts: n/a
 
      03-21-2007
Chris Dollin wrote:

> Also you're /assuming/ that the compiler doesn't implement
> `memset` with inline code, which it's allowed to. Perhaps
> the original call
>
>> memset(array,0,sizeof(array));

>
> was replaced by 10 move-integer instructions.


Make that 10 clear-integer instructions.

--
Chris "moving zero" Dollin
RIP John "BNF, Fortran, FP" Backus 3Dec1924 - 17Mar2007

 
Reply With Quote
 
Jens Thoms Toerring
Guest
Posts: n/a
 
      03-21-2007
(E-Mail Removed) wrote:
> dose anybody here explain to me why memset would be faster than a
> simple loop. I doubt about it!


> In an int array scenario:


> int array[10];


> for(int i=0;i<10;i++) //ten loops
> array[i]=0;


> or


> memset(array,0,sizeof(array));


> So, what will memset do inside?


There's no guarantee that memset() is faster, it's probably just some
observation a number of people have made. And if it is faster it is
probably due to the implementation using some carefully tuned method
that exploits some features of the processor the program is running
on which the compiler may not be able to find when it compiles the loop.
But that doesn't has to be the case, it's a question of how good the
implementation of memset() is on the one hand and how god the compiler
is on the other hand (and the compiler could even be clever enough to
replace the loop by a single call of memset() and there goes all your
difference in speed.

> Here is a snippet from MS c-run-time


> codes:


> void * __cdecl memset (
> void *dst,
> int val,
> size_t count
> )
> {
> void *start = dst;


> #if defined (_M_MRX000) || defined (_M_ALPHA) || defined (_M_PPC) ||
> defined (_M_IA64)
> {
> extern void RtlFillMemory( void *, size_t count, char );


> RtlFillMemory( dst, count, (char)val );
> }
> #else /* defined (_M_MRX000) || defined (_M_ALPHA) || defined
> (_M_PPC) || defined (_M_IA64) */
> while (count--) { //Watch
> here!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! !!!!!!!!!!
> *(char *)dst = (char)val;
> dst = (char *)dst + 1;
> }
> #endif /* defined (_M_MRX000) || defined (_M_ALPHA) || defined
> (_M_PPC) || defined (_M_IA64) */


> return(start);
> }


> memset initializes a block of memory byte by byte. So, the while loop
> will be executed 4*10 times!


That's just one of many implementations and you can't deduce any
general statement from looking at a certain one. And, as you will
notice when you have a close look, memset() seems to be implemented
in a different way depending on the architecture, so you can't
even say how memset() is implemented for "MS c-run-time" but only
for "MS c-run-time" on a certain architecture.

> I got confused. Why people still believe memset is faster than loop
> and pertains most of scenarios.


It isn't a question of believes. You have to carefully measure the
behaviour for the implementation and architecture you are using.

Regards, Jens
--
\ Jens Thoms Toerring ___ (E-Mail Removed)
\__________________________ http://toerring.de
 
Reply With Quote
 
AndersWang@gmail.com
Guest
Posts: n/a
 
      03-21-2007
Thank you for yr reply. Actually, I am working on measurement and I
have to agree the fact that you are right, memset is fater than loop
even in the example I wrote above.


 
Reply With Quote
 
Eric Sosman
Guest
Posts: n/a
 
      03-21-2007
(E-Mail Removed) wrote:
>
> dose anybody here explain to me why memset would be faster than a
> simple loop. I doubt about it!


It might be faster, it might be slower, or there
might be no difference. The C language Standard says
nothing about the speeds of language constructs or
library functions, and those speeds -- and relative
speeds -- will be different in different implementations
of C.

> [...] Here is a snippet from MS c-run-time
> codes: [...]


Compilers play many tricks in pursuit of speedier or
smaller code. One of the tricks often played with the
Standard functions (which the compiler can "know about")
is to generate in-line machine instructions instead of
generating an actual function call. What looks like a
call on the fabs() function might actually produce an
FABS instruction in the generated code, and no function
call at all.

memset() is a fairly simple function -- not like
printf(), say -- and many compilers will play this kind
of game with it. If you write what looks like a call
to memset(), the program might actually use a block-clear
instruction or instruction sequence instead of a call.

But even if the implementation plays this kind of
game with memset() or sqrt() or whatever, it must still
provide an actual, callable function that does the same
thing (possibly at a different speed). This enables a
program to use a function pointer to call a library
function, without necessarily being able to predict
what function will be called until run-time:

#include <string.h> /* for memset */

void shortset(void *s, int c, size_t n) {
short *sp = s;
for (n /= sizeof(short); n > 0; --n)
*sp++ = c;
}

void intset(void *s, int c, size_t n) {
int *ip = s;
for (n /= sizeof(int); n > 0; --n)
*ip++ = c;
}

void (*fptr)(void*, int, size_t);
...
switch (something_unpredictable) {
default: fptr = memset; break;
case 1: fptr = shortset; break;
case 2: fptr = intset; break;
}
fptr(buffer, 42, sizeof buffer);

In short, the code you have found (MS disclosed
their source? Surprising, but not astonishing) may
be used only in oddball circumstances like the above,
while "ordinary" memset() calls wind up using another
mechanism altogether. You'll need to dig deeper.

... and you'll need to remember that all such
tricks and timings vary from one implementation to
the next; they aren't the province of the language
as such.

--
Eric Sosman
(E-Mail Removed)lid


 
Reply With Quote
 
Racaille
Guest
Posts: n/a
 
      03-21-2007
Eric Sosman wrote:
> In short, the code you have found (MS disclosed
> their source? Surprising, but not astonishing) may


The source code to MSVCRT.DLL (the C library)
is available as part of their SDK, which is free to download
from microsoft itself.

 
Reply With Quote
 
Randy Howard
Guest
Posts: n/a
 
      03-21-2007
On Wed, 21 Mar 2007 10:42:11 -0500, (E-Mail Removed) wrote
(in article <(E-Mail Removed). com>):

> Thank you for yr reply. Actually, I am working on measurement and I
> have to agree the fact that you are right, memset is fater than loop
> even in the example I wrote above.


Well, just keep in mind that what's true for you today, on a particular
system with a particular development environment installed, may not be
true anywhere else, or even on your own system 6 months from now. You
can not extrapolate "rules" for this sort of thing and actually know
anything at all about what will happen in general.



--
Randy Howard (2reply remove FOOBAR)
"The power of accurate observation is called cynicism by those
who have not got it." - George Bernard Shaw





 
Reply With Quote
 
AndersWang@gmail.com
Guest
Posts: n/a
 
      03-21-2007
Racaille wrote:
>The source code to MSVCRT.DLL (the C library)
>is available as part of their SDK, which is free to download
>from microsoft itself.


You are also able to find the c run-time library source codes in your
VSStudio/vc_version/crt/src folder

 
Reply With Quote
 
Eric Sosman
Guest
Posts: n/a
 
      03-21-2007
(E-Mail Removed) wrote:
> Thank you for yr reply. Actually, I am working on measurement and I
> have to agree the fact that you are right, memset is fater than loop
> even in the example I wrote above.


An observation: If the time spent in memset() is a
significant or even a noticeable fraction of your program's
running time, it is highly unlikely that the cure is to use
a faster memset() equivalent. (After all, memset() generates
almost no "new information," and does not "advance the state
of the computation" by very much.) Rather, the cure is to
consider what it is about the program that makes the memset()
necessary, and to rearrange things so it isn't. For example,

char buffer[BIGSIZE];
...
memset (buffer, 0, sizeof buffer);
if (arriving)
strcat (buffer, "Hello,");
else
strcat (buffer, "Goodbye, cruel");
strcat (buffer, " world!");

.... doesn't really need the full effect of memset(), but just
its effect on buffer[0]. With that in mind, the code can be

char buffer[BIGSIZE];
...
buffer[0] = 0;
if (arriving)
... etc ...

Still better:

char buffer[BIGSIZE];
...
if (arriving)
strcpy (buffer, "Hello,");
else
strcpy (buffer, "Goodbye, cruel");
strcat (buffer, " world!");

And, of course, there are lots more alternatives. My point is
that a very large fraction of memset() calls can be eliminated
by transformations not much more complex than these, and when
you get rid of an operation altogether you get an infinite
improvement in its speed. It is said that "the fastest I/O is
the one you don't do," and this generalizes to "the fastest X
is the one you don't do."

Just about the only time memset() speed is crucial is when
one needs to "destroy" information. For example, an O/S may want
to ensure that a new memory page given to Program X doesn't still
contain data left in it by Previous Program Y, and so uses memset()
or an equivalent to clobber Y's data. The speed of memset() in
this kind of setting can indeed be important -- but in most user
programs, memset() should be a negligible fraction of the running
time, and even an infinite speedup makes a negligible overall
improvement.

--
Eric Sosman
(E-Mail Removed)lid
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Faster way to get PHP script than LWP::Simple Jason Carlton Perl Misc 2 11-29-2009 07:06 PM
Re: "memset" vs "= {0}"...Are they equivalent if your initializing variables? C++ 0 09-23-2004 01:28 PM
"memset" vs "= {0}"...Are they equivalent if your initializing variables? Nollie@runtime.com C++ 17 09-22-2004 06:06 PM
memset vs fill and iterators vs pointers Joe C C++ 5 08-24-2004 11:51 AM
2 questions: speed of memset() and pointer to multi-arrays k-man C++ 4 12-18-2003 08:52 PM



Advertisments