Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > C++ > vectorized computation in C++ such as those in Matlab (Matlab toC++)?

Reply
Thread Tools

vectorized computation in C++ such as those in Matlab (Matlab toC++)?

 
 
Giovanni Gherdovich
Guest
Posts: n/a
 
      08-06-2008
Hello,

my question is pretty much related to this topic so I continue
the thread instead of opening another one.
I'm a former Matlab coder and a wannabe C++ coder.

I've read about std::valarray<> in the Stroustrup's book
"The C++ Programming Language", discovering that some operators
(like the multiplication *), and some basic math functions
like sin and cos, are overloaded in order to behave very similar
to Matlab ones when dealing with valarrays (mainly, they act
component-wise).

I'm aware of Matlab "vectorization" techniques; I use them to
avoid for-loops.
But when I do that I _don't_ do linear algebra: I just do
component-wise operations between matrices, in order to
save Matlab from doing serial calls to the same routine (via for-
loops).
I dont't think such code gains anything from using BLAS or
similarities.
I mean: taking the inverse of a matrix is linear algebra,
but multiplying two vectors component-wise is just... multiplying.

rocksportrocker
> If it is for performance: Writing native loops in C/C++ with some
> optimization flags will give you best performance in most cases.


Best performance than Matlab, or best performance than
"vectorized" C++?

Rune Allnor:
> The problem is that users who only know matlab and no other
> programming languages are conditioned to believe that the
> problem lies with for-loops as such, and not with matlab.


My question:
When writing C++ code, do you thing I can have faster code
if I use std::valarray<> in "the Matlab way", instead
of using, say, std::vector<> and for-loops?

I must admit that this curiosity comes from my previous Matlab
experiences, when I used to think to for-loops as being the devil,
but from a naive point of view I can image that "vectorized"
operations on std::valarray<> are optimized by smart compilation
techniques... after all the programmer doesn't specify the
order with which the operation as to be done, like in a foor-loop...
Is this just fantasy?
--NOTE that the scenario I have in mind is a single core machine.

Regards,
Giovanni Gherdovich
 
Reply With Quote
 
 
 
 
Rune Allnor
Guest
Posts: n/a
 
      08-06-2008
On 6 Aug, 14:54, Giovanni Gherdovich
<(E-Mail Removed)> wrote:
> Hello,
>
> my question is pretty much related to this topic so I continue
> the thread instead of opening another one.
> I'm a former Matlab coder and a wannabe C++ coder.
>
> I've read about std::valarray<> in the Stroustrup's book
> "The C++ Programming Language", discovering that some operators
> (like the multiplication *), and some basic math functions
> like sin and cos, are overloaded in order to behave very similar
> to Matlab ones when dealing with valarrays (mainly, they act
> component-wise).
>
> I'm aware of Matlab "vectorization" techniques; I use them to
> avoid for-loops.


That's a *matlab* problem. 'Vectorization' is a concept
exclusive to matlab, which historically was caused by
what I consider to be bugs in the matlab interpreter.

> But when I do that I _don't_ do linear algebra: I just do
> component-wise operations between matrices, in order to
> save Matlab from doing serial calls to the same routine (via for-
> loops).
> I dont't think such code gains anything from using BLAS or
> similarities.
> I mean: taking the inverse of a matrix is linear algebra,
> but multiplying two vectors component-wise is just... multiplying.
>
> rocksportrocker
>
> > If it is for performance: Writing native loops in C/C++ with some
> > optimization flags will give you best performance in most cases.

>
> Best performance than Matlab,


Depending on exactly what you do, matlab *can* get very
close to best-possible performance since it uses highly
tuned low-level libraries. If your operation is covered by
such a function, you might find it difficult to beat matlab.
If not, don't be surprised if C++ code beats matlab by
a factor 5-10 or more.

> or best performance than
> "vectorized" C++?


There is no such thing as 'vectorized C++.'

> Rune Allnor:
>
> > The problem is that users who only know matlab and no other
> > programming languages are conditioned to believe that the
> > problem lies with for-loops as such, and not with matlab.

>
> My question:
> When writing C++ code, do you thing I can have faster code
> if I use std::valarray<> in "the Matlab way", instead
> of using, say, std::vector<> and for-loops?


I don't know. I haven't used std::valarray<>. I know I have
seen some comment somewhere that std::valarray<> was an early
attempt at a standardized way to handle numbercrunching in C++,
which was, well, not quite as successful as one might have
whished for.

> I must admit that this curiosity comes from my previous Matlab
> experiences, when I used to think to for-loops as being the devil,
> but from a naive point of view I can image that "vectorized"
> operations on std::valarray<> are optimized by smart compilation
> techniques...


You would be surprised: There are for-loops at the core of all
those libraries, even the BLAS libraries matlab is based on.
There are smart compilation techniques involved, but to *optimize*
the for-loops, not to *eliminate* them.

> after all the programmer doesn't specify the
> order with which the operation as to be done, like in a foor-loop...


I wrote a sketch to illustrate how this is done in a previous
post in this thread, which was posted only to comp.soft-sys.matlab:

http://groups.google.no/group/comp.s...d663e1fa?hl=no

As you can see, the 'vector' version myfunction(std::vector<double>)
calls the scalar version myfunction(double) in a for-loop. This is
essentially what is done in all the libraries you use, including
matlab.

The for-loops are at the core, and the smart compiler techniques
optimize the executable code to avoid any unnecessary run-time
overhead.

> Is this just fantasy?


You might want to have a look at the basic texts on modern C++.
Try "Accelerated C++" by Koenig & Moo, or "You can do it!" by
Glassborow. Or both.

Rune
 
Reply With Quote
 
 
 
 
dj3vande@csclub.uwaterloo.ca.invalid
Guest
Posts: n/a
 
      08-06-2008
In article <(E-Mail Removed)>,
Rune Allnor <(E-Mail Removed)> wrote:
>On 6 Aug, 14:54, Giovanni Gherdovich
><(E-Mail Removed)> wrote:


>> I'm aware of Matlab "vectorization" techniques; I use them to
>> avoid for-loops.

>
>That's a *matlab* problem. 'Vectorization' is a concept
>exclusive to matlab, which historically was caused by
>what I consider to be bugs in the matlab interpreter.


To describe this property of matlab as "buggy" is inordinately harsh.

It's an inherent property of interpreters: If you're interpreting a
loop, you have to look at the loop condition code, and the loop
bookkeeping code, and the code inside the loop, every time through.
Unless you go out of your way to make this fast, you end up having to
do a lookup-decode-process for each of those steps.
Compiling to native code lets you do the lookup-decode at compile time,
and for typical loops only generates a few machine-code instructions
for the loop bookkeeping and condition checking, which substantially
reduces the total amount of work the processor is doing. But making an
interpreter clever enough to do interpreted loops that fast is a Much
Harder Problem.

(So, the answer to the OP's question is (as already noted): Don't worry
about vectorizing, write loops and ask the compiler to optimize it, and
you'll probably come close enough to Matlab's performance that you
won't be able to tell the difference.)

Since Matlab is targeting numerical work with large arrays anyways,
there's not much benefit to speeding up this part of the interpreter;
if the program is spending most of its time inside the large-matrix
code (which is compiled to native code, aggressively optimized by the
compiler, and probably hand-tuned for speed), then speeding up the
interpreter's handling of the loop won't gain you any noticeable
speedup anyways. If you're writing loopy code to do things Matlab has
primitives for, you're probably better off vectorizing it anyways,
since that will make it both clearer and faster.
So (unlike with general-purpose interpreted languages that don't have
primitives that replace common loop idioms) there's no real benefit to
speeding up the Matlab interpreter's loop handling, and there are
obvious costs (development time, increased complexity, more potential
for bugs), so there are good reasons not to bother.

If you do have code that doesn't fit Matlab's vectorization model, you
can always write it in C or Fortran and wrap it up in a Matlab FFI
wrapper; Matlab's FFI is not hard to use on the compiled-to-native-code
side, and looks exactly like a Matlab function on the Matlab code side,
so it's almost always the Right Tool For The Job in that case.
(At my day job, I've been asked to do this for the Matlab programmers a
few times, and for hard-to-vectorize loopy code getting a speedup of
two or three orders of magnitude just by doing a reasonably direct
translation into C and compiling to native code with an optimizing
compiler is pretty much expected.)



dave

--
Dave Vandervies dj3vande at eskimo dot com
Erm... wouldn't clock(), used with Bill Godfrey's follow-up, ignoring my
follow-up to him (as suggested in your follow-up to me), do the trick
quite nicely? --Joona I Palaste in comp.lang.c
 
Reply With Quote
 
Uwe Schmitt
Guest
Posts: n/a
 
      08-07-2008
Giovanni Gherdovich schrieb:
> Hello,
>
> my question is pretty much related to this topic so I continue
> the thread instead of opening another one.
> I'm a former Matlab coder and a wannabe C++ coder.
>
> I've read about std::valarray<> in the Stroustrup's book
> "The C++ Programming Language", discovering that some operators
> (like the multiplication *), and some basic math functions
> like sin and cos, are overloaded in order to behave very similar
> to Matlab ones when dealing with valarrays (mainly, they act
> component-wise).
>
> I'm aware of Matlab "vectorization" techniques; I use them to
> avoid for-loops.
> But when I do that I _don't_ do linear algebra: I just do
> component-wise operations between matrices, in order to
> save Matlab from doing serial calls to the same routine (via for-
> loops).
> I dont't think such code gains anything from using BLAS or
> similarities.
> I mean: taking the inverse of a matrix is linear algebra,
> but multiplying two vectors component-wise is just... multiplying.
>

yes. but you can do some enrollment or other
access patterns for optimizing cache access.
this is a broad field, look at:
http://en.wikipedia.org/wiki/Loop_transformation

>
>> If it is for performance: Writing native loops in C/C++ with some
>> optimization flags will give you best performance in most cases.
>>

>
> Best performance than Matlab, or best performance than
> "vectorized" C++?
>

best performance compared to vectorized code.
optimization flags of your compiler can force
loopoptimisation and other strategies.

>
>> The problem is that users who only know matlab and no other
>> programming languages are conditioned to believe that the
>> problem lies with for-loops as such, and not with matlab.
>>

>
> My question:
> When writing C++ code, do you thing I can have faster code
> if I use std::valarray<> in "the Matlab way", instead
> of using, say, std::vector<> and for-loops?
>

I do not know how optimized valarray<> is. You
should compare it using different matrix-/vector-sizes
and different optimization flags
of your compiler and post your results.

If you use GNU compilers, the flags are -O?
afaik -O0 up to -O3

And you should compare it to
http://math-atlas.sourceforge.net/

which is supposed to gain very good performance.

> I must admit that this curiosity comes from my previous Matlab
> experiences, when I used to think to for-loops as being the devil,
> but from a naive point of view I can image that "vectorized"
> operations on std::valarray<> are optimized by smart compilation
> techniques... after all the programmer doesn't specify the
> order with which the operation as to be done, like in a foor-loop...
> Is this just fantasy?
>

In matlab for-loops are devil, because the interpreter
has to handle the loops, which slows things down.
If you make a navieve C implementation, you for-loops
are compiled to machinecode, which runs much faster
than the interpreted matlab for-loop.
vectorization gives matlab the ability to put the
operation into a optimized C routine, where the
essential and fast looping happens.

Greetings, Uwe

--
Dr. rer. nat. Uwe Schmitt
F&E Mathematik

mineway GmbH
Science Park 2
D-66123 Saarbrücken

Telefon: +49 (0)681 8390 5334
Telefax: +49 (0)681 830 4376

http://www.velocityreviews.com/forums/(E-Mail Removed)
www.mineway.de

Geschäftsführung: Dr.-Ing. Mathias Bauer
Amtsgericht Saarbrücken HRB 12339


 
Reply With Quote
 
Giovanni Gherdovich
Guest
Posts: n/a
 
      08-07-2008
Hello,

thank you for your answers.

Rune Allnor:
> Depending on exactly what you do, matlab *can* get very
> close to best-possible performance since it uses highly
> tuned low-level libraries. If your operation is covered by
> such a function, you might find it difficult to beat matlab.
> If not, don't be surprised if C++ code beats matlab by
> a factor 5-10 or more.


dave:
> (So, the answer to the OP's question is (as already noted): Don't worry
> about vectorizing, write loops and ask the compiler to optimize it, and
> you'll probably come close enough to Matlab's performance that you
> won't be able to tell the difference.)


Uwe:
> I mean: taking the inverse of a matrix is linear algebra,
> but multiplying two vectors component-wise is just... multiplying.
>
> > yes. but you can do some enrollment or other
> > access patterns for optimizing cache access.
> > this is a broad field, look at:
> > http://en.wikipedia.org/wiki/Loop_transformation


Rune Allnor:
> You would be surprised: There are for-loops at the core of all
> those libraries, even the BLAS libraries matlab is based on.
> There are smart compilation techniques involved, but to *optimize*
> the for-loops, not to *eliminate* them.


I was among the user who are "conditioned to believe that the
problem lies with for-loops as such, and not with matlab",
to use Rune's words.
Thank you all to point it out.

About the performance of numerical computation done using
std::valarray<>'s features:

Uwe:
> My question:
> When writing C++ code, do you thing I can have faster code
> if I use std::valarray<> in "the Matlab way", instead
> of using, say, std::vector<> and for-loops?
>
> > I do not know how optimized valarray<> is. You
> > should compare it using different matrix-/vector-sizes
> > and different optimization flags
> > of your compiler and post your results.


Rune Allnor:
> I don't know. I haven't used std::valarray<>. I know I have
> seen some comment somewhere that std::valarray<> was an early
> attempt at a standardized way to handle numbercrunching in C++,
> which was, well, not quite as successful as one might have
> whished for.


It seems that nobody knows if it's worth to use std::valarray<>
and related "vectorized" operators (provided by the standard
library) to do numerical computing in C++.

Googling this topic, I've found this interesting thread in
a forum of a site called "www.velocityreviews.com"
http://www.velocityreviews.com/forum...s-vectors.html

One of the poster, who (like me) took the chapter "Vector Arithmetic"
on Stroustrup's book as The Truth, says that with valarray<>
you can do math at the speed of light, blah blah optimization
blah blah vectorization and so on.

Another user answers with what I find a more reasonable argument:
std::valarray<> was designed to meet the characteristic of vector
machines, like the Cray. If you don't have the Cray, there is
no point in doing math with valarray<> and related operators.

Anyway, as soon as I have some spare time I will check it on
my own, comparing the results with ATLAS as Uwe suggests.

Regards,
Giovanni Gherdovich
 
Reply With Quote
 
Jerry Coffin
Guest
Posts: n/a
 
      08-08-2008
In article <a47fac46-bb33-4608-bd97-
(E-Mail Removed)>,
(E-Mail Removed) says...

[ ... ]

> Another user answers with what I find a more reasonable argument:
> std::valarray<> was designed to meet the characteristic of vector
> machines, like the Cray. If you don't have the Cray, there is
> no point in doing math with valarray<> and related operators.


In theory that's right: the basic idea was to provide something that
could be implemented quite efficiently on vector machines. In fact, I've
never heard of anybody optimizing the code for a vector machine, so it
may be open to question whether it provides any real advantage on them.

OTOH, valarray _can_ make some code quite readable, so it's not always a
complete loss anyway.

--
Later,
Jerry.

The universe is a figment of its own imagination.
 
Reply With Quote
 
Giovanni Gherdovich
Guest
Posts: n/a
 
      08-08-2008
Hello,

> During the mid-90s both C and C++ were involved in adding features that
> would support numerically intense programming. Unfortunately a couple of
> years later the companies whose numerical experts doing the grunt work
> withdrew support.


Just for the sake of historical investigation, I found a thread on
this newsgroup from the far 1991, where Walter Bright (who might
be the same Walter Bright who designed the D programming language,
http://www.walterbright.com/
http://en.wikipedia.org/wiki/Walter_Bright , but I'm not sure)
lists some shortcomings for the C++ numerical programmer,
and item #6 is

"Optimization of array operations is inhibited by the 'aliasing'
problems."
http://en.wikipedia.org/wiki/Aliasing_(computing)

(retrieved from
http://groups.google.com/group/comp....b0ec8ea7b24189

Then he mentions some solutions to this (two libraries, which
might be completely out of date nowaday).
Just to say that the Original Poster isn't the first to
address this issue...

> By hindsight it might have been better to have shelved
> the work but both WG14 and WG21 opted to continue hoping that they would
> still produce something useful.


Mmmh... I skimmed over the pages of Working Groups 14 and 21
http://www.open-std.org/jtc1/sc22/wg14
http://www.open-std.org/jtc1/sc22/wg21
and they don't seem to have vector arithmetic among their priorities.
Anyway, from what I've learned from this thread, the overall
theme can very well make no sense, because of CPUs characteristics.

Regards,
Giovanni Gh.
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Vectorized laziness 2 Bearophile Python 0 12-20-2009 11:27 PM
Vectorized laziness inside Bearophile Python 0 09-10-2009 03:44 PM
Can this loop be vectorized? Wolfgang Thomsen C++ 3 11-28-2008 07:22 PM
Eclipse RCP and MATLAB (calling MATLAB from JAVA) siki Java 0 01-16-2007 04:19 AM
Scipy: vectorized function does not take scalars as arguments ago Python 3 05-25-2006 09:22 AM



Advertisments