Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Python > multi-core software

Reply
Thread Tools

multi-core software

 
 
Xah Lee
Guest
Posts: n/a
 
      06-04-2009
Of interest:

• Why Must Software Be Rewritten For Multi-Core Processors?
http://xahlee.org/UnixResource_dir/w..._software.html

plain text version follows.

--------------------------------------------------
Why Must Software Be Rewritten For Multi-Core Processors?

Xah Lee, 2009-06-04

I had a revelation today, namely, that it is necessary to rewrite
software to use multi-processor in order to benefit from it.

This may sound stupid, but is a revelation to me. For the past decade,
the question has been on my mind, about why should software needs to
be rewritten to take advantage of multi-processors. Because, in my
mind, i thought that software are at some fundamental level just
algorithms, and algorithms, have nothing to do with hardware
implementation aspects such as number of processors. I always felt,
that those talks about the need or difficulty of rewriting software
for multi-processor (or multi-core these days) must be the product of
idiocy of industrial imperative coding monkies. In particular, some
languages such as java, the way they deal with it, seems to me
extremely stupid. e.g. the concept of threads. In my mind, there
should be a layer between the software and the hardware, such as the
operating system, or the compiler, that should be able to
automatically handle or compile the software so that it FULLY use the
multi-processors when present. In short, i thought that a algorithm
really has nothing to do with hardware details.

I never really thought hard about this issue, but today, since i got a
quad-core PC, so i looked into the issue, and thought about it, and i
realized the answer. The gist is that, algorithm, fundamentally means
manipulating some hardware, in fact, algorithm is a step by step
instruction about some form of hardware, even the hardware may be
abstract or virtual. For example, let's say the simplest case of 1+1.
It is a algorithm, but where is the hardware? You may say it's
abstract concept, or it being a mathematical model. What you call 1+1
depends on the context, but in our context, those numbers are the
hardware. To see this, lets say more complex example of listing primes
by sieve. Again, it comes down to “what is a number”? Here, numbers
can be stones, or arrangement of beads on abacus, it's hardware! As
another example, say sorting. To begin with, you have to have some
something to sort, that's hardware.

Another way to see this is that, for a given computing problem, there
are infinite number of algorithms to achieve the same thing. Some,
will be better ones, requiring less steps, or requiring less storage.
All these are concrete manipulation issues, and the thing being
manipulated, ultimately we have to call it hardware. So, when hardware
changes, say from one cpu to multi-cpu, there's no way for the
algorithm to magically change and adopt the changed hardware. If you
need a algorithm that is catered to the new hardware, you need a new
algorithm.

One might think that there might be algorithm Omega, such that it
takes input of old hardware O, new hardware N, and a algorithm A, and
output a algorithm B, such that B has the same behavior as A, but N+B
performs better than O+A. This is like asking for Strong AI.

One thing we often forgot is that algorithms ultimately translates to
manipulating hardware. In a modern digital computer, that means
software algorithms ultimately becomes machine instructions in CPU,
which manipulate the 1s and 0s in register, or electricity voltage in
transisters.

In a more mundane point of view, a automatic system for software to
work on multi-processors is a problem of breaking a problem into
discrete units (so that they can be computed in parallel). The problem
of finding a algorithm is entirely different from the problem of
breaking a problem into distinct units. The problem of breaking a
problem into distinct units is a entire new branch of mathematics. For
example, let's say factoring. Factoring is a well defined mathematical
problem. There are millions algorithms to do it, each class has
different properties such as number of steps or storage units.
However, if we look at these algorithms from the point of view of
distinct units, it's a new perspective on classification of
algorithms. Software are in general too ill-defined and fuzzy and
complex, the software we use on personal computers such as email,
browsers, games, don't even have mathematical models. They don't even
have mathematical models of their inputs and outputs. To talk about
automatic system of unitizing software, would be more like a AI
fantasy. Roughly, a term that describes this aspect of research is
Parallel computing.

In the Wikipedia article, it talks about types of parallelism: Bit-
level parallelism, Instruction-level parallelism, Data parallelism,
Task parallelism. Then it also discusses hardware aspects classes:
multicore, symmetric multiprocessing, distributed computing, cluster,
grid. The subjects mentioned there more close to this essay are “data
parallelism” and “task parallelism”. However, neither are high level
enough as i discussed here. The issue discussed here would be like
“algorithmic parallelism”. Indeed, Wikipedia mentioned “Automatic
parallelization”, which is exactly what i'm talking about here. Quote:

Automatic parallelization of a sequential program by a compiler is
the holy grail of parallel computing. Despite decades of work by
compiler researchers, automatic parallelization has had only limited
success.[40]

Mainstream parallel programming languages remain either explicitly
parallel or (at best) partially implicit, in which a programmer gives
the compiler directives for parallelization. A few fully implicit
parallel programming languages exist — SISAL, Parallel Haskell, and
(for FPGAs) Mitrion-C.

It says “A few fully implicit parallel programming languages exist”.
If you are a comp lang nutcase, don't get carried away by what those
words might seem to mean.

(Note: Wikipedia has a dedicated article on the subject: Automatic
parallelization)

Xah
http://xahlee.org/


 
Reply With Quote
 
 
 
 
Roedy Green
Guest
Posts: n/a
 
      06-04-2009
On Thu, 4 Jun 2009 09:46:44 -0700 (PDT), Xah Lee <>
wrote, quoted or indirectly quoted someone who said :

> Why Must Software Be Rewritten For Multi-Core Processors?


Threads have been part of Java since Day 1. Using threads complicates
your code, but even with a single core processor, they can improve
performance, particularly if you are doing something like combing
multiple websites.

The nice thing about Java is whether you are on a single core
processor or a 256 CPU machine (We got to run our Athena Integer Java
spreadsheet engine on such a beast), does not concern your code.

You just have to make sure your threads don't interfere with each
other, and Java/the OS, handle exploiting all the CPUs available.

--
Roedy Green Canadian Mind Products
http://mindprod.com

Never discourage anyone... who continually makes progress, no matter how slow.
~ Plato 428 BC died: 348 BC at age: 80
 
Reply With Quote
 
 
 
 
Kaz Kylheku
Guest
Posts: n/a
 
      06-04-2009
["Followup-To:" header set to comp.lang.lisp.]
On 2009-06-04, Roedy Green <> wrote:
> On Thu, 4 Jun 2009 09:46:44 -0700 (PDT), Xah Lee <>
> wrote, quoted or indirectly quoted someone who said :
>
>>• Why Must Software Be Rewritten For Multi-Core Processors?

>
> Threads have been part of Java since Day 1.


Unfortunately, not sane threads designed by people who actually understand
multithreading.

> The nice thing about Java is whether you are on a single core
> processor or a 256 CPU machine (We got to run our Athena Integer Java
> spreadsheet engine on such a beast), does not concern your code.


You are dreaming if you think that there are any circumstances (other than
circumstances in which performance doesn't matter) in which you don't have to
concern yourself about the difference between a uniprocessor and a 256 CPU
machine.
 
Reply With Quote
 
MRAB
Guest
Posts: n/a
 
      06-04-2009
Kaz Kylheku wrote:
> ["Followup-To:" header set to comp.lang.lisp.]
> On 2009-06-04, Roedy Green <> wrote:
>> On Thu, 4 Jun 2009 09:46:44 -0700 (PDT), Xah Lee <>
>> wrote, quoted or indirectly quoted someone who said :
>>
>>> • Why Must Software Be Rewritten For Multi-Core Processors?

>> Threads have been part of Java since Day 1.

>
> Unfortunately, not sane threads designed by people who actually understand
> multithreading.
>
>> The nice thing about Java is whether you are on a single core
>> processor or a 256 CPU machine (We got to run our Athena Integer Java
>> spreadsheet engine on such a beast), does not concern your code.

>
> You are dreaming if you think that there are any circumstances (other than
> circumstances in which performance doesn't matter) in which you don't have to
> concern yourself about the difference between a uniprocessor and a 256 CPU
> machine.


If you're interested in parallel programming, have a look at Flow-Based
Programming:

http://www.jpaulmorrison.com/fbp/

 
Reply With Quote
 
Seamus MacRae
Guest
Posts: n/a
 
      06-05-2009
Kaz Kylheku wrote:
[snip]

I see you're no longer confining yourself to the thread from hell, and
now randomly crashing other threads comp.lang.java.programmer to
badmouth Java and crosspost into cll.

> Unfortunately, not sane threads designed by people who actually understand
> multithreading.


Then let me enlighten you! We have had some nice concurrency support
over here in Java-land since Java 5, and especially Java 6, what with
java.util.concurrent, nonblocking streams in NIO, and
SwingWorker/SwingUtilities.
 
Reply With Quote
 
Kaz Kylheku
Guest
Posts: n/a
 
      06-05-2009
On 2009-06-05, Vend <> wrote:
> On Jun 4, 8:35*pm, Roedy Green <see_webs...@mindprod.com.invalid>
> wrote:
>> On Thu, 4 Jun 2009 09:46:44 -0700 (PDT), Xah Lee <xah...@gmail.com>
>> wrote, quoted or indirectly quoted someone who said :
>>
>> >• Why Must Software Be Rewritten For Multi-Core Processors?

>>
>> Threads have been part of Java since Day 1. *Using threads complicates
>> your code, but even with a single core processor, they can improve
>> performance, particularly if you are doing something like combing
>> multiple websites.
>>
>> The nice thing about Java is whether you are on a single core
>> processor or a 256 CPU machine (We got to run our Athena Integer Java
>> spreadsheet engine on such a beast), does not concern your code.
>>
>> You just have to make sure your threads don't interfere with each
>> other, and Java/the OS, handle exploiting all the CPUs available.

>
> You need to decompose your problem in 256 independent tasks.
>
> It can be trivial for some problems and difficult or perhaps
> impossible for some others.


Even for problems where it appears trivial, there can be hidden
issues, like false cache coherency communication where no actual
sharing is taking place. Or locks that appear to have low contention and
negligible performance impact on ``only'' 8 processors suddenly turn into
bottlenecks. Then there is NUMA. A given address in memory may be
RAM attached to the processor accessing it, or to another processor,
with very different access costs.
 
Reply With Quote
 
Scott Burson
Guest
Posts: n/a
 
      06-05-2009
On Jun 4, 9:46*am, Xah Lee <xah...@gmail.com> wrote:
> Of interest:
>
> • Why Must Software Be Rewritten For Multi-Core Processors?
> *http://xahlee.org/UnixResource_dir/w..._software.html
>
> plain text version follows.
>
> --------------------------------------------------
> Why Must Software Be Rewritten For Multi-Core Processors?
>
> Xah Lee, 2009-06-04
>
> I had a revelation today, namely, that it is necessary to rewrite
> software to use multi-processor in order to benefit from it.
>
> This may sound stupid, but is a revelation to me. For the past decade,
> the question has been on my mind, about why should software needs to
> be rewritten to take advantage of multi-processors. Because, in my
> mind, i thought that software are at some fundamental level just
> algorithms, and algorithms, have nothing to do with hardware
> implementation aspects such as number of processors. I always felt,
> that those talks about the need or difficulty of rewriting software
> for multi-processor (or multi-core these days) must be the product of
> idiocy of industrial imperative coding monkies. In particular, some
> languages such as java, the way they deal with it, seems to me
> extremely stupid. e.g. the concept of threads. In my mind, there
> should be a layer between the software and the hardware, such as the
> operating system, or the compiler, that should be able to
> automatically handle or compile the software so that it FULLY use the
> multi-processors when present. In short, i thought that a algorithm
> really has nothing to do with hardware details.
>
> I never really thought hard about this issue, but today, since i got a
> quad-core PC, so i looked into the issue, and thought about it, and i
> realized the answer. The gist is that, algorithm, fundamentally means
> manipulating some hardware, in fact, algorithm is a step by step
> instruction about some form of hardware, even the hardware may be
> abstract or virtual. For example, let's say the simplest case of 1+1.
> It is a algorithm, but where is the hardware? You may say it's
> abstract concept, or it being a mathematical model. What you call 1+1
> depends on the context, but in our context, those numbers are the
> hardware. To see this, lets say more complex example of listing primes
> by sieve. Again, it comes down to “what is a number”? Here, numbers
> can be stones, or arrangement of beads on abacus, it's hardware! As
> another example, say sorting. To begin with, you have to have some
> something to sort, that's hardware.
>
> Another way to see this is that, for a given computing problem, there
> are infinite number of algorithms to achieve the same thing. Some,
> will be better ones, requiring less steps, or requiring less storage.
> All these are concrete manipulation issues, and the thing being
> manipulated, ultimately we have to call it hardware. So, when hardware
> changes, say from one cpu to multi-cpu, there's no way for the
> algorithm to magically change and adopt the changed hardware. If you
> need a algorithm that is catered to the new hardware, you need a new
> algorithm.
>
> One might think that there might be algorithm Omega, such that it
> takes input of old hardware O, new hardware N, and a algorithm A, and
> output a algorithm B, such that B has the same behavior as A, but N+B
> performs better than O+A. This is like asking for Strong AI.
>
> One thing we often forgot is that algorithms ultimately translates to
> manipulating hardware. In a modern digital computer, that means
> software algorithms ultimately becomes machine instructions in CPU,
> which manipulate the 1s and 0s in register, or electricity voltage in
> transisters.
>
> In a more mundane point of view, a automatic system for software to
> work on multi-processors is a problem of breaking a problem into
> discrete units (so that they can be computed in parallel). The problem
> of finding a algorithm is entirely different from the problem of
> breaking a problem into distinct units. The problem of breaking a
> problem into distinct units is a entire new branch of mathematics. For
> example, let's say factoring. Factoring is a well defined mathematical
> problem. There are millions algorithms to do it, each class has
> different properties such as number of steps or storage units.
> However, if we look at these algorithms from the point of view of
> distinct units, it's a new perspective on classification of
> algorithms. Software are in general too ill-defined and fuzzy and
> complex, the software we use on personal computers such as email,
> browsers, games, don't even have mathematical models. They don't even
> have mathematical models of their inputs and outputs. To talk about
> automatic system of unitizing software, would be more like a AI
> fantasy. Roughly, a term that describes this aspect of research is
> Parallel computing.
>
> In the Wikipedia article, it talks about types of parallelism: Bit-
> level parallelism, Instruction-level parallelism, Data parallelism,
> Task parallelism. Then it also discusses hardware aspects classes:
> multicore, symmetric multiprocessing, distributed computing, cluster,
> grid. The subjects mentioned there more close to this essay are “data
> parallelism” and “task parallelism”. However, neither are high level
> enough as i discussed here. The issue discussed here would be like
> “algorithmic parallelism”. Indeed, Wikipedia mentioned “Automatic
> parallelization”, which is exactly what i'm talking about here. Quote:
>
> * * Automatic parallelization of a sequential program by a compiler is
> the holy grail of parallel computing. Despite decades of work by
> compiler researchers, automatic parallelization has had only limited
> success.[40]
>
> * * Mainstream parallel programming languages remain either explicitly
> parallel or (at best) partially implicit, in which a programmer gives
> the compiler directives for parallelization. A few fully implicit
> parallel programming languages exist — SISAL, Parallel Haskell, and
> (for FPGAs) Mitrion-C.
>
> It says “A few fully implicit parallel programming languages exist”.
> If you are a comp lang nutcase, don't get carried away by what those
> words might seem to mean.
>
> (Note: Wikipedia has a dedicated article on the subject: Automatic
> parallelization)
>
> * Xah
> ∑http://xahlee.org/
>
> ☄


 
Reply With Quote
 
Roedy Green
Guest
Posts: n/a
 
      06-05-2009
On Fri, 5 Jun 2009 18:15:00 +0000 (UTC), Kaz Kylheku
<> wrote, quoted or indirectly quoted someone who
said :

>Even for problems where it appears trivial, there can be hidden
>issues, like false cache coherency communication where no actual
>sharing is taking place. Or locks that appear to have low contention and
>negligible performance impact on ``only'' 8 processors suddenly turn into
>bottlenecks. Then there is NUMA. A given address in memory may be
>RAM attached to the processor accessing it, or to another processor,
>with very different access costs.


Could what you are saying be summed up by saying, "The more threads
you have the more important it is to keep your threads independent,
sharing as little data as possible."
--
Roedy Green Canadian Mind Products
http://mindprod.com

Never discourage anyone... who continually makes progress, no matter how slow.
~ Plato 428 BC died: 348 BC at age: 80
 
Reply With Quote
 
Raymond Wiker
Guest
Posts: n/a
 
      06-06-2009
Roedy Green <> writes:

> On Fri, 5 Jun 2009 18:15:00 +0000 (UTC), Kaz Kylheku
> <> wrote, quoted or indirectly quoted someone who
> said :
>
>>Even for problems where it appears trivial, there can be hidden
>>issues, like false cache coherency communication where no actual
>>sharing is taking place. Or locks that appear to have low contention and
>>negligible performance impact on ``only'' 8 processors suddenly turn into
>>bottlenecks. Then there is NUMA. A given address in memory may be
>>RAM attached to the processor accessing it, or to another processor,
>>with very different access costs.

>
> Could what you are saying be summed up by saying, "The more threads
> you have the more important it is to keep your threads independent,
> sharing as little data as possible."


Absolutely... not a new observation, either, as it follows
directly from Amdahl's law.
 
Reply With Quote
 
George Neuner
Guest
Posts: n/a
 
      06-06-2009
On Fri, 05 Jun 2009 16:26:37 -0700, Roedy Green
<> wrote:

>On Fri, 5 Jun 2009 18:15:00 +0000 (UTC), Kaz Kylheku
><> wrote, quoted or indirectly quoted someone who
>said :
>
>>Even for problems where it appears trivial, there can be hidden
>>issues, like false cache coherency communication where no actual
>>sharing is taking place. Or locks that appear to have low contention and
>>negligible performance impact on ``only'' 8 processors suddenly turn into
>>bottlenecks. Then there is NUMA. A given address in memory may be
>>RAM attached to the processor accessing it, or to another processor,
>>with very different access costs.

>
>Could what you are saying be summed up by saying, "The more threads
>you have the more important it is to keep your threads independent,
>sharing as little data as possible."


And therein lies the problem of leveraging many cores. There is a lot
of potential parallelism in programs (even in Java that is lost
because it is too fine a grain for threads. Even the lightest weight
user space ("green") threads need a few hundred instructions, minimum,
to amortize the cost of context switching.

Add to that the fact that programmers have shown themselves, on
average, to be remarkably bad at figuring out what _should_ be done in
parallel - as opposed to what _can_ be done - and you've got a clear
indicator that threads, as we know them, are not scalable except under
a limited set of conditions.

George
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
SMS gateway software, SMS gateway, SMS software, SMS server, SMPP software, WAP Push John UK VOIP 0 08-29-2007 05:14 AM
SMS gateway software, SMS gateway, SMS software, SMS server, SMPP software, WAP Push John ASP .Net 0 08-29-2007 05:08 AM
SMS gateway software, SMS gateway, SMS software, SMS server, SMPP software, WAP Push John Java 0 08-28-2007 05:53 AM
Formsys Maxsurf v11, other Naval Architecture Software, Marine Architecture Software, ship design software, boat building, Maxsurf Pro 9.52 (and Addons), Autoship 8.xx (and Addons), Proteus.Engineering.FastShip.v*6.1.25, HYDROSOFT NAVCAD 4.23.0061, S loa210@freemail.gr NZ Computing 0 01-21-2006 07:43 PM
Software Synoptics LattisRing 2715b Remote Software. Howard Huntley Cisco 1 08-27-2004 01:34 AM



Advertisments
 



1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57