Velocity Reviews > Re: artificial intelligence

# Re: artificial intelligence

Duncan Smith
Guest
Posts: n/a

 09-01-2003

"Arthur" <(E-Mail Removed)> wrote in message
news:(E-Mail Removed)...
> >Maybe there was some notice about using Python in
> >geophysic and the symposium book in one journal, so there was a sudden
> >spat of, say, three people who bought both.

>
> You would think the parameter for a statistically significant sample size
> would be a fundamental concept in this kind of thing. And no action taken
> before one was determined to exist.
>

Statistical tests take sample sizes into account (so e.g. a larger effect
will tend to be statistically significant for a smaller sample size).
Sample size calcs. are more useful when you're in a position to determine
how large the sample will be.

> OTOH, the concept of "coincidence" must necessarily be ruled out in AI, I
> would think.
>

Coincidence can't generally be ruled out, but you can look for relationships
in the (sample) data that would be unlikely to be present if the same
relationships weren't also present in the population.

> *Our* intelligence seems to give us a read as to where on the bell curve a
> particular event may lie, or a least some sense of when we are at an

extreme
> on the curve. Which we call coincidence. AI would probably have a
> particularly difficult time with this concept - it seems to me.
>

Some people have a difficult time with (or are unaware of) "statistical
thinking". Maybe some of them are involved in AI? (Well, of course some of
them are. )

> Spam filtering software must need to tackle these kinds of issues.
>

It can do, and I've no doubt some of it does. Spam filtering is a
classification problem and can be handled in a variety of ways. It's
generally easy to come up with an overly complex set of rules / model that
will correctly classify sample data. But (as you know) the idea's to come
up with a set of rules / model that will correctly (as far as possible)
classify future data. As many spam filters use Bayesian methods, I would
guess that they might be fitted using Bayesian methods; in which case overly
complex models can be (at least partially) avoided through the choice of
prior, rather than significance testing.

What do Amazon use? My guess (unless it's something really naive) would be
association rules.

Duncan

> Art
>
>

Dennis Lee Bieber
Guest
Posts: n/a

 09-01-2003
Duncan Smith fed this fish to the penguins on Monday 01 September 2003
07:06 am:

>
> It can do, and I've no doubt some of it does. Spam filtering is a
> classification problem and can be handled in a variety of ways. It's
> generally easy to come up with an overly complex set of rules / model
> that
> will correctly classify sample data. But (as you know) the idea's to
> come up with a set of rules / model that will correctly (as far as
> possible)
> classify future data. As many spam filters use Bayesian methods, I
> would guess that they might be fitted using Bayesian methods; in which
> case overly complex models can be (at least partially) avoided through
> the choice of prior, rather than significance testing.
>
> What do Amazon use? My guess (unless it's something really naive)
> would be association rules.
>

If I may insert an off-the-cuff comment...

The goal of spam filtering is normally to reduce the amount of traffic
permitted through to the client.

However, Amazon's goal would seem to be to increase the potential
sales. Hence, I'd suspect their algorithm is rigged on a quite
optimisitic mode (Hey, out of set A and set B, we have an overlap of
x... maybe we can increase x by suggesting that set A would like the
stuff from set B...)

--
> ================================================== ============ <
> http://www.velocityreviews.com/forums/(E-Mail Removed) | Wulfraed Dennis Lee Bieber KD6MOG <
> (E-Mail Removed) | Bestiaria Support Staff <
> ================================================== ============ <

Duncan Smith
Guest
Posts: n/a

 09-01-2003

"Dennis Lee Bieber" <(E-Mail Removed)> wrote in message
news:(E-Mail Removed)...
> Duncan Smith fed this fish to the penguins on Monday 01 September 2003
> 07:06 am:
>
>
> >
> > It can do, and I've no doubt some of it does. Spam filtering is a
> > classification problem and can be handled in a variety of ways. It's
> > generally easy to come up with an overly complex set of rules / model
> > that
> > will correctly classify sample data. But (as you know) the idea's to
> > come up with a set of rules / model that will correctly (as far as
> > possible)
> > classify future data. As many spam filters use Bayesian methods, I
> > would guess that they might be fitted using Bayesian methods; in which
> > case overly complex models can be (at least partially) avoided through
> > the choice of prior, rather than significance testing.
> >
> > What do Amazon use? My guess (unless it's something really naive)
> > would be association rules.
> >

> If I may insert an off-the-cuff comment...
>
> The goal of spam filtering is normally to reduce the amount of

traffic
> permitted through to the client.
>
> However, Amazon's goal would seem to be to increase the potential
> sales. Hence, I'd suspect their algorithm is rigged on a quite
> optimisitic mode (Hey, out of set A and set B, we have an overlap of
> x... maybe we can increase x by suggesting that set A would like the
> stuff from set B...)
>

(although I don't suppose they'd make it public). I'd guess that for 'very
low' and 'very high' values of x the increase in sales would be less than
for 'middling' values of x.

Duncan

Istvan Albert
Guest
Posts: n/a

 09-02-2003
Duncan Smith wrote:

> What do Amazon use? My guess (unless it's something really naive) would be
> association rules.

In my previous job I worked for a research group studying recommender
systems. We have our own, called MovieLens

http://www.movielens.org

among others we use a so called item-item recommender.
We compute similarities between items then look at a
given basket and based on it we choose to recommend the most
is that item similarities are more static they don't need
to be recomputed as often.

Istvan.