On May 25, 10:41*pm, (E-Mail Removed) wrote:

> So, basically, V = (v_1, v_2, ... , v_{k-1}, v_k) can be regarded as

> an abstract, k-dimensional vector, right?
Yes.

> If I understand your revised problem statement correctly, what you

> really want to do is build a graph of these vectors, where graph

> adjacency is equivalent to adjacency in your sense. *That is, imagine

> you have V_1, V_2, ... , V_n all sitting out in front of you,

> represented abstractly as simple points. *Draw a line between V_i and

> V_j if they are "adjacent" in your sense of the word. *What you have

> then is a graph structure where your version of adjacency exactly

> corresponds to graph adjacency. *Then, in your language, a stratum is

> simply a path in this graph, and finding those is easy.
You're solving an earlier part of the problem which I call stratum

generation. I've never thought to use a graph representation for

stratum generation -- very interesting, and I'll pursue it. Would you

be willing to outline how you'd do it here?

I've had fun "breeding" strata using genetic algorithms, and a lot of

interesting ideas came out of that experiment, in particular the

utility of building randomly shaped strata to do indirect estimates

(the objective of the set arithmetic described here). I used GA's

because I think that the space of possible strata is too big to search

by brute force.

I'd still like to have the graph representation for the brute force

solution. Or, in another random algorithm, the graph could be a

sampling frame from which random paths could be pulled.

In a real dataset, some of the strata will contain adequate data to

make an estimate, and these are called valid strata. Other legal

strata (as shown by a path through the graph) may not have adequate

data to make an estimate (thus, legal but invalid).

We've done this problem by hand: observing that we can estimate a

stratum (1,2,3) and another (2,3) but not (1). So

E(1) = E(1,2,3) - E(2,3)

There are issues with the variance of E(1), but that's a very

different problem. Using the valid strata E(1,2,3) and E(2,3) we've

made an indirect estimate of E(1).

I'm looking for an algorithm which automates the search for

combinations like the one above. Given a set of valid strata[1], yield

combinations of valid strata which, when combined via union or

difference[2], evaluate to a stratum (adjacent but possibly not

valid). The combination of valid strata is called an indirect

estimate

[1] strata can be generated via many methods; they all end up in the

same database

[2] Note that the set of operations is shrinking

Thanks in advance for your thoughts -- PB.