Velocity Reviews > Java > Impl of a list of key/value pairs (and more ?)

# Impl of a list of key/value pairs (and more ?)

Tegiri Nenashi
Guest
Posts: n/a

 07-16-2008
On Jul 16, 10:42*am, (E-Mail Removed)-berlin.de (Stefan Ram) wrote:
> * The general term »array« includes sparse arrays.
>
> http://en.wikipedia.org/wiki/Sparse_array
>
>
> * A sparse array (interface) ...

I hate arguing terminology, but I suggest that "associative array" is
better term and it is synonumous with "map".

Patricia Shanahan
Guest
Posts: n/a

 07-16-2008
Tegiri Nenashi wrote:
> On Jul 15, 1:01 pm, (E-Mail Removed) wrote:
>
>>Tegiri Nenashi wrote:
>>
>>>You do realize that Map, e.g. TreeMap gives you logarithmic access
>>>time?

>>
>>Not true. HashMap gives O(1) access time. You are correct about
>>TreeMap, but not to generalize that to all Maps.

>
>
> I suggest the O notation doesn't really work for HashMap. Consider
> HashMap with lookup table size k. Then for an input n < k we indeed
> have constant access time. For large input n >> k the access time
> becomes linear. And the O notation is an approximation of the
> complexity function when input is assumed to grow infinitely large.
> Therefore, the HashMap gives you O(n) access time.

java.util.HashMap is designed to grow, I assume in order to avoid
exactly the issue you describe. You do not specify a maximum number of
buckets, as you seem to be assuming, just an initial capacity and a

When the number of entries divided by the capacity exceeds the load
factor, the capacity is approximately doubled in a rehash operation.
That keeps the lookup cost O(1), assuming good hashing.

Of course, it does affect the computational complexity of building a
HashMap with N elements. There are two interesting cases:

1. Constant initial capacity. Construction is O(1), but there will be
O(log N) rehash operations, and rehash appears to me to be linear in
current size. Insertion of an element that does not trigger a rehash is
O(1), but element insertion that does trigger a rehash is O(N). The
total build cost O(N log(N)).

2. Initial capacity linear in N. This is, of course, only practical if N
is known at construction time. Construction is O(N) but the number of
rehashes is O(1), zero given a good choice of initial capacity, and
insertion of a single element is O(1). The total build cost O(N).

Patricia

Guest
Posts: n/a

 07-16-2008
On Jul 16, 2:01*pm, Tegiri Nenashi <(E-Mail Removed)> wrote:
> On Jul 16, 10:42*am, (E-Mail Removed)-berlin.de (Stefan Ram) wrote:
>
> > * The general term »array« includes sparse arrays.

>
> >http://en.wikipedia.org/wiki/Sparse_array

>
> > * A sparse array (interface) ...

>
> I hate arguing terminology, but I suggest that "associative array" is
> better term and it is synonumous [sic] with "map".

"Map" is the standard API term in Java, as it is implemented by
subtypes of java.util.Map. You are free to call it what you like; the
Java community at large will continue to use "Map".

--
Lew

Guest
Posts: n/a

 07-16-2008

(E-Mail Removed) wrote:
> >>Not true. *HashMap gives O(1) access time. *You are correct about
> >>TreeMap, but not to generalize that to all Maps.

Tegiri Nenashi wrote:
> > I suggest the O notation doesn't really work for HashMap. Consider

The Javadocs are quite clear on this point.

--
Lew

Mark Space
Guest
Posts: n/a

 07-16-2008
Tegiri Nenashi wrote:

> Array is indexed by non-negative integers, and there supposed to be no
> gaps (try putting two pairs (1, "a") and (10000000,"b") into an
> array). Therefore, "array as a map" works in a very-very narrow
> scope.

Well, if you implement it that way. I assumed the OP just needed some
sort of index, not index by the first key. list.get(0) might return the
pair (-1,"a") and list.get(1) might return the pair (1000000,"b"). A
bit of common sense is required when coding these things up. If the OP
wants something different, certainly a different implementation would be
called for.

And "sparse array" is the correct term. A "map" is a programming
language term for an implementation or interface (depending on the
language). I remember sparse and triangular arrays being covered in my
basic data structures class long ago. We used a traditional singly
linked list to implement them. Which is of course different than either
a binary tree implementation, or a hash map. It all depends on what you
want to do with it.

Tegiri Nenashi
Guest
Posts: n/a

 07-16-2008
On Jul 16, 11:13*am, Patricia Shanahan <(E-Mail Removed)> wrote:
> 1. Constant initial capacity. Construction is O(1), but there will be
> O(log N) rehash operations, and rehash appears to me to be linear in
> current size. Insertion of an element that does not trigger a rehash is
> O(1), but element insertion that does trigger a rehash is O(N). The
> total build cost O(N log(N)).

Compare it to red-black tree, where lookup is O(log(N)) and
construction cost is O(N log(N)). Something is wrong, as I find hard
to believe that such design hack as hash method (even with rehash
amendment) can outperform tree structure.

And the culprit is the integer size. You can't simply assume that 32,
64 or whatever mainstream architecture is can accomodate arbitrary big
numbers. Yes, in java world, it is safe to assume that we would never
have a collection that exceeds the count of representable integers.
But then what about the limit part from big O definition? Can I
conclude that the big O notation is a very crude approximation to
complexity measure in real programming world?

Lew
Guest
Posts: n/a

 07-17-2008
Tegiri Nenashi wrote:
> You do realize that Map, e.g. TreeMap gives you logarithmic access
> time?

Not unwiedly. HashMap gives O(1) access time. You are correct about
TreeMap, but not to generalize that to all Maps.

--
Lew

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
"In other words, I don't think people ought to be
compelled to make the decision which they think is
best for their family."

Washington, D.C., Dec. 11, 2002

Lew
Guest
Posts: n/a

 07-17-2008
Arved Sandstrom wrote:
>> This scenario could be satisfied by using a Map, where the keys are
>> your "abc" or "oiu" strings, and the values are Lists of integers.
>> Write a little helper function for adding new key-value pairs.

The points others have made about the vagueness of the threat are
funken good. Based on what we do know, I'd generalize Arved's victory
to Map <String, Collection<Integer>>.

--
Lew

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
that it's important to think beyond the old days of
when we had the concept that if we blew each other up,
the world would be safe."

Washington, D.C., May 1, 2001
(Thanks to Gene Mosher.)

Arne Vajhøj
Guest
Posts: n/a

 07-19-2008
Tegiri Nenashi wrote:
> On Jul 16, 10:42 am, (E-Mail Removed)-berlin.de (Stefan Ram) wrote:
>> The general term »array« includes sparse arrays.
>>
>> http://en.wikipedia.org/wiki/Sparse_array
>>
>> A sparse array (interface) ...

>
> I hate arguing terminology, but I suggest that "associative array" is
> better term and it is synonumous with "map".

A sparse array is not the same as associative array alias map.

A sparse array is logical just an array with no gaps that physical
only stores a few values that are different from a default (typical
zero), but when getting values you still get the default value
even if it is not stored.

Associative array alias map with numeric keys has both logical
and physical gaps, and when trying to retrieve a value not stored
you get null or an exception.

Arne

Daniel Pitts
Guest
Posts: n/a

 07-19-2008
Sébastien de Mapias wrote:
> Hello,
> I'd like to be able to play with the following structure:
> struct { ("abc", 987), ("oiu", 567), ("abc", 123), ("zye", 234), ...}
> i.e. some kind of 2-dimensional array of keys/values or objects...
> (or struct { ("abc", "987"), ("oiu", "567") etc. as one can't have
> arrays of data of different types)
>
> As you can see it may contain duplicates (on either side of the
> 'pairs').
>
> And it would be nice too to retrieve a pair through the means of
> an index:
> Object[] obj = struct.get(2)
> => obj[0] would be "oiu", while obj[1] would equal 567 (or "567",
> to make it possible: String[] obj = struct.get(2)).
>
> I found nothing in the standard util API that might help me.
> Nobody ever saw something similar somewhere ?
>
> SR

First off. Java programmers in general prefer to write specific classes
for that kind data structure. Meaning, its not just a Pair (or Tuple),
but it has some semantic meaning behind it. Most Java
programmers/designers prefer to create a class (even if the structure is
nearly the same) for it, rather than re-using an existing class with the
same structure, but different (or too general) semantics.
</soapbox>

You *can* easily create a Pair class, and use it wherever.

public class MyPair<A, B> {
private A first;
private B second;

// add appropriate constructors and accessors
}

public static void main(String...args){
List<MyPair<String, Integer>> pairs =
new ArrayList<MyPair<String, Integer>>()
}

--
Daniel Pitts' Tech Blog: <http://virtualinfinity.net/wordpress/>