Go Back   Velocity Reviews > Newsgroups > PERL
User Name
Password
Register FAQ Members List Calendar Search Today's Posts Mark Forums Read

Reply

PERL - Pointer Artithmetic on DB_File values: i need it (or a very efficient solution for this problem)

 
Thread Tools Search this Thread
Old 07-08-2003, 09:48 AM   #1
Default Pointer Artithmetic on DB_File values: i need it (or a very efficient solution for this problem)


Hope that some perl guru will help me:

I've implemented an inverted index using PERL + DB_File (using hash,
and b-tree). Therefore, i have my indexer and my searcher (a little
search engine, indead).

Given a word in the lexicon i stored in B-tree the corrispondent
inverted list,
using gap coding. For istance i have:

word_r -> id_1r:freq_1r,id_2r:freq_2r,id_3r:freq_3r,........ ,
id_nr:freq_nr
word_s -> ...

where freqij is just the freq of word_j in document doc_i.

gap encoding means that id_ij is expressed as difference with
id_(i-1)j

Now, i need to write some Cursor obj with state (call it
postingCursor).
The posting in fetched in a scalar and (a reference to it!) is saved
in the Cursor.

I need to write efficiently a function:

$postingCursorOBJ->getNext($minID)

which should return the next available (if any) id_lr >= minID.

1) I need some way to fetch from the disk, just the portion of
inverted list involved in current operations. Do tie or DBI support
this?

2) I need a way to unroll incrementally the inverted list ? Is this a
case
where pointer arithmethic could help ? or there is any efficent
solution
which i missing. I think that splitting it is the wrong way to do it.

PS: the original problem is to find the common IDs, given two inverted
lists
L1 and L2 which store IDs sorted in ascending way. This should be done
in
O(min {n, m)) instead of O(m+n).


Antonio
  Reply With Quote
Old 07-09-2003, 08:25 AM   #2
Antonio
 
Posts: n/a
Default Re: Pointer Artithmetic on DB_File values: i need it (or a very efficient solution for this problem)

(Antonio) wrote in message news:<. com>...
>
> PS: the original problem is to find the common IDs, given two inverted
> lists
> L1 and L2 which store IDs sorted in ascending way. This should be done
> in
> O(min {n, m)) instead of O(m+n).


It seems that i found a solution using index and substr (it's my fault
to forget about the importance of index !).

A question: given a scalar, do index/substr access to the specified offset
in O(1)?
  Reply With Quote
Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Forum Jump