![]() |
|
|
|||||||
![]() |
PERL - Pointer Artithmetic on DB_File values: i need it (or a very efficient solution for this problem) |
|
|
Thread Tools | Search this Thread |
|
|
#1 |
|
Hope that some perl guru will help me:
I've implemented an inverted index using PERL + DB_File (using hash, and b-tree). Therefore, i have my indexer and my searcher (a little search engine, indead). Given a word in the lexicon i stored in B-tree the corrispondent inverted list, using gap coding. For istance i have: word_r -> id_1r:freq_1r,id_2r:freq_2r,id_3r:freq_3r,........ , id_nr:freq_nr word_s -> ... where freqij is just the freq of word_j in document doc_i. gap encoding means that id_ij is expressed as difference with id_(i-1)j Now, i need to write some Cursor obj with state (call it postingCursor). The posting in fetched in a scalar and (a reference to it!) is saved in the Cursor. I need to write efficiently a function: $postingCursorOBJ->getNext($minID) which should return the next available (if any) id_lr >= minID. 1) I need some way to fetch from the disk, just the portion of inverted list involved in current operations. Do tie or DBI support this? 2) I need a way to unroll incrementally the inverted list ? Is this a case where pointer arithmethic could help ? or there is any efficent solution which i missing. I think that splitting it is the wrong way to do it. PS: the original problem is to find the common IDs, given two inverted lists L1 and L2 which store IDs sorted in ascending way. This should be done in O(min {n, m)) instead of O(m+n). Antonio |
|
|
|
|
#2 |
|
Posts: n/a
|
(Antonio) wrote in message news:<. com>...
> > PS: the original problem is to find the common IDs, given two inverted > lists > L1 and L2 which store IDs sorted in ascending way. This should be done > in > O(min {n, m)) instead of O(m+n). It seems that i found a solution using index and substr (it's my fault to forget about the importance of index !). A question: given a scalar, do index/substr access to the specified offset in O(1)? |
|