Go Back   Velocity Reviews > Newsgroups > Java
User Name
Password
Register FAQ Members List Calendar Search Today's Posts Mark Forums Read

Reply

Java - Re: A Logging data structure

 
Thread Tools Search this Thread
Old 02-21-2009, 01:57 PM   #1
Default Re: A Logging data structure


anal_aviator wrote:
> Hi,
>
> I have about 40,000 blocks sequentially in a file. each block is 1024 bytes,
>
> My application can either read or write blocks, but what i need to do is
> keep the original file intact , and build up log , so that when a block is
> written , any further reads , read from 'that' block and not the original
> file.


Sounds like a cache. You could hold the updated blocks in
a hash table or tree or some such, keyed on block number, search
the cache first and refer to the original file only when the
cache doesn't hold the block of interest.

But at forty megabytes it's probably simpler just to make
a working copy of the original file and muck with the copy as
freely as you like. Maybe just read it all into memory and
hammer on it if persistence isn't needed.

> So basically we start off with a virgin file, which is the root, then log and
> continually repoint as blocks are written to some place else.
> The sticky problem is that we need to keep a history of the blocks that were
> re-written along with the data , that they held.


So you create another sequential file in which you log each
change as you make it. You'll surely want the pre-change data,
possibly the post-change data as well, maybe with time stamps
and other decorations of your choosing.

> Ideally each block in the master file should have some sort of linked list/
> history attached to it , that not only shows the history of writes /reads to
> a given block , but maintains them in a contest of r/w to the overall file.


Hunh? Didn't you say you needed to keep the original file
intact? How does that square with making changes to the file?
I don't get it.

You can record the reads as well as the writes in your log,
if you feel like it. (What is this: Some kind of security app
where you want to be able to prove that So-And-So peeked at
Such-And-Such's tax records outside office hours?) You'll need
to make some decisions about how to archive your ever-expanding
log, though.

> it is very much like a transactional file system, fortunately the number of
> writes will be minimal, so it is easily manageable form a storage point of
> view.


If you want the semantics of a transactional file system,
you might consider using -- forgive me if this idea is simply
too weird -- a transactional file system ...

> It must also be very fast , generally it should not get slower if a
> particular block starts to build up a chain of writes, the idea being that a
> read goes straight to the head of the last write to a particular block. (so
> a linked list is out), also maintenance should be minimal.


Sorry; I'm unable to decipher this last set of requirements.
What, exactly, must be "fast?" How "fast" must it be to qualify
as "very fast?" Are you talking about throughput or about
latency? And what do you mean by "maintenance" (programmer time,
reorg time, backup time, ...)? And how far are you prepared to
compromise on the other requirements to keep it "minimal?"

--
Eric Sosman
lid


Eric Sosman
  Reply With Quote
Old 02-21-2009, 03:43 PM   #2
Martin Gregorie
 
Posts: n/a
Default Re: A Logging data structure
On Sat, 21 Feb 2009 08:57:03 -0500, Eric Sosman wrote:

> anal_aviator wrote:
>> it is very much like a transactional file system, fortunately the
>> number of writes will be minimal, so it is easily manageable form a
>> storage point of view.

>
> If you want the semantics of a transactional file system,
> you might consider using -- forgive me if this idea is simply too weird
> -- a transactional file system ...
>

So why not make it one and use a database table?

Each row would hold the original block together with a two field prime
key. The first field would be the sequence number of the original block
in its file and the second would start from zero and be incremented for
each edited copy of the block. Add any more information you might need,
e.g. the edit timestamp and user name, and you're done.

Access is fast since the prime key index is only 40,000+ entries. changes
are easily found by comparing the block keyed with n,k-1 with n,k and
change logs are easily extracted.

>> It must also be very fast , generally it should not get slower if a
>> particular block starts to build up a chain of writes,

>

The speed impact should be small:

select block_content from block_table where seqno=?
order by editno descending limit 1;

The rows are fairly small (a payload of only 1K bytes is nothing) so the
extract and sort needed to retrieve the latest edit should fast and only
the required data would be returned to the JDBC client.


--
martin@ | Martin Gregorie
gregorie. | Essex, UK
org |


Martin Gregorie
  Reply With Quote
Old 02-22-2009, 02:21 AM   #3
steve
 
Posts: n/a
Default Re: A Logging data structure
On Sat, 21 Feb 2009 21:57:03 +0800, Eric Sosman wrote
(in article <gnp17k$kpm$>):

> anal_aviator wrote:
>> Hi,
>>
>> I have about 40,000 blocks sequentially in a file. each block is 1024 bytes,
>>
>> My application can either read or write blocks, but what i need to do is
>> keep the original file intact , and build up log , so that when a block
>> is
>> written , any further reads , read from 'that' block and not the original
>> file.

>
> Sounds like a cache. You could hold the updated blocks in
> a hash table or tree or some such, keyed on block number, search
> the cache first and refer to the original file only when the
> cache doesn't hold the block of interest.
>
> But at forty megabytes it's probably simpler just to make
> a working copy of the original file and muck with the copy as
> freely as you like. Maybe just read it all into memory and
> hammer on it if persistence isn't needed.
>

Yep We could do that , but it does not give us a history, only a 'snapshot' .


>> So basically we start off with a virgin file, which is the root, then log
>> and
>> continually repoint as blocks are written to some place else.
>> The sticky problem is that we need to keep a history of the blocks that
>> were
>> re-written along with the data , that they held.

>
> So you create another sequential file in which you log each
> change as you make it. You'll surely want the pre-change data,
> possibly the post-change data as well, maybe with time stamps
> and other decorations of your choosing.
>
>> Ideally each block in the master file should have some sort of linked list/
>> history attached to it , that not only shows the history of writes /reads
>> to
>> a given block , but maintains them in a contest of r/w to the overall file.

>
> Hunh? Didn't you say you needed to keep the original file
> intact? How does that square with making changes to the file?
> I don't get it.

Basically the master file is the 'root' , and writes would branch off in some
sort of data structure.


>
> You can record the reads as well as the writes in your log,
> if you feel like it. (What is this: Some kind of security app
> where you want to be able to prove that So-And-So peeked at
> Such-And-Such's tax records outside office hours?) You'll need
> to make some decisions about how to archive your ever-expanding
> log, though.
>
>> it is very much like a transactional file system, fortunately the number
>> of
>> writes will be minimal, so it is easily manageable form a storage point of
>> view.

>
> If you want the semantics of a transactional file system,
> you might consider using -- forgive me if this idea is simply
> too weird -- a transactional file system ...
>

That was just an example to try and convey the idea, the only issue with a
TFS , it that the history is trashed once the transactions are written to
disk.

>> It must also be very fast , generally it should not get slower if a
>> particular block starts to build up a chain of writes, the idea being that
>> a
>> read goes straight to the head of the last write to a particular block.
>> (so
>> a linked list is out), also maintenance should be minimal.

>
> Sorry; I'm unable to decipher this last set of requirements.
> What, exactly, must be "fast?" How "fast" must it be to qualify
> as "very fast?" Are you talking about throughput or about
> latency? And what do you mean by "maintenance" (programmer time,
> reorg time, backup time, ...)? And how far are you prepared to
> compromise on the other requirements to keep it "minimal?"


Fast as it can be ,compared to other available solutions, linked lists are
potentially slow when they get longer, because you have to chain down the
length of the transactions.

>
>





steve
  Reply With Quote
Old 02-22-2009, 04:40 AM   #4
Lew
 
Posts: n/a
Default Re: A Logging data structure
Please do not top-post.

anal_aviator wrote:
> Yes It could be a solution, but it means cracking open a database, extra
> support infrastructure and something else to go wrong.


Don't be afraid of database programming. Once you get used to it it isn't all
that hard. The built-in Java DB (a.k.a., "Derby") is fairly easy to set up,
comes free with Sun's JDK, and is well worth learning.

> I had looked at oracle [sic], but things just seemed to get more complicated and
> bigger.


Go with Java DB, then.

> I'm looking for a single machine solution, plus i [sic] have to keep network
> traffic to a minimum , since my 'monitored' data traffic is coming in over
> tcp


Databases need not add to network traffic, nor be excessivly complicated.

I'm not saying that database is the answer for you necessarily, only that
complexity and network traffic concerns need not stop you from using one.

It is true that to master database usage is a learning curve, but simple uses
with simple table structures don't take all that long to put together, nor do
they require massive administration.

--
Lew


Lew
  Reply With Quote
Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are Off
Pingbacks are Off
Refbacks are Off

Similar Threads
Thread Thread Starter Forum Replies Last Post
logging to CISCO lan switch 3560 through key based SSH authentication. veena bhaskar Hardware 1 10-16-2008 10:59 AM
xslt for tree structure. please help me arumahi Software 0 09-03-2007 04:29 PM
ASP.Net Project Structure Question koraykazgan Software 0 08-10-2007 08:23 AM
multiuser XP reboots when logging off opie General Help Related Topics 3 04-10-2007 11:10 AM
Logging Link UP/DOWN Status (4506) To Console prad General Help Related Topics 0 08-15-2006 05:35 PM




SEO by vBSEO 3.3.2 ©2009, Crawlability, Inc.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46