Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Perl > Perl Misc > Database usage best practices

Reply
Thread Tools

Database usage best practices

 
 
Rainer Weikusat
Guest
Posts: n/a
 
      09-26-2012
I just came accross the following statement in the DBI::Class
documentation,

,----
| Class:BI provides a convenient abstraction layer to a database.
|
| It not only provides a simple database to object mapping layer, but
| can be used to implement several higher order database functions
| (triggers, referential integrity, cascading delete etc.), at the
| application level, rather than at the database.
|
| This is particularly useful when using a database which doesn't
| support these (such as MySQL), or when you would like your code to be
| portable across multiple databases which might implement these things
| in different ways.
|
| In short, Class:BI aims to make it simple to introduce 'best
| practice' when dealing with data stored in a relational database.
`----

I strongly disagree with the opinion that it would be 'best practice'
to treat a RDBMS as 'dumb' system for storing structured data in
binary files and reimplement all the features it already has in
application code on top of it. This rather strikes me as 'back to the
1960!' idea which likely comes from the fact that RDBMSes, originally
supposed to enable people to perform operations on datasets without
having to learn programming in some imperative language first, were so
successful that people who already know how to program in an imperative
language are more or less forced to use them but - of course - they
don't have the slightest interest in actually learning how to so do
efficiently, especially if this means 'learning a second' (or even
more than 'the second') other programming language.

Ironically, this step backwards in the last century markets itself as
modern ...
 
Reply With Quote
 
 
 
 
Charlton Wilbur
Guest
Posts: n/a
 
      09-27-2012
>>>>> "RW" == Rainer Weikusat <(E-Mail Removed)> writes:

RW> I strongly disagree with the opinion that it would be 'best
RW> practice' to treat a RDBMS as 'dumb' system for storing
RW> structured data in binary files and reimplement all the features
RW> it already has in application code on top of it.

Point the 1st: it depends very much on the DB. If you're using MySQL,
the only sane path is to treat it as a dumb system and reimplement the
features you need on top of it.

Point the 2nd: you're treating this as a black and white issue. It's
not whether an ORM layer is saintly or wicked, it's what costs you pay
in terms of performance and maintainability versus what benefits you get
in terms of expressivity, error-resistance, and risk of change when you
include an ORM layer.

In my experience, if you have an application large enough to consider in
terms of MVC that uses a database as a data store in any but the most
trivial of manners, you are a moron if you *don't* use an ORM layer. In
particular, this architecture allows new features in the application to
be developed entirely in software, and then once the desired behavior is
understood and stable, the logic can be pushed down into the database.
The risk and cost of change at levels higher than the ORM layer is much
lower than the risk and cost of change at levels below the ORM layer,
and this architecture offers a way to mitigate that risk.

RW> This rather strikes me as 'back to the 1960!' idea which likely
RW> comes from the fact that RDBMSes, originally supposed to enable
RW> people to perform operations on datasets without having to learn
RW> programming in some imperative language first, were so
RW> successful that people who already know how to program in an
RW> imperative language are more or less forced to use them but - of
RW> course - they don't have the slightest interest in actually
RW> learning how to so do efficiently, especially if this means
RW> 'learning a second' (or even more than 'the second') other
RW> programming language.

I'm sure waving your dick around like this is very pleasant, but please
tuck it back in your pants; people are pointing and laughing.

Charlton


--
Charlton Wilbur
http://www.velocityreviews.com/forums/(E-Mail Removed)
 
Reply With Quote
 
 
 
 
Rainer Weikusat
Guest
Posts: n/a
 
      09-28-2012
Charlton Wilbur <(E-Mail Removed)> writes:

[...]

> I'm sure waving your dick around like this is very pleasant, but please
> tuck it back in your pants; people are pointing and laughing.


I would like to add an additional metapoint: The opinions of worthless
people are worthless.

 
Reply With Quote
 
Rainer Weikusat
Guest
Posts: n/a
 
      09-28-2012
Mladen Gogala <(E-Mail Removed)> writes:
> On Thu, 27 Sep 2012 14:45:52 -0400, Charlton Wilbur wrote:
>> Point the 2nd: you're treating this as a black and white issue. It's
>> not whether an ORM layer is saintly or wicked, it's what costs you pay
>> in terms of performance and maintainability versus what benefits you get
>> in terms of expressivity, error-resistance, and risk of change when you
>> include an ORM layer.

>
> How do you measure expressivity and error resistance? How would you even
> define expressivity? I am not sure what is the connection between risk
> and ORM, so I will not comment on this one.


I don't know what 'expressivity' is supposed to be but I have some
ideas of 'error resistance' for program interacting with databases:
Usually, this means catching database errors and deciding on a
sensible recovery strategy. This is especially important for programs
which are supposed to run continously without supervision. In Perl,
this relatively easy: DBI supports defining a database error
handler. This handler can throw an exception (meaning, an exception
object containing information about the error, not something like
"Can't database!"). This exception can be caught at a suitable place
in the application code. Provided the problem seems transient, the
usually sensible way to deal with that it to disconnect from the
database, reconnect and re-execute the query at the front of the query
queue. This is not anyhow 'magically' tied to 'ORM layers' and not
really difficult to implement.

In simpler cases, it is sufficient to fork the 'worker process' (this
job can also be performed by a special supervisor program) and do the
actual work in the child. Should the child hit a transient problem, it
exits with a non-zero exit code and after seeing that, the parent
simply forks again and the procedure starts over (I know this is
difficult to imagine but 'fork' is actually a useful primitive, not
just an encumbrance on the way to 'execute another program' ...).

[...]

>> In my experience, if you have an application large enough to consider in
>> terms of MVC that uses a database as a data store in any but the most
>> trivial of manners, you are a moron if you *don't* use an ORM layer. In

>
> Oh yes, ORM layer is a good thing, it enables to program very quickly and
> generates classes for you. ORM is usually accompanied by an application
> generator, like Groovy on Grails (Hibernate) which means that the
> projects will be done much more quickly.


This is simply not true: Over the lifetime of a program, the amortized
impplementation cost of every infrastructure facility contained in it
is zero provided that it actually works and works as it would need to
work. As a practical example, the largest Perl program I'm currently
working with is just about to break through the 10,000 LOC barrier
(I've done a couple of large ones in the past). This includes a
database interface layer designed to be suitable for this particular
program and writing that has maybe costed me something like a week
back in 2010. Presently, it amounts to 6% of the code (678 lines) and
something like this is by far too trivial to warrant downloading
seriously large amounts of unknown code 'from the internet'. In
exchange for this minor effort (and some assorted fixes/ enhancements
to the DBD:g module), this has been working 24x7 in dozens of
installations world-wide (which is not much in absolute terms but a
fair lot for a single person to support) without ever causing the
slightest headache to me or the people who use this code. In contrast
to that, 'the Java team' (larger than 1 person) uses Hibernate and
they've certainly meanwhile spend as much time as I spent writing this
code with debugging, investigating and working-around bugs/ quirks in
the ORM. These chores are still ongoing and in exchange for that, they
got a technically inferior solution both from an 'error resilience'
and a user experience standpoint (of course, it can be argued that the
solution to any problem really ought to be 'download more stuff from
the internet' .
 
Reply With Quote
 
Rainer Weikusat
Guest
Posts: n/a
 
      09-28-2012
Rainer Weikusat <(E-Mail Removed)> writes:

[Hibernate]

> in exchange for that, they got a technically inferior solution both
> from an 'error resilience' and a user experience standpoint (of
> course, it can be argued that the solution to any problem really
> ought to be 'download more stuff from the internet' .


Some details on this:

By default, the ORM which happens to be used here does not support
reconnecting to the database after an existing connection was severed
for whatever reason. Should this happen, the UI code becomes dead in
water until the corresponding process is restarted. Meanwhile, this
default policy has been changed and the program survives TCP-related
malaises. But this didn't happen until it caused problems for
customers and understandingly so: The theory behind using third-party
developed infrastructure code is that the people who did that are 'the
experts' in this domain and that a poor application programmer should
be happy that he can concentrate on his problems instead of having to
learn how to tame 'wild beasts' like a RDBMS. Because of this, people
who use this code naturally tend to leaving 'the defaults' alone.

Also by default (or at least when being used in the 'obvious,
straight-forward way') the ORM performs database operations
synchronously and 'the user session' becomes unresponsive until they
are completed. In contrast to this, the Perl program I mentioned in
the previous posting has been designed as 'traditional'
single-threaded, event-driven UNIX(*) process whose 'processing core'
is an I/O multiplexing loop (since it has lots of other stuff to do
than interact with the database). Because of this, it uses the
PostgreSQL asynchronous query processing interface. At the application
level, this means that a query is 'started' based on a query object
and some parameters for this particular invocation and once it has
been completed, a suitable 'continuation' will be invoked (while I
still think the way 'continuations' are supposed to work in scheme is
completely mad, the concept/ term itself is IMHO very useful). Since
Perl supports closures, this is an easy thing to do. The same can very
likely be accomplished with 'the ORM' as well, at the very least by
employing threading in a suitable way, but it wasn't done: The
difference is really that I started with designing how 'the program',
considering its purpose, was generally supposed to operate and then
started looking for already existing code which could be helpful for
accomplishing this instead of starting with "download what everybody
else uses" and then code according the properties it happens to have.

The morale of this is that 'complex abstraction layers' are only
really useful when the people using them already have the knowledge
and skills they are not supposed to need because of the abstraction
layer.

A note the 'dicksizing' auto-responder: I understand that you are
usually motivated by showing off, however, this is not necessarily
true for everyone else. I try to avoid using my own experiences as
an example as hard as possible exactly to avoid soliciting these kind
of useless retorts. But this isn't always possible and I'm not really
convinced that it is sensible.
 
Reply With Quote
 
Kaz Kylheku
Guest
Posts: n/a
 
      09-28-2012
On 2012-09-27, Mladen Gogala <(E-Mail Removed)> wrote:
> How do you measure expressivity and error resistance? How would you even
> define expressivity?


Expressivity is the extent to which you can express the solution to a problem
in terms of only the symbols and language of the problem domain. Let us
call those relevant symbols.

So for instance if we're solving some problem in linguistics, but stuff like
"malloc(sizeof *node * n)" occurs in the solution, then that takes away
from the expressivity. The entity "node" perhaps represents something in the
problem domain, and perhaps n also (being the number of such things), but
malloc, sizeof, and the * dereferencing operator are three irrelevant symbols,
and so is the * multiplicative operator, because it's involved in a memory
management calculation which is irrelevant to the problem domain.

To obtain a measure of expressivity, we could count the total number of symbols
in the code, excluding any punctuation. That yields a denominator by which we
can divide the number of just the relevant symbols to obtain a fraction.

Expressivity is, intuitively, a very real, concrete concept which we can
readily recognize, and that tells us we can put a number on it, just like we
can put a number on how funny a joke is, or how moving is a symphony.
 
Reply With Quote
 
Rainer Weikusat
Guest
Posts: n/a
 
      09-28-2012
Kaz Kylheku <(E-Mail Removed)> writes:
> On 2012-09-27, Mladen Gogala <(E-Mail Removed)> wrote:
>> How do you measure expressivity and error resistance? How would you even
>> define expressivity?

>
> Expressivity is the extent to which you can express the solution to a problem
> in terms of only the symbols and language of the problem domain. Let us
> call those relevant symbols.
>
> So for instance if we're solving some problem in linguistics, but stuff like
> "malloc(sizeof *node * n)" occurs in the solution, then that takes away
> from the expressivity. The entity "node" perhaps represents something in the
> problem domain, and perhaps n also (being the number of such things), but
> malloc, sizeof, and the * dereferencing operator are three irrelevant symbols,
> and so is the * multiplicative operator, because it's involved in a memory
> management calculation which is irrelevant to the problem domain.


It is irrelevant 'to the problem domain' because 'the problem' is
different from the 'the solution'. Because of this, a problem can be
stated abstractly while any technical solution to the problem will
need to make use of existing tools suitable for solving problems. Eg,
let's assume the problem is "I want a coffee!". Neither the ground
coffee beans nor the filter nor the water kettle, the water itself,
the desk all of this is placed on, the material which was used to
construct this desk nor even the coffeepot I employ in order to drink
the result has any 'problem domain' relation to the problem
itself. They are just necessary (or helpful) parts of a technical
solution to it.






 
Reply With Quote
 
Rainer Weikusat
Guest
Posts: n/a
 
      09-28-2012
Rainer Weikusat <(E-Mail Removed)> writes:

[...]

> let's assume the problem is "I want a coffee!". Neither the ground
> coffee beans nor


I was actually being to conservative: As demonstrated by
' !Iaaceeffnotw', stating the problem already requires a lot of things
with absolutely no inherent relation to it. Not even the morphemes or
the sounds the spoken sentence would be composed have any: All just
perfectly abitrary conventions forced onto the poor users of language
because of 'technicalities' of the solution domain.

 
Reply With Quote
 
Kaz Kylheku
Guest
Posts: n/a
 
      09-28-2012
On 2012-09-28, Rainer Weikusat <(E-Mail Removed)> wrote:
> Kaz Kylheku <(E-Mail Removed)> writes:
>> On 2012-09-27, Mladen Gogala <(E-Mail Removed)> wrote:
>>> How do you measure expressivity and error resistance? How would you even
>>> define expressivity?

>>
>> Expressivity is the extent to which you can express the solution to a problem
>> in terms of only the symbols and language of the problem domain. Let us
>> call those relevant symbols.
>>
>> So for instance if we're solving some problem in linguistics, but stuff like
>> "malloc(sizeof *node * n)" occurs in the solution, then that takes away
>> from the expressivity. The entity "node" perhaps represents something in the
>> problem domain, and perhaps n also (being the number of such things), but
>> malloc, sizeof, and the * dereferencing operator are three irrelevant symbols,
>> and so is the * multiplicative operator, because it's involved in a memory
>> management calculation which is irrelevant to the problem domain.

>
> It is irrelevant 'to the problem domain' because 'the problem' is
> different from the 'the solution'. Because of this, a problem can be
> stated abstractly while any technical solution to the problem will
> need to make use of existing tools suitable for solving problems. Eg,
> let's assume the problem is "I want a coffee!". Neither the ground
> coffee beans nor the filter nor the water kettle, the water itself,
> the desk all of this is placed on, the material which was used to
> construct this desk nor even the coffeepot I employ in order to drink
> the result has any 'problem domain' relation to the problem
> itself. They are just necessary (or helpful) parts of a technical
> solution to it.


Those things may be necessary and helpful, but they are not expressive.
(Well, they are "espressive", haha.)

The details of the process for brewing coffee is irrelevant to the problem
domain of "I would like a coffee". The relevant concepts are: are what kind of
coffee I want: latte, cappucino, black. How dark a roast, how much sugar, if
any and so on.

The barista at the coffee shop speaks more or less the relevant language of
this problem domain, as does any well-designed coffee dispensing automaton.
They hide the irrelevant details from the consumer, giving us an abstract
cofee-making interface whereby we can "program" the coffee that we want
in "coffee shop lingo".

In computing, we can similarly hide the necessary and helpful, but irrelevant
concepts. Those are not always bits, bytes, pointers and memory management.

For instance "Low pass filter these samples with a cutoff frequency of
4000 Hz, rolling off at 18 db per octave" is more expressive than constructing
a matrix full of coefficients and then explicitly munging the data
trhough it.
 
Reply With Quote
 
Kaz Kylheku
Guest
Posts: n/a
 
      09-28-2012
On 2012-09-28, Mladen Gogala <(E-Mail Removed)> wrote:
> On Fri, 28 Sep 2012 18:12:30 +0000, Kaz Kylheku wrote:
>
>> Expressivity is the extent to which you can express the solution to a
>> problem in terms of only the symbols and language of the problem domain.
>> Let us call those relevant symbols.

>
> Thanks for the nice explanation. I haven't heard about the expressivity
> before, my spelling checker is still flagging it out. What does
> expressivity show me and what can I use it for? May I conclude anything
> about the program unit being studied, based on its expressivity?


Yes. For example, if you have sufficient expressivity, there isn't much
difference between the specification of a problem and the solution. And, of
course, correctness means that the solution implements the specification of the
problem, so if they are not so different, correctness is easier to verify.

So certain conclusions are easier to obtain from expressive code. But not
necessarily others. If you require conclusions about the details which
are hidden, that may be difficult to impossible.

For instance, suppose you want reach a conclusion about how much memory the
solution requires, but there is no explicit memory management. Then you have to
infer it somehow by knowing how it works "under the hood", or rely on
profiling tools.
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Database Database Database Database scott93727@gmail.com Computer Information 0 09-27-2012 02:43 AM
DataBase DataBase DataBase DataBase scott93727@gmail.com Computer Information 0 09-26-2012 09:40 AM
Searching Database - Design Best Practices thomson ASP .Net 4 11-16-2007 06:19 PM
XML Database Best Practices Yehia A.Salam ASP .Net 9 05-31-2007 07:28 PM
Best practices for cookies in classic ASP - memory usage MartyN ASP General 2 05-30-2007 06:44 PM



Advertisments