Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Python > Uniquely identifying each & every html template

Reply
Thread Tools

Uniquely identifying each & every html template

 
 
Ferrous Cranus
Guest
Posts: n/a
 
      01-21-2013
Τη Σάββατο, 19 Ιανουαρίου 2013 11:00:15 π.μ. UTC+2, ο χρήστης Dave Angel *γραψε:
> On 01/19/2013 03:39 AM, Ferrous Cranus wrote:
>
> > Τη Σάββατο, 19 Ιανουαρίου 2013 12:09:28 π.μ. UTC+2, ο χρήστης Dave Angel*γραψε:

>
> >

>
> >> I don't understand the problem. A trivial Python script could scan

>
> >>

>
> >> through all the files in the directory, checking which ones are missing

>
> >>

>
> >> the identifier, and rewriting the file with the identifier added.

>
> >

>
> >>

>
> >> So, since you didn't come to that conclusion, there must be some other

>
> >>

>
> >> reason you don't want to edit the files. Is it that the real sources

>
> >>

>
> >> are elsewhere (e.g. Dreamweaver), and whenever one recompiles those

>
> >>

>
> >> sources, these files get replaced (without identifiers)?

>
> >

>
> > Exactly. Files get modified/updates thus the embedded identifier will be missing each time. So, relying on embedding code to html template contentis not practical.

>
> >

>
> >

>
> >> If that's the case, then I figure you have about 3 choices:

>
> >> 1) use the file path as your key, instead of requiring a number

>
> >

>
> > No, i cannot, because it would mess things at a later time on when i for example:

>
> >

>
> > 1. mv name.html othername.html (document's filename altered)

>
> > 2. mv name.html /subfolder/name.html (document's filepath altered)

>
> >

>
> > Hence, new database counters will be created for each of the above actions, therefore i will be having 2 counters for the same file, and the latter one will start from a zero value.

>
> >

>
> > Pros: If the file's contents gets updated, that won't affect the counter.

>
> > Cons: If filepath is altered, then duplicity will happen.

>
> >

>
> >

>
> >> 2) use a hash of the page (eg. md5) as your key. of course this could

>
> >> mean that you get a new value whenever the page is updated. That's good

>
> >> in many situations, but you don't give enough information to know if

>
> >> that's desirable for you or not.

>
> >

>
> > That sounds nice! A hash is a mathematical algorithm that produce a unique number after analyzing each file's contents? But then again what if thehtml templated gets updated? That update action will create a new hash forthe file, hence another counter will be created for the same file, same end result as (1) solution.

>
> >

>
> > Pros: If filepath is altered, that won't affect the counter.

>
> > Cons: If file's contents gets updated the, then duplicity will happen.

>
> >

>
> >

>
> >> 3) Keep an external list of filenames, and their associated id numbers..

>
> >> The database would be a good place to store such a list, in a separatetable.

>
> >

>
> > I did not understand that solution.

>
> >

>
> >

>
> > We need to find a way so even IF:

>
> >

>
> > (filepath gets modified && file content's gets modified) simultaneouslythe counter will STILL retains it's value.

>
> >

>
>
>
> You don't yet have a programming problem, you have a specification
>
> problem. Somehow, you want a file to be considered "the same" even when
>
> it's moved, renamed and/or modified. So all files are the same, and you
>
> only need one id.
>
> Don't pick a mechanism until you have an self-consistent spec.



I do have the specification.

An .html page must retain its database counter value even if its:

(renamed && moved && contents altered)


[original attributes of the file]:

filename: index.html
filepath: /home/nikos/public_html/
contents: <html> Hello </html>

[get modified to]:

filename: index2.html
filepath: /home/nikos/public_html/folder/subfolder/
contents: <html> Hello, people </html>


The file is still the same, even though its attributes got modified.
We want counter.py script to still be able to "identify" the .html page, hence its counter value in order to get increased properly.
 
Reply With Quote
 
 
 
 
Ferrous Cranus
Guest
Posts: n/a
 
      01-21-2013
Τη Σάββατο, 19 Ιανουαρίου 2013 11:00:15 π.μ. UTC+2, ο χρήστης Dave Angel *γραψε:
> On 01/19/2013 03:39 AM, Ferrous Cranus wrote:
>
> > Τη Σάββατο, 19 Ιανουαρίου 2013 12:09:28 π.μ. UTC+2, ο χρήστης Dave Angel*γραψε:

>
> >

>
> >> I don't understand the problem. A trivial Python script could scan

>
> >>

>
> >> through all the files in the directory, checking which ones are missing

>
> >>

>
> >> the identifier, and rewriting the file with the identifier added.

>
> >

>
> >>

>
> >> So, since you didn't come to that conclusion, there must be some other

>
> >>

>
> >> reason you don't want to edit the files. Is it that the real sources

>
> >>

>
> >> are elsewhere (e.g. Dreamweaver), and whenever one recompiles those

>
> >>

>
> >> sources, these files get replaced (without identifiers)?

>
> >

>
> > Exactly. Files get modified/updates thus the embedded identifier will be missing each time. So, relying on embedding code to html template contentis not practical.

>
> >

>
> >

>
> >> If that's the case, then I figure you have about 3 choices:

>
> >> 1) use the file path as your key, instead of requiring a number

>
> >

>
> > No, i cannot, because it would mess things at a later time on when i for example:

>
> >

>
> > 1. mv name.html othername.html (document's filename altered)

>
> > 2. mv name.html /subfolder/name.html (document's filepath altered)

>
> >

>
> > Hence, new database counters will be created for each of the above actions, therefore i will be having 2 counters for the same file, and the latter one will start from a zero value.

>
> >

>
> > Pros: If the file's contents gets updated, that won't affect the counter.

>
> > Cons: If filepath is altered, then duplicity will happen.

>
> >

>
> >

>
> >> 2) use a hash of the page (eg. md5) as your key. of course this could

>
> >> mean that you get a new value whenever the page is updated. That's good

>
> >> in many situations, but you don't give enough information to know if

>
> >> that's desirable for you or not.

>
> >

>
> > That sounds nice! A hash is a mathematical algorithm that produce a unique number after analyzing each file's contents? But then again what if thehtml templated gets updated? That update action will create a new hash forthe file, hence another counter will be created for the same file, same end result as (1) solution.

>
> >

>
> > Pros: If filepath is altered, that won't affect the counter.

>
> > Cons: If file's contents gets updated the, then duplicity will happen.

>
> >

>
> >

>
> >> 3) Keep an external list of filenames, and their associated id numbers..

>
> >> The database would be a good place to store such a list, in a separatetable.

>
> >

>
> > I did not understand that solution.

>
> >

>
> >

>
> > We need to find a way so even IF:

>
> >

>
> > (filepath gets modified && file content's gets modified) simultaneouslythe counter will STILL retains it's value.

>
> >

>
>
>
> You don't yet have a programming problem, you have a specification
>
> problem. Somehow, you want a file to be considered "the same" even when
>
> it's moved, renamed and/or modified. So all files are the same, and you
>
> only need one id.
>
> Don't pick a mechanism until you have an self-consistent spec.



I do have the specification.

An .html page must retain its database counter value even if its:

(renamed && moved && contents altered)


[original attributes of the file]:

filename: index.html
filepath: /home/nikos/public_html/
contents: <html> Hello </html>

[get modified to]:

filename: index2.html
filepath: /home/nikos/public_html/folder/subfolder/
contents: <html> Hello, people </html>


The file is still the same, even though its attributes got modified.
We want counter.py script to still be able to "identify" the .html page, hence its counter value in order to get increased properly.
 
Reply With Quote
 
 
 
 
Chris Angelico
Guest
Posts: n/a
 
      01-21-2013
On Mon, Jan 21, 2013 at 6:08 PM, Ferrous Cranus <> wrote:
> An .html page must retain its database counter value even if its:
>
> (renamed && moved && contents altered)


Then you either need to tag them in some external way, or have some
kind of tracking operation - for instance, if you require that all
renames/moves be done through a script, that script can update its
pointer. Otherwise, you need magic, and lots of it.

ChrisA
 
Reply With Quote
 
Ferrous Cranus
Guest
Posts: n/a
 
      01-21-2013
Τη Δευτ*ρα, 21 Ιανουαρίου 2013 9:20:15 π.μ. UTC+2, ο χρήστης Chris Angelico *γραψε:
> On Mon, Jan 21, 2013 at 6:08 PM, Ferrous Cranus <> wrote:
>
> > An .html page must retain its database counter value even if its:

>
> >

>
> > (renamed && moved && contents altered)

>
>
>
> Then you either need to tag them in some external way, or have some
>
> kind of tracking operation - for instance, if you require that all
>
> renames/moves be done through a script, that script can update its
>
> pointer. Otherwise, you need magic, and lots of it.
>
>
>
> ChrisA


This python script acts upon websites other people use and
every html templates has been written by different methods(notepad++, dreamweaver, joomla).

Renames and moves are performed, either by shell access or either by cPanel access by website owners.

That being said i have no control on HOW and WHEN users alter their html pages.
 
Reply With Quote
 
Ferrous Cranus
Guest
Posts: n/a
 
      01-21-2013
Τη Δευτ*ρα, 21 Ιανουαρίου 2013 9:20:15 π.μ. UTC+2, ο χρήστης Chris Angelico *γραψε:
> On Mon, Jan 21, 2013 at 6:08 PM, Ferrous Cranus <> wrote:
>
> > An .html page must retain its database counter value even if its:

>
> >

>
> > (renamed && moved && contents altered)

>
>
>
> Then you either need to tag them in some external way, or have some
>
> kind of tracking operation - for instance, if you require that all
>
> renames/moves be done through a script, that script can update its
>
> pointer. Otherwise, you need magic, and lots of it.
>
>
>
> ChrisA


This python script acts upon websites other people use and
every html templates has been written by different methods(notepad++, dreamweaver, joomla).

Renames and moves are performed, either by shell access or either by cPanel access by website owners.

That being said i have no control on HOW and WHEN users alter their html pages.
 
Reply With Quote
 
Chris Angelico
Guest
Posts: n/a
 
      01-21-2013
On Mon, Jan 21, 2013 at 8:19 PM, Ferrous Cranus <> wrote:
> This python script acts upon websites other people use and
> every html templates has been written by different methods(notepad++, dreamweaver, joomla).
>
> Renames and moves are performed, either by shell access or either by cPanel access by website owners.
>
> That being said i have no control on HOW and WHEN users alter their html pages.


Then I recommend investing in some magic. There's an old-established
business JW Wells & Co, Family Sorcerers. They've a first-rate
assortment of magic, and for raising a posthumous shade with effects
that are comic, or tragic, there's no cheaper house in the trade! If
anyone anything lacks, he'll find it all ready in stacks, if he'll
only look in on the resident Djinn, number seventy, Simmery Axe!

Seriously, you're asking for something that's beyond the power of
humans or computers. You want to identify that something's the same
file, without tracking the change or having any identifiable tag.
That's a fundamentally impossible task.

ChrisA
 
Reply With Quote
 
Ferrous Cranus
Guest
Posts: n/a
 
      01-21-2013
Τη Δευτ*ρα, 21 Ιανουαρίου 2013 11:31:24 π.μ. UTC+2, ο χρήστης Chris Angelico *γραψε:
> On Mon, Jan 21, 2013 at 8:19 PM, Ferrous Cranus <> wrote:
>
> > This python script acts upon websites other people use and

>
> > every html templates has been written by different methods(notepad++, dreamweaver, joomla).

>
> >

>
> > Renames and moves are performed, either by shell access or either by cPanel access by website owners.

>
> >

>
> > That being said i have no control on HOW and WHEN users alter their html pages.

>
>
>
> Then I recommend investing in some magic. There's an old-established
>
> business JW Wells & Co, Family Sorcerers. They've a first-rate
>
> assortment of magic, and for raising a posthumous shade with effects
>
> that are comic, or tragic, there's no cheaper house in the trade! If
>
> anyone anything lacks, he'll find it all ready in stacks, if he'll
>
> only look in on the resident Djinn, number seventy, Simmery Axe!
>
>
>
> Seriously, you're asking for something that's beyond the power of
>
> humans or computers. You want to identify that something's the same
>
> file, without tracking the change or having any identifiable tag.
>
> That's a fundamentally impossible task.


No, it is difficult but not impossible.
It just cannot be done by tagging the file by:

1. filename
2. filepath
3. hash (math algorithm producing a string based on the file's contents)

We need another way to identify the file WITHOUT using the above attributes..
 
Reply With Quote
 
Ferrous Cranus
Guest
Posts: n/a
 
      01-21-2013
Τη Δευτ*ρα, 21 Ιανουαρίου 2013 11:31:24 π.μ. UTC+2, ο χρήστης Chris Angelico *γραψε:
> On Mon, Jan 21, 2013 at 8:19 PM, Ferrous Cranus <> wrote:
>
> > This python script acts upon websites other people use and

>
> > every html templates has been written by different methods(notepad++, dreamweaver, joomla).

>
> >

>
> > Renames and moves are performed, either by shell access or either by cPanel access by website owners.

>
> >

>
> > That being said i have no control on HOW and WHEN users alter their html pages.

>
>
>
> Then I recommend investing in some magic. There's an old-established
>
> business JW Wells & Co, Family Sorcerers. They've a first-rate
>
> assortment of magic, and for raising a posthumous shade with effects
>
> that are comic, or tragic, there's no cheaper house in the trade! If
>
> anyone anything lacks, he'll find it all ready in stacks, if he'll
>
> only look in on the resident Djinn, number seventy, Simmery Axe!
>
>
>
> Seriously, you're asking for something that's beyond the power of
>
> humans or computers. You want to identify that something's the same
>
> file, without tracking the change or having any identifiable tag.
>
> That's a fundamentally impossible task.


No, it is difficult but not impossible.
It just cannot be done by tagging the file by:

1. filename
2. filepath
3. hash (math algorithm producing a string based on the file's contents)

We need another way to identify the file WITHOUT using the above attributes..
 
Reply With Quote
 
Oscar Benjamin
Guest
Posts: n/a
 
      01-21-2013
On 21 January 2013 12:06, Ferrous Cranus <> wrote:
> , 21 2013 11:31:24 .. UTC+2, Chris Angelico :
>>
>> Seriously, you're asking for something that's beyond the power of
>> humans or computers. You want to identify that something's the same
>> file, without tracking the change or having any identifiable tag.
>>
>> That's a fundamentally impossible task.

>
> No, it is difficult but not impossible.
> It just cannot be done by tagging the file by:
>
> 1. filename
> 2. filepath
> 3. hash (math algorithm producing a string based on the file's contents)
>
> We need another way to identify the file WITHOUT using the above attributes.


This is a very old problem (still unsolved I believe):
http://en.wikipedia.org/wiki/Ship_of_Theseus


Oscar
 
Reply With Quote
 
Joel Goldstick
Guest
Posts: n/a
 
      01-21-2013
This is trolling Ferrous. you are a troll. Go away


On Mon, Jan 21, 2013 at 7:39 AM, Oscar Benjamin
<>wrote:

> On 21 January 2013 12:06, Ferrous Cranus <> wrote:
> > Τη Δευτ*ρα, 21 Ιανουαρίου 2013 11:31:24 π.μ. UTC+2, ο χρήστης Chris

> Angelico *γραψε:
> >>
> >> Seriously, you're asking for something that's beyond the power of
> >> humans or computers. You want to identify that something's the same
> >> file, without tracking the change or having any identifiable tag.
> >>
> >> That's a fundamentally impossible task.

> >
> > No, it is difficult but not impossible.
> > It just cannot be done by tagging the file by:
> >
> > 1. filename
> > 2. filepath
> > 3. hash (math algorithm producing a string based on the file's contents)
> >
> > We need another way to identify the file WITHOUT using the above

> attributes.
>
> This is a very old problem (still unsolved I believe):
> http://en.wikipedia.org/wiki/Ship_of_Theseus
>
>
> Oscar
> --
> http://mail.python.org/mailman/listinfo/python-list
>




--
Joel Goldstick
http://joelgoldstick.com

 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Generated UPDATE statement. could not determine which columns uniquely identify the rows for "Customers" bazzer ASP .Net 8 03-23-2007 08:26 PM
could not determine which columns uniquely identify the rows for ... bazzer ASP .Net 0 04-10-2006 11:09 AM
Semi OT: Uniquely Identifying Substrings for an Elem in a Set: substr, Sets and Complexity Veli-Pekka Ttil Perl Misc 6 08-23-2005 09:10 AM
How do I uniquely identify a control? Alan Silver ASP .Net 6 02-24-2005 06:31 PM
The best way to uniquely identify anonymous visitors muser8@hotmail.com ASP .Net 2 07-26-2004 11:47 PM



Advertisments
 



1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57