MD5 Alive?

Discussion in 'Computer Security' started by JuiceMan, Aug 20, 2007.

  1. JuiceMan

    JuiceMan Guest

    I'm hoping I can find somebody who might know about the guts of MD5 ,
    really on a platform or implementation independent way.

    Background: I've got an existing web app (Perl/CGI) chugging along
    doing its thing. It's been happy for some time now. I'm relying on MD5
    to give me a unique value for a password, data (m/d/y), some other
    flags, prefereneces, or status as to what the user is doing all told
    about 70-80 characters. I'm not relying solely on MD5 for password
    authentication. I have other alogrithmns being used for that. This has
    nothing to do with on-line commerce. If the user submits a form either
    one or all of the following could be wrong: username, not todays date,
    my code may not be able to understand where they are. I find something
    wrong I reset user has to relog in and starts from the beginning. What
    has become a nice little artifact of this is that users have to relog
    in everyday.

    So I was young back then I choose to use authentication over the use
    of cookies. I'm not going to update this to change, but as I ponder my
    next design for the current client I do have opportunity to try
    something new. Though I'd rather not. So MD5:

    >From what I've heard and gotten as feedback I think I'm isolated from

    the vulnerabilties of MD5? (Q1. True or False). In general, its a low
    risk that anybody is go looking into the source and see the hidden
    fields. It's an even less of a risk that somebody would even think I'm
    doing MD5 for this. I was young back then :) Also in the grand scheme
    things not that major user log in is not impacted password is
    maintained under another encryption function of the database itself.
    Worse case user will avoid logging in each day or confuse my script as
    to what there doing. There's nothing for them to see that they can't
    see now.

    So now as I ponder my new design I would be interested in answers to
    the following questions, and it sounds like I just stumbled onto a
    site that might be able to answer them:

    What's even better (I think?) is that I've got a twist on MD5. I'm
    running the digest 25 times on itself, breaking the digest into 2
    parts and inserting a known string (like a salt) in between the parts
    and rerunning it another 25 times all 3 reassembled pieces(original
    first, my salt, and original second). Am I interfering with the nature
    MD5 by doing this and losing anything it's giving me, by doing this do
    I run the risk of not getting a unique value from MD5? (Q2 Yes or No).

    I've visited one of those cracker sites. I've ran MD5 a 5-6 times on
    itself and they were able to tell me what my previous digest was and
    eventually back to the original Ok fine. I ran it 1250 times on itself
    and they weren't so successful.

    I've gotten a detail implmentation of MD5 which matches what the
    cracker site tells me, and the results both match for "abc" (just for
    fun). I also have a detailed write up as to MD5's working. I might
    just fall in love with this. Now instead being in some library
    somewhere I have a sense of control of it and I know my host won't
    take it away from me. As I read it there are 4 functions that are used
    each "round" summarized below:
    1. AND(x,y) OR AND(Notx, y)
    2. AND(x,z) OR AND(Notz,y)
    3. EXOR (X,Y,Z)
    4. EXOR(Y, (AND (X, NotZ)
    Sorry, notation's not the best but hopefully you can decipher. So what
    if I changed the order of the functions like did 2,1,3,4. Can I louse
    up MD5 so that it won't be effective in producing unique digest (Q3.
    Yes or No)? What about if changed or scrambled what x, y, z like kept
    the calculations the same like made my z into x, y into z, and x
    became y? (Q4. Rearrange the letters hurts MD5?).

    I'm guessing I could really louse up MD5 -- if I really got in there
    and started playing around with it. As I do gaze into the future I do
    like authentication over cookies although the latter will probably win/
    has won out. I don't need a 128 characters of output. If I could
    shorten that maybe do some different operations within the MD5
    algorihtmn and not compromise it I might be able to make good use of
    it.

    Sorry, this inevitably brings up a debate between the use of cookies
    and authentication to maintain state in HTML documents. I do admit
    that cookies are better and they will win :<, but please don't blame a
    guy for trying or thinking. In my brief of your site sounds like this
    is what you folks do anyway.
    JuiceMan, Aug 20, 2007
    #1
    1. Advertising

  2. JuiceMan <> (07-08-19 17:07:22):

    > From what I've heard and gotten as feedback I think I'm isolated from
    > the vulnerabilties of MD5? (Q1. True or False).


    MD5 is weak regarding collision resistance. Whether this is a threat to
    your application depends on your specific usage scenario. If I
    understood your scenario properly, then you should be safe. Still
    switch to a stronger hash function for future applications.


    > What's even better (I think?) is that I've got a twist on MD5. I'm
    > running the digest 25 times on itself, breaking the digest into 2
    > parts and inserting a known string (like a salt) in between the parts
    > and rerunning it another 25 times all 3 reassembled pieces(original
    > first, my salt, and original second). Am I interfering with the nature
    > MD5 by doing this and losing anything it's giving me, by doing this do
    > I run the risk of not getting a unique value from MD5? (Q2 Yes or No).


    Probably yes. Say, you're running MD5 five times, and let F = G = MD5.
    You're calculating the hash value with the following formula:

    H = (G . G . G . G . F) P

    where P is the plaintext and `.' means function composition. Although F
    and G are the same function, there is a slight difference. While F's
    input space is infinite (the set of all finite messages), G's input
    space is finite and in fact the same as its output space. So if MD5 is
    not a bijection, then you're losing entropy by applying it multiple
    times without changing the intermediate hash values. On the other hand,
    you add complexity by adding computation time, but this may be a small
    benefit.

    Put short: Probably you shouldn't do this.


    > 1. AND(x,y) OR AND(Notx, y)
    > 2. AND(x,z) OR AND(Notz,y)
    > 3. EXOR (X,Y,Z)
    > 4. EXOR(Y, (AND (X, NotZ)
    > Sorry, notation's not the best but hopefully you can decipher. So what
    > if I changed the order of the functions like did 2,1,3,4. Can I louse
    > up MD5 so that it won't be effective in producing unique digest (Q3.
    > Yes or No)?


    MD5 does not, and in fact cannot, guarantee uniqueness. If, like stated
    above, MD5 with the same input space as output space is a bijection,
    then it does guarantee uniqueness for inputs of exactly 128 bits, but we
    don't know that even for the original MD5. So reasoning about how a
    modified version behaves is going to be even more difficult without
    proper research.


    > What about if changed or scrambled what x, y, z like kept the
    > calculations the same like made my z into x, y into z, and x became y?
    > (Q4. Rearrange the letters hurts MD5?).


    Consider that there are reasons for the way MD5 works, you shouldn't
    change anything at all. Your function might be more secure, but history
    has shown that in general such customizations rather lead to weaker
    primitives. If you find MD5 too weak, use another hash function, but I
    urge you to refrain from custom adjustments without background knowledge
    and a lot of research.


    > Sorry, this inevitably brings up a debate between the use of cookies
    > and authentication to maintain state in HTML documents. I do admit
    > that cookies are better and they will win :<, but please don't blame a
    > guy for trying or thinking. In my brief of your site sounds like this
    > is what you folks do anyway.


    There are many other methods of maintaining state in query-response
    protocols. For non-public sites I like the idea of using SSL
    certificate-based authentication. Unfortunately the average user won't
    be able to comprehend this, and will eventually look for alternatives,
    so this method really isn't suitable for public sites.


    Regards,
    Ertugrul Söylemez.


    --
    Security is the one concept, which makes things in your life stay as
    they are. Otto is a man, who is afraid of changes in his life; so
    naturally he does not employ security.
    Ertugrul Soeylemez, Aug 22, 2007
    #2
    1. Advertising

  3. JuiceMan

    Sebastian G. Guest

    Ertugrul Soeylemez wrote:


    > MD5 does not, and in fact cannot, guarantee uniqueness. If, like stated
    > above, MD5 with the same input space as output space is a bijection,
    > then it does guarantee uniqueness for inputs of exactly 128 bits, but we
    > don't know that even for the original MD5.



    Actually we should assume quite the contrary: If MD5 is a pseudorandom
    function, than it's a pseudorandom mapping and therefore about 1/e of all
    outputs will not occur, and about the same number will be collisions.
    Sebastian G., Aug 22, 2007
    #3
  4. JuiceMan

    JuiceMan Guest

    On Aug 22, 7:53 am, "Sebastian G." <> wrote:
    > Ertugrul Soeylemez wrote:
    > >MD5does not, and in fact cannot, guarantee uniqueness. If, like stated
    > > above,MD5with the same input space as output space is a bijection,
    > > then it does guarantee uniqueness for inputs of exactly 128 bits, but we
    > > don't know that even for the originalMD5.

    >
    > Actually we should assume quite the contrary: IfMD5is a pseudorandom
    > function, than it's a pseudorandom mapping and therefore about 1/e of all
    > outputs will not occur, and about the same number will be collisions.


    Thanks for the feedback. I figured monkeying around with the internals
    of MD5 would not help me.

    However, some alarming points were brought up and I wouldn't mind
    clarification on them.

    It sounds like I got a big NO to my 25 times, split, 25 more times. As
    in this is NOT making MD5 any more secure and in fact might be making
    it less secure.

    Running the Digest on itself say upwards of 1000 times is -- in the
    circles that I've been in kind of an accepted thing to do.

    For example if I take "abc" and run MD5 on it, take the answer run MD5
    on that, and do so a 1000 more times what I get is no more secure then
    if I ran it once?

    If I'm reading correctly it even sounds that what I get utlimately at
    the end of 1000 times might be the same as if it had been done with
    "xyz" 1000 times?

    Is this true?
    JuiceMan, Aug 23, 2007
    #4
  5. JuiceMan <> (07-08-22 19:36:55):

    > > > MD5 does not, and in fact cannot, guarantee uniqueness. If, like
    > > > stated above, MD5 with the same input space as output space is a
    > > > bijection, then it does guarantee uniqueness for inputs of exactly
    > > > 128 bits, but we don't know that even for the original MD5.

    > >
    > > Actually we should assume quite the contrary: If MD5 is a
    > > pseudorandom function, than it's a pseudorandom mapping and
    > > therefore about 1/e of all outputs will not occur, and about the
    > > same number will be collisions.


    But we don't know that. Some attacks were published, which show
    evidence, that MD5 might not be a pseudo-random function.


    > Thanks for the feedback. I figured monkeying around with the internals
    > of MD5 would not help me.


    It wouldn't, unless you're a researcher. But you may use one of the
    established methods, if you really need more strength. However, mostly
    just using another hash function is probably the best method.


    > It sounds like I got a big NO to my 25 times, split, 25 more times. As
    > in this is NOT making MD5 any more secure and in fact might be making
    > it less secure.


    The idea isn't extremely bad, but you shouldn't try to invent your own
    strengthening function, without putting a lot of research effort on it.
    Basically you're doing something quite similar to HMAC, and probably
    that is part of what you're looking for.

    PBKDF2 may be the answer. You can use it together with HMAC for exactly
    the purpose of strengthening.


    > Running the Digest on itself say upwards of 1000 times is -- in the
    > circles that I've been in kind of an accepted thing to do.


    Yes, but not the way you're doing it. Look up PBKDF2.


    > For example if I take "abc" and run MD5 on it, take the answer run MD5
    > on that, and do so a 1000 more times what I get is no more secure then
    > if I ran it once?
    >
    > If I'm reading correctly it even sounds that what I get utlimately at
    > the end of 1000 times might be the same as if it had been done with
    > "xyz" 1000 times?


    Yes, this is true. But it's not related to the fact that you're
    applying MD5 multiple times. However, doing that directly as you do it
    _might_ increase the probablility for this to happen.


    Regards,
    Ertugrul Söylemez.


    --
    Security is the one concept, which makes things in your life stay as
    they are. Otto is a man, who is afraid of changes in his life; so
    naturally he does not employ security.
    Ertugrul Soeylemez, Aug 26, 2007
    #5
  6. JuiceMan

    Unruh Guest

    JuiceMan <> writes:

    >On Aug 22, 7:53 am, "Sebastian G." <> wrote:
    >> Ertugrul Soeylemez wrote:
    >> >MD5does not, and in fact cannot, guarantee uniqueness. If, like stated
    >> > above,MD5with the same input space as output space is a bijection,
    >> > then it does guarantee uniqueness for inputs of exactly 128 bits, but we
    >> > don't know that even for the originalMD5.

    >>
    >> Actually we should assume quite the contrary: IfMD5is a pseudorandom
    >> function, than it's a pseudorandom mapping and therefore about 1/e of all
    >> outputs will not occur, and about the same number will be collisions.


    >Thanks for the feedback. I figured monkeying around with the internals
    >of MD5 would not help me.


    Help you with what? MD5 is a cryptographic hash. Its primary purpose is to
    make finding the preimage from the output very very hard. Its purpose is
    NOT to prevent output collisions. What are you suing it for that you would
    worry about the collisions.



    >However, some alarming points were brought up and I wouldn't mind
    >clarification on them.


    >It sounds like I got a big NO to my 25 times, split, 25 more times. As
    >in this is NOT making MD5 any more secure and in fact might be making
    >it less secure.


    >Running the Digest on itself say upwards of 1000 times is -- in the
    >circles that I've been in kind of an accepted thing to do.


    Really bad idea. On each level, you loose about as he says, 1/e outputs.
    Ie, aftr 1000 times,. you will have only something like 1/e^1000 unique
    outputs. (I am sure it is more than that, but lets stick with the random
    mapping assumption).





    >For example if I take "abc" and run MD5 on it, take the answer run MD5
    >on that, and do so a 1000 more times what I get is no more secure then
    >if I ran it once?



    Probably much less secure. What are you trying to do?



    >If I'm reading correctly it even sounds that what I get utlimately at
    >the end of 1000 times might be the same as if it had been done with
    >"xyz" 1000 times?

    No idea what this means.



    >Is this true?
    Unruh, Aug 28, 2007
    #6
  7. JuiceMan

    Sebastian G. Guest

    Unruh wrote:


    >> Running the Digest on itself say upwards of 1000 times is -- in the
    >> circles that I've been in kind of an accepted thing to do.

    >
    > Really bad idea. On each level, you loose about as he says, 1/e outputs.



    This would mean you loose about 0.66 bits per iteration, so after 194
    iterations you would have created a unique collision...

    > Ie, aftr 1000 times,. you will have only something like 1/e^1000 unique
    > outputs.



    You assume that the iterations are independent. However, for MD5 you're very
    likely not not stumble upon an additional collision after the first
    iteration. This is a property inherited by the design.

    > (I am sure it is more than that, but lets stick with the random
    > mapping assumption).



    Be careful about your assumptions.
    Sebastian G., Aug 28, 2007
    #7
  8. JuiceMan

    Unruh Guest

    "Sebastian G." <> writes:

    >Unruh wrote:



    >>> Running the Digest on itself say upwards of 1000 times is -- in the
    >>> circles that I've been in kind of an accepted thing to do.

    >>
    >> Really bad idea. On each level, you loose about as he says, 1/e outputs.



    >This would mean you loose about 0.66 bits per iteration, so after 194
    >iterations you would have created a unique collision...


    >> Ie, aftr 1000 times,. you will have only something like 1/e^1000 unique
    >> outputs.



    >You assume that the iterations are independent. However, for MD5 you're very
    >likely not not stumble upon an additional collision after the first
    >iteration. This is a property inherited by the design.


    >> (I am sure it is more than that, but lets stick with the random
    >> mapping assumption).



    >Be careful about your assumptions.


    AGreed, but what makes you think that there are far MD5(MD5(random)) has
    fewer collisions than MD5(random)?

    Ie, what makes you think that the output of MD5 does not act like a random
    input to MD5?
    Unruh, Sep 5, 2007
    #8
  9. Unruh <> (07-09-04 23:12:55):

    > > > (I am sure it is more than that, but lets stick with the random
    > > > mapping assumption).

    >
    > > Be careful about your assumptions.

    >
    > AGreed, but what makes you think that there are far MD5(MD5(random))
    > has fewer collisions than MD5(random)?


    You must have misunderstood something.


    > Ie, what makes you think that the output of MD5 does not act like a
    > random input to MD5?


    Actually it's exactly the fact that MD5 output _does_ look like random
    input to MD5. Read again carefully.


    Regards,
    Ertugrul Söylemez.


    --
    Security is the one concept, which makes things in your life stay as
    they are. Otto is a man, who is afraid of changes in his life; so
    naturally he does not employ security.
    Ertugrul Soeylemez, Sep 5, 2007
    #9
  10. JuiceMan

    Sebastian G. Guest

    Unruh wrote:


    > AGreed, but what makes you think that there are far MD5(MD5(random)) has
    > fewer collisions than MD5(random)?



    Because MD5 has been designed to resist this kind of attack. As I described,
    it would be trivial to create a collision if it behaved otherwise.

    > Ie, what makes you think that the output of MD5 does not act like a random
    > input to MD5?


    A design decision.
    Sebastian G., Sep 5, 2007
    #10
  11. JuiceMan

    Sebastian G. Guest

    Ertugrul Soeylemez wrote:


    >> Ie, what makes you think that the output of MD5 does not act like a
    >> random input to MD5?

    >
    > Actually it's exactly the fact that MD5 output _does_ look like random
    > input to MD5. Read again carefully.


    And maybe you should read again as well. If it really behaved like this,
    then 192 iterations would be sufficient to create a collision in MD5 with
    probability 1/e. So about 8 bit bit of security. That's clearly not a
    cryptographic hash.
    Sebastian G., Sep 5, 2007
    #11
  12. "Sebastian G." <> (07-09-05 06:09:16):

    > > > Ie, what makes you think that the output of MD5 does not act like
    > > > a random input to MD5?

    > >
    > > Actually it's exactly the fact that MD5 output _does_ look like
    > > random input to MD5. Read again carefully.

    >
    > And maybe you should read again as well. If it really behaved like
    > this, then 192 iterations would be sufficient to create a collision in
    > MD5 with probability 1/e. So about 8 bit bit of security. That's
    > clearly not a cryptographic hash.


    Uhm, just a typo. Add the "not".


    Regards,
    Ertugrul Söylemez.


    --
    Security is the one concept, which makes things in your life stay as
    they are. Otto is a man, who is afraid of changes in his life; so
    naturally he does not employ security.
    Ertugrul Soeylemez, Sep 5, 2007
    #12
  13. JuiceMan

    Unruh Guest

    "Sebastian G." <> writes:

    >Unruh wrote:



    >> AGreed, but what makes you think that there are far MD5(MD5(random)) has
    >> fewer collisions than MD5(random)?



    >Because MD5 has been designed to resist this kind of attack. As I described,
    >it would be trivial to create a collision if it behaved otherwise.


    No idea what you are talking about. It IS trivial to create collisions.
    Take 10^64 inputs and the probablility is high that there will be a
    collision (Birthday calculation). That helps one not at all in any
    security application.



    >> Ie, what makes you think that the output of MD5 does not act like a random
    >> input to MD5?


    >A design decision.


    Exactly which design decision are you talking about?
    Unruh, Sep 6, 2007
    #13
  14. JuiceMan

    Sebastian G. Guest

    Unruh wrote:

    > "Sebastian G." <> writes:
    >
    >> Unruh wrote:

    >
    >
    >>> AGreed, but what makes you think that there are far MD5(MD5(random)) has
    >>> fewer collisions than MD5(random)?

    >
    >
    >> Because MD5 has been designed to resist this kind of attack. As I described,
    >> it would be trivial to create a collision if it behaved otherwise.

    >
    > No idea what you are talking about. It IS trivial to create collisions.
    > Take 10^64 inputs and the probablility is high that there will be a
    > collision (Birthday calculation). That helps one not at all in any
    > security application.



    1. 2^64 inputs are already sufficient, and still 2^64 are not so trivial at all.
    2. According to the iterated attack you described, 2^8 (!) inputs would be
    sufficient.

    >>> Ie, what makes you think that the output of MD5 does not act like a random
    >>> input to MD5?

    >
    >> A design decision.

    >
    > Exactly which design decision are you talking about?


    Collision resistance on iteration
    Sebastian G., Sep 7, 2007
    #14
  15. JuiceMan

    ciphex Guest

    On 7 Sep, 17:45, "Sebastian G." <> wrote:
    > Unruh wrote:
    > > "Sebastian G." <> writes:

    >
    > >> Unruh wrote:

    >
    > >>> AGreed, but what makes you think that there are far MD5(MD5(random)) has
    > >>> fewer collisions than MD5(random)?

    >
    > >> Because MD5 has been designed to resist this kind of attack. As I described,
    > >> it would be trivial to create a collision if it behaved otherwise.

    >
    > > No idea what you are talking about. It IS trivial to create collisions.
    > > Take 10^64 inputs and the probablility is high that there will be a
    > > collision (Birthday calculation). That helps one not at all in any
    > > security application.

    >
    > 1. 2^64 inputs are already sufficient, and still 2^64 are not so trivial at all.
    > 2. According to the iterated attack you described, 2^8 (!) inputs would be
    > sufficient.
    >
    > >>> Ie, what makes you think that the output of MD5 does not act like a random
    > >>> input to MD5?

    >
    > >> A design decision.

    >
    > > Exactly which design decision are you talking about?

    >
    > Collision resistance on iteration


    The only reason i can think of for 'fiddling' with MD5 in order to
    protect
    authentication code in an online application is to defeat reverse
    lookup
    databases. In this case it would be better to mangle the input before
    hashing in some way that doesn't decrease the randomness of the
    initial input text - for example, a simple substitution/transposition
    scheme might work. The cracker would then have a problem, there is
    less chance that the resulting hash would appear in the database
    (assuming the resulting input is long enough).

    On the whole though, if you don't understand the implications of this
    kind of step then I'd avoid it and, as many have said, use a stronger
    hashing algorithm.
    ciphex, Sep 20, 2007
    #15
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Ed Pacheco

    MD5 security probems on a 2620

    Ed Pacheco, Feb 6, 2004, in forum: Cisco
    Replies:
    3
    Views:
    391
    Hansang Bae
    Feb 10, 2004
  2. speeder

    Where can I get MD5 hash of system files?

    speeder, May 19, 2005, in forum: Computer Security
    Replies:
    5
    Views:
    8,069
    Michael Pelletier
    May 20, 2005
  3. Kenny

    What is MD5 Hash?

    Kenny, Nov 18, 2003, in forum: Computer Information
    Replies:
    3
    Views:
    6,008
    davitz38
    Feb 18, 2010
  4. Zak

    Does MD5 include the file name?

    Zak, Sep 12, 2006, in forum: Computer Security
    Replies:
    9
    Views:
    2,352
    Todd H.
    Sep 14, 2006
  5. Jag Chan

    md5.txt files

    Jag Chan, Nov 21, 2006, in forum: Computer Support
    Replies:
    2
    Views:
    3,322
    Bucky Breeder
    Nov 21, 2006
Loading...

Share This Page