Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Python > making a typing speed tester

Reply
Thread Tools

making a typing speed tester

 
 
tavspamnofwd@googlemail.com
Guest
Posts: n/a
 
      11-14-2007
Referred here from the tutor list.

> I'm trying to write a program to test someones typing speed and show
> them their mistakes. However I'm getting weird results when looking
> for the differences in longer (than 100 chars) strings:
>
> import difflib
>
> # a tape measure string (just makes it easier to locate a given index)
> a =
> '1-3-5-7-9-12-15-18-21-24-27-30-33-36-39-42-45-48-51-54-57-60-63-66-69
> -72-75-78-81-84-87-90-93-96-99-103-107-111-115-119-123-127-131-135-139
> -143-147-151-155-159-163-167-171-175-179-183-187-191-195--200'
>
> # now with a few mistakes
> b = '1-3-5-7-
> l-12-15-18-21-24-27-30-33-36-39o42-45-48-51-54-57-60-63-66-69-72-75-78
> -81-84-8k-90-93-96-9l-103-107-111-115-119-12b-1v7-131-135-139-143-147-
> 151-m55-159-163-167-a71-175j179-183-187-191-195--200'
>
> s = difflib.SequenceMatcher(None, a ,b)
> ms = s.get_matching_blocks()
>
> print ms
>
>>>> [(0, 0, , (200, 200, 0)]

>
> Have I made a mistake or is this function designed to give up when the
> input strings get too long? If so what could I use instead to compute
> the mistakes in a typed text?


---------- Forwarded message ----------
From: Evert Rol

Hi Tom,

Ok, I wasn't on the list last year, but I was a few days ago, so
persistence pays off; partly, as I don't have a full answer.

I got curious and looked at the source of difflib. There's a method
__chain_b() which sets up the b2j variable, which contains the
occurrences of characters in string b. So cutting b to 199
characters, it looks like this:
b2j= 19 {'a': [168], 'b': [122], 'm': [152], 'k': [86], 'v':
[125], '-': [1, 3, 5, 7, 9, 12, 15, 18, 21, 24, 27, 30, 33, 36, 42,
45, 48, 51, 54, 57, 60, 63, 66, 69, 72, 75, 78, 81, 84, 87, 90, 93,
96, 99, 103, 107, 111, 115, 119, 123, 127, 131, 135, 139, 143, 147,
151, 155, 159, 163, 167, 171, 179, 183, 187, 191, 195, 196], 'l': [8,
98], 'o': [39], 'j': [175], '1': [0, 10, 13, 16, 20, 50, 80, 100,
104, 108, 109, 110, 112, 113, 116, 117, 120, 124, 128, 130, 132, 136,
140, 144, 148, 150, 156, 160, 164, 170, 172, 176, 180, 184, 188, 190,
192], '0': [29, 59, 89, 101, 105, 198], '3': [2, 28, 31, 32, 34, 37,
62, 92, 102, 129, 133, 137, 142, 162, 182], '2': [11, 19, 22, 25, 41,
71, 121, 197], '5': [4, 14, 44, 49, 52, 55, 74, 114, 134, 149, 153,
154, 157, 174, 194], '4': [23, 40, 43, 46, 53, 83, 141, 145], '7':
[6, 26, 56, 70, 73, 76, 106, 126, 146, 166, 169, 173, 177, 186], '6':
[35, 58, 61, 64, 65, 67, 95, 161, 165], '9': [38, 68, 88, 91, 94, 97,
118, 138, 158, 178, 189, 193], '8': [17, 47, 77, 79, 82, 85, 181,
185]}

This little detour is because of how b2j is built. Here's a part from
the comments of __chain_b():

# Before the tricks described here, __chain_b was by far the most
# time-consuming routine in the whole module! If anyone sees
# Jim Roskind, thank him again for profile.py -- I never would
# have guessed that.

And the part of the actual code reads:
b = self.b
n = len(b)
self.b2j = b2j = {}
populardict = {}
for i, elt in enumerate(b):
if elt in b2j:
indices = b2j[elt]
if n >= 200 and len(indices) * 100 > n: # <--- !!
populardict[elt] = 1
del indices[:]
else:
indices.append(i)
else:
b2j[elt] = [i]

So you're right: it has a stop at the (somewhat arbitrarily) limit of
200 characters. How that exactly works, I don't know (needs more
delving into the code), though it looks like there also need to be a
lot of indices (len(indices*100>n); I guess that's caused in your
strings by the dashes, '1's and '0's (that's why I printed the b2j
string).
If you feel safe enough and on a fast platform, you can probably up
that limit (or even put it somewhere as an optional variable in the
code, which I would think is generally better).
Not sure who the author of the module is (doesn't list in the file
itself), but perhaps you can find out and email him/her, to see what
can be altered.

Hope that helps.

Evert

 
Reply With Quote
 
 
 
 
kyosohma@gmail.com
Guest
Posts: n/a
 
      11-14-2007
On Nov 14, 11:56 am, (E-Mail Removed) wrote:
> Referred here from the tutor list.
>
> > I'm trying to write a program to test someones typing speed and show
> > them their mistakes. However I'm getting weird results when looking
> > for the differences in longer (than 100 chars) strings:

>
> > import difflib

>
> > # a tape measure string (just makes it easier to locate a given index)
> > a =
> > '1-3-5-7-9-12-15-18-21-24-27-30-33-36-39-42-45-48-51-54-57-60-63-66-69
> > -72-75-78-81-84-87-90-93-96-99-103-107-111-115-119-123-127-131-135-139
> > -143-147-151-155-159-163-167-171-175-179-183-187-191-195--200'

>
> > # now with a few mistakes
> > b = '1-3-5-7-
> > l-12-15-18-21-24-27-30-33-36-39o42-45-48-51-54-57-60-63-66-69-72-75-78
> > -81-84-8k-90-93-96-9l-103-107-111-115-119-12b-1v7-131-135-139-143-147-
> > 151-m55-159-163-167-a71-175j179-183-187-191-195--200'

>
> > s = difflib.SequenceMatcher(None, a ,b)
> > ms = s.get_matching_blocks()

>
> > print ms

>
> >>>> [(0, 0, , (200, 200, 0)]

>
> > Have I made a mistake or is this function designed to give up when the
> > input strings get too long? If so what could I use instead to compute
> > the mistakes in a typed text?

> ---------- Forwarded message ----------
> From: Evert Rol
>
> Hi Tom,
>
> Ok, I wasn't on the list last year, but I was a few days ago, so
> persistence pays off; partly, as I don't have a full answer.
>
> I got curious and looked at the source of difflib. There's a method
> __chain_b() which sets up the b2j variable, which contains the
> occurrences of characters in string b. So cutting b to 199
> characters, it looks like this:
> b2j= 19 {'a': [168], 'b': [122], 'm': [152], 'k': [86], 'v':
> [125], '-': [1, 3, 5, 7, 9, 12, 15, 18, 21, 24, 27, 30, 33, 36, 42,
> 45, 48, 51, 54, 57, 60, 63, 66, 69, 72, 75, 78, 81, 84, 87, 90, 93,
> 96, 99, 103, 107, 111, 115, 119, 123, 127, 131, 135, 139, 143, 147,
> 151, 155, 159, 163, 167, 171, 179, 183, 187, 191, 195, 196], 'l': [8,
> 98], 'o': [39], 'j': [175], '1': [0, 10, 13, 16, 20, 50, 80, 100,
> 104, 108, 109, 110, 112, 113, 116, 117, 120, 124, 128, 130, 132, 136,
> 140, 144, 148, 150, 156, 160, 164, 170, 172, 176, 180, 184, 188, 190,
> 192], '0': [29, 59, 89, 101, 105, 198], '3': [2, 28, 31, 32, 34, 37,
> 62, 92, 102, 129, 133, 137, 142, 162, 182], '2': [11, 19, 22, 25, 41,
> 71, 121, 197], '5': [4, 14, 44, 49, 52, 55, 74, 114, 134, 149, 153,
> 154, 157, 174, 194], '4': [23, 40, 43, 46, 53, 83, 141, 145], '7':
> [6, 26, 56, 70, 73, 76, 106, 126, 146, 166, 169, 173, 177, 186], '6':
> [35, 58, 61, 64, 65, 67, 95, 161, 165], '9': [38, 68, 88, 91, 94, 97,
> 118, 138, 158, 178, 189, 193], '8': [17, 47, 77, 79, 82, 85, 181,
> 185]}
>
> This little detour is because of how b2j is built. Here's a part from
> the comments of __chain_b():
>
> # Before the tricks described here, __chain_b was by far the most
> # time-consuming routine in the whole module! If anyone sees
> # Jim Roskind, thank him again for profile.py -- I never would
> # have guessed that.
>
> And the part of the actual code reads:
> b = self.b
> n = len(b)
> self.b2j = b2j = {}
> populardict = {}
> for i, elt in enumerate(b):
> if elt in b2j:
> indices = b2j[elt]
> if n >= 200 and len(indices) * 100 > n: # <--- !!
> populardict[elt] = 1
> del indices[:]
> else:
> indices.append(i)
> else:
> b2j[elt] = [i]
>
> So you're right: it has a stop at the (somewhat arbitrarily) limit of
> 200 characters. How that exactly works, I don't know (needs more
> delving into the code), though it looks like there also need to be a
> lot of indices (len(indices*100>n); I guess that's caused in your
> strings by the dashes, '1's and '0's (that's why I printed the b2j
> string).
> If you feel safe enough and on a fast platform, you can probably up
> that limit (or even put it somewhere as an optional variable in the
> code, which I would think is generally better).
> Not sure who the author of the module is (doesn't list in the file
> itself), but perhaps you can find out and email him/her, to see what
> can be altered.
>
> Hope that helps.
>
> Evert


I would use the time module to "time" the user. Then you should be
able to compare the original string with the user inputted string
using cmp.

<code>
# untested

start = time.time()
print 'some complicated long string'

# you should use a GUI toolkit's textbox rather than
# using a variable
user_string = raw_input('Please type the string above as quickly and
accurately as you can:\n\n')
end = time.time()
print 'amount of time to complete: %s seconds' % (end-start)

# do the comparison here
# which I am not sure how to do right now
</code>

See the following for ideas on comparing similar strings/iterables:

http://www.velocityreviews.com/forum...r-strings.html

Mike

 
Reply With Quote
 
 
 
 
Gabriel Genellina
Guest
Posts: n/a
 
      11-15-2007
En Wed, 14 Nov 2007 14:56:25 -0300, <(E-Mail Removed)> escribió:

>> I'm trying to write a program to test someones typing speed and show
>> them their mistakes. However I'm getting weird results when looking
>> for the differences in longer (than 100 chars) strings:
>>
>> import difflib
>>
>> # a tape measure string (just makes it easier to locate a given index)
>> a =
>> '1-3-5-7-9-12-15-18-21-24-27-30-33-36-39-42-45-48-51-54-57-60-63-66-69
>> -72-75-78-81-84-87-90-93-96-99-103-107-111-115-119-123-127-131-135-139
>> -143-147-151-155-159-163-167-171-175-179-183-187-191-195--200'
>>
>> # now with a few mistakes
>> b = '1-3-5-7-
>> l-12-15-18-21-24-27-30-33-36-39o42-45-48-51-54-57-60-63-66-69-72-75-78
>> -81-84-8k-90-93-96-9l-103-107-111-115-119-12b-1v7-131-135-139-143-147-
>> 151-m55-159-163-167-a71-175j179-183-187-191-195--200'
>>
>> s = difflib.SequenceMatcher(None, a ,b)
>> ms = s.get_matching_blocks()
>>
>> print ms
>>
>>>>> [(0, 0, , (200, 200, 0)]

>>
>> Have I made a mistake or is this function designed to give up when the
>> input strings get too long? If so what could I use instead to compute
>> the mistakes in a typed text?


Yes, there are some limitations on how SequenceMatcher works.

> ---------- Forwarded message ----------
> From: Evert Rol
> [...]
> And the part of the actual code reads:


> if n >= 200 and len(indices) * 100 > n: # <--- !!
> populardict[elt] = 1
> del indices[:]
> else:
> indices.append(i)>


> So you're right: it has a stop at the (somewhat arbitrarily) limit of
> 200 characters. [...]If you feel safe enough and on a fast platform, you
> can probably up
> that limit (or even put it somewhere as an optional variable in the
> code, which I would think is generally better).


If you try with a slightly shorter text (190 chars, by example) you get
the expected result, pretty fast:

py> s = difflib.SequenceMatcher(None, a[:190], b[:190])
py> ms = s.get_matching_blocks()
py> print ms
[(0, 0, , (9, 9, 30), (40, 40, 46), (87, 87, 11), (99, 99, 23), (123,
123, 2),
(126, 126, 26), (153, 153, 15), (169, 169, 6), (176, 176, 14), (190, 190,
0)]

So it appears that your strings are hitting that (arbitrary) limit. From
the algorithm point of view, your strings are a rather degenerate case: so
many '-' and '0' and '1's to match.
Try increasing that 200 to somewhat larger than your strings.

--
Gabriel Genellina

 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Typing to Nullable (Of Date) vs typing to Date JimLad ASP .Net 0 01-26-2010 07:54 PM
typing location changes when I am typing. Ed Computer Support 5 11-11-2006 12:51 AM
Static Typing Where Possible and Dynamic Typing When Needed vladare Ruby 0 07-11-2005 11:54 AM
Speed up you typing speed more than 100 wpm zheenma@yahoo.com C++ 2 04-28-2004 10:30 PM
Speed up you typing speed more than 100 wpm zheenma@yahoo.com C Programming 2 04-28-2004 01:34 PM



Advertisments