Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Python > googling for fun (and profit...? naah!-)

Reply
Thread Tools

googling for fun (and profit...? naah!-)

 
 
Alex Martelli
Guest
Posts: n/a
 
      10-31-2003

(Note: you need to download & install Mark Pilgrim's pygoogle, see
http://diveintomark.org/projects/pygoogle/ , get a personal license to the
google api, see http://www.google.com/apis/ , save it in a file such as
"googlekey.txt" in your home directory [pygoogle looks in several places,
see http://diveintomark.org/projects/pygoogle/readme.txt for the list]).

So, a little script such as...:



#! /usr/local/bin/python2.3
# programming languages popularity web-survey

import google
import time

def quoter(xs): return ['"%s"'%x for x in xs]
langs = '''
python ruby perl caml java haskell lisp eiffel sml scheme
fortran ada forth apl javascript ecmascript vbscript vba sql
bash awk tcsh csh zsh ksh autolisp elisp occam intercal basic
abc algol applescript assembly befunge beta chill cobol dylan
erlang pascal delphi idl limbo smalltalk squeak m4 matlab logo
foxpro turing tcl snobol simula setl self rexx rebol postscript
php oz modula ml miranda mercury mumps oberon sather stackless
functional procedural parallel hpf agile extreme database
relational rpg
'''.split() + quoter([
'visual basic', 'object pascal', 'objective c', 'c++', 'c#', 'c',
'stackless python', 'object oriented',
])

# ensure all duplications are removed
langs = dict.fromkeys(langs).keys()

print 'examining %d terms' % len(langs)
results = []
for i, lang in enumerate(langs):
while True:
print '%2d: %20s' % (i, lang.strip('"'), ),
try: data = google.doGoogleSearch(lang + ' programming')
except Exception:
print "... likely internal server error, we wait & retry... "
time.sleep(0.5)
else:
results.append((data.meta.estimatedTotalResultsCou nt, lang))
print '%9d' % data.meta.estimatedTotalResultsCount
break
results.sort()
results.reverse()
print
print
print '%20s %9s' % ("Language", "# of hits")
print

for numb, lang in results:
print '%20s %9d' % (lang.strip('"'), numb)


Gives me the following results:

Language # of hits

c 4980000
database 3750000
basic 3750000
java 3320000
self 2000000
php 1880000
c++ 1860000
perl 1640000
sql 1150000
logo 1070000
parallel 1030000
javascript 1030000
functional 997000
object oriented 944000
visual basic 847000
beta 745000
python 729000
scheme 693000
assembly 687000
forth 591000
extreme 572000
c# 506000
relational 377000
delphi 354000
fortran 344000
pascal 329000
postscript 297000
tcl 277000
abc 259000
lisp 220000
procedural 204000
ml 201000
ada 196000
vbscript 181000
cobol 171000
foxpro 137000
vba 123000
matlab 111000
smalltalk 101000
ruby 97900
bash 87400
mercury 86800
rpg 81600
oz 78500
turing 72200
rexx 66100
agile 62700
eiffel 58300
idl 58100
haskell 55100
awk 53100
mumps 49800
chill 47600
objective c 44900
modula 39000
apl 38800
csh 31700
dylan 31500
simula 30600
erlang 29900
m4 28000
squeak 24400
miranda 24300
applescript 24000
object pascal 23900
algol 21000
ksh 17900
tcsh 17600
sml 16000
oberon 15400
caml 15300
hpf 11900
limbo 11400
rebol 10800
occam 10300
elisp 8780
ecmascript 7080
zsh 5640
autolisp 5430
sather 4260
snobol 3900
intercal 2700
setl 2010
stackless 1040
befunge 951
stackless python 431

of course there are quite a few anomalies here -- e.g. i think there is
no automatic way to "clean" the C hit count from the hits for objective c,
c++, c# -- basic from visual basic -- and so on. But then, this is for
fun, not a scientific query, which is why i've mixed other catchwords
with the programming languages as I thought of them.

Doing some "eyeball cleanup" we can see that c, net of c++, c# etc, must
be a little below Java; basic, net of visual basic, ditto. 'self' is
alas too unlikely to refer to that little-known though interesting
language. similarly for 'logo', 'beta', ... -- and 'sql' is likely
to be mixed up with many other languages too.

So, I think the top ten places, in order, for actual languages, are really:
java
c (not objective/c++/c#)
basic (not visual)
php
c++
perl
javascript
visual basic
python
scheme

not too surprising, I guess. One could explore a bit more of course
(e.g. specifically look for 'basic -visual' etc etc) but I'm running
a bit short of my daily 1000 searches so I'm gonna leave that fun to
you, o readers. Points to ponder: the preponderance of visual basic
over python, and of python over scheme, is really small; the latter
may perhaps be explained by some occurrences of 'scheme' as an ordinary
word rather than the language name, and the former by the fact that the
typical web usage of many visual basic programmers is unlikely to include
writing websites about VB, compared to the web usage of Pythonistas.

If scheme's apparent popularity does turn out to be an artefact, then
forth (or is it an artefact from "go forth" etc...?-), assembly (but IS
that used in the programming sense...?), and C# are the other possible
contenders for the coveted tenth place. After the contenders for the
top places we have a (to me!) somewhat surprising bunch -- delphi,
fortran, pascal, postscript (!), tcl, abc (!?), lisp, ml, ada, and
vbscript in this order. Wow -- how are the mighty fallen! -- cobol
is BELOW this second bunch...!

Coming to buzzwords that aren't programming languages, other
surprises await: "functional" edges out "object oriented", "extreme"
is WAY more popular than "procedural" (yeah right, "agile"
programming isn't as popular a term as I'd have thought (but still,
more than eiffel....

Plenty of other food for flamewars here -- can mercury AND oz
really be THAT much more popular than haskell, erlang, caml -- the
latter badly outscored even by OLD miranda -- and ML so WAY more
popular than ALL other pure functional languages & dialects (and
indeed even more than ada, vbscript, cobol, foxpro, vba, matlab,
smalltalk, ruby, bash...)...?!

googling sure IS plenty of fun!!!-)


Alex

 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
3 PIX VPN questions - FUN FUN FUN frishack@gmail.com Cisco 3 03-16-2006 02:25 PM
Fun fun fun Luke Computer Support 3 10-07-2003 03:45 PM
Re: Googling for a 403 Miggsee Computer Support 5 08-08-2003 05:21 AM
Re: Googling for a 403 why? Computer Support 0 08-07-2003 05:47 PM
Re: Googling for a 403 Brian H Computer Support 0 08-07-2003 05:34 PM



Advertisments