Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Ruby > SPSS

Reply
 
 
Mike Schwab
Guest
Posts: n/a
 
      12-02-2005
--Apple-Mail-4-587532378
Content-Transfer-Encoding: 7bit
Content-Type: text/plain;
charset=US-ASCII;
format=flowed

A piece of software that badly needs to be written is an easy-to-use
statistics package. Apparently, in the social sciences the industry
standard is this program called SPSS, which is a chore to use, quite
expensive, and suffers from all sorts of vestiges that no modern
program should.

The essential functions are importing (and 'massaging') data, running
regressions, and making pretty graphs. When I got thinking about this
idea I realized it would be really cool as a web app, because then
datasets would be public, and different researchers could use them for
different purposes. The essence of social science is asking people to
answer series of questions, correlating their answers and attempting to
make causal inferences. While people have their theories and biases
that lead them to expect certain causal culprits, their data can speak
for themselves (in cases where the wording of their survey questions is
smart enough to let them). Perhaps if other sets of eyes happen across
these datasets, with the regressions ready to go, and other datasets
from similar surveys are readily available, a less biased researcher
can come along, aggregate the data and find a deeper truth about human
nature. These studies always have the numbers working against them;
the cost per subject is high so they don't get as many datapoints as
they want, but a ton of datapoints is exactly what they need to move
past deceptive ('confounded') results and find the subtler stuff that
isn't trivial or outright false.

So I feel that a browser-based datacruncher would be cool because it
would give people the freedom to work on/show off their findings from
any computer, and it would force them to make their data public for the
betterment of social science as explained above. Also, use of tags to
indicate which sets might be compatible for aggregation could happen;
and making all this stuff browsable would help inspire new questions in
readers, new directions for research. The only other 'cool' idea I've
had thus far was to have the thing 'automatically' run all the logical
regressions and succinctly inform the user which ones are significant.

So I'm looking for someone to shoot ideas back and forth, someone who
may have experience that could be applicable to this sort of project,
or who may have a mature understanding of how statistics are used in
the real world. I don't exactly have time to get cracking on this yet,
but I do want to be actively planning it.

-Mike
--Apple-Mail-4-587532378--


 
Reply With Quote
 
 
 
 
Edwin van Leeuwen
Guest
Posts: n/a
 
      12-02-2005
michael.schwab wrote:
> So I'm looking for someone to shoot ideas back and forth, someone who
> may have experience that could be applicable to this sort of project,
> or who may have a mature understanding of how statistics are used in
> the real world. I don't exactly have time to get cracking on this yet,
> but I do want to be actively planning it.
>
> -Mike


In biological sciences spss is used a lot too, I myself have only
limited experience though. If you are really serious about this then I
would use the r-project as the backbone for all the statistical test. I
think there are some fairly limited ruby-rproject bindings available,
but they all seem to limited/unmaintained/undocumented (someone please
correct me if I'm wrong). So this might be the first step to take.

I am somewhat hesitant on the whole statistical package thing. I know
that in biological sciences statistics are often badly understood and
people mostly use these packages wrongly. I know that some statiticians
looked at a couple of papers and in a large amount (80%?) the statistic
methods used were completely wrong. I know you can't really blame spss
for this, but the fact is that people will get answers from spss even if
they don't understand what they are actually doing. If you want to use a
higher level language you are forced to learn about what you are doing
-> less mistakes.
On the other hand it is about time that we have a good open source
statistics package, that is also available from linux.

I would also be hesitant about forcing people to open up their data. I
myself would be hesitant to share all my data before I had analysed
it/published an article about it.



--
Posted via http://www.ruby-forum.com/.


 
Reply With Quote
 
 
 
 
Zed A. Shaw
Guest
Posts: n/a
 
      12-03-2005
One letter: R

http://www.r-project.org/

(I think that's it. Anyway, does everything you want and is also a
sort of nice scripting language).

Zed A. Shaw
http://www.zedshaw.com/


On Fri, 2 Dec 2005 17:45:46 +0900
Mike Schwab <(E-Mail Removed)> wrote:

> A piece of software that badly needs to be written is an easy-to-use
> statistics package. Apparently, in the social sciences the industry
> standard is this program called SPSS, which is a chore to use, quite
> expensive, and suffers from all sorts of vestiges that no modern
> program should.
>



 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Wrapping statements in Python in SPSS alankrinsky@gmail.com Python 10 12-28-2012 06:20 PM
Training and Projects on Bioinformatics, SAS, SPSS, ComputationalBiology and Clinical Research @ SANCTUARY BIO-LABS, Hyderabad SANCTUARY BIO-LABS C++ 0 07-25-2009 10:17 AM
SPSS Silvio Bierman Java 0 02-14-2008 11:22 PM
how to put data from a sql server table into spss format? Mark ASP .Net 0 12-26-2007 05:20 PM



Advertisments