Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Ruby > CSV import to hash to compare with database

Reply
Thread Tools

CSV import to hash to compare with database

 
 
John Mcleod
Guest
Posts: n/a
 
      08-17-2009
Hello All,
Is it practical to import data from a csv file, via FasteCSV to a
temporary data structure, like a hash, for editing purposes?
I'm looking to read a file, send the data to a temp location, then
compare the file with similar files in a database table, if any. Then
import the edited data to the database table.

Thank you for any help.

JohnM
--
Posted via http://www.ruby-forum.com/.

 
Reply With Quote
 
 
 
 
James Gray
Guest
Posts: n/a
 
      08-17-2009
On Aug 17, 2009, at 10:54 AM, John Mcleod wrote:

> Hello All,


Hello.

> Is it practical to import data from a csv file, via FasteCSV to a
> temporary data structure, like a hash, for editing purposes?


I don't see why it wouldn't be. FasterCSV::Row has a to_hash()
method. I use this regularly to pump CSV rows into ActiveRecord's
create!() method to add records to my database.

James Edward Gray II


 
Reply With Quote
 
 
 
 
John Mcleod
Guest
Posts: n/a
 
      08-17-2009
Mr. Gray,
Thank you for the quick reply.
I'm a PHP guy with 3 weeks in Ruby on Rails (Department changed platform
3 months after they hired me)
Could you please point me to a good 'FasterCSV' resource with some
examples?
Thank you very much.
JohnM


James Gray wrote:
> On Aug 17, 2009, at 10:54 AM, John Mcleod wrote:
>
>> Hello All,

>
> Hello.
>
>> Is it practical to import data from a csv file, via FasteCSV to a
>> temporary data structure, like a hash, for editing purposes?

>
> I don't see why it wouldn't be. FasterCSV::Row has a to_hash()
> method. I use this regularly to pump CSV rows into ActiveRecord's
> create!() method to add records to my database.
>
> James Edward Gray II


--
Posted via http://www.ruby-forum.com/.

 
Reply With Quote
 
James Gray
Guest
Posts: n/a
 
      08-17-2009
On Aug 17, 2009, at 11:56 AM, John Mcleod wrote:

> Mr. Gray,


You can stick with James. I'm not that old yet.

> Thank you for the quick reply.


Sure. Happy to help.

> Could you please point me to a good 'FasterCSV' resource with some
> examples?


The documentation is pretty thorough:

http://fastercsv.rubyforge.org/

and the tests show usage:

http://fastercsv.rubyforge.org/svn/trunk/test/

If your needs are simple though, maybe this code is all you need:

FCSV.foreach( path, :headers => true,
:header_converters => :symbol ) do |row|
YourARModelClass.create!(row.to_hash)
end

Hopefully that at least gets you started.

James Edward Gray II


 
Reply With Quote
 
John Mcleod
Guest
Posts: n/a
 
      08-18-2009
James,
Thank you for the example code.
It's funny, I found an old post of yours from 2005.

h = Hash.new(0)
file = FasterCSV.read(filename)[1..-1].each { |row| h[row[0]] += 1 }

I've tested it and it seems to work.
I will however, check out the documentation and your test code.

This is step 1 though. Next, populate a hash with data from the
database comparing data from the CSV file and making a editable CSV grid
(post to rails forum).

I'm not done yet.

Thanks again.

JohnM

James Gray wrote:
> On Aug 17, 2009, at 11:56 AM, John Mcleod wrote:
>
>> Mr. Gray,

>
> You can stick with James. I'm not that old yet.
>
>> Thank you for the quick reply.

>
> Sure. Happy to help.
>
>> Could you please point me to a good 'FasterCSV' resource with some
>> examples?

>
> The documentation is pretty thorough:
>
> http://fastercsv.rubyforge.org/
>
> and the tests show usage:
>
> http://fastercsv.rubyforge.org/svn/trunk/test/
>
> If your needs are simple though, maybe this code is all you need:
>
> FCSV.foreach( path, :headers => true,
> :header_converters => :symbol ) do |row|
> YourARModelClass.create!(row.to_hash)
> end
>
> Hopefully that at least gets you started.
>
> James Edward Gray II


--
Posted via http://www.ruby-forum.com/.

 
Reply With Quote
 
Robert Klemme
Guest
Posts: n/a
 
      08-18-2009
John,

if I understand you correctly you have data in CSV files and you have
a relational database. You want to compare the content of those files
with data already present in the database and edit the CSV data before
importing it into the DB.

I do not know what your schema looks like nor what "comparing" in your
case means, but did you consider first loading CSV data into the
database using some staging table, doing the comparison and editing
there and then copying the data over to the target location? That way
you can use the full power of SQL for comparison purposes. And there
are a lot of tools that allow to edit database data in tabular
representation.

Kind regards

robert

--
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/

 
Reply With Quote
 
John Mcleod
Guest
Posts: n/a
 
      08-18-2009
Robert,
Yes, you understand my dilemma.
I want to thank you for your keen insight. I didn't even thought of
that.

When I say "compare", I mean a similarity comparison not an exact
comparison.
I installed 'amatch' gem. It has several comparison features that I
believe will benefit me. In particular, the Levenshtein distance.

Thank you again for the advice.

JohnM

Robert Klemme wrote:
> John,
>
> if I understand you correctly you have data in CSV files and you have
> a relational database. You want to compare the content of those files
> with data already present in the database and edit the CSV data before
> importing it into the DB.
>
> I do not know what your schema looks like nor what "comparing" in your
> case means, but did you consider first loading CSV data into the
> database using some staging table, doing the comparison and editing
> there and then copying the data over to the target location? That way
> you can use the full power of SQL for comparison purposes. And there
> are a lot of tools that allow to edit database data in tabular
> representation.
>
> Kind regards
>
> robert


--
Posted via http://www.ruby-forum.com/.

 
Reply With Quote
 
Robert Klemme
Guest
Posts: n/a
 
      08-18-2009
2009/8/18 John Mcleod <(E-Mail Removed)>:
> Yes, you understand my dilemma.
> I want to thank you for your keen insight. I didn't even thought of
> that.


It may not be a viable solution in your case but it could be
worthwhile to ponder.

> When I say "compare", I mean a similarity comparison not an exact
> comparison.
> I installed 'amatch' gem. It has several comparison features that I
> believe will benefit me. In particular, the Levenshtein distance.


Two things come to mind: depending on the RDBMS you are using it may
have similar tools (functions) for text processing and analysis.
Depending on your requirements you can even write a database function
that calculates Levenshtein Distance.

Second, you could extract the data from the database via any of Ruby's
database interfaces, do the comparison in Ruby code and write out
results. Then, edit them and import the result back into the DB.

> Thank you again for the advice.


Your welcome!

Kind regards

robert

--
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/

 
Reply With Quote
 
John Mcleod
Guest
Posts: n/a
 
      08-18-2009
Robert,

>Two things come to mind: depending on the RDBMS you are using it may
> have similar tools (functions) for text processing and analysis.


Currently, my development database is SQLite, but my supervisor's plan
is to have Oracle as a production database.

Once again, I'll be in learning mode. (knowledge of Oracle = 0) I'm
experienced in mySQL mostly.

> Second, you could extract the data from the database via any of Ruby's
> database interfaces, ...


Are the "interfaces" easy to work with?

>Then, edit them and import the result back into the DB.


I've been looking at jquery jgrid for rails. It looks easy to implement
(that's a topic for another forum)

Thanks again.

John

Robert Klemme wrote:
> 2009/8/18 John Mcleod <(E-Mail Removed)>:
>> Yes, you understand my dilemma.
>> I want to thank you for your keen insight. I didn't even thought of
>> that.

>
> It may not be a viable solution in your case but it could be
> worthwhile to ponder.
>
>> When I say "compare", I mean a similarity comparison not an exact
>> comparison.
>> I installed 'amatch' gem. It has several comparison features that I
>> believe will benefit me. In particular, the Levenshtein distance.

>
> Two things come to mind: depending on the RDBMS you are using it may
> have similar tools (functions) for text processing and analysis.
> Depending on your requirements you can even write a database function
> that calculates Levenshtein Distance.
>
> Second, you could extract the data from the database via any of Ruby's
> database interfaces, do the comparison in Ruby code and write out
> results. Then, edit them and import the result back into the DB.
>
>> Thank you again for the advice.

>
> Your welcome!
>
> Kind regards
>
> robert


--
Posted via http://www.ruby-forum.com/.

 
Reply With Quote
 
Robert Klemme
Guest
Posts: n/a
 
      08-19-2009
2009/8/18 John Mcleod <(E-Mail Removed)>:
> Robert,
>
>>Two things come to mind: depending on the RDBMS you are using it may
>> have similar tools (functions) for text processing and analysis.

>
> Currently, my development database is SQLite, but my supervisor's plan
> is to have Oracle as a production database.
>
> Once again, I'll be in learning mode. =A0(knowledge of Oracle =3D 0) =A0I=

'm
> experienced in mySQL mostly.
>
>> Second, you could extract the data from the database via any of Ruby's
>> database interfaces, ...

>
> Are the "interfaces" easy to work with?


DBI is pretty straightforward - once you have it installed IIRC. It's
a while since I used it.

But you can as easily export data as CSV with SQL Plus - example for
the SCOTT demo schema that comes with the database (if it is
installed):

c:\Temp>sqlplus -S scott/PWD @emp
7369;"SMITH";"CLERK"
7499;"ALLEN";"SALESMAN"
7521;"WARD";"SALESMAN"
7566;"JONES";"MANAGER"
7654;"MARTIN";"SALESMAN"
7698;"BLAKE";"MANAGER"
7782;"CLARK";"MANAGER"
7788;"SCOTT";"ANALYST"
7839;"KING";"PRESIDENT"
7844;"TURNER";"SALESMAN"
7876;"ADAMS";"CLERK"
7900;"JAMES";"CLERK"
7902;"FORD";"ANALYST"
7934;"MILLER";"CLERK"

c:\Temp>

emp.sql:

set pagesize 0 linesize 200
set feedback off
select empno || ';"' || ename || '";"' || job || '"'
from emp
order by empno
/
exit

You can find all the docs over at http://tahiti.oracle.com/

Kind regards

robert

--=20
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/

 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Database Database Database Database scott93727@gmail.com Computer Information 0 09-27-2012 02:43 AM
DataBase DataBase DataBase DataBase scott93727@gmail.com Computer Information 0 09-26-2012 09:40 AM
hash of hash of hash of hash in c++ rp C++ 1 11-10-2011 04:45 PM
Hash#select returns an array but Hash#reject returns a hash... Srijayanth Sridhar Ruby 19 07-02-2008 12:49 PM
JSP Import CSV into database Dave Java 4 10-23-2006 12:32 AM



Advertisments