Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Perl > Perl Misc > Need Estimate of Programming Effort

Reply
Thread Tools

Need Estimate of Programming Effort

 
 
Jeff Sheffel
Guest
Posts: n/a
 
      03-06-2007
I'm looking for a simple estimate of a "level of effort", for a Perl
programming task.

The estimate should be in hours. (Maybe a range of programming hours -
based on level of Perl experience.) Any other additional estimate
information, and design comments are appreciated. (Please do not ask
questions about the requirements, since I did not write them; make any
assumptions necessary.)

Program Requirements:
---------------------
Design a web scraping utility to scrap information from various shopping
sites. The code should be written in Object Oriented Perl, with use strict
and warnings enabled.

The initial sites used should be http://www.shopzilla.com and
http://www.shopping.com. However, the program should be designed in a
manor that will allow other sites to be added in the future.

The minimum requirement for output is: Site scraped from, product name,
short description, low price, high price. For simplicity scrapings can be
limited to 60 items or less from each target site.

Optional features that can be added are throttling and threading.
Throttling will limit the number of hits to a particular site in a giving
time period and threading would allow the program to make several requests
simultaneously.
The program should be fully documented and run without warnings.

 
Reply With Quote
 
 
 
 
Uri Guttman
Guest
Posts: n/a
 
      03-06-2007
>>>>> "JS" == Jeff Sheffel <(E-Mail Removed)> writes:

JS> I'm looking for a simple estimate of a "level of effort", for a Perl
JS> programming task.

you need to hire someone just for this task alone.

JS> The estimate should be in hours. (Maybe a range of programming hours -
JS> based on level of Perl experience.) Any other additional estimate
JS> information, and design comments are appreciated. (Please do not ask
JS> questions about the requirements, since I did not write them; make any
JS> assumptions necessary.)

you can't do that. i have written crawlers before and the client will
ALWAYS make many changes as it is developed. these projects cannot be
properly estimated without a very clear and precise spec. you are asking
for a world of trouble otherwise and this comes from deep experience.

JS> Design a web scraping utility to scrap information from various
JS> shopping sites. The code should be written in Object Oriented
JS> Perl, with use strict and warnings enabled.

oh boy! strict and warnings add many hours to any project. a stupid
requirement which doesn't help at all with estimates. so many design
questions will need to be asked and answered. this is not a toy.

JS> The initial sites used should be http://www.shopzilla.com and
JS> http://www.shopping.com. However, the program should be designed in a
JS> manor that will allow other sites to be added in the future.

s/manor/manner/

and you can't crawl those sites as is. they are shopping search engines
so you would need to know the product names/etc to locate them.

JS> The minimum requirement for output is: Site scraped from, product name,
JS> short description, low price, high price. For simplicity scrapings can be
JS> limited to 60 items or less from each target site.

which 60 items? is there a list? will it grow? more unasked questions.

JS> Optional features that can be added are throttling and threading.
JS> Throttling will limit the number of hits to a particular site in a
JS> giving time period and threading would allow the program to make
JS> several requests simultaneously.

parallel requests can be done without threading and in several
ways. threading is a design issue and not a requirement. throttling is a
requirement and if you didn't do it, any decent site will notice and
block you. are these 'optional' features to be designed in now or bolted
on (poorly) later? again, crawling large scale is not for
kiddies. prototype crawlers will not scale unless they are designed for
it from the beginning. so you have a major requirements conflict here
about whether this is a kiddie toy or a professional scalable crawler.

JS> The program should be fully documented and run without warnings.

they want documentation too? unheard of!! how about some properly
written requirements first?

if you really need professional help (and i think you do) have them
contact me directly as i have actually created 2 major crawler systems
and can at least ask the right questions. but there is no way in hell i
would provide a time estimate on such a frivolous set of
requirements. you can't make assumptions as this could be a week long
kiddie thing or 6-12 man-months which is a pretty wide range of
estimates.

i await the call from your client (or yourself). (not holding my
breath).

uri

--
Uri Guttman ------ http://www.velocityreviews.com/forums/(E-Mail Removed) -------- http://www.stemsystems.com
--Perl Consulting, Stem Development, Systems Architecture, Design and Coding-
Search or Offer Perl Jobs ---------------------------- http://jobs.perl.org
 
Reply With Quote
 
 
 
 
Jeff Sheffel
Guest
Posts: n/a
 
      03-06-2007
Uri,
Thank you for your time. Your comments are valuable, and I agree with the
points you're making.

Your summary of programming effort, i.e. 1 week (= 40 man hours?) minimum,
is the quick answer I was looking for.

I should have clarified, that, this is not for a client, but a
"homework exercise" used by a potential employer for employment screening.
So, creative and rapid programming disciplines are being called for, here.

I don't see why you state, that, "the shopping sites can't be crawled."
I think they can, but not easily, and each site will require specific
methods. The product names ARE the search terms (expressed by the user on
the command line). They (i.e. the client) used the term threading, loosely,
as a design requirement for parallelism.

I don't think I'm up for this test...
aren't their plenty of Perl jobs today?
Jeff

Uri Guttman wrote:
>>>>>> "JS" == Jeff Sheffel <(E-Mail Removed)> writes:

>
> JS> I'm looking for a simple estimate of a "level of effort", for a Perl
> JS> programming task.
>
> you need to hire someone just for this task alone.
>
> JS> The estimate should be in hours. (Maybe a range of programming
> hours -
> JS> based on level of Perl experience.) Any other additional estimate
> JS> information, and design comments are appreciated. (Please do not
> ask JS> questions about the requirements, since I did not write them;
> make any JS> assumptions necessary.)
>
> you can't do that. i have written crawlers before and the client will
> ALWAYS make many changes as it is developed. these projects cannot be
> properly estimated without a very clear and precise spec. you are asking
> for a world of trouble otherwise and this comes from deep experience.
>
> JS> Design a web scraping utility to scrap information from various
> JS> shopping sites. The code should be written in Object Oriented
> JS> Perl, with use strict and warnings enabled.
>
> oh boy! strict and warnings add many hours to any project. a stupid
> requirement which doesn't help at all with estimates. so many design
> questions will need to be asked and answered. this is not a toy.
>
> JS> The initial sites used should be http://www.shopzilla.com and
> JS> http://www.shopping.com. However, the program should be designed in
> a JS> manor that will allow other sites to be added in the future.
>
> s/manor/manner/
>
> and you can't crawl those sites as is. they are shopping search engines
> so you would need to know the product names/etc to locate them.
>
> JS> The minimum requirement for output is: Site scraped from, product
> name,
> JS> short description, low price, high price. For simplicity scrapings
> can be JS> limited to 60 items or less from each target site.
>
> which 60 items? is there a list? will it grow? more unasked questions.
>
> JS> Optional features that can be added are throttling and threading.
> JS> Throttling will limit the number of hits to a particular site in a
> JS> giving time period and threading would allow the program to make
> JS> several requests simultaneously.
>
> parallel requests can be done without threading and in several
> ways. threading is a design issue and not a requirement. throttling is a
> requirement and if you didn't do it, any decent site will notice and
> block you. are these 'optional' features to be designed in now or bolted
> on (poorly) later? again, crawling large scale is not for
> kiddies. prototype crawlers will not scale unless they are designed for
> it from the beginning. so you have a major requirements conflict here
> about whether this is a kiddie toy or a professional scalable crawler.
>
> JS> The program should be fully documented and run without warnings.
>
> they want documentation too? unheard of!! how about some properly
> written requirements first?
>
> if you really need professional help (and i think you do) have them
> contact me directly as i have actually created 2 major crawler systems
> and can at least ask the right questions. but there is no way in hell i
> would provide a time estimate on such a frivolous set of
> requirements. you can't make assumptions as this could be a week long
> kiddie thing or 6-12 man-months which is a pretty wide range of
> estimates.
>
> i await the call from your client (or yourself). (not holding my
> breath).
>
> uri
>


 
Reply With Quote
 
Uri Guttman
Guest
Posts: n/a
 
      03-06-2007
>>>>> "JS" == Jeff Sheffel <(E-Mail Removed)> writes:

JS> Your summary of programming effort, i.e. 1 week (= 40 man hours?)
JS> minimum, is the quick answer I was looking for.

JS> I should have clarified, that, this is not for a client, but a
JS> "homework exercise" used by a potential employer for employment
JS> screening. So, creative and rapid programming disciplines are
JS> being called for, here.

why didn't you say that to begin with? that is the most important info
in the whole story.

JS> I don't see why you state, that, "the shopping sites can't be
JS> crawled." I think they can, but not easily, and each site will
JS> require specific methods. The product names ARE the search terms
JS> (expressed by the user on the command line). They (i.e. the
JS> client) used the term threading, loosely, as a design requirement
JS> for parallelism.

this sounds like a very large homework assignment. i would be wary of
working for them if they require such projects to apply for a
job. unless they are expecting a toy which can be done in little time if
you don't care about scaling. there are dinky crawlers on cpan and if
you just drive them with some search terms the rest is parsing the web
pages (also cpan) and various amounts of driver and glue code.

JS> I don't think I'm up for this test...
JS> aren't their plenty of Perl jobs today?

ever look at jobs.perl.org?

and you should tell this employer to also post there.

and i do some perl job placement as well.

uri

--
Uri Guttman ------ (E-Mail Removed) -------- http://www.stemsystems.com
--Perl Consulting, Stem Development, Systems Architecture, Design and Coding-
Search or Offer Perl Jobs ---------------------------- http://jobs.perl.org
 
Reply With Quote
 
Charlton Wilbur
Guest
Posts: n/a
 
      03-06-2007
>>>>> "JS" == Jeff Sheffel <(E-Mail Removed)> writes:

JS> I should have clarified, that, this is not for a client, but a
JS> "homework exercise" used by a potential employer for
JS> employment screening. So, creative and rapid programming
JS> disciplines are being called for, here.

Someone who wants to employ *you* asks *you* this question, and you
pass it on to Usenet to do *your* homework for you?

Charlton



--
Charlton Wilbur
(E-Mail Removed)
 
Reply With Quote
 
Charlton Wilbur
Guest
Posts: n/a
 
      03-06-2007
>>>>> "UG" == Uri Guttman <(E-Mail Removed)> writes:

UG> this sounds like a very large homework assignment. i would be
UG> wary of working for them if they require such projects to
UG> apply for a job.

My impression is that the employer wanted a back-of-the-envelope
estimate as homework, not that they wanted the whole crawler.

Charlton


--
Charlton Wilbur
(E-Mail Removed)
 
Reply With Quote
 
Jens Thoms Toerring
Guest
Posts: n/a
 
      03-07-2007
Uri Guttman <(E-Mail Removed)> wrote:
> >>>>> "JS" == Jeff Sheffel <(E-Mail Removed)> writes:

> JS> Design a web scraping utility to scrap information from various
> JS> shopping sites. The code should be written in Object Oriented
> JS> Perl, with use strict and warnings enabled.


> oh boy! strict and warnings add many hours to any project. a stupid
> requirement which doesn't help at all with estimates.


Awfully sorry for chiming in like that but I am labouring under the
impression that using strict and warnings actually saves me a lot
of time since it helps to catch my more stupid errors. Can you help
me to find out about the errors of may ways and write a bit more about
why you think it would "add many hours to any project" instead?

> Uri Guttman ------ (E-Mail Removed) -------- http://www.stemsystems.com


This might be a temporary problem but when I try to go to the URL
at the end I got "403 Forbidden".

Regards, Jens
--
\ Jens Thoms Toerring ___ (E-Mail Removed)
\__________________________ http://toerring.de
 
Reply With Quote
 
Uri Guttman
Guest
Posts: n/a
 
      03-07-2007
>>>>> "JTT" == Jens Thoms Toerring <(E-Mail Removed)> writes:

JTT> Uri Guttman <(E-Mail Removed)> wrote:
>> >>>>> "JS" == Jeff Sheffel <(E-Mail Removed)> writes:

JS> Design a web scraping utility to scrap information from various
JS> shopping sites. The code should be written in Object Oriented
JS> Perl, with use strict and warnings enabled.

>> oh boy! strict and warnings add many hours to any project. a stupid
>> requirement which doesn't help at all with estimates.


JTT> Awfully sorry for chiming in like that but I am labouring under the
JTT> impression that using strict and warnings actually saves me a lot
JTT> of time since it helps to catch my more stupid errors. Can you help
JTT> me to find out about the errors of may ways and write a bit more about
JTT> why you think it would "add many hours to any project" instead?

i was being sarcastic about estimating a project schedule when strict
and warnings are enabled. of course i endorse their use all the time but
it was silly for the requirements to specify them. and considering that
this was job application homework it is even sillier. how would using
strict and warnings affect project time estimation?


>> Uri Guttman ------ (E-Mail Removed) -------- http://www.stemsystems.com


JTT> This might be a temporary problem but when I try to go to the URL
JTT> at the end I got "403 Forbidden".

it is down until i can redo the site as its hosting was moved. i should
put up an under construction thing already. an interview i did for
perlcast last summer was just broadcast last week and i got some emails
about it being down.

uri

--
Uri Guttman ------ (E-Mail Removed) -------- http://www.stemsystems.com
--Perl Consulting, Stem Development, Systems Architecture, Design and Coding-
Search or Offer Perl Jobs ---------------------------- http://jobs.perl.org
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Estimate IFInOctets on multilink interface anti00@poczta.onet.pl Cisco 1 12-01-2005 05:49 PM
cost estimate for a database-driven web site =?Utf-8?B?dmlrdG9yOTk5MA==?= ASP .Net 0 06-05-2005 09:45 PM
cost estimate for a database-driven web site =?Utf-8?B?dmlrdG9yOTk5MA==?= ASP .Net 0 06-05-2005 09:31 PM
Estimate of hours to be spent on a project Bob ASP .Net 31 07-16-2004 03:01 PM
Give me your estimate... MCSE 8 04-05-2004 11:59 PM



Advertisments