Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Python > Regular expression for different date formats in Python

Reply
Thread Tools

Regular expression for different date formats in Python

 
 
undesputed.hackerz@gmail.com
Guest
Posts: n/a
 
      11-26-2012
Hello Developers,

I am a beginner in python and need help with writing a regular expression for date and time to be fetched from some html documents. In the following code I am walking through the html files in a folder called event and printing the headings with h1 tag using beautifulsoup. These html pages also contains different formats of date and time. I want to fetch and display this information as well. Different formats of date in these html documents are:

21 - 27 Nov 2012
1 Dec 2012
30 Nov - 2 Dec 2012
26 Nov 2012

Can someone help me out with fetching these formats from these html documents ?
Here is my code for walking through the files and fetching h1 from those html files:


Code:


import re
import os
from bs4 import BeautifulSoup

for subdir, dirs, files in os.walk("/home/himanshu/event/"):
for fle in files:
path = os.path.join(subdir, fle)
soup = BeautifulSoup(open(path))

print (soup.h1.string)

#Date and Time detection

 
Reply With Quote
 
 
 
 
Michael Torrie
Guest
Posts: n/a
 
      11-26-2012
On 11/26/2012 06:15 AM, http://www.velocityreviews.com/forums/(E-Mail Removed) wrote:
> I am a beginner in python and need help with writing a regular
> expression for date and time to be fetched from some html documents.


Would the "parser" module from the third-party dateutil module work for you?

http://pypi.python.org/pypi/python-dateutil
http://labix.org/python-dateutil#hea...69557ec2c3ecd2

I don't believe the library is updated for Python 3 yet, sadly. But I
bet it could be ported fairly easily. I think it's pure python.


 
Reply With Quote
 
 
 
 
Vlastimil Brom
Guest
Posts: n/a
 
      11-26-2012
2012/11/26 <(E-Mail Removed)>:
> Hello Developers,
>
> I am a beginner in python and need help with writing a regular expressionfor date and time to be fetched from some html documents. In the followingcode I am walking through the html files in a folder called event and printing the headings with h1 tag using beautifulsoup. These html pages also contains different formats of date and time. I want to fetch and display thisinformation as well. Different formats of date in these html documents are:
>
> 21 - 27 Nov 2012
> 1 Dec 2012
> 30 Nov - 2 Dec 2012
> 26 Nov 2012
>
> Can someone help me out with fetching these formats from these html documents ?
> Here is my code for walking through the files and fetching h1 from those html files:
>
>
> Code:
>
>
> import re
> import os
> from bs4 import BeautifulSoup
>
> for subdir, dirs, files in os.walk("/home/himanshu/event/"):
> for fle in files:
> path = os.path.join(subdir, fle)
> soup = BeautifulSoup(open(path))
>
> print (soup.h1.string)
>
> #Date and Time detection
>
> --
> http://mail.python.org/mailman/listinfo/python-list


Hi,
the following pattern seems to match all of your examples,

(\d{1,2} )?(Nov|Dec)?( ?- )?(\d{1,2}) (Nov|Dec) (\d{4})

however, it doesn't look like very robust - of course, you have to add
the remaining months' abbreviations and check on the (parts of the)
HTML documents, you are interested in.

hth,
vbr
 
Reply With Quote
 
Miki Tebeka
Guest
Posts: n/a
 
      11-26-2012
On Monday, November 26, 2012 8:34:22 AM UTC-8, Michael Torrie wrote:
> http://pypi.python.org/pypi/python-dateutil
> ...
> I don't believe the library is updated for Python 3 yet, sadly.

dateutil supports 3.x since version 2.0.
 
Reply With Quote
 
Miki Tebeka
Guest
Posts: n/a
 
      11-26-2012
On Monday, November 26, 2012 8:34:22 AM UTC-8, Michael Torrie wrote:
> http://pypi.python.org/pypi/python-dateutil
> ...
> I don't believe the library is updated for Python 3 yet, sadly.

dateutil supports 3.x since version 2.0.
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Is MKV formats better than other video formats? helena68 Software 3 11-26-2008 07:57 AM
Detailed Analysis of the file formats for M$ Office 2007 -"Microsoft Office XML Formats? Defective by design" Jonathan Walker NZ Computing 1 08-26-2007 03:43 AM
Date Formats of Date in Database Martin Eyles ASP .Net 5 03-29-2006 09:28 AM
CyberLink Supports UDF 2.5/2.6 Formats For Blu-ray and HD DVD next generation of disc formats. Allan DVD Video 0 07-15-2005 07:43 PM
Dynamically changing the regular expression of Regular Expression validator VSK ASP .Net 2 08-24-2003 02:47 PM



Advertisments