Velocity Reviews

Velocity Reviews (http://www.velocityreviews.com/forums/index.php)
-   C++ (http://www.velocityreviews.com/forums/f39-c.html)
-   -   Parsing Numeric Data (http://www.velocityreviews.com/forums/t954336-parsing-numeric-data.html)

Mike Copeland 11-08-2012 03:42 PM

Parsing Numeric Data
 
The function below (parseNum) seems convoluted and possibly
faulty...although it seems to work. In the code invocation (far below)
the data is real-world, and I wish to parse only the first 6 numeric
values. The number of values to be parsed varies, but there is always a
"termination value" of some alphabetic value or end-of-line. Thus, I
want this logic to act as though it's a variable-value "scanf".
Please advise if there's a "cleaner" way to do this. TIA


typedef vector<string> TOKENS1; // parsing structures
TOKENS1 tokArray;

size_t parseNum(string line) // Parse numeric value(s)
{
string tok1, tok2;
istringstream iss1(line);
tokArray.clear();
while(getline(iss1, tok1, ' '))
{
if(tok1.find(' ') != string::npos)
{
istringstream iss1(tok1);
while(getline(iss1, tok2, ' '))
{
if(!tok2.empty()) tokArray.push_back(tok2);
} // while
} // if
else
{
if(tok1 == "") continue;
if(isdigit(tok1.at(0))) tokArray.push_back(tok1);
else return tokArray.size();
}
} // while
return tokArray.size();
} // parseNum

char m1[] = " 326 500 11 3900 11 3900 stop 10/29/2011 ";
size_t ii = parseNum(m1);

Chris Gordon-Smith 11-08-2012 08:12 PM

Re: Parsing Numeric Data
 
On Thu, 08 Nov 2012 08:42:35 -0700, Mike Copeland wrote:

> The function below (parseNum) seems convoluted and possibly
> faulty...although it seems to work. In the code invocation (far below)
> the data is real-world, and I wish to parse only the first 6 numeric
> values. The number of values to be parsed varies, but there is always a
> "termination value" of some alphabetic value or end-of-line. Thus, I
> want this logic to act as though it's a variable-value "scanf".
> Please advise if there's a "cleaner" way to do this. TIA
>
>


I had to solve a similar problem a while back. I've included the code I
came up with below. It seems that rather than using getline(), I had an
istream called Input_Stream and a string called Token, and tokenised
records in a loop containing the following construct:
Input_Stream >> Token

The Tokens are pushed onto a list of strings called Input_Record.

It works, although I have no doubt there are many things that could be
done much better.

Cout is a threadsafe wrapper for cout.

The code is available at
http://code.google.com/p/simsoup/sou...k/simsoup/src/
Persistent_Data_Manager/Input_Record.cpp

I've included an example of the text parsed at the end.

bool Input_Record::Read_Record(istream& Input_Stream, bool& EOF_Flag,
string& Error_Text)

// Read the input record into a list of strings. A record is
// terminated by a semicolon. Comments start with "//" and are
// terminated by end of line

{
TRACE;
string SemiColon(";");
string Token = "";
EOF_Flag = false;
bool End_Of_Record_Flag = false;

while (not End_Of_Record_Flag)
{
if (not (Input_Stream >> Token))
{
EOF_Flag = true;
if (not Token.empty())
{
Error_Text = Error_Text
+ "Incomplete record at end of file - last
token is "
+ String_In_Quotes(Token);
return false;
}
else
{
return true;
}
}

// Echo comment text but otherwise ignore
if ((Token.size() > 1)
and ((Token.substr(0,2) == "//") || Token.substr(0,2) == "/
*"))
{
ostringstream OutStream;
OutStream << Token;
char Text = ' ';
while ((Text not_eq '\n') and (not Input_Stream.eof()))
{
Text = Input_Stream.get();
OutStream << Text;
}
Cout::Get_Pt()->Write(OutStream);
Token.clear();
}
else
{
// Detect end of record
if (Token.substr(Token.size() -1,1) not_eq SemiColon)
{
Input_Record.push_back(Token);
}
else
{
End_Of_Record_Flag = true;
Token.erase(Token.size() -1,1);
if(not Token.empty())
{
Input_Record.push_back(Token);
}
}
}
}
Input_Record_For_Print = Input_Record;
return true;
}

// Bond Types for Designed Atom Types
// ----------------------------------

// Assemblite
Add_BondType @Time 2 @Atom1 a @Atom2 a @Order 1 @Enthalpy 1000;
Add_BondType @Time 2 @Atom1 a @Atom2 h @Order 1 @Enthalpy 1000;
Add_BondType @Time 2 @Atom1 a @Atom2 j @Order 1 @Enthalpy 1000;
Add_BondType @Time 2 @Atom1 a @Atom2 l @Order 1 @Enthalpy 1000;
Add_BondType @Time 2 @Atom1 a @Atom2 m @Order 1 @Enthalpy 10000;
Add_BondType @Time 2 @Atom1 a @Atom2 p @Order 1 @Enthalpy 1000;
Add_BondType @Time 2 @Atom1 a @Atom2 s @Order 1 @Enthalpy 1000;
Add_BondType @Time 2 @Atom1 a @Atom2 t @Order 1 @Enthalpy 1000;

Chris Gordon-Smith
www.simsoup.info

Jorgen Grahn 11-12-2012 11:15 PM

Re: Parsing Numeric Data
 
On Thu, 2012-11-08, Mike Copeland wrote:
> The function below (parseNum) seems convoluted and possibly
> faulty...although it seems to work. In the code invocation (far below)
> the data is real-world, and I wish to parse only the first 6 numeric
> values. The number of values to be parsed varies, but there is always a
> "termination value" of some alphabetic value or end-of-line. Thus, I
> want this logic to act as though it's a variable-value "scanf".
> Please advise if there's a "cleaner" way to do this. TIA
>
>
> typedef vector<string> TOKENS1; // parsing structures
> TOKENS1 tokArray;
>
> size_t parseNum(string line) // Parse numeric value(s)
> {


Why not just return a vector or numbers, and why not pass the line as
const reference?

I have a feeling I posted this the other week, but anyway ... this is
untested and probably not correct, but it's not much more complicated
than this. You can fix it up. Don't neglect to read the strtoul()
documentation carefully.

vector<unsigned> parseNum(const string& line)
{
const char* p = line.c_str();
vector<unsigned> acc;

while(1) {
char* end;
unsigned n = strtoul(p, &end, 10);
if(end==p) break;
acc.push_back(n);
if(!*end || !isspace(*end)) break;
p = end;
}

return acc;
}

/Jorgen

--
// Jorgen Grahn <grahn@ Oo o. . .
\X/ snipabacken.se> O o .


All times are GMT. The time now is 07:08 AM.

Powered by vBulletin®. Copyright ©2000 - 2013, vBulletin Solutions, Inc.
SEO by vBSEO ©2010, Crawlability, Inc.


1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57