![]() |
Parsing Numeric Data
The function below (parseNum) seems convoluted and possibly
faulty...although it seems to work. In the code invocation (far below) the data is real-world, and I wish to parse only the first 6 numeric values. The number of values to be parsed varies, but there is always a "termination value" of some alphabetic value or end-of-line. Thus, I want this logic to act as though it's a variable-value "scanf". Please advise if there's a "cleaner" way to do this. TIA typedef vector<string> TOKENS1; // parsing structures TOKENS1 tokArray; size_t parseNum(string line) // Parse numeric value(s) { string tok1, tok2; istringstream iss1(line); tokArray.clear(); while(getline(iss1, tok1, ' ')) { if(tok1.find(' ') != string::npos) { istringstream iss1(tok1); while(getline(iss1, tok2, ' ')) { if(!tok2.empty()) tokArray.push_back(tok2); } // while } // if else { if(tok1 == "") continue; if(isdigit(tok1.at(0))) tokArray.push_back(tok1); else return tokArray.size(); } } // while return tokArray.size(); } // parseNum char m1[] = " 326 500 11 3900 11 3900 stop 10/29/2011 "; size_t ii = parseNum(m1); |
Re: Parsing Numeric Data
On Thu, 08 Nov 2012 08:42:35 -0700, Mike Copeland wrote:
> The function below (parseNum) seems convoluted and possibly > faulty...although it seems to work. In the code invocation (far below) > the data is real-world, and I wish to parse only the first 6 numeric > values. The number of values to be parsed varies, but there is always a > "termination value" of some alphabetic value or end-of-line. Thus, I > want this logic to act as though it's a variable-value "scanf". > Please advise if there's a "cleaner" way to do this. TIA > > I had to solve a similar problem a while back. I've included the code I came up with below. It seems that rather than using getline(), I had an istream called Input_Stream and a string called Token, and tokenised records in a loop containing the following construct: Input_Stream >> Token The Tokens are pushed onto a list of strings called Input_Record. It works, although I have no doubt there are many things that could be done much better. Cout is a threadsafe wrapper for cout. The code is available at http://code.google.com/p/simsoup/sou...k/simsoup/src/ Persistent_Data_Manager/Input_Record.cpp I've included an example of the text parsed at the end. bool Input_Record::Read_Record(istream& Input_Stream, bool& EOF_Flag, string& Error_Text) // Read the input record into a list of strings. A record is // terminated by a semicolon. Comments start with "//" and are // terminated by end of line { TRACE; string SemiColon(";"); string Token = ""; EOF_Flag = false; bool End_Of_Record_Flag = false; while (not End_Of_Record_Flag) { if (not (Input_Stream >> Token)) { EOF_Flag = true; if (not Token.empty()) { Error_Text = Error_Text + "Incomplete record at end of file - last token is " + String_In_Quotes(Token); return false; } else { return true; } } // Echo comment text but otherwise ignore if ((Token.size() > 1) and ((Token.substr(0,2) == "//") || Token.substr(0,2) == "/ *")) { ostringstream OutStream; OutStream << Token; char Text = ' '; while ((Text not_eq '\n') and (not Input_Stream.eof())) { Text = Input_Stream.get(); OutStream << Text; } Cout::Get_Pt()->Write(OutStream); Token.clear(); } else { // Detect end of record if (Token.substr(Token.size() -1,1) not_eq SemiColon) { Input_Record.push_back(Token); } else { End_Of_Record_Flag = true; Token.erase(Token.size() -1,1); if(not Token.empty()) { Input_Record.push_back(Token); } } } } Input_Record_For_Print = Input_Record; return true; } // Bond Types for Designed Atom Types // ---------------------------------- // Assemblite Add_BondType @Time 2 @Atom1 a @Atom2 a @Order 1 @Enthalpy 1000; Add_BondType @Time 2 @Atom1 a @Atom2 h @Order 1 @Enthalpy 1000; Add_BondType @Time 2 @Atom1 a @Atom2 j @Order 1 @Enthalpy 1000; Add_BondType @Time 2 @Atom1 a @Atom2 l @Order 1 @Enthalpy 1000; Add_BondType @Time 2 @Atom1 a @Atom2 m @Order 1 @Enthalpy 10000; Add_BondType @Time 2 @Atom1 a @Atom2 p @Order 1 @Enthalpy 1000; Add_BondType @Time 2 @Atom1 a @Atom2 s @Order 1 @Enthalpy 1000; Add_BondType @Time 2 @Atom1 a @Atom2 t @Order 1 @Enthalpy 1000; Chris Gordon-Smith www.simsoup.info |
Re: Parsing Numeric Data
On Thu, 2012-11-08, Mike Copeland wrote:
> The function below (parseNum) seems convoluted and possibly > faulty...although it seems to work. In the code invocation (far below) > the data is real-world, and I wish to parse only the first 6 numeric > values. The number of values to be parsed varies, but there is always a > "termination value" of some alphabetic value or end-of-line. Thus, I > want this logic to act as though it's a variable-value "scanf". > Please advise if there's a "cleaner" way to do this. TIA > > > typedef vector<string> TOKENS1; // parsing structures > TOKENS1 tokArray; > > size_t parseNum(string line) // Parse numeric value(s) > { Why not just return a vector or numbers, and why not pass the line as const reference? I have a feeling I posted this the other week, but anyway ... this is untested and probably not correct, but it's not much more complicated than this. You can fix it up. Don't neglect to read the strtoul() documentation carefully. vector<unsigned> parseNum(const string& line) { const char* p = line.c_str(); vector<unsigned> acc; while(1) { char* end; unsigned n = strtoul(p, &end, 10); if(end==p) break; acc.push_back(n); if(!*end || !isspace(*end)) break; p = end; } return acc; } /Jorgen -- // Jorgen Grahn <grahn@ Oo o. . . \X/ snipabacken.se> O o . |
| All times are GMT. The time now is 07:08 AM. |
Powered by vBulletin®. Copyright ©2000 - 2013, vBulletin Solutions, Inc.
SEO by vBSEO ©2010, Crawlability, Inc.