Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > C++ > making an istream from a char array

Reply
Thread Tools

making an istream from a char array

 
 
John Salmon
Guest
Posts: n/a
 
      12-30-2006

I'm working with two libraries, one written
in old school C, that returns a very large
chunk of data in the form of a C-style,
NUL-terminated string.

The other written in a more modern C++
is a parser for the chunk of bytes returned by
the first. It expects a reference to a
std::istream as its argument.

The chunk of data is very large.
I'd like to feed the output of the first to
the second WITHOUT MAKING AN EXTRA IN-MEMORY COPY.

My attempts to create an istringstream from the
chunk of data all seem to at least double the
amount of VM used. Here's a short program demonstrating
what I've tried. Is there any way to get "inside"
the istringstream and tell it to use the 'chunk'
directly, rather than insisting on making a copy?

Thanks,
John Salmon

[jsalmon@river c++]$ cat chararraytostream.cpp
#include <string>
#include <sstream>
#include <cstdlib>
#include <cstring>
#include <cstdio>
using namespace std;

char *getLotsOfBytes();
istream& streamParser(istream &s);
void linuxChkMem(const char *msg);

void withImplicitString(){
linuxChkMem("Before getLotsOfBytes: ");
char *chunk = getLotsOfBytes();
linuxChkMem("After getLotsOfBytes():");
{
istringstream iss(chunk);
linuxChkMem("After iss(p): ");
streamParser(iss);
linuxChkMem("After streamParser(iss): ");
}
linuxChkMem("After iss goes out of scope: ");
free(chunk);
linuxChkMem("After free(p): ");
}

void withExplicitString(){
linuxChkMem("Before getLotsOfBytes: ");
char *chunk = getLotsOfBytes();
linuxChkMem("After getLotsOfBytes():");
{
string s(chunk);
linuxChkMem("After s(chunk): ");
free(chunk);
linuxChkMem("After free(p): ");
istringstream iss(s);
linuxChkMem("After iss(s): ");
streamParser(iss);
linuxChkMem("After streamParser(iss): ");
}
linuxChkMem("After iss goes out of scope: ");
}

int main(int argc, char **argv){
printf("with an implicit string constructor\n");
withImplicitString();
printf("\nwith an explicit string constructor\n");
withExplicitString();
return 0;
}

// On linux, tell us how much data space we're using
// in the VM.
void linuxChkMem(const char *msg){
printf("%s", msg);
fflush(stdout);
char cmd[50];
sprintf(cmd, "grep VmData /proc/%d/status", getpid());
system(cmd);
}

static const int SZ = 100*1024*1024;
// A rough approximation to getLotsOfBytes. In the
// real application, getLotsOfBytes has these characteristics:
// - it returns a malloced pointer to a NUL-terminated array of chars.
// - it is out of my control. E.g., I can't rewrite it in a way
// that might be more friendly to C++ streams.
char *getLotsOfBytes(){
char *p = (char *)malloc(SZ);
memset(p, ' ', SZ);
strcpy(p+SZ-50, "3.1415 2.718 1.414");
return p;
}

// A rough approximation to streamParser. In the real
// application, streamParser takes a ref to an istream
// and does what it does. Again, I can't easily redefine
// the interface.
istream& streamParser(istream& s){
double x, y, z;
s >> x >> y >> z;
printf("x: %f y: %f z: %f\n", x, y, z);
return s;
}

[jsalmon@river c++]$ g++ -O3 chararraytostream.cpp
[jsalmon@river c++]$ a.out
with an implicit string constructor
Before getLotsOfBytes: VmData: 40 kB
After getLotsOfBytes():VmData: 102444 kB
After iss(p): VmData: 204848 kB
x: 3.141500 y: 2.718000 z: 1.414000
After streamParser(iss): VmData: 204980 kB
After iss goes out of scope: VmData: 102576 kB
After free(p): VmData: 172 kB

with an explicit string constructor
Before getLotsOfBytes: VmData: 172 kB
After getLotsOfBytes():VmData: 102576 kB
After s(chunk): VmData: 204980 kB
After free(p): VmData: 102576 kB
After iss(s): VmData: 204980 kB
x: 3.141500 y: 2.718000 z: 1.414000
After streamParser(iss): VmData: 204980 kB
After iss goes out of scope: VmData: 172 kB
[jsalmon@river c++]$

 
Reply With Quote
 
 
 
 
Denise Kleingeist
Guest
Posts: n/a
 
      12-30-2006
Hello John!
John Salmon wrote:
> My attempts to create an istringstream from the
> chunk of data all seem to at least double the
> amount of VM used.


std::istringstream takes a std::string. For creating this
std::string from a char array, a copy is created. This copy
is then copied into the std::istringstream. For this purpose,
you probably don't want to use an std::istringstream. Instead,
you could use a simple homegrown stream buffer (code see
below).

Good luck, Denise!
--- CUT HERE ---
#include <istream>
#include <iostream>
#include <streambuf>
#include <string>
#include <string.h>

struct membuf:
std::streambuf
{
membuf(char* b, char* e) { this->setg(b, b, e); }
};

int main()
{
char* buffer = get_huge_buffer_with_data();
membuf sbuf(buffer, std::find(buffer, buffer + strlen(buffer), 0));
std::istream in(&sbuf);
for (std::string line; std::getline(in, line); )
std::cout << "line: " << line << "\n";
}

 
Reply With Quote
 
 
 
 
Gianni Mariani
Guest
Posts: n/a
 
      12-30-2006
John Salmon wrote:
> I'm working with two libraries, one written
> in old school C, that returns a very large
> chunk of data in the form of a C-style,
> NUL-terminated string.
>
> The other written in a more modern C++
> is a parser for the chunk of bytes returned by
> the first. It expects a reference to a
> std::istream as its argument.
>
> The chunk of data is very large.
> I'd like to feed the output of the first to
> the second WITHOUT MAKING AN EXTRA IN-MEMORY COPY.


The "without making a copy" might be a little tricky with istringstream.

I'm no expert on c++ streams but something like this might work.

#include <istream>

class Xistream
: public std::istream,
public std::streambuf
{
public:
Xistream( const char * begin, const char * end )
: std::istream( this )
{
setg( const_cast<char *>(begin), const_cast<char *>(begin),
const_cast<char *>(end) );
}
};

#include <iostream>

int main()
{
const char xx[] = "1 22 33";

Xistream xi( xx, xx + sizeof(xx) -1);

int i;
xi >> i;

std::cout << i << "\n";

xi >> i;

std::cout << i << "\n";

}
 
Reply With Quote
 
John Salmon
Guest
Posts: n/a
 
      12-30-2006
>>>>> "Denise" == Denise Kleingeist <> writes:

Denise> Hello John!
Denise> John Salmon wrote:
>> My attempts to create an istringstream from the
>> chunk of data all seem to at least double the
>> amount of VM used.


Denise> std::istringstream takes a std::string. For creating this
Denise> std::string from a char array, a copy is created. This copy
Denise> is then copied into the std::istringstream. For this purpose,
Denise> you probably don't want to use an std::istringstream. Instead,
Denise> you could use a simple homegrown stream buffer (code see
Denise> below).

Denise> Good luck, Denise!
Denise> --- CUT HERE ---
Denise> #include <istream>
Denise> #include <iostream>
Denise> #include <streambuf>
Denise> #include <string>
Denise> #include <string.h>

Denise> struct membuf:
Denise> std::streambuf
Denise> {
Denise> membuf(char* b, char* e) { this->setg(b, b, e); }
Denise> };

Denise> int main()
Denise> {
Denise> char* buffer = get_huge_buffer_with_data();
Denise> membuf sbuf(buffer, std::find(buffer, buffer + strlen(buffer), 0));
Denise> std::istream in(&sbuf);
Denise> for (std::string line; std::getline(in, line); )
Denise> std::cout << "line: " << line << "\n";
Denise> }

Thanks! This is exactly what I needed.

One question - what's the point of the std::find()?

I don't see how std::find(buffer, buffer+strlen(buffer), 0);
could ever be different from buffer+strlen(buffer)??

Cheers,
John Salmon
 
Reply With Quote
 
Denise Kleingeist
Guest
Posts: n/a
 
      12-30-2006
Hello John!
John Salmon wrote:
> >>>>> "Denise" == Denise Kleingeist <> writes:

> Denise> membuf sbuf(buffer, std::find(buffer, buffer + strlen(buffer), 0));
> One question - what's the point of the std::find()?
>
> I don't see how std::find(buffer, buffer+strlen(buffer), 0);
> could ever be different from buffer+strlen(buffer)??


You are right: it is a left over from a discarded attempt to use
std::find() instead of strlen()! Just use buffer + strlen(buffer)
instead.

Sorry for any confusion caused, Denise!

 
Reply With Quote
 
P.J. Plauger
Guest
Posts: n/a
 
      12-30-2006
"John Salmon" <> wrote in message
news:...

> I'm working with two libraries, one written
> in old school C, that returns a very large
> chunk of data in the form of a C-style,
> NUL-terminated string.
>
> The other written in a more modern C++
> is a parser for the chunk of bytes returned by
> the first. It expects a reference to a
> std::istream as its argument.
>
> The chunk of data is very large.
> I'd like to feed the output of the first to
> the second WITHOUT MAKING AN EXTRA IN-MEMORY COPY.
>
> My attempts to create an istringstream from the
> chunk of data all seem to at least double the
> amount of VM used. Here's a short program demonstrating
> what I've tried. Is there any way to get "inside"
> the istringstream and tell it to use the 'chunk'
> directly, rather than insisting on making a copy?


See the header <strstream>. It does exactly what you want,
and it's part of the C++ Standard (albeit a bit old
fashioned).

P.J. Plauger
Dinkumware, Ltd.
http://www.dinkumware.com


 
Reply With Quote
 
John Salmon
Guest
Posts: n/a
 
      12-30-2006
>>>>> "PJ" == P J Plauger <> writes:

PJ> "John Salmon" <> wrote in message
PJ> news:...

>> I'm working with two libraries, one written
>> in old school C, that returns a very large
>> chunk of data in the form of a C-style,
>> NUL-terminated string.
>>
>> The other written in a more modern C++
>> is a parser for the chunk of bytes returned by
>> the first. It expects a reference to a
>> std::istream as its argument.
>>
>> The chunk of data is very large.
>> I'd like to feed the output of the first to
>> the second WITHOUT MAKING AN EXTRA IN-MEMORY COPY.
>>
>> My attempts to create an istringstream from the
>> chunk of data all seem to at least double the
>> amount of VM used. Here's a short program demonstrating
>> what I've tried. Is there any way to get "inside"
>> the istringstream and tell it to use the 'chunk'
>> directly, rather than insisting on making a copy?


PJ> See the header <strstream>. It does exactly what you want,
PJ> and it's part of the C++ Standard (albeit a bit old
PJ> fashioned).

Thanks to Usenet, I now have two workable solutions.

Googling for strstream turns up lots of warnings that "strstream is
deprecated", with dire warnings that it may be removed from future
versions of the standard. OTOH, an istrstream does exactly what I
want, without any extra custom machinery ( struct membuf : public
streambuf ).

Other than simplicity and possible compatibility with future
standards, is there any reason to prefer one approach over the
other?

Cheers,
John Salmon


 
Reply With Quote
 
P.J. Plauger
Guest
Posts: n/a
 
      12-30-2006
"John Salmon" <> wrote in message
news:...

>>>>>> "PJ" == P J Plauger <> writes:

>
> PJ> "John Salmon" <> wrote in message
> PJ> news:...
>
>>> I'm working with two libraries, one written
>>> in old school C, that returns a very large
>>> chunk of data in the form of a C-style,
>>> NUL-terminated string.
>>>
>>> The other written in a more modern C++
>>> is a parser for the chunk of bytes returned by
>>> the first. It expects a reference to a
>>> std::istream as its argument.
>>>
>>> The chunk of data is very large.
>>> I'd like to feed the output of the first to
>>> the second WITHOUT MAKING AN EXTRA IN-MEMORY COPY.
>>>
>>> My attempts to create an istringstream from the
>>> chunk of data all seem to at least double the
>>> amount of VM used. Here's a short program demonstrating
>>> what I've tried. Is there any way to get "inside"
>>> the istringstream and tell it to use the 'chunk'
>>> directly, rather than insisting on making a copy?

>
> PJ> See the header <strstream>. It does exactly what you want,
> PJ> and it's part of the C++ Standard (albeit a bit old
> PJ> fashioned).
>
> Thanks to Usenet, I now have two workable solutions.
>
> Googling for strstream turns up lots of warnings that "strstream is
> deprecated", with dire warnings that it may be removed from future
> versions of the standard. OTOH, an istrstream does exactly what I
> want, without any extra custom machinery ( struct membuf : public
> streambuf ).
>
> Other than simplicity and possible compatibility with future
> standards, is there any reason to prefer one approach over the
> other?


You should prefer strstream because:

1) it's exactly what you need

2) it's still part of the C++ Standard

3) there's no reason to believe it'll become nonstandard anytime
soon, despite the dire warnings

4) even if it does officially go away, there's not a sane vendor
who'll stop supporting it for the next decade

So what the hell.

P.J. Plauger
Dinkumware, Ltd.
http://www.dinkumware.com


 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Does the C++ standard define the global function of " istream&operator >>(istream& in, string& str); "? xmllmx C++ 5 06-15-2010 11:57 AM
length of 2D Array >> char **myString= (char **) malloc (sizeof (char *)); davidb C++ 0 09-01-2006 03:22 PM
(const char *cp) and (char *p) are consistent type, (const char **cpp) and (char **pp) are not consistent lovecreatesbeauty C Programming 1 05-09-2006 08:01 AM
Problem- strcat with char and char indexed from char array aldonnelley@gmail.com C++ 3 04-20-2006 07:32 AM
/usr/bin/ld: ../../dist/lib/libjsdombase_s.a(BlockGrouper.o)(.text+0x98): unresolvable relocation against symbol `std::basic_ostream<char, std::char_traits<char> >& std::endl<char, std::char_traits<char> >(std::basic_ostre silverburgh.meryl@gmail.com C++ 3 03-09-2006 12:14 AM



Advertisments
 



1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57