[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [leafnode-list] C++-ifying leafnode



Timo Geusch <timo@xxxxxxxxxxxxxxxx> writes:

> There's an initial hit but once you're over that the code tends to
> grow by the similar factors to C code if you know what you're
> doing. Not using the C++ standard library would be a mistake IMO
> because it's well understood and decent implementations are available
> on most platforms. And one can always download STLport if it turns out
> that the compiler's standard library isn't up to scratch (like the one
> in VC6 that has a lot of known bugs.

Frankly: I don't care about the latter. Leafnode may work on NTFS file
systems, but I myself won't do the port.

> I've had a quick look, looks quite interesting but my objection would
> be that any potential contributor who knows C++ would also have to
> learn PTypes. Not sure if that's worth it.

I may already lose past contributors, some of them have declared they
didn't know C++. I fear implementing all the network stuff as iostream
though. If someone's already done the job, why do it again?

For a short while, I had also considered using Ada95, but that's
something I'd have to learn myself which I don't really have time for
ATM.

> Also, I'm not too impressed by the container types either - looks like
> one is trading the fully typesafe approach for the 'derived from
> common base class' approach like you find it in Java. 
>
>> The stages you mention aren't very much to my intentions.
>
> Well, one has to start somewhere, plus if you want to make proper
> use of the standard C++ datatypes like strings, you're better off
> using the other parts of the C++ library as well, otherwise you'll be
> forever converting C++ data into C data and back. Kind of defeats the
> objective IMO.

One of the nicer things of C++ is function overloading. I could stuff
char * as well as mastr * into functions by defining a cast operator, so
that's not a big issue.

> You have missed the other argument, which was 'type-safe input/
> output'.

Well, yes, but my compiler and splint do a good job at identifying those
sore spots.

>> With respect to the library: that's the thing I'd bother about
>> converting last. If C++ allowed me to simplify code, I'd do it,
>
> You mean, like turning the current getaline()/getline()/_getline()
> into something like this:
>
> bool getaline(FILE *stream, std::string &line) {
>   std::ostringstream line_buffer;
>   int c;
>   char last_char = 0;
>
>   while (((c = getc(stream)) != EOF) && (c != '\n')) {
>     if (last_char != 0)
>       line_buffer << last_char;
>     last_char = static_cast<char>(c);
>   }
>
>   if (last_char != '\r')
>     line_buffer << last_char;
>
>   if (ferror(stream))
>     return false;
>
>   line = line_buffer.str();
>
>   if (debugmode & DEBUG_IO)
>     ln_log(LNLOG_SDEBUG, LNLOG_CTOP, "<%s", line.c_str());/* FIXME: CTOP? */
>
>   return (c != EOF) ? true : false;
> }

Hum... Well, the C++ standard library should have some getline function.

Other than that, the getaline function is plain ugly and overdue for
replacement. It needs to handle local line endings (LF on Unix) and NNTP
protocol line endings (CRLF). Leafnode currently munches bare LF which
it shouldn't, and turns them into CRLF on output. That's unnecessary,
and according to RFC-2822, it contributes to interoperability problems.

There need not be a 1:1 translation, a functional equivalent (like
looping over getline or something) is fine as well.

Looking at the code you write above, the ln_log interface is one that
should offer operator<< and maybe modifiers, or return a filtering
subclass that offers this stream insertion operator, and error handling
needs to be rethought. There are places that would benefit from
exceptions, low-level I/O is one of them. If the server disconnects
suddenly, then there's no way we continue on that server, so a "throw"
all the way up to the cleanup section doesn't look wrong to me.

> Same functionality (OK, the interface is slightly different because I
> wanted to make sure that I don't hide the 'C' getaline function), no
> explicit memory allocation, plus it appears to be at least as fast as
> the "C" implementation on my FBSD server, a humble 400MHz
> P2. Unfortunately, it *does* make use of the standard library.

The trivial stuff isn't necessarily slower in C++, it depends on whether
the global optimizer sees all the scopes and can optimize. Too many
indirections are bad. In fact, the current GNU libstdc++ seems to
produce smaller code than libc, at least the cout << "Hello, world!\n";
stuff compiles to a smaller executable than its C equivalent
puts("Hello, world!");

>> Going into more detail: I've considered storing articles in wire format
>> (CRLF) for a long time now so that the server can use sendfile or just
>> mmap and write large chunks,
>
> sendfile sounds good to me, but from reading the man page here it
> appears to assume that you're talking to a socket. Not sure how well
> it would handle talking to stdout, even if stdout is connected to a
> socket.

That's one thing that needs to be abstracted (virtual class or
something), because sendfile() isn't universally available, and we need
mmap()+write() or read()+write() in any case (although I'd go with just
sendfile and mmap/write first, those with other systems can then send in
patches.

Speaking of sockets, I've considered extracting parts of fetchnews so I
have either library code or a module or whatever that deals with a
single server, ultimately, I want fetchnews to contact several different
servers in parallel. For security and portability reasons, I'd prefer a
fork()-based model over threads.

> TBH I think if I would continue to work on this then we should come to
> some agreement *before* as to what is going to be done.

That's why I spoke up.

> I don't exactly have ample spare time so I'm not going to work for the
> bit bucket, sorry.

That's fine, neither would I.

> I also have no interest in forking the project by maintaining a
> separate "C++ version" so unless we can work out some kind of C++
> roadmap I don't think it's worth continuing on the off-chance that
> you'd like to include a snippet here or there.

Some things that would need to be cleaned up or reconsidered:

1. locking. leafnode has a global lock for the programs (or their parts)
   that modify the groupinfo ("active file"). Maybe a group-specific
   lock and keeping the active on disk with short-lived locks is a
   better approach, it would allow texpire and fetchnews to run
   concurrently, or fetchnews to poll several servers at the same time.

2. article store. I mentioned wire format, maybe the overview format
   needs to be rethought as well.

>> Having said all that, I'll of course appreciate voluntary help with
>> leafnode, and if my pondering about "C++ or Python" may have been
>> premature, because it didn't mention plans when that would happen or
>> how.
>
> Well it would've been handy to know a bit more about your plans for
> leafnode as at least my crystal ball was just in for a service...

OK. I'll have a look at your current code and see what has been done
already and how we can negotiate on common terms, because I'd like to
save those parts of it that are worth saving. During a rewrite or
maintenance, any possibility for cleanup should be taken though, there
is much code duplication.

When we can agree on some terms, then we should also consider using some
version control system. I am currently using CVS on my home system, but
I dislike the ugly CVS branch support. I have used BitKeeper (not open
source though) to some extent, it has some really nifty features to
support distributed/disconnected development, and I've also considered
using subversion (SVN) or arch, but haven't yet met people who used that
in production.

-- 
Matthias Andree

-- 
leafnode-list@xxxxxxxxxxxxxxxxxxxxxxxxxxxx -- mailing list for leafnode
To unsubscribe, send mail with "unsubscribe" in the subject to the list