[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[leafnode-list] lockfile_exists disaster, analysis of 1.9.19 and 2.0b8 (was: Reproducibly truncate groupinfo)


Joerg Dietrich <joerg@xxxxxxxxxxxx> writes:

> On Wed, Jun 13, 2001 at 01:18:29PM +0200, Matthias Andree wrote:
> > I presume this is a logging issue.
> Interestring that groupinfo truncation also occurs in the 1.9.x
> tree. Is the completely different locking method there also
> buggy, or is it a completely different problem?

Also broken, slightly different problem.

- ----------------------------------------------------------------------
- ----------------------------------------------------------------------

1.9.19 has a TOC-TOU race (time of check - time of use). If a context
switch occurs between the read-open and the write-open of the lock file,
two processes can have the lock, since both of them have seen "no lock
file there" (or possibly "removed stale lock file"), and fopen(...,"w")
does not set O_EXCL on open(2). Non-atomicity issue here.

It's unsafe at first glance, so there's no deeper analysis, but I
haven't really cared for 1.9.x for a long time.

- ----------------------------------------------------------------------
- ----------------------------------------------------------------------
2.0b8 has a different problem I caught with this approach:

strace -tt -i -fefcntl,unlink,open \
/usr/local/bin/tcpserver -DHRv 119 \
/usr/local/sbin/leafnode \
2>&1 | egrep 'SETLK|fetchnews.lck|SIG'

Here's what happened, simplified scheme, P# = Process, LF = Lockfile

1 P1 starts, opens and locks LF
2 P2 starts, opens LF, is held because of fcntl(F_SETLKW)
3 P1 resumes, completes and unlinks LF
4 P2 wakes up from fcntl, holds a lock on a deleted file, resumes its operation
5 P2 is interrupted after its time slice has passed
6 P3 opens and locks LF because it's a new descriptor
7 P2 and P3 operate concurrently
- ---- locking scheme failed
8 P2 completes, unlinks LF
9 P3 completes, fails to unlink LF, but this is unnoticed since unlink
     return code is unchecked

As to #9: Quote from the "Ten Commandments for C Programmers":

|  6. If a function be advertised to return an error code in the event
|  of difficulties, thou shalt check for that code, yea, even though the
|  checks triple the size of thy code and produce aches in thy typing
|  fingers, for if thou thinkest 'it cannot happen to me', the gods
|  shall surely punish thee for thy arrogance.

My WIP tree has a log_unlink operation which logs the result in case of
problems, even if the caller casts the result to void (discards it).

- -- 
Matthias Andree

Version: 2.6.3i
Charset: noconv
Comment: Processed by Mailcrypt 3.5.5, an Emacs/PGP interface


leafnode-list@xxxxxxxxxxxxxxxxxxxxxxxxxxxx -- mailing list for leafnode
To unsubscribe, send mail with "unsubscribe" in the subject to the list