[asterisk-dev] Potential race in reload_queues - read modify write unlocked
Dave WOOLLEY
dave.woolley at bts.co.uk
Tue Mar 4 11:32:54 CST 2014
Whilst trying to debug a probable race causing "module reload app_queue" to lose a queue when it clashed with offering a call to an agent channel, I found a potential race condition that still seems to exist in trunk. Unfortunately this race condition seems to fail towards not setting the queue to dead, rather than leaving it dead when it isn't, so it doesn't explain our problem. Nonetheless, I think it needs recording.
mark_dead_and_unfound executes the following with no lock on the queue:
q->dead = 1;
The problem with this is that "dead" is a bit field. In particular it shares a byte with "wrapped", which is a bit field that does get updated in normal operation. This means it actually compiles as Load, Or, Store. If an update of wrapped spans the Store, the Store can get wiped out leaving the value unchanged. Similarly this code could negate an update of "wrapped".
Note that this failure mode is not dependent on the processor processing memory accesses out of sequence, so memory fences won't help. Similarly "volatile" won't help.
The exact bit allocations will vary between our version and the trunk one. The former definitely shares a byte. Based on bit counting, I would expect the same for trunk.
--
David Woolley
BTS Holdings Plc
Tel: +44 (0)20 8401 9000 Fax: +44 (0)20 8401 9100
http://www.bts.co.uk
BTS Holdings PLC - Registered office: BTS House, Manor Road, Wallington, SM6 0DD - Registered in England: 1517630
More information about the asterisk-dev
mailing list