[asterisk-dev] Lockups in Asterisk 11

Steve Davies davies147 at gmail.com
Mon Nov 16 09:30:03 CST 2015


Hi,

I believe there is a known issue in Postgres where if multiple threads
call pgsql_exec(), then it can cause threads to get stuck inside Postgres -
This is why I suggested that Postgres patch, which puts Asterisk locks
around the call to pgsql_exec().

OTOH, I don't think you mentioned what DB you are using, so that might not
apply...

Regards,
Steve

On Sat, 14 Nov 2015 at 15:11 Mark Murawski <markm-lists at intellasoft.net>
wrote:

> Hi Jaco,
>
> I do know what to look for with deadlocks, and the odd thing is that
> what I pasted in the earlier post is not your run of the mill mutex
> based deadlock.  Four locks are currently held and something inside each
> thread is just not finishing.  None of the threads are waiting for
> cross-thread locks.
>
> We have netlock, a single sip private lock, an ao2 lock, and a dblock.
> None of these threads are waiting for additional locks.  It seems
> whatever they are doing is simply not finishing.
>
>
>
> On 11/13/15 05:59, Jaco Kroon wrote:
> > Hi Mark,
> >
> > You're right.  That looks like something else.  do_monitor() relates to
> > MWI.  This could be a completely different issue, if I need to guess I'd
> > say we're looking at a lock-ordering deadlock.  Eg:
> >
> > Thread 1 grabs lock A.
> > Thread 2 grabs lock B.
> > Thread 1 blocks trying to lock B.
> > Thread 2 blocks trying to lock A.
> >
> > I'm not seeing any direct changes in chan_sip from 11.19 => 11.20 that
> > would do anything with respect to netlock.  Also no changes with respect
> > to dblock (and ast_db_put is so trivial that there is almost no way that
> > it can be wrong unless the unlock call fails - which I think is
> impossible).
> >
> > Out of my depth.  Sorry.
> >
> > Kind Regards,
> > Jaco
> >
> > On 12/11/2015 07:07, markm-lists at intellasoft.net wrote:
> >> Hi Jaco,
> >>
> >> I've tried this tmpfs workaround on asterisk 11 and I'm running into
> >> the following (very very long duration) locks that block sip activity
> >> completely.  My core issue is definitely not disk-io.
> >>
> >>
> >> === <pending> <lock#> (<file>): <lock type> <line num> <function>
> >> <lock name> <lock addr> (times locked)
> >> ===
> >> === Thread ID: 0xb524fb70 (do_monitor           started at [29517]
> >> chan_sip.c restart_monitor())
> >> === ---> Lock #0 (chan_sip.c): MUTEX 28923 handle_request_do &netlock
> >> 0xb630b440 (1)
> >>          main/logger.c:1702 ast_bt_get_addresses() (0x813f3ee+19)
> >>          main/lock.c:258 __ast_pthread_mutex_lock() (0x813864f+85)
> >>          channels/chan_sip.c:28926 handle_request_do()
> >>          channels/chan_sip.c:28885 sipsock_read()
> >>          main/io.c:292 ast_io_wait() (0x8132d50+175)
> >>          channels/chan_sip.c:29484 do_monitor()
> >>          main/utils.c:1223 dummy_start()
> >>          :0 start_thread()
> >>          libc.so.6 clone() (0xb77270d0+5E)
> >> === ---> Lock #1 (chan_sip.c): MUTEX 8959 sip_pvt_lock_full pvt
> >> 0x84b5660 (1)
> >>          main/logger.c:1702 ast_bt_get_addresses() (0x813f3ee+19)
> >>          main/lock.c:258 __ast_pthread_mutex_lock() (0x813864f+85)
> >>          main/astobj2.c:198 __ao2_lock() (0x80906ad+7C)
> >>          channels/chan_sip.c:8960 sip_pvt_lock_full()
> >>          channels/chan_sip.c:28939 handle_request_do()
> >>          channels/chan_sip.c:28885 sipsock_read()
> >>          main/io.c:292 ast_io_wait() (0x8132d50+175)
> >>          channels/chan_sip.c:29484 do_monitor()
> >>          main/utils.c:1223 dummy_start()
> >>          :0 start_thread()
> >>          libc.so.6 clone() (0xb77270d0+5E)
> >> === ---> Lock #2 (chan_sip.c): MUTEX 17047 register_verify peer
> >> 0x86df2b8 (1)
> >>          main/logger.c:1702 ast_bt_get_addresses() (0x813f3ee+19)
> >>          main/lock.c:258 __ast_pthread_mutex_lock() (0x813864f+85)
> >>          main/astobj2.c:198 __ao2_lock() (0x80906ad+7C)
> >>          channels/chan_sip.c:17048 register_verify()
> >>          channels/chan_sip.c:28477 handle_request_register()
> >>          channels/chan_sip.c:28785 handle_incoming()
> >>          channels/chan_sip.c:28953 handle_request_do()
> >>          channels/chan_sip.c:28885 sipsock_read()
> >>          main/io.c:292 ast_io_wait() (0x8132d50+175)
> >>          channels/chan_sip.c:29484 do_monitor()
> >>          main/utils.c:1223 dummy_start()
> >>          :0 start_thread()
> >>          libc.so.6 clone() (0xb77270d0+5E)
> >> === ---> Lock #3 (db.c): MUTEX 329 ast_db_put &dblock 0x8214f60 (1)
> >>          main/logger.c:1702 ast_bt_get_addresses() (0x813f3ee+19)
> >>          main/lock.c:380 __ast_pthread_mutex_trylock() (0x81389b7+85)
> >>          main/db.c:329 ast_db_put() (0x80eeec4+124)
> >>          channels/chan_sip.c:16242 parse_register_contact()
> >>          channels/chan_sip.c:17069 register_verify()
> >>          channels/chan_sip.c:28477 handle_request_register()
> >>          channels/chan_sip.c:28785 handle_incoming()
> >>          channels/chan_sip.c:28953 handle_request_do()
> >>          channels/chan_sip.c:28885 sipsock_read()
> >>          main/io.c:292 ast_io_wait() (0x8132d50+175)
> >>          channels/chan_sip.c:29484 do_monitor()
> >>          main/utils.c:1223 dummy_start()
> >>          :0 start_thread()
> >>          libc.so.6 clone() (0xb77270d0+5E)
> >> === -------------------------------------------------------------------
> >> ===
> >> =======================================================================
> >>
> >>
> >> On 11.11.2015 10:03, Jaco Kroon wrote:
> >>> Hi Mark,
> >>>
> >>> I suspect the following relates:
> >>>
> >>>
> >>>
> http://jkroon.blogs.uls.co.za/it/voip/asterisk-massively-speeding-up-those-register-requests
> >>>
> >>>
> >>> That should explain the underlying problem for you, and potentially
> >>> provide you with a fix depending on your situation.  We've been using
> >>> the proposed "workaround" there since the time of writing with no ill
> >>> effect, however, you need to take heed of the warnings posted there.
> >>>
> >>> We had to modify code on at least some instances to avoid DB() dialplan
> >>> use, in our case that wasn't difficult, but still, you need to be aware
> >>> of the risks.
> >>>
> >>> Kind Regards,
> >>> Jaco
> >>>
> >>
> >
> >
> >
>
>
> --
> _____________________________________________________________________
> -- Bandwidth and Colocation Provided by http://www.api-digital.com --
>
> asterisk-dev mailing list
> To UNSUBSCRIBE or update options visit:
>    http://lists.digium.com/mailman/listinfo/asterisk-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.digium.com/pipermail/asterisk-dev/attachments/20151116/19f58f65/attachment.html>


More information about the asterisk-dev mailing list