[asterisk-dev] func_odbc auto commit at answer time

Jaco Kroon jaco at uls.co.za
Fri Feb 19 05:03:07 CST 2021


Hi Dennis,

Thank you for taking the time.

On 2021/02/18 16:48, Dennis Buteyn wrote:

> On 2/18/21 12:53 PM, Jaco Kroon wrote:
>> Hi All,
>>
>> So I've been updating some code to use func_odbc vs previous solutions
>> and have bumped into a few concerns (1 I'd appreciate feedback on the
>> review, 2 is prelude to 3, which is the main issue currently):
> <snip>
>> 2.  Too many queries resulting in the database server (which is set for
>> maximum persistence) being unable to keep up, so enters
>> ODBC(transaction,mydb)=meh + ODBC(forcecommit)=1.  This sorts out the
>> COMMIT rate since now things are getting grouped into transactions.
>
> <snip>
>
> Nothing is stopping you from using more than one database server. With
> ODBC its pretty easy to spread the load over several machines. But
> aside from throwing more metal at the problem, you may also want to
> review how you are using your database. Perhaps some queries are extra
> slow, maybe some queries can be combined, or perhaps a different
> storage engine is more suitable, etc. Transactions help prevent
> concurrent queries from stepping on each other's toes as well as
> allowing the database to delay disk flushes. Transactions however may
> also cause delays while waiting on row and table locks, potentially
> deadlocking (especially long-running ones).

The specific ODBC connection here is merely used for inserting "traces"
into the database, so exactly this:

[trace]
dsn=ulsdb
synopsis=Add a trace onto an existing call.
syntax=[<level>]=<message>
writesql=INSERT INTO calltrace(uniqueid, message, level)
VALUES('${SQL_ESC(${UNIQUEID})}', '${SQL_ESC(${VALUE})}',
'${IF($[${ARGC}>0]?${SQL_ESC(${ARG1})}:user)}')

(Relies on patch mentioned in my point 1 for ARG1 to work correctly when
not supplied)

The table itself:

CREATE TABLE `calltrace` (
  `uniqueid` varchar(64) NOT NULL,
  `level` enum('user','admin','system') NOT NULL DEFAULT 'user',
  `tracetime` datetime(3) NOT NULL DEFAULT current_timestamp(3),
  `message` varchar(1024) NOT NULL,
  KEY `uniqueid` (`uniqueid`),
  KEY `tracetime` (`tracetime`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1

So we'll do stuff like Set(ODBC_trace(user)=Received call from
${CALLERID(all)} to ${EXTEN}) in the dialplan for critical decision
points.  The other option is to log this to a file and to post-load it
into the database.  Really doesn't matter that much, and if it goes
lost, it's not the end of the world, but the above just makes it simpler
and more real time, allows us to use concurrent inserts.

For any given uniqueid it's seldom that there are more than 30 entries
generated, around 20 or so on average, we currently dealt with this by
reducing the amount of logging, so whilst not all of it is strictly
required, it's useful stuff to be able to enable support personnel to
not have to bug the system administrators, and even to users to not need
to log stuff to support to begin with.

The other func odbc queries are almost all select queries, and are very
well cached, in critical cases into astdb too.  Those never cause
issues, and even when the INSERTs are problematic continue to operate
correctly.  The above basically just starts slowing down because of the
implicit COMMIT, with the above as a transaction it works very well
overall, the transaction connection just gets held up for the duration
of the call, but it does lower the commit rate from 50-100 (and higher
in extreme cases) to 5-10 average.  We can start seeing the effects of
the forced disk IO around 75 odd COMMIT/s, and annoying slowdown beyond
that.

One do need to be aware that READ queries are issued on handles
different than the WRITE queries, since the the _read function doesn't
take res_odbc_transaction into consideration (which arguably it should
but that has zero effect for my use case, in fact, as you point out it
may actually be bad for my use case).

>
> Without diving deep into transactions, it sounds like your dialplan is
> doing lots of small modifications during the call. Databases like
> doing things in bulk, so perhaps collecting changes in channel
> variables before submitting them with a hangup handler will be more
> effective?

That's possible, or just group them into a transaction, it's not going
to reduce the overall number of queries much, but yes, since this is
MariaDB one could use the "bulk insert syntax" if we really wanted to
(but that's not standard SQL and may well fail if for some reason
someone some years from now decide we need to swap out MariaDB for
postgres), but that also complicates the overall process significantly. 
Our dialplan processing is usually well under a second in total, with
average ring time on the Dial() around 12 seconds *usually* in ANSWER
cases, and significantly shorter in NO ANSWER case.  This means that at
a call setup rate of 10 calls/second we need a concurrency of 150 to
have a slight safety factor, but if the COMMIT only happens when the
channel gets destroyed (and average call duration of 73 seconds, or
overall 85 seconds from channel creation to channel destruction, that
goes up to 850 with little to no safety, 900-1000 with, and that's in
both cases cutting it finer than we like).

Don't think the discussion here is how to make the database
better/faster, but rather focus on how asterisk can make more convenient
mechanisms available in order to enable more effective and efficient use
of the database engine, and by implication the hardware.

You raise a valid point though:  what if the commit could take up
significant (>100ms) amount of time?  Should we then have an option
in/for res_odbc_transaction to fork a separate thread for COMMIT before
returning the ODBC handle to the pool?  That actually makes some sense,
so of the various 3.X options that involve res_odbc_transaction if there
is then some way to "parallel commit", which would be fine for my use
case, but may be terrible for others, eg, a sequence like:

START TRANSACTION;
INSERT INTO foo(a,b,c,d) VALUES(1,2,3,4);
COMMIT; -- implicitly releases handle back to pool.

SELECT * FROM foo WHERE a=1;

If the COMMIT is parallel this opens a race towards the SELECT (which
may end up using a different handle, and with default and sane isolation
levels would result in a race condition even if both "transactions" are
handled from the same dialplan "thread").  See above about func_odbc
using different handles for transactional queries and read queries anyway.

Kind Regards,
Jaco





More information about the asterisk-dev mailing list