[asterisk-dev] PJSIP realtime scalability problem

Matthew Jordan mjordan at digium.com
Sun Oct 18 14:20:37 CDT 2015


On Sat, Oct 17, 2015 at 12:54 PM, Michael Ulitskiy <mulitskiy at acedsl.com> wrote:
> Matthew,
>
> First of all, I apologize if my tone sounded too harsh. I didn't mean to
> offend anyone.
>
> I didn't mean to just say "it sucks". I wish to point it out though that,
> again, unless I'm missing something,
>
> current behaviour of pjsip realtime is not scalable and I believe it's a
> departure from
>
> what has been known as "dynamic realtime" for a long time.

No worries - just want to make sure we get to the bottom of the issues
you're experiencing!

>> Which problems?
>
>
>
> The problem here isn't actually related to caching implementation, but to
> the way pjsip matches endpoints.
>
> Whenever sip request arrives pjsip initially performs lookup for
> 'username at domain' and if it fails it falls
>
> back to lookup by username only.
>
> It results in 2 queries:
>
> SELECT * FROM pjsip_endpoints_v WHERE id = 'ep1 at domain';
>
> SELECT * FROM pjsip_endpoints_v WHERE id = 'ep1';
>
> Now in my environment only the 2nd one will succeed and will be cached. Now
> for every sip request my asterisk
>
> will be issuing
>
> SELECT * FROM pjsip_endpoints_v WHERE id = 'ep1 at domain';
>
> that will never succeed followed by retrieving 'ep1' from cache.
>
> Basically I'd like to have a way to suppress lookup for 'username at domain' or
> at least to cache the negative results.
>

I think that makes sense. The code from
res_pjsip_endpoint_identifier_user that does the endpoint
identification by user name portion is shown below:

    /* Attempt to find the endpoint given the name and domain provided */
    snprintf(id, sizeof(id), "%s@%s", endpoint_name, domain_name);
    if ((endpoint = ast_sorcery_retrieve_by_id(ast_sip_get_sorcery(),
"endpoint", id))) {
        goto done;
    }

    /* See if an alias exists for the domain provided */
    if ((alias = ast_sorcery_retrieve_by_id(ast_sip_get_sorcery(),
"domain_alias", domain_name))) {
        snprintf(id, sizeof(id), "%s@%s", endpoint_name, alias->domain);
        if ((endpoint =
ast_sorcery_retrieve_by_id(ast_sip_get_sorcery(), "endpoint", id))) {
            goto done;
        }
    }

    /* See if the transport this came in on has a provided domain */
    if ((transports =
ast_sorcery_retrieve_by_fields(ast_sip_get_sorcery(), "transport",
AST_RETRIEVE_FLAG_MULTIPLE | AST_RETRIEVE_FLAG_ALL, NULL)) &&
        (transport = ao2_callback(transports, 0,
find_transport_in_use, rdata)) &&
        !ast_strlen_zero(transport->domain)) {
        snprintf(id, sizeof(id), "%s@%s", endpoint_name, transport->domain);
        if ((endpoint =
ast_sorcery_retrieve_by_id(ast_sip_get_sorcery(), "endpoint", id))) {
            goto done;
        }
    }

    /* Fall back to no domain */
    endpoint = ast_sorcery_retrieve_by_id(ast_sip_get_sorcery(),
"endpoint", endpoint_name);

As you can see, we have a number of lookups that can occur:
 (1) First, we attempt to look up the endpoint by user at domain
 (2) If that fails, we grab any domain aliases that may exist, and do
a lookup by user at domainalias
 (3) If that fails, we look at the transports to see if they have any
domains. If so, we do a lookup by user at transportdomain
 (4) Finally, if that fails, we do a lookup by user name portion only

A simple solution would be to have a configuration option that skips
some of those checks.

It's probably worth opening an issue for that as well. While an
improvement, it is one that would skip one to three sorcery lookups on
every endpoint request, which would reduce contention both on the
cache and - when the cache is empty - on the database. For larger
systems, as you've pointed out, that's practically a bug.

>> > with ongoing load it has nothing to do with initial load that is still
>> > done
>
>> > in the extremely inefficient way
>
>> >
>
>> > I described in my original email.
>
>>
>
>> I'm not sure why that would be the case. You'll need to be more
>
>> specific, and provide your sorcery.conf configuration as well as the
>
>> specific operations/times when there are issues.
>
>
>
> sorcery.conf:
>
> [res_pjsip]
>
> endpoint=config,pjsip.conf,criteria=type=endpoint
>
> endpoint/cache=memory_cache,expire_on_reload=yes,object_lifetime_maximum=600,object_lifetime_stale=300
>
> endpoint=realtime,ps_endpoints
>
> aor=config,pjsip.conf,criteria=type=aor
>
> aor/cache=memory_cache,expire_on_reload=yes,object_lifetime_maximum=600,object_lifetime_stale=300
>
> aor=realtime,ps_aors
>
> extconfig.conf:
>
> ps_endpoints => pgsql,users,pjsip_endpoints_v
>
> ps_aors => pgsql,users,pjsip_aors_v
>
> When asterisk starts up and loads pjsip it does the following:
>
> SELECT * FROM pjsip_aors_v WHERE id LIKE '%' ORDER BY id
>
> SELECT * FROM pjsip_endpoints_v WHERE id LIKE '%' ORDER BY id
>
> thus loading all endpoints and AORs in memory. Then the worst part, it
> follows on with loading all
>
> endpoints and AORs individually with queries like this:
>
> SELECT * FROM pjsip_aors_v WHERE id = 'ep1'
>
> SELECT * FROM pjsip_aors_v WHERE id = 'ep2'
>
> ...
>
> SELECT * FROM pjsip_aors_v WHERE id = 'epN'
>
> then
>
> SELECT * FROM pjsip_endpoints_v WHERE id = 'ep1'
>
> SELECT * FROM pjsip_endpoints_v WHERE id = 'ep2'
>
> ...
>
> SELECT * FROM pjsip_endpoints_v WHERE id = 'epN'
>
> With 10K endpoints it results in 20K queries to db at asterisk startup. Now
> imagine multiple asterisk
>
> servers. This is the biggest problem.
>
> Also, to my surprise, this initial loading doesn't populate cache.
>
> Right after asterisk startup I do "sorcery memory cache dump
> res_pjsip/endpoint" and it's empty therefore causing
>
> additional db lookups as asterisk starts to serve sip requests.
>

One question I have on this point (as your later e-mail noted that the
cache was getting populated on initial load) - what ARA backend are
you using for realtime? While nothing has moved forward yet with it,
we are aware of some improvements that could be made with the ODBC
backend specifically around performance.

Initial loading is probably always going to be something of a problem,
as endpoints often will get populated during qualify/registration. The
only way I can think to address that is to either not qualify the
endpoints, qualify the endpoints with a larger range of qualification
times, or to address any bottlenecks in the realtime drivers
themselves.

>
>> > Caching also doesn't help at all with CLI commands like "pjsip show
>
>> > endpoints" in which case asterisk
>
>> >
>
>> > reloads the whole list from db instead of showing what it has in-memory.
>
>>
>
>> That actually is by design.
>
>>
>
>> Say we are caching endpoints. The cache only contains the n most
>
>> recently requested endpoints, *not* every endpoint that you may have
>
>> in your system. Hence, if you ask for all endpoints, we have to bypass
>
>> the cache and get all endpoints in order to accurately fulfill the
>
>> request.
>
>>
>
>> Given that this is a human interaction and not a run-time machine
>
>> interaction, the fact that you're requesting all endpoints results in
>
>> going out to the database is not unreasonable.
>
>
>
> Well I see your point. The thing is that in a system where endpoints are
>
> dynamically spread over multiple asterisk systems I never want to see
>
> all the endpoints. Only those that's been served by this asterisk and
> cached.
>
> May be it's worth having a command that shows only cached endpoints?
>
> Basically I was happy with how chan_sip worked in that regard - only loading
>
> endpoints on-demand and only showing those endpoints that are loaded in
> memory.
>

There actually was an e-mail on the users list about a similar notion
[1] - namely, a 'pjsip show endpoints like' command. Are the endpoints
on a particular Asterisk system named in such a fashion that you know
- from the name - which ones you generally want to see?

[1] http://lists.digium.com/pipermail/asterisk-users/2015-October/287781.html

It is possible as well to 'dump the cache', as you've noticed from the
CLI command. A PJSIP variant of that could probably be written as
well, although I haven't investigated exactly what level of access
PJSIP has to what is in the cache and what is not. (Generally, the
fact that something is coming from a cache should be transparent to
the consumer.)

>
>> > Also I've noticed another very awkward problem. If I type "pjsip show
>
>> > endpoint" in the console and then
>
>> >
>
>> > press "Tab" then asterisk hangs for over a minute and I register over
>> > 300
>
>> > queries like this in the db log:
>
>>
>
>> So, first, you are asking for name completion against 10k endpoints.
>
>> Regardless of the number of database queries, that's a large set to
>
>> complete against. Granted, there's no reason to go get the dataset on
>
>> every single entry...
>
>>
>
>> >
>
>> >
>
>> > SELECT * FROM pjsip_endpoints_v WHERE id LIKE '%' ORDER BY id
>
>> >
>
>>
>
>> ... which does appear as if that is what we are doing. In pjsip_cli:
>
>>
>
>> while ((object = ao2_t_iterator_next(&i, "iterate thru endpoints table")))
>> {
>
>> const char *id = formatter_entry->get_id(object);
>
>> if (!strncasecmp(word, id, wordlen)
>
>> && ++which > state) {
>
>> result = ast_strdup(id);
>
>> }
>
>> ao2_t_ref(object, -1, "toss iterator endpoint ptr before break");
>
>> if (result) {
>
>> break;
>
>> }
>
>> }
>
>>
>
>> Since the endpoint formatter_entry only has a 'get by id' callback:
>
>>
>
>> static void *cli_endpoint_retrieve_by_id(const char *id)
>
>> {
>
>> return ast_sorcery_retrieve_by_id(ast_sip_get_sorcery(), "endpoint", id);
>
>> }
>
>>
>
>> That means that for every partial match that you have on an endpoint,
>
>> we do a separate lookup.
>
>>
>
>> Alternatively, we could go pull a partial match in a single query,
>
>> than iterate over the returned set of matches. Clearly that would be a
>
>> lot better in this case.
>
>>
>
>> >
>
>> > Why would asterisk need to load the whole list of endpoints more than
>> > 300
>
>> > times is just completely beyond me.
>
>> >
>
>>
>
>> Hyperbole aside, it's because PJSIP chose a sane, maintainable method
>
>> to interact with its storage backends and uses a data abstraction
>
>> layer above its SQL statements - unlike chan_sip, which just embeds
>
>> the statements willy-nilly in the codebase. The downside of this is
>
>> that sometimes - in some specific cases - we aren't as efficient as we
>
>> should be.
>
>>
>
>> That's fixable however. Please do file a specific issue for the tab
>
>> completion case, as that should be improved.
>
>
>
> First of all, again, I'd prefer that completion to be performed not against
> all the endpoints
>
> in db, but only those loaded and cached.
>
> Second, my test environment doesn't have 10K of endpoints, but only
> currently 173.
>
> I imagine that if I did it against all 10K endpoints it would never finish.
>
> Sure I'll open an issue for that.
>

Thanks!

>
>> >
>

If I boil your issues down, it sounds like the problems are mostly around:
(1) CLI commands (or AMI actions, for that matter) that attempt to
operate on all endpoints. Generally, options that accessed only the
subset of the endpoints that were present in the cache would be
desirable.
(2) Initial loading of endpoints can take awhile and present a lot of
contention on the database.

Does that sound right?

Matt

-- 
Matthew Jordan
Digium, Inc. | Director of Technology
445 Jan Davis Drive NW - Huntsville, AL 35806 - USA
Check us out at: http://digium.com & http://asterisk.org



More information about the asterisk-dev mailing list