[asterisk-dev] [Code Review] Initial review request for CCSS architecture and generic implementations

Mon Oct 19 16:38:03 CDT 2009

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviewboard.asterisk.org/r/410/
-----------------------------------------------------------

Review request for Asterisk Developers, Russell Bryant, Kevin Fleming, and rmudgett.

Summary
-------

This is the first attempt at call completion code for Asterisk. It follows the architecture document which was sent to the dev list for the most part, but there is one significant change. That is, instead of modifying the Hangup() dialplan application to take an argument stating whether to offer CC, the decision is now made based on the cc_agent_policy of the calling channel. If it is set to anything other than "never," then CC will be offered.

There is an opaque structure used for defining CC configuration parameters and an API for getting and setting each one. There is also an API call to set a parameter's value based on the name of the parameter and a passed-in value. Thus far, the SIP channel driver is the only channel driver which has had the necessary code changes to be able to configure CC parameters within the channel driver's configuration file. Other channel types can still get and set CC parameters by using the CALLCOMPLETION dialplan function, though.

Now onto the process used.

An incoming channel has a datastore attached to it which contains information about the CC-capable interfaces dialed. This information is stored in a linked-list (with metadata that allows it to act as a "tree"). In addition, a "core_id" is assigned to the call. The core_id is used as a unique identifier for the CC transaction. When dialing is completed, a cc_core_instance structure and an ast_cc_agent structure are created for the call. The interface tree on the channel's datastore also resides on the agent structure.

The caller can request CC using the CallCompletionRequest dialplan application. This, plus all other core state machine changes (see the architecture document for information about the state machine) are handled by a taskprocessor. The reason for this is that it allows for the core to handle all CC-related requests one at a time and not have to worry about corruption of data due to poor locking techniques.

The core will create an ast_cc_monitor for each outbound interface on which CC has been requested. The monitors form a tree-like structure with ast_cc_monitor_links being used as edges. In addition, the monitors are stored in an ao2 container so that an individual monitor may be found without needing to traverse the tree. All operations that work on the monitor tree happen in the core's taskprocessor thread.

The generic monitor works by using a device state subscription. When a monitored device becomes available, the original caller is called back. When the original caller answers, a new datastore is created on the caller's channel which contains the same interface tree that was stored in the original call and on the agent structure. This is used so that the CC_INTERFACES channel variable can be properly populated at each step of the dialplan.

Currently, the code is in a state where it works and it works well. However, it is lacking a lot of bells and whistles, and the code itself could benefit from some more work being done on it. Here are some of my ideas for improving the code.

1) There are several XXX comments throughout the code. Rather than reiterate them here in the review request, I'll let them speak for themselves.
2) There currently are no CLI commands to do things such as checking on the current state of CC requests.
3) There are no manager events yet.
4) There are a bunch of LOG_NOTICE messages currently in the code that will need to be changed before going live. The easy way to go is to change them to debug messages, but what I'd really like to do is to create a custom CC log level instead for them.
5) ast_cc_request_state_change takes a debug string that will be printed so that you can see why a state change occurred. I want to change this to a printf-style argument list so that it will be easier to write more specific information.
6) There are two very similar functions in ccss.c, cc_device_tree_item_init and cc_extension_tree_item_init, which should probably be combined into a single function.
7) Despite my efforts to group like functions together, I still feel that ccss.c is a total mess and is hard to follow. I would like to fix this by doing one of two things which are very unAsterisk-like. Either a) separate the functions into separate files, or b) place function prototypes for every internal function at the top of ccss.c so that I can place things into an order where I don't have to make sure that I only call static functions defined above the one I'm currently in.

If any reviewers find other areas which can use improvement, or find any places where I'm obviously doing things incorrectly, then I would appreciate it.

Diffs
-----

  /trunk/apps/app_dial.c 224615 
  /trunk/channels/chan_local.c 224615 
  /trunk/channels/chan_sip.c 224615 
  /trunk/configs/ccss.conf.sample PRE-CREATION 
  /trunk/doc/tex/asterisk.tex 224615 
  /trunk/doc/tex/ccss.tex PRE-CREATION 
  /trunk/funcs/func_callcompletion.c PRE-CREATION 
  /trunk/include/asterisk/ccss.h PRE-CREATION 
  /trunk/include/asterisk/channel.h 224615 
  /trunk/include/asterisk/channelstate.h PRE-CREATION 
  /trunk/include/asterisk/devicestate.h 224615 
  /trunk/include/asterisk/frame.h 224615 
  /trunk/main/asterisk.c 224615 
  /trunk/main/ccss.c PRE-CREATION 
  /trunk/main/channel.c 224615 

Diff: https://reviewboard.asterisk.org/r/410/diff

Testing
-------

1) Configuration testing.
- chan_sip correctly reads and sets CC configuration settings set in chan_sip.conf
- A channel driver (chan_console in my case) that does not have CC configuration settings gets the default settings.
- The CALLCOMPLETION function correctly sets parameters and overwrites those set in the channel driver configuration settings.
- The CALLCOMPLETION function correctly sets parameters for channel drivers which cannot parse channel driver configuration settings.

2) Simple Call from one SIP phone to one other SIP phone.
- CCBS and CCNR are offered to the caller at proper times
- The tree of interfaces on the incoming channel's datastore is correct.
- The proper CC service is requested of the outbound interface
- The CallCompletionRequest application operates properly
- The CallCompletionCancel application operates properly
- The monitor tree formed from the interface tree is correctly structured and its data is correct
- The core properly detects when the called party has become available and calls the caller back
- The configured callback macro is executed on the caller's channel when he answers.
- The proper extension is dialed on the CC recall
- The CC_INTERFACES channel variable has the proper interfaces on it (to include Local channel specifics such as /b and /n)

3) Same test as 2), but the calling phone is busy when the called phone becomes available
- All points from 2) hold true
- The called phone's monitoring is properly suspended and the caller's phone is monitored
- When the caller becomes available again, the caller's phone is properly called back.
- Tested back-and-forth situations (i.e. the callee becomes available but the caller is busy, then the caller becomes available but the callee is busy) and they worked fine.

4) Ran tests 2) and 3) but with several layers of Local channels used before dialing an endpoint
- All points from 2) and 3) hold true.

5) Ran tests 2), 3), and 4) but dialed two SIP phones instead of just one.
- All points from 2), 3), and 4) hold true.
- Specifically, the core picks up when the first callee becomes available and calls the caller back.

6) Ran test 2) and had the phone forward the call
- Confirmed that CCNR was offered to the caller
- Confirmed that on recall, the forwarding phone was called back

7) Ran test where two Dials were made in a single extension
- Confirmed that CC was only offered for interfaces dialed in the first instance of Dial

8) Tested testable failure points to be sure that nothing catastrophic happened.
- Called CallCompletionRequest under atypical settings (e.g. Call when there was no previous failed call, call after cancelling CC, call when the CC process has progressed further)
- Called CallCompletionCancel when there was no CC request to cancel.
- Allowed both offer and available timers to expire.
- Caller hung up during execution of the callback macro.

9) Ran all of the above tests with MALLOC_DEBUG enabled and under valgrind
- Confirmed that the memory used by ccss.c before and after all tests was the same.
- Found that no invalid read or invalid writes were reported by valgrind.

Untested:

- Limitations, such as cc_max_requests, cc_max_agents, and cc_max_monitors.

Thanks,

Mark