[asterisk-dev] Time for a bug fix phase?

Sat May 31 14:52:53 CDT 2008

On Sat, 2008-05-31 at 16:00 +0100, Grey Man wrote:
> On Sat, May 31, 2008 at 3:00 PM, Steve Murphy <murf at digium.com> wrote:
> >
> > Which version of Asterisk did you do this in? The stuff I in did 1.4
> > is different than what I did in trunk.
> 
> 1.4.19 but I'd be pretty confident the CDR code in res_features.c
> hasn't changed in later 1.4 versions.
> 
> > In general, the CDR merging was necessary to combine info that was
> > captured in either the channel or peer CDR's.
> 
> What sort of information? To me it looks like information is only
> going to be lost by merging. For example if the channels on either
> side of the bridge happen to have different accountcodes then since
> there is only one accountcode per CDR a combined CDR has to choose one
> of the accountcodes. Which happens to be the crux of the problem when
> an attended transfer takes place.
> 

Well, it was pretty random as to what information exactly. 
The main fields that really concerned me were the start,
answer, and end times. Call disposition. 

As to disambiguation, you are right, and it *is* a problem,
when both CDR's have a chunk of info in their respective
structs, that contradict one another. In which case, we
*try* to pick the one that makes the most sense, somehow.
Usually, the channel (as opposed to the peer) would have been
the one to go out; for times, the earlier start time is probably
the best. The later end time. And so on.

Of course, in the 1.2 world, the data in the peer CDR never even
got set-- because the peer's CDR wouldn't be allocated. So, you
just had to settle for what you got.

> > And also, in general, I find that the CDR system is pretty much smeared
> > across the entire code base, and innocuous changes in one part can
> > solve one problem, but create a cascade of problems in other areas.
> 
> I don't doubt you there. That's one reason why I think the problem
> needs some thought put into the design instead of just jumping in. The
> starting point for a CDR design that works correctly should be to
> create an option that allows the whole bridged/merged CDR approach to
> be turned off and instead get a CDR per channel.
> 

I agree that we do need to think about how Asterisk should evolve its
CDR approach.

And you are, in a kind of way, pointing out a good approach, in 
that changes should be optional to the community.

But it does get a bit ugly in regards to pleasing the community
wrt CDRs... And here's what worries me: I have no clear perception
that the community would be happy with one single CDR approach.
What I'm perceiving is quite the opposite: different 'users'
have quite different concepts of what they want the CDR system
to report.

For instance, Brian was quite unhappy with the direction I took
in 1.4, with (attempting to, at least) produce one CDR per bridge.
He's more interested in 'logical' conversations, and the amount
of time they take. I just have trouble coming to grips with what
that 'logical' conversation consists of... 

So, CEL seemed the perfect answer to me. You basically drop
the CDR concept entirely, and go down a level to the individual
events from which CDR's are usually compiled. You get a timely
notification of every channel creation (start), answer, hangup (end),
every transfer request, park, unpark, conference join, and so on,
and from those, YOU decide how and what to do with each event.

At first, I thought just databasing these and running a post-processor
on the data would give most people what they need. But Brian pointed
out that such a database would not be that useful. Queries for common
and useful chunks of info would be exceedingly complex if possible at
all. So Brian proposed that a CEL -> CDR converter would be in order,
which I thought was a BRILLIANT idea. His needs later pointed him
in other directions... but the idea remains.

Why do I think it's so brilliant? Well, right now CDR code is spread
out all over Asterisk. With a CEL-> CDR converter, all the code 
to collect data and emit CDR's would be in a single source file
(or a set of them, depending on the code size and organization level
of the author). True, the CEL calls would be spread out over all
creation, but that's OK.

CEL calls look a bit like Manager event calls in the source, and 
in a lot of cases, are called from the same places. I thought about
just using manager calls, but they have a sufficiently different
set of requirements, to justify a separate interface (at least, 
for the time being). The Manager interface is a socket with text
messaging, the manager events cover a much larger scope of events,
and so on. Perhaps in the future, we can unify the undercarriages
of Manager and CEL.

So, with CEL, there's no need to get together and decide on policies.
The only problem that the different CDR camps could have with CEL is
that it may not provide some camps with enough events. That should
be no problem-- those who don't need to know about some events can
simply ignore them.

You see, the CDR system was meant to be a source of TRUTH, from which
all could put in a spoon and scoop out the facts that they need for
billing. But as it turns out, it isn't such a great oracle: it's
not only stating some facts (but not all), and it's also giving
us interpretation of the facts. Sometimes not useful interpretations.

Everyone thinks that CDRs alone can give us a perfect CALEA memory
of the Truth of what happened. But it can't. And everyone expects
that they can with a little (hopefully no) post-processing, they can
generate billing statement directly from the CDR's: every CDR is
a billable statement of some kind. But that's not true, either.

Forming a CDR is a result of individual needs and choices. 
If all that everybody did was to pick up a phone, dial a number,
talk, and hang up, well, that's a no-brainer, CDR's cover this
situation 100% and that's that. But if we rang a phone and no-one
answered, do want a CDR for that? If we rang a dozen phones, do
we want to know about only the phone that got answered first?
If we park someone, do we care how long they were parked? Do we
care how long people were kept on hold?  If we forward a call
and participate in 3-way conferences, does anyone care about that?
If we forward to an extension in the pbx, do we care? if we forward
to a long-distance number across the world, who gets billed for that?
Or do we care? What events are we legally required to track? If
we use a local channel to play a recording, do we charge for that?

My dream is produce a general purpose CEL->CDR converter, from
which perhaps you could use a config file to describe which
kind of events you want CDRs to describe, and how... I'm sure it 
could have more options than the Dial app.

> > I have been targeting trunk for changes to the CDR system, mainly
> > because
> > of all the complaints about changes I've made in 1.4. It disturbs apps
> > they've
> > worked hard to develop. I've pretty much written off 1.4, for this
> > reason.
> > There's hardly ANYTHING I can do without mucking somebody up.
> 
> I'd agree as well but it's worth noting that the current CDR system is
> pretty broken, a lot of us are in a real fix about providing a
> billable service and transfer functionality with Asterisk. The format
> of the CDR records does not need to be altered and that would avoid
> the most disruptive type of change. If the option to switch from the
> current broken merged CDR behaviour to correct per channel CDR
> behaviour was made into a config file switch that should further
> minimise any pain.

You are right; there are certain fixes that could be made, and
I'll do my best. But the trouble is, that someone probably by
now depends on the erroneous behavior. If every fix I make 
means some sort of config file option to turn it on, then I 
swiftly get into a morass of options I have to maintain.

> 
> I'd still contend that this is a major bug and should be fixed in 1.4
> but I suspect that may be a lonely view.

Hah, that's the magic of Asterisk. We have no way of telling the 
number of users who share in our pain. No way to track down the
number of users/implementers, and no way to judge who will care or
not, beyond the irc messages, and the email mailing lists. I know
the political people assign 'weights' to callers, to try and estimate
how much of the general public sympathize with you. They usually 
won't know for sure until the votes are counted at the end of the
season. I can even give you a weight. Maybe you are worth 20 people,
maybe you are worth 2000. Hard to tell!

But, you can file bugs, and asterisk developers will evaluate and
triage them, and maybe even fix them. If a thousand folks signed
on to monitor a bug, that's a very good indication that it's one
very hot bug!

But I think that I can safely say that it's not purely a voting 
process. We use our own common sense and our own feelings about
bugs. We have to. We don't always even know we'll get people mad
until we get heated email either directly or spark a fire-fight
in the email lists. Some of us have better flame-resistant 
suits than others!

murf

-- 
Steve Murphy
Software Developer
Digium
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/x-pkcs7-signature
Size: 3227 bytes
Desc: not available
Url : http://lists.digium.com/pipermail/asterisk-dev/attachments/20080531/2b99e011/attachment.bin