1. Brekeke Product Name and Version: 3.13.0.0/552
2. Java version:11.0.16
3. OS type and the version: RHEL7
4. UA (phone), gateway or other hardware/software involved: Avaya & Genesys SIP Server
5. Your problem:
Using the failover module in Advanced Server, we want to fail over if and only if the primary SIP deploy target for the rule does not respond at all. For 3xx-6xx responses, we do NOT want to fail over but just end the session with the error response as though not using failover module.
I have already searched the forums and found a few older responses, but none that answer this succinctly. The ambiguity seems due to the defaults and what not setting parameters means will happen. I think the answer is to set &failover.timer.inviting to something besides the default. But would setting that the same as the default have any effect other than what the default would be anyhow? Would there be any difference between setting &failover.timer.inviting = 10 explicitly versus just not setting it at all (thus using the default)?
Further, &failover.pattern.response could be used to specify which responses failover should follow, but it obviously cannot test for a negative (to not failover on any response).
Could I set &failover.pattern.response = ^$ to test for a null string? Would that effectively ignore failover for all responses, since no response would ever match an empty string? Or would it be better to match only all 3xx responses for the failover pattern, given that &failover.redirection defaults to true?
Again, it seems to me like just setting &failover.timer.inviting to something really short like 3 or 4 seconds is probably what is needed.
For testing how this fails over, I was thinking I could test no response by just deploying to an unused IP and have the failover be the actual destination, to see it fail over. Then to test for failure responses to not fail over, swap the secondary and the primary in the deploy pattern - because we want to see it get an error code and then not fail over because it did get a response.
Thank you.
Restricting failover to only occur due to NO response.
Moderator: Brekeke Support Team
-
- Posts: 31
- Joined: Wed Oct 17, 2018 2:21 pm
> Would there be any difference between setting &failover.timer.inviting = 10 explicitly versus just not setting it at all (thus using the default)?
They are same because 10 is the default value.
> Could I set &failover.pattern.response = ^$ to test for a null string? Would that effectively ignore failover for all responses,
It might work. Let you try.
They are same because 10 is the default value.
> Could I set &failover.pattern.response = ^$ to test for a null string? Would that effectively ignore failover for all responses,
It might work. Let you try.
-
- Posts: 31
- Joined: Wed Oct 17, 2018 2:21 pm
Today an opportunity presented to be able to test this, and yes setting &failover.pattern.response = ^$ DOES in fact work for this!
I received a bit more information, and the destination is actually a pair of Avaya PBXes sitting in front of the Genesys (not sure if the forum will allow me to update my original post). That makes a difference because how they told me it would respond and what how it actually responds in reality are not the same.
Tomorrow I'm hopeful to be able to test at least partly with using &failover.pattern.response = 3.. and see if that works as well. I would rather use 3.. if that works, since at least then I'm matching on a SIP response we would actually expect to see. Unfortunately, I have no way of really testing with that Avaya, so I do not truly know if on a real fail over situation the primary Avaya would just not respond at all, give an error message, or actually respond with a 3xx redirect.
I received a bit more information, and the destination is actually a pair of Avaya PBXes sitting in front of the Genesys (not sure if the forum will allow me to update my original post). That makes a difference because how they told me it would respond and what how it actually responds in reality are not the same.
Tomorrow I'm hopeful to be able to test at least partly with using &failover.pattern.response = 3.. and see if that works as well. I would rather use 3.. if that works, since at least then I'm matching on a SIP response we would actually expect to see. Unfortunately, I have no way of really testing with that Avaya, so I do not truly know if on a real fail over situation the primary Avaya would just not respond at all, give an error message, or actually respond with a 3xx redirect.
-
- Posts: 31
- Joined: Wed Oct 17, 2018 2:21 pm
Yes, that is exactly what I want to happen IF we actually get a 3xx redirect response. What I do not want is for failover to happen if we get literally any other response whatsoever.Do you want to let the Brekeke SIP Server handle 3xx redirection responses without forwarding them back to the call originator?
The intention is to have it fail over only for 2 cases:
1) no initial response to the INVITE
2) 3xx redirect responses (IF this even happens)
For 4xx, 5xx, and 6xx, we want the session to end with that given error response.
When a 3xx redirect occurs, is there any indication of that in the Brekeke logs? Is there anywhere in the Brekeke logs to see the failover occurring? What I find in the logs is when a failover occurs the only indication is that the destination IP is that specified in the failover parameters. I don't see anything in the logs to indicate what actually caused the failover to occur. So far I've been relying on packet captures from server where Brekeke is running to find out what really happened.
We tested further today with mixed results: leaving the same rule in place as yesterday (using &failover.pattern.response = ^$), we had different results from the same Avaya. What we had found previously (2017/2018 timeframe) was that for the Avaya, the SIP trunks MUST be set to disable "Layer 3 Test" on the Avaya side. Otherwise, session reuse may be attempted, causing intermittent rejections from the Avaya - which only further complicates this matter. We have a call scheduled for next week, during which they can verify how the Avaya SIP trunks are configured.
Furthermore, for testing purposes we also tried setting TO pointed at something we know would fail, and then having the primary (where it should answer) be in the failover parameter. We did not expect for that to also fail. So if we send to the Avaya's primary initially, the session gets set up normally. If we force a fail over to that same Avaya primary's IP, the Avaya rejects it. I'm still researching that one with packet captures, because in both cases it is effectively the "same" INVITE to TCP port 5060 of that same Avaya IP with two completely different responses. Admittedly it isn't the "same" INVITE message, but they are very similar. There are a number of details there that I simply won't have until that meeting next week, because I can only see what is happening from the perspective of the Brekeke SIP servers.
> When a 3xx redirect occurs, is there any indication of that in the Brekeke logs?
3xx SIP response packet will be shown in sv.xxxx.log file.
> Is there anywhere in the Brekeke logs to see the failover occurring?
"failover" log line will be shown in sv.xxxx.log file.
> What I find in the logs is when a failover occurs the only indication is that the destination IP is that specified in the failover parameters.
Is it sv.xxxx.log file contained in a zip file which downloaded from [Diagnostics]->[Debug Logs] page?
> We tested further today with mixed results: leaving the same rule in place as yesterday (using &failover.pattern.response = ^$)
For handling 3xx at Brekeke SIP Server without forwarding them,
set &failover.pattern.response = 3..
If you have a support account with Brekeke, I recommend you to send a log file to them for their review.
3xx SIP response packet will be shown in sv.xxxx.log file.
> Is there anywhere in the Brekeke logs to see the failover occurring?
"failover" log line will be shown in sv.xxxx.log file.
> What I find in the logs is when a failover occurs the only indication is that the destination IP is that specified in the failover parameters.
Is it sv.xxxx.log file contained in a zip file which downloaded from [Diagnostics]->[Debug Logs] page?
> We tested further today with mixed results: leaving the same rule in place as yesterday (using &failover.pattern.response = ^$)
For handling 3xx at Brekeke SIP Server without forwarding them,
set &failover.pattern.response = 3..
If you have a support account with Brekeke, I recommend you to send a log file to them for their review.
-
- Posts: 31
- Joined: Wed Oct 17, 2018 2:21 pm
That sounds great, but by default all those categories under "SIP Server log settings" are unchecked. Which means none of that is enabled by default. So all I find in those sv.*.log files are general information like startup time, Java memory information, session and packet counts.
I'll have to do a bit of experimentation on resource impact of enabling those debug log options before considering implement in production. In particular, I'm not sure how much log data each session would generate on average with all those debug log options enabled. I might want to split the logs off into their own filesystem, as I don't know how Brekeke behaves if it fills the filesystem with its own logs (and I really don't want to find out in production ).
As far as follow up on the &failover.pattern.response = 3.. testing, that didn't work as expected. The Avaya team doesn't have a test system, so we didn't really have any 3xx messages to be able to test with. I did get some weird 404 and 481 responses that were not expected. I switched back to using &failover.pattern.response = ^$ and retested successfully. I'm not convinced yet that isn't an Avaya quirk.
I'll have to do a bit of experimentation on resource impact of enabling those debug log options before considering implement in production. In particular, I'm not sure how much log data each session would generate on average with all those debug log options enabled. I might want to split the logs off into their own filesystem, as I don't know how Brekeke behaves if it fills the filesystem with its own logs (and I really don't want to find out in production ).
As far as follow up on the &failover.pattern.response = 3.. testing, that didn't work as expected. The Avaya team doesn't have a test system, so we didn't really have any 3xx messages to be able to test with. I did get some weird 404 and 481 responses that were not expected. I switched back to using &failover.pattern.response = ^$ and retested successfully. I'm not convinced yet that isn't an Avaya quirk.
If you have a concern about logging's impact, I recommend that you enable the SIP session logging at only certain SIP calls with DialPlan.
Refer to the Logging SIP packets through a certain dial plan in the wiki topic below.
https://docs.brekeke.com/sip/how-to-log-sip-packets
Adding "&net.sip.loglevel.file = 255" in DeployPattern will write the SIP session logging and also Failover's log only if Matching Pattern is fulfilled. So this way will not consume the disks pace a lot.
If you enable "SIP Session" log category at [Debug Logs] page, all SIP sessions will be logged and so the log file will grow quickly.
Anyway having a log will diagnose the issue easily.
Refer to the Logging SIP packets through a certain dial plan in the wiki topic below.
https://docs.brekeke.com/sip/how-to-log-sip-packets
Adding "&net.sip.loglevel.file = 255" in DeployPattern will write the SIP session logging and also Failover's log only if Matching Pattern is fulfilled. So this way will not consume the disks pace a lot.
If you enable "SIP Session" log category at [Debug Logs] page, all SIP sessions will be logged and so the log file will grow quickly.
Anyway having a log will diagnose the issue easily.