* [RFC/PATCH] sbc: new filtering function for 8 band fixed point encoding @ 2008-11-28 13:35 Jaska Uimonen 2008-11-28 14:18 ` Marcel Holtmann 2008-12-02 20:15 ` Jim Carter 0 siblings, 2 replies; 35+ messages in thread From: Jaska Uimonen @ 2008-11-28 13:35 UTC (permalink / raw) To: linux-bluetooth [-- Attachment #1: Type: text/plain, Size: 925 bytes --] Hi All, I did some testing on the current 8 band fixed point encoding and it seems to attenuate frequencies below 800Hz and above 18kHz. There might be some other stuff happening also, because at least to me the bass seemed to lack some "definition". I didn't quite understand how the current tables are calculated and how the filtering works so I wrote a new filtering function and calculated new filter tables for it. It is written using 16 bit fixed point without any platform specific optimizations. I only unrolled some loops etc. I tried to follow the flow chart in MPEG-1 annex c. With this new filtering the low and high frequencies are there, but I haven't done any more thorough testing. At least it sounds a little bit better to my ears :) br, Jaska Uimonen P.S. I could have done some discussion with the list members about the current implementation, but I kind of got carried away. Sorry about that. [-- Attachment #2: 0001-New-function-and-coefficients-for-8-band-fixed-point.patch --] [-- Type: application/mbox, Size: 7292 bytes --] ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [RFC/PATCH] sbc: new filtering function for 8 band fixed point encoding 2008-11-28 13:35 [RFC/PATCH] sbc: new filtering function for 8 band fixed point encoding Jaska Uimonen @ 2008-11-28 14:18 ` Marcel Holtmann 2008-11-28 14:24 ` Jelle de Jong 2008-11-28 15:14 ` Jaska Uimonen 2008-12-02 20:15 ` Jim Carter 1 sibling, 2 replies; 35+ messages in thread From: Marcel Holtmann @ 2008-11-28 14:18 UTC (permalink / raw) To: Jaska Uimonen; +Cc: linux-bluetooth Hi Jaska, > I did some testing on the current 8 band fixed point > encoding and it seems to attenuate frequencies below 800Hz > and above 18kHz. There might be some other stuff happening > also, because at least to me the bass seemed to lack some > "definition". > > I didn't quite understand how the current tables are calculated > and how the filtering works so I wrote a new filtering function > and calculated new filter tables for it. It is written > using 16 bit fixed point without any platform specific optimizations. > I only unrolled some loops etc. I tried to follow the > flow chart in MPEG-1 annex c. > > With this new filtering the low and high frequencies are there, but > I haven't done any more thorough testing. At least it sounds > a little bit better to my ears :) thanks for looking at it. I am seriously lost when it comes to audio codecs and my ears normally don't count for much. So do you think we should throw all away any you start over providing a correct implementation with fixed point integer and then we start optimizing step by step (while testing against SBC conformance) or how should we continue. For sure we have to fix our codec. Regards Marcel ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [RFC/PATCH] sbc: new filtering function for 8 band fixed point encoding 2008-11-28 14:18 ` Marcel Holtmann @ 2008-11-28 14:24 ` Jelle de Jong 2008-11-28 15:20 ` Jaska Uimonen 2008-11-28 15:14 ` Jaska Uimonen 1 sibling, 1 reply; 35+ messages in thread From: Jelle de Jong @ 2008-11-28 14:24 UTC (permalink / raw) To: Marcel Holtmann; +Cc: Jaska Uimonen, linux-bluetooth Marcel Holtmann wrote: > Hi Jaska, > >> I did some testing on the current 8 band fixed point >> encoding and it seems to attenuate frequencies below 800Hz >> and above 18kHz. There might be some other stuff happening >> also, because at least to me the bass seemed to lack some >> "definition". >> >> I didn't quite understand how the current tables are calculated >> and how the filtering works so I wrote a new filtering function >> and calculated new filter tables for it. It is written >> using 16 bit fixed point without any platform specific optimizations. >> I only unrolled some loops etc. I tried to follow the >> flow chart in MPEG-1 annex c. >> >> With this new filtering the low and high frequencies are there, but >> I haven't done any more thorough testing. At least it sounds >> a little bit better to my ears :) > > thanks for looking at it. I am seriously lost when it comes to audio > codecs and my ears normally don't count for much. > > So do you think we should throw all away any you start over providing a > correct implementation with fixed point integer and then we start > optimizing step by step (while testing against SBC conformance) or how > should we continue. For sure we have to fix our codec. > > Regards > > Marcel Hi Jaska, Thank you so much for improving the codecs :-D it was on my long standing wish list. You want to receive a few Dutch stroopwafels for your efforts :-D Is the patch you provided working against the latest git? Marcel would you be willing to review the patch for code style, hidden issues etcetera. I would love to test this patch this weekend and apply it to the latest git. I am an heavy stereo bluetooth user and will notice glitches and quality distortions on my bluetooth speakers and headsets. Best regards, Jelle ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [RFC/PATCH] sbc: new filtering function for 8 band fixed point encoding 2008-11-28 14:24 ` Jelle de Jong @ 2008-11-28 15:20 ` Jaska Uimonen 2008-11-28 18:13 ` David Sainty 0 siblings, 1 reply; 35+ messages in thread From: Jaska Uimonen @ 2008-11-28 15:20 UTC (permalink / raw) To: ext Jelle de Jong; +Cc: linux-bluetooth Hi, I did the git clone this morning and made the patch on top of that. Johan already came to tell me that I have whitespace instead of tabs in the changed parts. Sorry about that, I'll fix it. He said it will take 1 minutes to fix that but it was at least 1/2 hour with emacs :) br, Jaska On Fri, 2008-11-28 at 15:24 +0100, ext Jelle de Jong wrote: > Marcel Holtmann wrote: > > Hi Jaska, > > > >> I did some testing on the current 8 band fixed point > >> encoding and it seems to attenuate frequencies below 800Hz > >> and above 18kHz. There might be some other stuff happening > >> also, because at least to me the bass seemed to lack some > >> "definition". > >> > >> I didn't quite understand how the current tables are calculated > >> and how the filtering works so I wrote a new filtering function > >> and calculated new filter tables for it. It is written > >> using 16 bit fixed point without any platform specific optimizations. > >> I only unrolled some loops etc. I tried to follow the > >> flow chart in MPEG-1 annex c. > >> > >> With this new filtering the low and high frequencies are there, but > >> I haven't done any more thorough testing. At least it sounds > >> a little bit better to my ears :) > > > > thanks for looking at it. I am seriously lost when it comes to audio > > codecs and my ears normally don't count for much. > > > > So do you think we should throw all away any you start over providing a > > correct implementation with fixed point integer and then we start > > optimizing step by step (while testing against SBC conformance) or how > > should we continue. For sure we have to fix our codec. > > > > Regards > > > > Marcel > > Hi Jaska, Thank you so much for improving the codecs :-D it was on my > long standing wish list. You want to receive a few Dutch stroopwafels > for your efforts :-D > > Is the patch you provided working against the latest git? Marcel would > you be willing to review the patch for code style, hidden issues etcetera. > > I would love to test this patch this weekend and apply it to the latest git. > > I am an heavy stereo bluetooth user and will notice glitches and quality > distortions on my bluetooth speakers and headsets. > > Best regards, > > Jelle ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [RFC/PATCH] sbc: new filtering function for 8 band fixed point encoding 2008-11-28 15:20 ` Jaska Uimonen @ 2008-11-28 18:13 ` David Sainty 0 siblings, 0 replies; 35+ messages in thread From: David Sainty @ 2008-11-28 18:13 UTC (permalink / raw) To: Jaska Uimonen; +Cc: ext Jelle de Jong, linux-bluetooth Jaska Uimonen wrote: > Hi, > > I did the git clone this morning and made the patch on top of that. > > Johan already came to tell me that I have whitespace instead > of tabs in the changed parts. Sorry about that, I'll fix it. > He said it will take 1 minutes to fix that but it was > at least 1/2 hour with emacs :) > > br, > Jaska > > M-x tabify ? ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [RFC/PATCH] sbc: new filtering function for 8 band fixed point encoding 2008-11-28 14:18 ` Marcel Holtmann 2008-11-28 14:24 ` Jelle de Jong @ 2008-11-28 15:14 ` Jaska Uimonen 1 sibling, 0 replies; 35+ messages in thread From: Jaska Uimonen @ 2008-11-28 15:14 UTC (permalink / raw) To: ext Marcel Holtmann; +Cc: linux-bluetooth Hi, So in general I would just like to hear comments about the quality on this one. So does it fix the quality issues or is there still some problems. If someone just has the measurement system ready and would be able to run some sound through... This is all 16 bit fixed point and we have an option to go to 32 bit if necessary. About the optimizations... I myself use a vectorizing compiler, which optimizes this kind of C-code very well. So we could keep the "reference" fix point in the code base (if it actually works now...) and then do optimizations to different platform if one wishes. But I'm really open to how you would like to do it. br, Jaska On Fri, 2008-11-28 at 15:18 +0100, ext Marcel Holtmann wrote: > Hi Jaska, > > > I did some testing on the current 8 band fixed point > > encoding and it seems to attenuate frequencies below 800Hz > > and above 18kHz. There might be some other stuff happening > > also, because at least to me the bass seemed to lack some > > "definition". > > > > I didn't quite understand how the current tables are calculated > > and how the filtering works so I wrote a new filtering function > > and calculated new filter tables for it. It is written > > using 16 bit fixed point without any platform specific optimizations. > > I only unrolled some loops etc. I tried to follow the > > flow chart in MPEG-1 annex c. > > > > With this new filtering the low and high frequencies are there, but > > I haven't done any more thorough testing. At least it sounds > > a little bit better to my ears :) > > thanks for looking at it. I am seriously lost when it comes to audio > codecs and my ears normally don't count for much. > > So do you think we should throw all away any you start over providing a > correct implementation with fixed point integer and then we start > optimizing step by step (while testing against SBC conformance) or how > should we continue. For sure we have to fix our codec. > > Regards > > Marcel > > ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [RFC/PATCH] sbc: new filtering function for 8 band fixed point encoding 2008-11-28 13:35 [RFC/PATCH] sbc: new filtering function for 8 band fixed point encoding Jaska Uimonen 2008-11-28 14:18 ` Marcel Holtmann @ 2008-12-02 20:15 ` Jim Carter 2008-12-12 17:14 ` Siarhei Siamashka 1 sibling, 1 reply; 35+ messages in thread From: Jim Carter @ 2008-12-02 20:15 UTC (permalink / raw) To: Jaska Uimonen; +Cc: linux-bluetooth On Fri, 28 Nov 2008, Jaska Uimonen wrote: > I did some testing on the current 8 band fixed point > encoding and it seems to attenuate frequencies below 800Hz > and above 18kHz. There might be some other stuff happening > also, because at least to me the bass seemed to lack some > "definition". I have the same subjective impression: on (inexpensive) Motorola HT-820 phones, A2DP is noticeably anemic compared to the same phones with the provided wire. > I didn't quite understand how the current tables are calculated > and how the filtering works so I wrote a new filtering function > and calculated new filter tables for it. It is written > using 16 bit fixed point without any platform specific optimizations. > I only unrolled some loops etc. I tried to follow the > flow chart in MPEG-1 annex c. This is very exciting! But it's been about 1.5 years since I read the A2DP spec so I can't remember: does the headphone require standard band breakpoints? Or are the breakpoints implicit in the encoding, so if you adjust them to make better use of the Bluetooth bandwidth, the headphone will automatically go along? James F. Carter Voice 310 825 2897 FAX 310 206 6673 UCLA-Mathnet; 6115 MSA; 405 Hilgard Ave.; Los Angeles, CA, USA 90095-1555 Email: jimc@math.ucla.edu http://www.math.ucla.edu/~jimc (q.v. for PGP key) ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [RFC/PATCH] sbc: new filtering function for 8 band fixed point encoding 2008-12-02 20:15 ` Jim Carter @ 2008-12-12 17:14 ` Siarhei Siamashka 2008-12-12 19:19 ` Brad Midgley 0 siblings, 1 reply; 35+ messages in thread From: Siarhei Siamashka @ 2008-12-12 17:14 UTC (permalink / raw) To: Jaska Uimonen; +Cc: linux-bluetooth On Fri, 28 Nov 2008, Jaska Uimonen wrote: > I did some testing on the current 8 band fixed point > encoding and it seems to attenuate frequencies below 800Hz > and above 18kHz. There might be some other stuff happening > also, because at least to me the bass seemed to lack some > "definition". > I didn't quite understand how the current tables are calculated > and how the filtering works so I wrote a new filtering function > and calculated new filter tables for it. It is written > using 16 bit fixed point without any platform specific optimizations. > I only unrolled some loops etc. I tried to follow the > flow chart in MPEG-1 annex c. > +/* > + * to produce this Q15 format table: > + * > + * Get the filter coeffs from the spec and multiply them by 2^15. > + */ > +static const signed short _sbc_proto_fixed8[80] = { > + 0, 5, 11, 18, 26, 37, 48, 58, 65, 68, > + 65, 52, 29, -5, -54, -114, 185, 263, 342, 417, > + 480, 521, 531, 501, 423, 290, 95, -161, -479, -855, > + -1280, -1742, 2228, 2719, 3197, 3643, 4039, 4366, 4612, 4764, > + 4815, 4764, 4612, 4366, 4039, 3643, 3197, 2719, -2228, -1742, > + -1280, -855, -479, -161, 95, 290, 423, 501, 531, 521, > + 480, 417, 342, 263, -185, -114, -54, -5, 29, 52, > + 65, 68, 65, 58, 48, 37, 26, 18, 11, 5 > +}; Just remembered to check this. Precision and audio quality should be a bit better if the original floating values are not truncated, but rounded when put into tables. For example, the fifth element is 26 in _sbc_proto_fixed8 table, but it was 26.998194372608 after multiplication and before conversion to integer. Using 27 would have been a bit more correct here. -- Best regards, Siarhei Siamashka ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [RFC/PATCH] sbc: new filtering function for 8 band fixed point encoding 2008-12-12 17:14 ` Siarhei Siamashka @ 2008-12-12 19:19 ` Brad Midgley 2008-12-15 12:54 ` Siarhei Siamashka 0 siblings, 1 reply; 35+ messages in thread From: Brad Midgley @ 2008-12-12 19:19 UTC (permalink / raw) To: Siarhei Siamashka; +Cc: Jaska Uimonen, linux-bluetooth Guys One mistake we made was not keeping track of what functions we were applying to the published tables as we worked things to look less and less like the pseudocode in the spec. So if you start again from one of the tables in the spec, keep a comment like /* ourtable(x) = (int32)(table1(x) << 16) */ Which in this case would mean that in ourtable we've shifted the original table1 float value left 16 bits and truncated it to an int32. I also combined tables or split tables to simplify our loop logic or eliminate operations; an explanation in a comment would have been appropriate. Brad On Fri, Dec 12, 2008 at 10:14 AM, Siarhei Siamashka <siarhei.siamashka@nokia.com> wrote: > On Fri, 28 Nov 2008, Jaska Uimonen wrote: >> I did some testing on the current 8 band fixed point >> encoding and it seems to attenuate frequencies below 800Hz >> and above 18kHz. There might be some other stuff happening >> also, because at least to me the bass seemed to lack some >> "definition". >> I didn't quite understand how the current tables are calculated >> and how the filtering works so I wrote a new filtering function >> and calculated new filter tables for it. It is written >> using 16 bit fixed point without any platform specific optimizations. >> I only unrolled some loops etc. I tried to follow the >> flow chart in MPEG-1 annex c. > >> +/* >> + * to produce this Q15 format table: >> + * >> + * Get the filter coeffs from the spec and multiply them by 2^15. >> + */ >> +static const signed short _sbc_proto_fixed8[80] = { >> + 0, 5, 11, 18, 26, 37, 48, 58, 65, 68, >> + 65, 52, 29, -5, -54, -114, 185, 263, 342, 417, >> + 480, 521, 531, 501, 423, 290, 95, -161, -479, -855, >> + -1280, -1742, 2228, 2719, 3197, 3643, 4039, 4366, 4612, 4764, >> + 4815, 4764, 4612, 4366, 4039, 3643, 3197, 2719, -2228, -1742, >> + -1280, -855, -479, -161, 95, 290, 423, 501, 531, 521, >> + 480, 417, 342, 263, -185, -114, -54, -5, 29, 52, >> + 65, 68, 65, 58, 48, 37, 26, 18, 11, 5 >> +}; > > Just remembered to check this. Precision and audio quality should be a bit > better if the original floating values are not truncated, but rounded when put > into tables. > > For example, the fifth element is 26 in _sbc_proto_fixed8 table, but it was > 26.998194372608 after multiplication and before conversion to integer. > Using 27 would have been a bit more correct here. > > -- > Best regards, > Siarhei Siamashka > -- > To unsubscribe from this list: send the line "unsubscribe linux-bluetooth" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- Brad Midgley ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [RFC/PATCH] sbc: new filtering function for 8 band fixed point encoding 2008-12-12 19:19 ` Brad Midgley @ 2008-12-15 12:54 ` Siarhei Siamashka 2008-12-15 15:16 ` Brad Midgley 0 siblings, 1 reply; 35+ messages in thread From: Siarhei Siamashka @ 2008-12-15 12:54 UTC (permalink / raw) To: ext Brad Midgley; +Cc: Jaska Uimonen, linux-bluetooth On Friday 12 December 2008 21:19:20 ext Brad Midgley wrote: > Guys > > One mistake we made was not keeping track of what functions we were > applying to the published tables as we worked things to look less and > less like the pseudocode in the spec. So if you start again from one > of the tables in the spec, keep a comment like > > /* ourtable(x) = (int32)(table1(x) << 16) */ This part can be probably done just by having a macro, something like #define F_TO_Q15(x) (int16_t)((x>0) ? ((x)*(1<<15)+0.5) : ((x)*(1<<15)-0.5)) And then using plain floating point numbers from the SBC specification in the table, wrapped into this macro. Though I wonder if it is possible to use such conditional expression in the static table initializer list with all versions of gcc/other compilers. > Which in this case would mean that in ourtable we've shifted the > original table1 float value left 16 bits and truncated it to an int32. > > I also combined tables or split tables to simplify our loop logic or > eliminate operations; an explanation in a comment would have been > appropriate. Yes, any transformations or simplifications should be extensively commented. So that it will be always possible to reproduce them or verify their correctness. Can anybody try to remember/explain what transformations were applied to the existing fixed point implementation? -- Best regards, Siarhei Siamashka ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [RFC/PATCH] sbc: new filtering function for 8 band fixed point encoding 2008-12-15 12:54 ` Siarhei Siamashka @ 2008-12-15 15:16 ` Brad Midgley 2008-12-16 22:37 ` Siarhei Siamashka 0 siblings, 1 reply; 35+ messages in thread From: Brad Midgley @ 2008-12-15 15:16 UTC (permalink / raw) To: Siarhei Siamashka; +Cc: Jaska Uimonen, linux-bluetooth Siarhei I like your idea of using a macro with the original floating point tables, as long as we know it is done at compile time, not runtime :) > Can anybody try to remember/explain what transformations were applied to > the existing fixed point implementation? it was done by several people and the only record we have is in cvs. (part of it is in the old btsco project's cvs) -- Brad Midgley ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [RFC/PATCH] sbc: new filtering function for 8 band fixed point encoding 2008-12-15 15:16 ` Brad Midgley @ 2008-12-16 22:37 ` Siarhei Siamashka 2008-12-17 8:16 ` Jaska Uimonen 2008-12-19 22:12 ` Siarhei Siamashka 0 siblings, 2 replies; 35+ messages in thread From: Siarhei Siamashka @ 2008-12-16 22:37 UTC (permalink / raw) To: ext Brad Midgley; +Cc: Jaska Uimonen, linux-bluetooth [-- Attachment #1: Type: text/plain, Size: 1667 bytes --] On Monday 15 December 2008 17:16:58 ext Brad Midgley wrote: > I like your idea of using a macro with the original floating point > tables, as long as we know it is done at compile time, not runtime :) What about something like this modification to Jaska's patch? It contains floating point constants wrapped into a macro. This version is using 16-bit multiplications only (additional natural change would be just to convert 'sbc_encoder_state->' to int16_t because it does not need to be int32_t), which is good for performance for the platforms with fast 16-bit integer multiplication. But it is also flexible enough to be changed to use 32x32->64 multiplications just by replacing FIXED_A and FIXED_T types to int64_t and int32_t respectively (for better precision or experiments with conformance testing). > > Can anybody try to remember/explain what transformations were applied to > > the existing fixed point implementation? > > it was done by several people and the only record we have is in cvs. > (part of it is in the old btsco project's cvs) Regarding the code optimizations. Looking at the tables, It can be seen that 'cos_table_fixed_8[0+hop]' is always equal to 'cos_table_fixed_8[8+hop]'. The same is true for 'cos_table_fixed_8[1+hop]' and 'cos_table_fixed_8[7+hop]' So it is possible to join 't1[0] + t1[8]', 't1[1]+ t1[7]' and the other such pairs, effectively halving the number of counters. This looks very much like the optimization that was applied to the current fixed point code :) But now it would be very interesting to see if the conformance tests pass rate is better with the new filtering function. Best regards, Siarhei Siamashka [-- Attachment #2: analyze_eight_modified.diff --] [-- Type: text/x-diff, Size: 13939 bytes --] diff --git a/sbc/sbc.c b/sbc/sbc.c index 7072673..656abe5 100644 --- a/sbc/sbc.c +++ b/sbc/sbc.c @@ -40,6 +40,7 @@ #include <string.h> #include <stdlib.h> #include <sys/types.h> +#include <limits.h> #include "sbc_math.h" #include "sbc_tables.h" @@ -742,124 +743,107 @@ static inline void sbc_analyze_four(struct sbc_encoder_state *state, static inline void _sbc_analyze_eight(const int32_t *in, int32_t *out) { - sbc_fixed_t t[8], s[8]; - - t[0] = SCALE8_STAGE1( /* Q10 */ - MULA(_sbc_proto_8[0], (in[16] - in[64]), /* Q18 = Q18 * Q0 */ - MULA(_sbc_proto_8[1], (in[32] - in[48]), - MULA(_sbc_proto_8[2], in[4], - MULA(_sbc_proto_8[3], in[20], - MULA(_sbc_proto_8[4], in[36], - MUL( _sbc_proto_8[5], in[52]))))))); - - t[1] = SCALE8_STAGE1( - MULA(_sbc_proto_8[6], in[2], - MULA(_sbc_proto_8[7], in[18], - MULA(_sbc_proto_8[8], in[34], - MULA(_sbc_proto_8[9], in[50], - MUL(_sbc_proto_8[10], in[66])))))); - - t[2] = SCALE8_STAGE1( - MULA(_sbc_proto_8[11], in[1], - MULA(_sbc_proto_8[12], in[17], - MULA(_sbc_proto_8[13], in[33], - MULA(_sbc_proto_8[14], in[49], - MULA(_sbc_proto_8[15], in[65], - MULA(_sbc_proto_8[16], in[3], - MULA(_sbc_proto_8[17], in[19], - MULA(_sbc_proto_8[18], in[35], - MULA(_sbc_proto_8[19], in[51], - MUL( _sbc_proto_8[20], in[67]))))))))))); - - t[3] = SCALE8_STAGE1( - MULA( _sbc_proto_8[21], in[5], - MULA( _sbc_proto_8[22], in[21], - MULA( _sbc_proto_8[23], in[37], - MULA( _sbc_proto_8[24], in[53], - MULA( _sbc_proto_8[25], in[69], - MULA(-_sbc_proto_8[15], in[15], - MULA(-_sbc_proto_8[14], in[31], - MULA(-_sbc_proto_8[13], in[47], - MULA(-_sbc_proto_8[12], in[63], - MUL( -_sbc_proto_8[11], in[79]))))))))))); - - t[4] = SCALE8_STAGE1( - MULA( _sbc_proto_8[26], in[6], - MULA( _sbc_proto_8[27], in[22], - MULA( _sbc_proto_8[28], in[38], - MULA( _sbc_proto_8[29], in[54], - MULA( _sbc_proto_8[30], in[70], - MULA(-_sbc_proto_8[10], in[14], - MULA(-_sbc_proto_8[9], in[30], - MULA(-_sbc_proto_8[8], in[46], - MULA(-_sbc_proto_8[7], in[62], - MUL( -_sbc_proto_8[6], in[78]))))))))))); - - t[5] = SCALE8_STAGE1( - MULA( _sbc_proto_8[31], in[7], - MULA( _sbc_proto_8[32], in[23], - MULA( _sbc_proto_8[33], in[39], - MULA( _sbc_proto_8[34], in[55], - MULA( _sbc_proto_8[35], in[71], - MULA(-_sbc_proto_8[20], in[13], - MULA(-_sbc_proto_8[19], in[29], - MULA(-_sbc_proto_8[18], in[45], - MULA(-_sbc_proto_8[17], in[61], - MUL( -_sbc_proto_8[16], in[77]))))))))))); - - t[6] = SCALE8_STAGE1( - MULA( _sbc_proto_8[36], (in[8] + in[72]), - MULA( _sbc_proto_8[37], (in[24] + in[56]), - MULA( _sbc_proto_8[38], in[40], - MULA(-_sbc_proto_8[39], in[12], - MULA(-_sbc_proto_8[5], in[28], - MULA(-_sbc_proto_8[4], in[44], - MULA(-_sbc_proto_8[3], in[60], - MUL( -_sbc_proto_8[2], in[76]))))))))); - - t[7] = SCALE8_STAGE1( - MULA( _sbc_proto_8[35], in[9], - MULA( _sbc_proto_8[34], in[25], - MULA( _sbc_proto_8[33], in[41], - MULA( _sbc_proto_8[32], in[57], - MULA( _sbc_proto_8[31], in[73], - MULA(-_sbc_proto_8[25], in[11], - MULA(-_sbc_proto_8[24], in[27], - MULA(-_sbc_proto_8[23], in[43], - MULA(-_sbc_proto_8[22], in[59], - MUL( -_sbc_proto_8[21], in[75]))))))))))); - - s[0] = MULA( _anamatrix8[0], t[0], - MUL( _anamatrix8[1], t[6])); - s[1] = MUL( _anamatrix8[7], t[1]); - s[2] = MULA( _anamatrix8[2], t[2], - MULA( _anamatrix8[3], t[3], - MULA( _anamatrix8[4], t[5], - MUL( _anamatrix8[5], t[7])))); - s[3] = MUL( _anamatrix8[6], t[4]); - s[4] = MULA( _anamatrix8[3], t[2], - MULA(-_anamatrix8[5], t[3], - MULA(-_anamatrix8[2], t[5], - MUL( -_anamatrix8[4], t[7])))); - s[5] = MULA( _anamatrix8[4], t[2], - MULA(-_anamatrix8[2], t[3], - MULA( _anamatrix8[5], t[5], - MUL( _anamatrix8[3], t[7])))); - s[6] = MULA( _anamatrix8[1], t[0], - MUL( -_anamatrix8[0], t[6])); - s[7] = MULA( _anamatrix8[5], t[2], - MULA(-_anamatrix8[4], t[3], - MULA( _anamatrix8[3], t[5], - MUL( -_anamatrix8[2], t[7])))); - - out[0] = SCALE8_STAGE2( s[0] + s[1] + s[2] + s[3]); - out[1] = SCALE8_STAGE2( s[1] - s[3] + s[4] + s[6]); - out[2] = SCALE8_STAGE2( s[1] - s[3] + s[5] - s[6]); - out[3] = SCALE8_STAGE2(-s[0] + s[1] + s[3] + s[7]); - out[4] = SCALE8_STAGE2(-s[0] + s[1] + s[3] - s[7]); - out[5] = SCALE8_STAGE2( s[1] - s[3] - s[5] - s[6]); - out[6] = SCALE8_STAGE2( s[1] - s[3] - s[4] + s[6]); - out[7] = SCALE8_STAGE2( s[0] + s[1] - s[2] + s[3]); + FIXED_A t1[16]; + FIXED_T t2[16]; + int i = 0, hop = 0, R = 0; + + /* rounding coefficient */ + R = 1 << (SBC_PROTO_FIXED8_SCALE-1); + + /* low pass polyphase filter */ + t1[0] = (FIXED_A)in[0] * _sbc_proto_fixed8[0]; + t1[1] = (FIXED_A)in[1] * _sbc_proto_fixed8[1]; + t1[2] = (FIXED_A)in[2] * _sbc_proto_fixed8[2]; + t1[3] = (FIXED_A)in[3] * _sbc_proto_fixed8[3]; + t1[4] = (FIXED_A)in[4] * _sbc_proto_fixed8[4]; + t1[5] = (FIXED_A)in[5] * _sbc_proto_fixed8[5]; + t1[6] = (FIXED_A)in[6] * _sbc_proto_fixed8[6]; + t1[7] = (FIXED_A)in[7] * _sbc_proto_fixed8[7]; + t1[8] = (FIXED_A)in[8] * _sbc_proto_fixed8[8]; + t1[9] = (FIXED_A)in[9] * _sbc_proto_fixed8[9]; + t1[10] = (FIXED_A)in[10] * _sbc_proto_fixed8[10]; + t1[11] = (FIXED_A)in[11] * _sbc_proto_fixed8[11]; + /* t1[12] = (FIXED_A)in[12] * _sbc_proto_fixed8[12]; */ + t1[13] = (FIXED_A)in[13] * _sbc_proto_fixed8[13]; + t1[14] = (FIXED_A)in[14] * _sbc_proto_fixed8[14]; + t1[15] = (FIXED_A)in[15] * _sbc_proto_fixed8[15]; + + hop = 16; + for (i = 0; i < 4; i++) { + t1[0] += (FIXED_A)in[hop] * _sbc_proto_fixed8[hop]; + t1[1] += (FIXED_A)in[hop + 1] * _sbc_proto_fixed8[hop + 1]; + t1[2] += (FIXED_A)in[hop + 2] * _sbc_proto_fixed8[hop + 2]; + t1[3] += (FIXED_A)in[hop + 3] * _sbc_proto_fixed8[hop + 3]; + t1[4] += (FIXED_A)in[hop + 4] * _sbc_proto_fixed8[hop + 4]; + t1[5] += (FIXED_A)in[hop + 5] * _sbc_proto_fixed8[hop + 5]; + t1[6] += (FIXED_A)in[hop + 6] * _sbc_proto_fixed8[hop + 6]; + t1[7] += (FIXED_A)in[hop + 7] * _sbc_proto_fixed8[hop + 7]; + t1[8] += (FIXED_A)in[hop + 8] * _sbc_proto_fixed8[hop + 8]; + t1[9] += (FIXED_A)in[hop + 9] * _sbc_proto_fixed8[hop + 9]; + t1[10] += (FIXED_A)in[hop + 10] * _sbc_proto_fixed8[hop + 10]; + t1[11] += (FIXED_A)in[hop + 11] * _sbc_proto_fixed8[hop + 11]; + /* t1[12] += (FIXED_A)in[hop + 12] * _sbc_proto_fixed8[hop + 12]; */ + t1[13] += (FIXED_A)in[hop + 13] * _sbc_proto_fixed8[hop + 13]; + t1[14] += (FIXED_A)in[hop + 14] * _sbc_proto_fixed8[hop + 14]; + t1[15] += (FIXED_A)in[hop + 15] * _sbc_proto_fixed8[hop + 15]; + + hop += 16; + } + + /* scaling */ + t2[0] = (t1[0] + R) >> SBC_PROTO_FIXED8_SCALE; + t2[1] = (t1[1] + R) >> SBC_PROTO_FIXED8_SCALE; + t2[2] = (t1[2] + R) >> SBC_PROTO_FIXED8_SCALE; + t2[3] = (t1[3] + R) >> SBC_PROTO_FIXED8_SCALE; + t2[4] = (t1[4] + R) >> SBC_PROTO_FIXED8_SCALE; + t2[5] = (t1[5] + R) >> SBC_PROTO_FIXED8_SCALE; + t2[6] = (t1[6] + R) >> SBC_PROTO_FIXED8_SCALE; + t2[7] = (t1[7] + R) >> SBC_PROTO_FIXED8_SCALE; + t2[8] = (t1[8] + R) >> SBC_PROTO_FIXED8_SCALE; + t2[9] = (t1[9] + R) >> SBC_PROTO_FIXED8_SCALE; + t2[10] = (t1[10] + R) >> SBC_PROTO_FIXED8_SCALE; + t2[11] = (t1[11] + R) >> SBC_PROTO_FIXED8_SCALE; + /* t2[12] = (t1[12] + R) >> SBC_PROTO_FIXED8_SCALE; */ + t2[13] = (t1[13] + R) >> SBC_PROTO_FIXED8_SCALE; + t2[14] = (t1[14] + R) >> SBC_PROTO_FIXED8_SCALE; + t2[15] = (t1[15] + R) >> SBC_PROTO_FIXED8_SCALE; + + R = 1 << (SBC_COS_TABLE_FIXED8_SCALE-1); + + /* do the cos transform */ + hop = 0; + for (i = 0; i < 8; i++) { + t1[i] = (FIXED_A)t2[0] * cos_table_fixed_8[0 + hop]; + t1[i] += (FIXED_A)t2[1] * cos_table_fixed_8[1 + hop]; + t1[i] += (FIXED_A)t2[2] * cos_table_fixed_8[2 + hop]; + t1[i] += (FIXED_A)t2[3] * cos_table_fixed_8[3 + hop]; + /* cos_table_fixed_8[4 + hop] = 1.0 */ + t1[i] += (FIXED_A)t2[4] << SBC_COS_TABLE_FIXED8_SCALE; + t1[i] += (FIXED_A)t2[5] * cos_table_fixed_8[5 + hop]; + t1[i] += (FIXED_A)t2[6] * cos_table_fixed_8[6 + hop]; + t1[i] += (FIXED_A)t2[7] * cos_table_fixed_8[7 + hop]; + t1[i] += (FIXED_A)t2[8] * cos_table_fixed_8[8 + hop]; + t1[i] += (FIXED_A)t2[9] * cos_table_fixed_8[9 + hop]; + t1[i] += (FIXED_A)t2[10] * cos_table_fixed_8[10 + hop]; + t1[i] += (FIXED_A)t2[11] * cos_table_fixed_8[11 + hop]; + /* cos_table_fixed_8[12 + hop] = 0.0 */ + /* t1[i] += (FIXED_A)t2[12] * cos_table_fixed_8[12 + hop]; */ + t1[i] += (FIXED_A)t2[13] * cos_table_fixed_8[13 + hop]; + t1[i] += (FIXED_A)t2[14] * cos_table_fixed_8[14 + hop]; + t1[i] += (FIXED_A)t2[15] * cos_table_fixed_8[15 + hop]; + + hop += 16; + } + + /* scaling */ + out[0] = (t1[0] + R) >> SBC_COS_TABLE_FIXED8_SCALE; + out[1] = (t1[1] + R) >> SBC_COS_TABLE_FIXED8_SCALE; + out[2] = (t1[2] + R) >> SBC_COS_TABLE_FIXED8_SCALE; + out[3] = (t1[3] + R) >> SBC_COS_TABLE_FIXED8_SCALE; + out[4] = (t1[4] + R) >> SBC_COS_TABLE_FIXED8_SCALE; + out[5] = (t1[5] + R) >> SBC_COS_TABLE_FIXED8_SCALE; + out[6] = (t1[6] + R) >> SBC_COS_TABLE_FIXED8_SCALE; + out[7] = (t1[7] + R) >> SBC_COS_TABLE_FIXED8_SCALE; } static inline void sbc_analyze_eight(struct sbc_encoder_state *state, diff --git a/sbc/sbc_tables.h b/sbc/sbc_tables.h index f5daaa7..6e63f48 100644 --- a/sbc/sbc_tables.h +++ b/sbc/sbc_tables.h @@ -166,3 +166,81 @@ static const int32_t synmatrix8[16][8] = { { SN8(0xf9592678), SN8(0x018f8b84), SN8(0x07d8a5f0), SN8(0x0471ced0), SN8(0xfb8e3130), SN8(0xf8275a10), SN8(0xfe70747c), SN8(0x06a6d988) } }; + +/* data type for fixed point accumulator */ +#define FIXED_A int32_t +/* data type for fixed point constants */ +#define FIXED_T int16_t + +/* A2DP specification: Section 12.8 Tables */ +#define SBC_PROTO_FIXED8_SCALE (sizeof(FIXED_T)*CHAR_BIT - 1) +#define F(x) (FIXED_T)((x)*((FIXED_A)1<<SBC_PROTO_FIXED8_SCALE)+0.5) +static const FIXED_T _sbc_proto_fixed8[80] = { + F(0.00000000E+00), F(1.56575398E-04), F(3.43256425E-04), F(5.54620202E-04), + F(8.23919506E-04), F(1.13992507E-03), F(1.47640169E-03), F(1.78371725E-03), + F(2.01182542E-03), F(2.10371989E-03), F(1.99454554E-03), F(1.61656283E-03), + F(9.02154502E-04),-F(1.78805361E-04),-F(1.64973098E-03),-F(3.49717454E-03), + F(5.65949473E-03), F(8.02941163E-03), F(1.04584443E-02), F(1.27472335E-02), + F(1.46525263E-02), F(1.59045603E-02), F(1.62208471E-02), F(1.53184106E-02), + F(1.29371806E-02), F(8.85757540E-03), F(2.92408442E-03),-F(4.91578024E-03), + -F(1.46404076E-02),-F(2.61098752E-02),-F(3.90751381E-02),-F(5.31873032E-02), + F(6.79989431E-02), F(8.29847578E-02), F(9.75753918E-02), F(1.11196689E-01), + F(1.23264548E-01), F(1.33264415E-01), F(1.40753505E-01), F(1.45389847E-01), + F(1.46955068E-01), F(1.45389847E-01), F(1.40753505E-01), F(1.33264415E-01), + F(1.23264548E-01), F(1.11196689E-01), F(9.75753918E-02), F(8.29847578E-02), + -F(6.79989431E-02),-F(5.31873032E-02),-F(3.90751381E-02),-F(2.61098752E-02), + -F(1.46404076E-02),-F(4.91578024E-03), F(2.92408442E-03), F(8.85757540E-03), + F(1.29371806E-02), F(1.53184106E-02), F(1.62208471E-02), F(1.59045603E-02), + F(1.46525263E-02), F(1.27472335E-02), F(1.04584443E-02), F(8.02941163E-03), + -F(5.65949473E-03),-F(3.49717454E-03),-F(1.64973098E-03),-F(1.78805361E-04), + F(9.02154502E-04), F(1.61656283E-03), F(1.99454554E-03), F(2.10371989E-03), + F(2.01182542E-03), F(1.78371725E-03), F(1.47640169E-03), F(1.13992507E-03), + F(8.23919506E-04), F(5.54620202E-04), F(3.43256425E-04), F(1.56575398E-04), +}; +#undef F + +/* + * To produce this cosine matrix in Octave: + * + * b = zeros(8, 16); + * for i = 0:7 for j = 0:15 b(i+1, j+1) = cos( (i + 0.5) * (j - 4) * (pi/8) ) endfor endfor; + * printf("%.10f, ", b'); + * + */ +#define SBC_COS_TABLE_FIXED8_SCALE (sizeof(FIXED_T)*CHAR_BIT - 1) +#define F(x) (FIXED_T)((x)*((FIXED_A)1<<SBC_COS_TABLE_FIXED8_SCALE)+0.5) +static const FIXED_T cos_table_fixed_8[128] = { + F(0.7071067812), F(0.8314696123), F(0.9238795325), F(0.9807852804), + F(1.0000000000), F(0.9807852804), F(0.9238795325), F(0.8314696123), + F(0.7071067812), F(0.5555702330), F(0.3826834324), F(0.1950903220), + F(0.0000000000),-F(0.1950903220),-F(0.3826834324),-F(0.5555702330), + -F(0.7071067812),-F(0.1950903220), F(0.3826834324), F(0.8314696123), + F(1.0000000000), F(0.8314696123), F(0.3826834324),-F(0.1950903220), + -F(0.7071067812),-F(0.9807852804),-F(0.9238795325),-F(0.5555702330), + -F(0.0000000000), F(0.5555702330), F(0.9238795325), F(0.9807852804), + -F(0.7071067812),-F(0.9807852804),-F(0.3826834324), F(0.5555702330), + F(1.0000000000), F(0.5555702330),-F(0.3826834324),-F(0.9807852804), + -F(0.7071067812), F(0.1950903220), F(0.9238795325), F(0.8314696123), + F(0.0000000000),-F(0.8314696123),-F(0.9238795325),-F(0.1950903220), + F(0.7071067812),-F(0.5555702330),-F(0.9238795325), F(0.1950903220), + F(1.0000000000), F(0.1950903220),-F(0.9238795325),-F(0.5555702330), + F(0.7071067812), F(0.8314696123),-F(0.3826834324),-F(0.9807852804), + -F(0.0000000000), F(0.9807852804), F(0.3826834324),-F(0.8314696123), + F(0.7071067812), F(0.5555702330),-F(0.9238795325),-F(0.1950903220), + F(1.0000000000),-F(0.1950903220),-F(0.9238795325), F(0.5555702330), + F(0.7071067812),-F(0.8314696123),-F(0.3826834324), F(0.9807852804), + F(0.0000000000),-F(0.9807852804), F(0.3826834324), F(0.8314696123), + -F(0.7071067812), F(0.9807852804),-F(0.3826834324),-F(0.5555702330), + F(1.0000000000),-F(0.5555702330),-F(0.3826834324), F(0.9807852804), + -F(0.7071067812),-F(0.1950903220), F(0.9238795325),-F(0.8314696123), + -F(0.0000000000), F(0.8314696123),-F(0.9238795325), F(0.1950903220), + -F(0.7071067812), F(0.1950903220), F(0.3826834324),-F(0.8314696123), + F(1.0000000000),-F(0.8314696123), F(0.3826834324), F(0.1950903220), + -F(0.7071067812), F(0.9807852804),-F(0.9238795325), F(0.5555702330), + -F(0.0000000000),-F(0.5555702330), F(0.9238795325),-F(0.9807852804), + F(0.7071067812),-F(0.8314696123), F(0.9238795325),-F(0.9807852804), + F(1.0000000000),-F(0.9807852804), F(0.9238795325),-F(0.8314696123), + F(0.7071067812),-F(0.5555702330), F(0.3826834324),-F(0.1950903220), + -F(0.0000000000), F(0.1950903220),-F(0.3826834324), F(0.5555702330), +}; +#undef F ^ permalink raw reply related [flat|nested] 35+ messages in thread
* Re: [RFC/PATCH] sbc: new filtering function for 8 band fixed point encoding 2008-12-16 22:37 ` Siarhei Siamashka @ 2008-12-17 8:16 ` Jaska Uimonen 2008-12-19 22:12 ` Siarhei Siamashka 1 sibling, 0 replies; 35+ messages in thread From: Jaska Uimonen @ 2008-12-17 8:16 UTC (permalink / raw) To: Siarhei Siamashka; +Cc: ext Brad Midgley, linux-bluetooth [-- Attachment #1: Type: text/plain, Size: 2905 bytes --] Hello All, Sorry about the silence, I was away from the office for couple of weeks. Anyway here is the original patch modified to suit the bluez coding conventions. There might still be something (I'm quite bad with these issues...) I also added Siarhei's rounding stuff to the filter tables. Good point Siarhei! Concerning Brad's comments I think the calculation of tables should now be quite clear. There are comments on the tables how to produce them in Octave and the filtering function is very basic fixed point Q15 calculation. I agree that including the original float values to the code would probably be a good idea. Then we could just use some macro at compile time to modify them to suitable Q format and original values would always be there to be viewed and compared to. I would still be careful about the optimization of the filtering, so that we don't obfuscate the code too much. The filter and cosine table are quite small so we should consider how much we save in computation. Anyway if the savings are significant let's just add them there. br, Jaska Uimonen On Wed, 2008-12-17 at 00:37 +0200, Siarhei Siamashka wrote: > On Monday 15 December 2008 17:16:58 ext Brad Midgley wrote: > > I like your idea of using a macro with the original floating point > > tables, as long as we know it is done at compile time, not runtime :) > > What about something like this modification to Jaska's patch? It contains > floating point constants wrapped into a macro. > > This version is using 16-bit multiplications only (additional natural change > would be just to convert 'sbc_encoder_state->' to int16_t because it does not > need to be int32_t), which is good for performance for the platforms with fast > 16-bit integer multiplication. But it is also flexible enough to be changed to > use 32x32->64 multiplications just by replacing FIXED_A and FIXED_T types > to int64_t and int32_t respectively (for better precision or experiments with > conformance testing). > > > > Can anybody try to remember/explain what transformations were applied to > > > the existing fixed point implementation? > > > > it was done by several people and the only record we have is in cvs. > > (part of it is in the old btsco project's cvs) > > Regarding the code optimizations. Looking at the tables, It can be seen that > 'cos_table_fixed_8[0+hop]' is always equal to 'cos_table_fixed_8[8+hop]'. > The same is true for 'cos_table_fixed_8[1+hop]' and 'cos_table_fixed_8[7+hop]' > So it is possible to join 't1[0] + t1[8]', 't1[1]+ t1[7]' and the other such > pairs, effectively halving the number of counters. This looks very much like > the optimization that was applied to the current fixed point code :) > > But now it would be very interesting to see if the conformance tests pass > rate is better with the new filtering function. > > > Best regards, > Siarhei Siamashka [-- Attachment #2: 0001-New-function-and-tables-for-8-band-fixed-point-analy.patch --] [-- Type: application/mbox, Size: 7221 bytes --] ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [RFC/PATCH] sbc: new filtering function for 8 band fixed point encoding 2008-12-16 22:37 ` Siarhei Siamashka 2008-12-17 8:16 ` Jaska Uimonen @ 2008-12-19 22:12 ` Siarhei Siamashka 2008-12-22 23:30 ` Siarhei Siamashka 1 sibling, 1 reply; 35+ messages in thread From: Siarhei Siamashka @ 2008-12-19 22:12 UTC (permalink / raw) To: ext Brad Midgley; +Cc: Jaska Uimonen, linux-bluetooth [-- Attachment #1: Type: text/plain, Size: 3524 bytes --] On Wednesday 17 December 2008 00:37:48 ext Siarhei Siamashka wrote: > On Monday 15 December 2008 17:16:58 ext Brad Midgley wrote: > > I like your idea of using a macro with the original floating point > > tables, as long as we know it is done at compile time, not runtime :) > > What about something like this modification to Jaska's patch? It contains > floating point constants wrapped into a macro. > > This version is using 16-bit multiplications only (additional natural > change would be just to convert 'sbc_encoder_state->' to int16_t because it > does not need to be int32_t), which is good for performance for the > platforms with fast 16-bit integer multiplication. But it is also flexible > enough to be changed to use 32x32->64 multiplications just by replacing > FIXED_A and FIXED_T types to int64_t and int32_t respectively (for better > precision or experiments with conformance testing). > > > > Can anybody try to remember/explain what transformations were applied > > > to the existing fixed point implementation? > > > > it was done by several people and the only record we have is in cvs. > > (part of it is in the old btsco project's cvs) > > Regarding the code optimizations. Looking at the tables, It can be seen > that 'cos_table_fixed_8[0+hop]' is always equal to > 'cos_table_fixed_8[8+hop]'. The same is true for 'cos_table_fixed_8[1+hop]' > and 'cos_table_fixed_8[7+hop]' So it is possible to join 't1[0] + t1[8]', > 't1[1]+ t1[7]' and the other such pairs, effectively halving the number of > counters. This looks very much like the optimization that was applied to > the current fixed point code :) > > But now it would be very interesting to see if the conformance tests pass > rate is better with the new filtering function. Here is one more attempt at improving filtering function. Now I tried to get the best possible audio quality (using 32-bit fixed point). 16-bit version of filtering function can be enabled by just commenting out '#define SBC_HIGH_PRECISION' line The improvements include fixing a problem in scalefactors processing code. Here we don't want to use the absolute value just because it is possible to encode more negative values than positive values with the same number of bits: - while (scalefactor[ch][sb] < fabs(frame->sb_sample_f[blk][ch][sb])) { + while ((scalefactor[ch][sb] << SCALE_OUT_BITS) <= neginv(frame->sb_sample_f[blk][ch][sb])) { Another quality improvement is achieved by keeping more bits in the output of filtering function, thus avoiding unnecessary precision loss on quantizing stage. Both of these changes also naturally improve audio quality for the 16-bit variant. We had a talk with Jaska Uimonen here, and now I'm kind of delegated to finish the work on this filtering function for SBC encoder (including the final addition of ARM assembly optimizations). He provided me with his last variant of code, which contains some more optimizations to reduce the number of operations and also loops unrolling. I will add his changes to the patch on next iteration. Now the question is how to best integrate a fixed filtering function to git repository? If I just continue adding changes to the patch in order to make it a faster, it will be also not so obvious to see how we got to these code transformations just from the commit log. I intentionally keep posting work-in-progress variants just to keep track of the history at least in this mailing list archive :) As always, feedback is very much welcome. Best regards, Siarhei Siamashka [-- Attachment #2: analyze_eight_modified_32bit.diff --] [-- Type: text/x-diff, Size: 17214 bytes --] diff --git a/sbc/sbc.c b/sbc/sbc.c index 5411893..873c370 100644 --- a/sbc/sbc.c +++ b/sbc/sbc.c @@ -40,6 +40,7 @@ #include <string.h> #include <stdlib.h> #include <sys/types.h> +#include <limits.h> #include "sbc_math.h" #include "sbc_tables.h" @@ -742,124 +743,108 @@ static inline void sbc_analyze_four(struct sbc_encoder_state *state, static inline void _sbc_analyze_eight(const int32_t *in, int32_t *out) { - sbc_fixed_t t[8], s[8]; - - t[0] = SCALE8_STAGE1( /* Q10 */ - MULA(_sbc_proto_8[0], (in[16] - in[64]), /* Q18 = Q18 * Q0 */ - MULA(_sbc_proto_8[1], (in[32] - in[48]), - MULA(_sbc_proto_8[2], in[4], - MULA(_sbc_proto_8[3], in[20], - MULA(_sbc_proto_8[4], in[36], - MUL( _sbc_proto_8[5], in[52]))))))); - - t[1] = SCALE8_STAGE1( - MULA(_sbc_proto_8[6], in[2], - MULA(_sbc_proto_8[7], in[18], - MULA(_sbc_proto_8[8], in[34], - MULA(_sbc_proto_8[9], in[50], - MUL(_sbc_proto_8[10], in[66])))))); - - t[2] = SCALE8_STAGE1( - MULA(_sbc_proto_8[11], in[1], - MULA(_sbc_proto_8[12], in[17], - MULA(_sbc_proto_8[13], in[33], - MULA(_sbc_proto_8[14], in[49], - MULA(_sbc_proto_8[15], in[65], - MULA(_sbc_proto_8[16], in[3], - MULA(_sbc_proto_8[17], in[19], - MULA(_sbc_proto_8[18], in[35], - MULA(_sbc_proto_8[19], in[51], - MUL( _sbc_proto_8[20], in[67]))))))))))); - - t[3] = SCALE8_STAGE1( - MULA( _sbc_proto_8[21], in[5], - MULA( _sbc_proto_8[22], in[21], - MULA( _sbc_proto_8[23], in[37], - MULA( _sbc_proto_8[24], in[53], - MULA( _sbc_proto_8[25], in[69], - MULA(-_sbc_proto_8[15], in[15], - MULA(-_sbc_proto_8[14], in[31], - MULA(-_sbc_proto_8[13], in[47], - MULA(-_sbc_proto_8[12], in[63], - MUL( -_sbc_proto_8[11], in[79]))))))))))); - - t[4] = SCALE8_STAGE1( - MULA( _sbc_proto_8[26], in[6], - MULA( _sbc_proto_8[27], in[22], - MULA( _sbc_proto_8[28], in[38], - MULA( _sbc_proto_8[29], in[54], - MULA( _sbc_proto_8[30], in[70], - MULA(-_sbc_proto_8[10], in[14], - MULA(-_sbc_proto_8[9], in[30], - MULA(-_sbc_proto_8[8], in[46], - MULA(-_sbc_proto_8[7], in[62], - MUL( -_sbc_proto_8[6], in[78]))))))))))); - - t[5] = SCALE8_STAGE1( - MULA( _sbc_proto_8[31], in[7], - MULA( _sbc_proto_8[32], in[23], - MULA( _sbc_proto_8[33], in[39], - MULA( _sbc_proto_8[34], in[55], - MULA( _sbc_proto_8[35], in[71], - MULA(-_sbc_proto_8[20], in[13], - MULA(-_sbc_proto_8[19], in[29], - MULA(-_sbc_proto_8[18], in[45], - MULA(-_sbc_proto_8[17], in[61], - MUL( -_sbc_proto_8[16], in[77]))))))))))); - - t[6] = SCALE8_STAGE1( - MULA( _sbc_proto_8[36], (in[8] + in[72]), - MULA( _sbc_proto_8[37], (in[24] + in[56]), - MULA( _sbc_proto_8[38], in[40], - MULA(-_sbc_proto_8[39], in[12], - MULA(-_sbc_proto_8[5], in[28], - MULA(-_sbc_proto_8[4], in[44], - MULA(-_sbc_proto_8[3], in[60], - MUL( -_sbc_proto_8[2], in[76]))))))))); - - t[7] = SCALE8_STAGE1( - MULA( _sbc_proto_8[35], in[9], - MULA( _sbc_proto_8[34], in[25], - MULA( _sbc_proto_8[33], in[41], - MULA( _sbc_proto_8[32], in[57], - MULA( _sbc_proto_8[31], in[73], - MULA(-_sbc_proto_8[25], in[11], - MULA(-_sbc_proto_8[24], in[27], - MULA(-_sbc_proto_8[23], in[43], - MULA(-_sbc_proto_8[22], in[59], - MUL( -_sbc_proto_8[21], in[75]))))))))))); - - s[0] = MULA( _anamatrix8[0], t[0], - MUL( _anamatrix8[1], t[6])); - s[1] = MUL( _anamatrix8[7], t[1]); - s[2] = MULA( _anamatrix8[2], t[2], - MULA( _anamatrix8[3], t[3], - MULA( _anamatrix8[4], t[5], - MUL( _anamatrix8[5], t[7])))); - s[3] = MUL( _anamatrix8[6], t[4]); - s[4] = MULA( _anamatrix8[3], t[2], - MULA(-_anamatrix8[5], t[3], - MULA(-_anamatrix8[2], t[5], - MUL( -_anamatrix8[4], t[7])))); - s[5] = MULA( _anamatrix8[4], t[2], - MULA(-_anamatrix8[2], t[3], - MULA( _anamatrix8[5], t[5], - MUL( _anamatrix8[3], t[7])))); - s[6] = MULA( _anamatrix8[1], t[0], - MUL( -_anamatrix8[0], t[6])); - s[7] = MULA( _anamatrix8[5], t[2], - MULA(-_anamatrix8[4], t[3], - MULA( _anamatrix8[3], t[5], - MUL( -_anamatrix8[2], t[7])))); - - out[0] = SCALE8_STAGE2( s[0] + s[1] + s[2] + s[3]); - out[1] = SCALE8_STAGE2( s[1] - s[3] + s[4] + s[6]); - out[2] = SCALE8_STAGE2( s[1] - s[3] + s[5] - s[6]); - out[3] = SCALE8_STAGE2(-s[0] + s[1] + s[3] + s[7]); - out[4] = SCALE8_STAGE2(-s[0] + s[1] + s[3] - s[7]); - out[5] = SCALE8_STAGE2( s[1] - s[3] - s[5] - s[6]); - out[6] = SCALE8_STAGE2( s[1] - s[3] - s[4] + s[6]); - out[7] = SCALE8_STAGE2( s[0] + s[1] - s[2] + s[3]); + FIXED_A t1[16]; + FIXED_T t2[16]; + FIXED_A R; + int i, hop; + + /* rounding coefficient */ + R = (FIXED_A)1 << (SBC_PROTO_FIXED8_SCALE-1); + + /* low pass polyphase filter */ + t1[0] = (FIXED_A)in[0] * _sbc_proto_fixed8[0]; + t1[1] = (FIXED_A)in[1] * _sbc_proto_fixed8[1]; + t1[2] = (FIXED_A)in[2] * _sbc_proto_fixed8[2]; + t1[3] = (FIXED_A)in[3] * _sbc_proto_fixed8[3]; + t1[4] = (FIXED_A)in[4] * _sbc_proto_fixed8[4]; + t1[5] = (FIXED_A)in[5] * _sbc_proto_fixed8[5]; + t1[6] = (FIXED_A)in[6] * _sbc_proto_fixed8[6]; + t1[7] = (FIXED_A)in[7] * _sbc_proto_fixed8[7]; + t1[8] = (FIXED_A)in[8] * _sbc_proto_fixed8[8]; + t1[9] = (FIXED_A)in[9] * _sbc_proto_fixed8[9]; + t1[10] = (FIXED_A)in[10] * _sbc_proto_fixed8[10]; + t1[11] = (FIXED_A)in[11] * _sbc_proto_fixed8[11]; + /* t1[12] = (FIXED_A)in[12] * _sbc_proto_fixed8[12]; */ + t1[13] = (FIXED_A)in[13] * _sbc_proto_fixed8[13]; + t1[14] = (FIXED_A)in[14] * _sbc_proto_fixed8[14]; + t1[15] = (FIXED_A)in[15] * _sbc_proto_fixed8[15]; + + hop = 16; + for (i = 0; i < 4; i++) { + t1[0] += (FIXED_A)in[hop] * _sbc_proto_fixed8[hop]; + t1[1] += (FIXED_A)in[hop + 1] * _sbc_proto_fixed8[hop + 1]; + t1[2] += (FIXED_A)in[hop + 2] * _sbc_proto_fixed8[hop + 2]; + t1[3] += (FIXED_A)in[hop + 3] * _sbc_proto_fixed8[hop + 3]; + t1[4] += (FIXED_A)in[hop + 4] * _sbc_proto_fixed8[hop + 4]; + t1[5] += (FIXED_A)in[hop + 5] * _sbc_proto_fixed8[hop + 5]; + t1[6] += (FIXED_A)in[hop + 6] * _sbc_proto_fixed8[hop + 6]; + t1[7] += (FIXED_A)in[hop + 7] * _sbc_proto_fixed8[hop + 7]; + t1[8] += (FIXED_A)in[hop + 8] * _sbc_proto_fixed8[hop + 8]; + t1[9] += (FIXED_A)in[hop + 9] * _sbc_proto_fixed8[hop + 9]; + t1[10] += (FIXED_A)in[hop + 10] * _sbc_proto_fixed8[hop + 10]; + t1[11] += (FIXED_A)in[hop + 11] * _sbc_proto_fixed8[hop + 11]; + /* t1[12] += (FIXED_A)in[hop + 12] * _sbc_proto_fixed8[hop + 12]; */ + t1[13] += (FIXED_A)in[hop + 13] * _sbc_proto_fixed8[hop + 13]; + t1[14] += (FIXED_A)in[hop + 14] * _sbc_proto_fixed8[hop + 14]; + t1[15] += (FIXED_A)in[hop + 15] * _sbc_proto_fixed8[hop + 15]; + + hop += 16; + } + + /* scaling */ + t2[0] = (t1[0] + R) >> SBC_PROTO_FIXED8_SCALE; + t2[1] = (t1[1] + R) >> SBC_PROTO_FIXED8_SCALE; + t2[2] = (t1[2] + R) >> SBC_PROTO_FIXED8_SCALE; + t2[3] = (t1[3] + R) >> SBC_PROTO_FIXED8_SCALE; + t2[4] = (t1[4] + R) >> SBC_PROTO_FIXED8_SCALE; + t2[5] = (t1[5] + R) >> SBC_PROTO_FIXED8_SCALE; + t2[6] = (t1[6] + R) >> SBC_PROTO_FIXED8_SCALE; + t2[7] = (t1[7] + R) >> SBC_PROTO_FIXED8_SCALE; + t2[8] = (t1[8] + R) >> SBC_PROTO_FIXED8_SCALE; + t2[9] = (t1[9] + R) >> SBC_PROTO_FIXED8_SCALE; + t2[10] = (t1[10] + R) >> SBC_PROTO_FIXED8_SCALE; + t2[11] = (t1[11] + R) >> SBC_PROTO_FIXED8_SCALE; + /* t2[12] = (t1[12] + R) >> SBC_PROTO_FIXED8_SCALE; */ + t2[13] = (t1[13] + R) >> SBC_PROTO_FIXED8_SCALE; + t2[14] = (t1[14] + R) >> SBC_PROTO_FIXED8_SCALE; + t2[15] = (t1[15] + R) >> SBC_PROTO_FIXED8_SCALE; + + R = (FIXED_A)1 << (SBC_COS_TABLE_FIXED8_SCALE-1-SCALE_OUT_BITS); + + /* do the cos transform */ + hop = 0; + for (i = 0; i < 8; i++) { + t1[i] = (FIXED_A)t2[0] * cos_table_fixed_8[0 + hop]; + t1[i] += (FIXED_A)t2[1] * cos_table_fixed_8[1 + hop]; + t1[i] += (FIXED_A)t2[2] * cos_table_fixed_8[2 + hop]; + t1[i] += (FIXED_A)t2[3] * cos_table_fixed_8[3 + hop]; + /* cos_table_fixed_8[4 + hop] = 1.0 */ + t1[i] += (FIXED_A)t2[4] << (sizeof(FIXED_T)*CHAR_BIT-1); + t1[i] += (FIXED_A)t2[5] * cos_table_fixed_8[5 + hop]; + t1[i] += (FIXED_A)t2[6] * cos_table_fixed_8[6 + hop]; + t1[i] += (FIXED_A)t2[7] * cos_table_fixed_8[7 + hop]; + t1[i] += (FIXED_A)t2[8] * cos_table_fixed_8[8 + hop]; + t1[i] += (FIXED_A)t2[9] * cos_table_fixed_8[9 + hop]; + t1[i] += (FIXED_A)t2[10] * cos_table_fixed_8[10 + hop]; + t1[i] += (FIXED_A)t2[11] * cos_table_fixed_8[11 + hop]; + /* cos_table_fixed_8[12 + hop] = 0.0 */ + /* t1[i] += (FIXED_A)t2[12] * cos_table_fixed_8[12 + hop]; */ + t1[i] += (FIXED_A)t2[13] * cos_table_fixed_8[13 + hop]; + t1[i] += (FIXED_A)t2[14] * cos_table_fixed_8[14 + hop]; + t1[i] += (FIXED_A)t2[15] * cos_table_fixed_8[15 + hop]; + + hop += 16; + } + + /* scaling */ + out[0] = (t1[0] + R) >> (SBC_COS_TABLE_FIXED8_SCALE-SCALE_OUT_BITS); + out[1] = (t1[1] + R) >> (SBC_COS_TABLE_FIXED8_SCALE-SCALE_OUT_BITS); + out[2] = (t1[2] + R) >> (SBC_COS_TABLE_FIXED8_SCALE-SCALE_OUT_BITS); + out[3] = (t1[3] + R) >> (SBC_COS_TABLE_FIXED8_SCALE-SCALE_OUT_BITS); + out[4] = (t1[4] + R) >> (SBC_COS_TABLE_FIXED8_SCALE-SCALE_OUT_BITS); + out[5] = (t1[5] + R) >> (SBC_COS_TABLE_FIXED8_SCALE-SCALE_OUT_BITS); + out[6] = (t1[6] + R) >> (SBC_COS_TABLE_FIXED8_SCALE-SCALE_OUT_BITS); + out[7] = (t1[7] + R) >> (SBC_COS_TABLE_FIXED8_SCALE-SCALE_OUT_BITS); } static inline void sbc_analyze_eight(struct sbc_encoder_state *state, @@ -1006,7 +991,7 @@ static int sbc_pack_frame(uint8_t *data, struct sbc_frame *frame, size_t len) frame->scale_factor[ch][sb] = 0; scalefactor[ch][sb] = 2; for (blk = 0; blk < frame->blocks; blk++) { - while (scalefactor[ch][sb] < fabs(frame->sb_sample_f[blk][ch][sb])) { + while ((scalefactor[ch][sb] << SCALE_OUT_BITS) <= neginv(frame->sb_sample_f[blk][ch][sb])) { frame->scale_factor[ch][sb]++; scalefactor[ch][sb] *= 2; } @@ -1040,11 +1025,11 @@ static int sbc_pack_frame(uint8_t *data, struct sbc_frame *frame, size_t len) frame->sb_sample_f[blk][1][sb]) >> 1; /* calculate scale_factor_j and scalefactor_j for joint case */ - while (scalefactor_j[0] < fabs(sb_sample_j[blk][0])) { + while ((scalefactor_j[0] << SCALE_OUT_BITS) <= neginv(sb_sample_j[blk][0])) { scale_factor_j[0]++; scalefactor_j[0] *= 2; } - while (scalefactor_j[1] < fabs(sb_sample_j[blk][1])) { + while ((scalefactor_j[1] << SCALE_OUT_BITS) <= neginv(sb_sample_j[blk][1])) { scale_factor_j[1]++; scalefactor_j[1] *= 2; } @@ -1100,11 +1085,11 @@ static int sbc_pack_frame(uint8_t *data, struct sbc_frame *frame, size_t len) for (ch = 0; ch < frame->channels; ch++) { for (sb = 0; sb < frame->subbands; sb++) { if (levels[ch][sb] > 0) { - audio_sample = - (uint16_t) (((((int64_t)frame->sb_sample_f[blk][ch][sb]*levels[ch][sb]) >> - (frame->scale_factor[ch][sb] + 1)) + - levels[ch][sb]) >> 1); - PUT_BITS(audio_sample & levels[ch][sb], bits[ch][sb]); + int32_t sample = frame->sb_sample_f[blk][ch][sb]; + int32_t s_shift = (frame->scale_factor[ch][sb] + 1 + SCALE_OUT_BITS); + int32_t ls = levels[ch][sb]; + audio_sample = ((((int64_t)1 << s_shift) + sample) * ls) >> (s_shift + 1); + PUT_BITS(audio_sample, bits[ch][sb]); } } } diff --git a/sbc/sbc_math.h b/sbc/sbc_math.h index b3d87a6..b53e3d1 100644 --- a/sbc/sbc_math.h +++ b/sbc/sbc_math.h @@ -23,12 +23,14 @@ * */ -#define fabs(x) ((x) < 0 ? -(x) : (x)) +#define neginv(x) ((x) < 0 ? ~(x) : (x)) /* C does not provide an explicit arithmetic shift right but this will always be correct and every compiler *should* generate optimal code */ #define ASR(val, bits) ((-2 >> 1 == -1) ? \ ((int32_t)(val)) >> (bits) : ((int32_t) (val)) / (1 << (bits))) +#define SCALE_OUT_BITS 14 + #define SCALE_PROTO4_TBL 15 #define SCALE_ANA4_TBL 17 #define SCALE_PROTO8_TBL 16 @@ -38,7 +40,7 @@ #define SCALE_NPROTO4_TBL 11 #define SCALE_NPROTO8_TBL 11 #define SCALE4_STAGE1_BITS 15 -#define SCALE4_STAGE2_BITS 16 +#define SCALE4_STAGE2_BITS (16-SCALE_OUT_BITS) #define SCALE4_STAGED1_BITS 15 #define SCALE4_STAGED2_BITS 16 #define SCALE8_STAGE1_BITS 15 diff --git a/sbc/sbc_tables.h b/sbc/sbc_tables.h index f5daaa7..eeea7b7 100644 --- a/sbc/sbc_tables.h +++ b/sbc/sbc_tables.h @@ -166,3 +166,88 @@ static const int32_t synmatrix8[16][8] = { { SN8(0xf9592678), SN8(0x018f8b84), SN8(0x07d8a5f0), SN8(0x0471ced0), SN8(0xfb8e3130), SN8(0xf8275a10), SN8(0xfe70747c), SN8(0x06a6d988) } }; + +#define SBC_HIGH_PRECISION + +#ifdef SBC_HIGH_PRECISION +# define FIXED_A int64_t /* data type for fixed point accumulator */ +# define FIXED_T int32_t /* data type for fixed point constants */ +# define SBC_FIXED8_EXTRA_BITS 15 +#else +# define FIXED_A int32_t /* data type for fixed point accumulator */ +# define FIXED_T int16_t /* data type for fixed point constants */ +# define SBC_FIXED8_EXTRA_BITS 0 +#endif + +/* A2DP specification: Section 12.8 Tables */ +#define SBC_PROTO_FIXED8_SCALE (sizeof(FIXED_T)*CHAR_BIT-1-SBC_FIXED8_EXTRA_BITS) +#define F(x) (FIXED_T)(FIXED_A)((x)*((FIXED_A)1<<(sizeof(FIXED_T)*CHAR_BIT-1))+0.5) +static const FIXED_T _sbc_proto_fixed8[80] = { + F(0.00000000E+00), F(1.56575398E-04), F(3.43256425E-04), F(5.54620202E-04), + F(8.23919506E-04), F(1.13992507E-03), F(1.47640169E-03), F(1.78371725E-03), + F(2.01182542E-03), F(2.10371989E-03), F(1.99454554E-03), F(1.61656283E-03), + F(9.02154502E-04),-F(1.78805361E-04),-F(1.64973098E-03),-F(3.49717454E-03), + F(5.65949473E-03), F(8.02941163E-03), F(1.04584443E-02), F(1.27472335E-02), + F(1.46525263E-02), F(1.59045603E-02), F(1.62208471E-02), F(1.53184106E-02), + F(1.29371806E-02), F(8.85757540E-03), F(2.92408442E-03),-F(4.91578024E-03), + -F(1.46404076E-02),-F(2.61098752E-02),-F(3.90751381E-02),-F(5.31873032E-02), + F(6.79989431E-02), F(8.29847578E-02), F(9.75753918E-02), F(1.11196689E-01), + F(1.23264548E-01), F(1.33264415E-01), F(1.40753505E-01), F(1.45389847E-01), + F(1.46955068E-01), F(1.45389847E-01), F(1.40753505E-01), F(1.33264415E-01), + F(1.23264548E-01), F(1.11196689E-01), F(9.75753918E-02), F(8.29847578E-02), + -F(6.79989431E-02),-F(5.31873032E-02),-F(3.90751381E-02),-F(2.61098752E-02), + -F(1.46404076E-02),-F(4.91578024E-03), F(2.92408442E-03), F(8.85757540E-03), + F(1.29371806E-02), F(1.53184106E-02), F(1.62208471E-02), F(1.59045603E-02), + F(1.46525263E-02), F(1.27472335E-02), F(1.04584443E-02), F(8.02941163E-03), + -F(5.65949473E-03),-F(3.49717454E-03),-F(1.64973098E-03),-F(1.78805361E-04), + F(9.02154502E-04), F(1.61656283E-03), F(1.99454554E-03), F(2.10371989E-03), + F(2.01182542E-03), F(1.78371725E-03), F(1.47640169E-03), F(1.13992507E-03), + F(8.23919506E-04), F(5.54620202E-04), F(3.43256425E-04), F(1.56575398E-04), +}; +#undef F + +/* + * To produce this cosine matrix in Octave: + * + * b = zeros(8, 16); + * for i = 0:7 for j = 0:15 b(i+1, j+1) = cos( (i + 0.5) * (j - 4) * (pi/8) ) endfor endfor; + * printf("%.10f, ", b'); + * + */ +#define SBC_COS_TABLE_FIXED8_SCALE (sizeof(FIXED_T)*CHAR_BIT-1+SBC_FIXED8_EXTRA_BITS) +#define F(x) (FIXED_T)(FIXED_A)((x)*((FIXED_A)1<<(sizeof(FIXED_T)*CHAR_BIT-1))+0.5) +static const FIXED_T cos_table_fixed_8[128] = { + F(0.7071067812), F(0.8314696123), F(0.9238795325), F(0.9807852804), + F(1.0000000000), F(0.9807852804), F(0.9238795325), F(0.8314696123), + F(0.7071067812), F(0.5555702330), F(0.3826834324), F(0.1950903220), + F(0.0000000000),-F(0.1950903220),-F(0.3826834324),-F(0.5555702330), + -F(0.7071067812),-F(0.1950903220), F(0.3826834324), F(0.8314696123), + F(1.0000000000), F(0.8314696123), F(0.3826834324),-F(0.1950903220), + -F(0.7071067812),-F(0.9807852804),-F(0.9238795325),-F(0.5555702330), + -F(0.0000000000), F(0.5555702330), F(0.9238795325), F(0.9807852804), + -F(0.7071067812),-F(0.9807852804),-F(0.3826834324), F(0.5555702330), + F(1.0000000000), F(0.5555702330),-F(0.3826834324),-F(0.9807852804), + -F(0.7071067812), F(0.1950903220), F(0.9238795325), F(0.8314696123), + F(0.0000000000),-F(0.8314696123),-F(0.9238795325),-F(0.1950903220), + F(0.7071067812),-F(0.5555702330),-F(0.9238795325), F(0.1950903220), + F(1.0000000000), F(0.1950903220),-F(0.9238795325),-F(0.5555702330), + F(0.7071067812), F(0.8314696123),-F(0.3826834324),-F(0.9807852804), + -F(0.0000000000), F(0.9807852804), F(0.3826834324),-F(0.8314696123), + F(0.7071067812), F(0.5555702330),-F(0.9238795325),-F(0.1950903220), + F(1.0000000000),-F(0.1950903220),-F(0.9238795325), F(0.5555702330), + F(0.7071067812),-F(0.8314696123),-F(0.3826834324), F(0.9807852804), + F(0.0000000000),-F(0.9807852804), F(0.3826834324), F(0.8314696123), + -F(0.7071067812), F(0.9807852804),-F(0.3826834324),-F(0.5555702330), + F(1.0000000000),-F(0.5555702330),-F(0.3826834324), F(0.9807852804), + -F(0.7071067812),-F(0.1950903220), F(0.9238795325),-F(0.8314696123), + -F(0.0000000000), F(0.8314696123),-F(0.9238795325), F(0.1950903220), + -F(0.7071067812), F(0.1950903220), F(0.3826834324),-F(0.8314696123), + F(1.0000000000),-F(0.8314696123), F(0.3826834324), F(0.1950903220), + -F(0.7071067812), F(0.9807852804),-F(0.9238795325), F(0.5555702330), + -F(0.0000000000),-F(0.5555702330), F(0.9238795325),-F(0.9807852804), + F(0.7071067812),-F(0.8314696123), F(0.9238795325),-F(0.9807852804), + F(1.0000000000),-F(0.9807852804), F(0.9238795325),-F(0.8314696123), + F(0.7071067812),-F(0.5555702330), F(0.3826834324),-F(0.1950903220), + -F(0.0000000000), F(0.1950903220),-F(0.3826834324), F(0.5555702330), +}; +#undef F ^ permalink raw reply related [flat|nested] 35+ messages in thread
* Re: [RFC/PATCH] sbc: new filtering function for 8 band fixed point encoding 2008-12-19 22:12 ` Siarhei Siamashka @ 2008-12-22 23:30 ` Siarhei Siamashka 2008-12-23 1:00 ` Marcel Holtmann 0 siblings, 1 reply; 35+ messages in thread From: Siarhei Siamashka @ 2008-12-22 23:30 UTC (permalink / raw) To: ext Brad Midgley; +Cc: Jaska Uimonen, linux-bluetooth [-- Attachment #1: Type: text/plain, Size: 1086 bytes --] On Saturday 20 December 2008 00:12:08 ext Siarhei Siamashka wrote: [...] > We had a talk with Jaska Uimonen here, and now I'm kind of delegated to > finish the work on this filtering function for SBC encoder (including the > final addition of ARM assembly optimizations). He provided me with his > last variant of code, which contains some more optimizations to reduce the > number of operations and also loops unrolling. I will add his changes to > the patch on next iteration. > > Now the question is how to best integrate a fixed filtering function to git > repository? If I just continue adding changes to the patch in order to make > it a faster, it will be also not so obvious to see how we got to these code > transformations just from the commit log. Next iteration done. Added support for 4 subbands, number of arithmetic operations reduced (but without loop unrolling for better code readability), precision improved for both 16-bit and 32-bit fixed point, 'neginv' macro is now more portable and faster. The rest is in the code comments. Best regards, Siarhei Siamashka [-- Attachment #2: sbc_analyze_modified.diff --] [-- Type: text/x-diff, Size: 26435 bytes --] diff --git a/sbc/sbc.c b/sbc/sbc.c index 5411893..3d6a412 100644 --- a/sbc/sbc.c +++ b/sbc/sbc.c @@ -40,6 +40,7 @@ #include <string.h> #include <stdlib.h> #include <sys/types.h> +#include <limits.h> #include "sbc_math.h" #include "sbc_tables.h" @@ -93,7 +94,7 @@ struct sbc_decoder_state { struct sbc_encoder_state { int subbands; int position[2]; - int32_t X[2][160]; + int16_t X[2][160]; }; /* @@ -656,75 +657,56 @@ static void sbc_encoder_init(struct sbc_encoder_state *state, state->position[0] = state->position[1] = 9 * frame->subbands; } -static inline void _sbc_analyze_four(const int32_t *in, int32_t *out) +static inline void _sbc_analyze_four(const int16_t *in, int32_t *out) { - sbc_fixed_t t[8], s[5]; - - t[0] = SCALE4_STAGE1( /* Q8 */ - MULA(_sbc_proto_4[0], in[8] - in[32], /* Q18 */ - MUL( _sbc_proto_4[1], in[16] - in[24]))); - - t[1] = SCALE4_STAGE1( - MULA(_sbc_proto_4[2], in[1], - MULA(_sbc_proto_4[3], in[9], - MULA(_sbc_proto_4[4], in[17], - MULA(_sbc_proto_4[5], in[25], - MUL( _sbc_proto_4[6], in[33])))))); - - t[2] = SCALE4_STAGE1( - MULA(_sbc_proto_4[7], in[2], - MULA(_sbc_proto_4[8], in[10], - MULA(_sbc_proto_4[9], in[18], - MULA(_sbc_proto_4[10], in[26], - MUL( _sbc_proto_4[11], in[34])))))); - - t[3] = SCALE4_STAGE1( - MULA(_sbc_proto_4[12], in[3], - MULA(_sbc_proto_4[13], in[11], - MULA(_sbc_proto_4[14], in[19], - MULA(_sbc_proto_4[15], in[27], - MUL( _sbc_proto_4[16], in[35])))))); - - t[4] = SCALE4_STAGE1( - MULA(_sbc_proto_4[17], in[4] + in[36], - MULA(_sbc_proto_4[18], in[12] + in[28], - MUL( _sbc_proto_4[19], in[20])))); - - t[5] = SCALE4_STAGE1( - MULA(_sbc_proto_4[16], in[5], - MULA(_sbc_proto_4[15], in[13], - MULA(_sbc_proto_4[14], in[21], - MULA(_sbc_proto_4[13], in[29], - MUL( _sbc_proto_4[12], in[37])))))); - - /* don't compute t[6]... this term always multiplies - * with cos(pi/2) = 0 */ - - t[7] = SCALE4_STAGE1( - MULA(_sbc_proto_4[6], in[7], - MULA(_sbc_proto_4[5], in[15], - MULA(_sbc_proto_4[4], in[23], - MULA(_sbc_proto_4[3], in[31], - MUL( _sbc_proto_4[2], in[39])))))); - - s[0] = MUL( _anamatrix4[0], t[0] + t[4]); - s[1] = MUL( _anamatrix4[2], t[2]); - s[2] = MULA(_anamatrix4[1], t[1] + t[3], - MUL(_anamatrix4[3], t[5])); - s[3] = MULA(_anamatrix4[3], t[1] + t[3], - MUL(_anamatrix4[1], -t[5] + t[7])); - s[4] = MUL( _anamatrix4[3], t[7]); - - out[0] = SCALE4_STAGE2( s[0] + s[1] + s[2] + s[4]); /* Q0 */ - out[1] = SCALE4_STAGE2(-s[0] + s[1] + s[3]); - out[2] = SCALE4_STAGE2(-s[0] + s[1] - s[3]); - out[3] = SCALE4_STAGE2( s[0] + s[1] - s[2] - s[4]); + FIXED_A t1[4]; + FIXED_T t2[4]; + int i = 0, hop = 0; + + /* rounding coefficient */ + + t1[0] = t1[1] = t1[2] = t1[3] = + (FIXED_A)1 << (SBC_PROTO_FIXED4_SCALE-1); + + /* low pass polyphase filter */ + for (hop = 0; hop < 40; hop += 8) { + t1[0] += (FIXED_A)in[hop] * _sbc_proto_fixed4[hop]; + t1[1] += (FIXED_A)in[hop + 1] * _sbc_proto_fixed4[hop + 1]; + t1[2] += (FIXED_A)in[hop + 2] * _sbc_proto_fixed4[hop + 2]; + t1[1] += (FIXED_A)in[hop + 3] * _sbc_proto_fixed4[hop + 3]; + t1[0] += (FIXED_A)in[hop + 4] * _sbc_proto_fixed4[hop + 4]; + t1[3] += (FIXED_A)in[hop + 5] * _sbc_proto_fixed4[hop + 5]; + t1[3] += (FIXED_A)in[hop + 7] * _sbc_proto_fixed4[hop + 7]; + } + + /* scaling */ + t2[0] = t1[0] >> SBC_PROTO_FIXED4_SCALE; + t2[1] = t1[1] >> SBC_PROTO_FIXED4_SCALE; + t2[2] = t1[2] >> SBC_PROTO_FIXED4_SCALE; + t2[3] = t1[3] >> SBC_PROTO_FIXED4_SCALE; + + /* do the cos transform */ + for (i = 0, hop = 0; i < 4; hop += 8, i++) { + /* rounding coefficient */ + t1[i] = (FIXED_A)1 << (SBC_COS_TABLE_FIXED4_SCALE-1-SCALE_OUT_BITS); + + t1[i] += (FIXED_A)t2[0] * cos_table_fixed_4[0 + hop]; + t1[i] += (FIXED_A)t2[1] * cos_table_fixed_4[1 + hop]; + t1[i] += (FIXED_A)t2[2] * cos_table_fixed_4[2 + hop]; + t1[i] += (FIXED_A)t2[3] * cos_table_fixed_4[5 + hop]; + } + + /* scaling */ + out[0] = t1[0] >> (SBC_COS_TABLE_FIXED4_SCALE-SCALE_OUT_BITS); + out[1] = t1[1] >> (SBC_COS_TABLE_FIXED4_SCALE-SCALE_OUT_BITS); + out[2] = t1[2] >> (SBC_COS_TABLE_FIXED4_SCALE-SCALE_OUT_BITS); + out[3] = t1[3] >> (SBC_COS_TABLE_FIXED4_SCALE-SCALE_OUT_BITS); } static inline void sbc_analyze_four(struct sbc_encoder_state *state, struct sbc_frame *frame, int ch, int blk) { - int32_t *x = &state->X[ch][state->position[ch]]; + int16_t *x = &state->X[ch][state->position[ch]]; int16_t *pcm = &frame->pcm_sample[ch][blk * 4]; /* Input 4 Audio Samples */ @@ -740,133 +722,80 @@ static inline void sbc_analyze_four(struct sbc_encoder_state *state, state->position[ch] = 36; } -static inline void _sbc_analyze_eight(const int32_t *in, int32_t *out) +static inline void _sbc_analyze_eight(const int16_t *in, int32_t *out) { - sbc_fixed_t t[8], s[8]; - - t[0] = SCALE8_STAGE1( /* Q10 */ - MULA(_sbc_proto_8[0], (in[16] - in[64]), /* Q18 = Q18 * Q0 */ - MULA(_sbc_proto_8[1], (in[32] - in[48]), - MULA(_sbc_proto_8[2], in[4], - MULA(_sbc_proto_8[3], in[20], - MULA(_sbc_proto_8[4], in[36], - MUL( _sbc_proto_8[5], in[52]))))))); - - t[1] = SCALE8_STAGE1( - MULA(_sbc_proto_8[6], in[2], - MULA(_sbc_proto_8[7], in[18], - MULA(_sbc_proto_8[8], in[34], - MULA(_sbc_proto_8[9], in[50], - MUL(_sbc_proto_8[10], in[66])))))); - - t[2] = SCALE8_STAGE1( - MULA(_sbc_proto_8[11], in[1], - MULA(_sbc_proto_8[12], in[17], - MULA(_sbc_proto_8[13], in[33], - MULA(_sbc_proto_8[14], in[49], - MULA(_sbc_proto_8[15], in[65], - MULA(_sbc_proto_8[16], in[3], - MULA(_sbc_proto_8[17], in[19], - MULA(_sbc_proto_8[18], in[35], - MULA(_sbc_proto_8[19], in[51], - MUL( _sbc_proto_8[20], in[67]))))))))))); - - t[3] = SCALE8_STAGE1( - MULA( _sbc_proto_8[21], in[5], - MULA( _sbc_proto_8[22], in[21], - MULA( _sbc_proto_8[23], in[37], - MULA( _sbc_proto_8[24], in[53], - MULA( _sbc_proto_8[25], in[69], - MULA(-_sbc_proto_8[15], in[15], - MULA(-_sbc_proto_8[14], in[31], - MULA(-_sbc_proto_8[13], in[47], - MULA(-_sbc_proto_8[12], in[63], - MUL( -_sbc_proto_8[11], in[79]))))))))))); - - t[4] = SCALE8_STAGE1( - MULA( _sbc_proto_8[26], in[6], - MULA( _sbc_proto_8[27], in[22], - MULA( _sbc_proto_8[28], in[38], - MULA( _sbc_proto_8[29], in[54], - MULA( _sbc_proto_8[30], in[70], - MULA(-_sbc_proto_8[10], in[14], - MULA(-_sbc_proto_8[9], in[30], - MULA(-_sbc_proto_8[8], in[46], - MULA(-_sbc_proto_8[7], in[62], - MUL( -_sbc_proto_8[6], in[78]))))))))))); - - t[5] = SCALE8_STAGE1( - MULA( _sbc_proto_8[31], in[7], - MULA( _sbc_proto_8[32], in[23], - MULA( _sbc_proto_8[33], in[39], - MULA( _sbc_proto_8[34], in[55], - MULA( _sbc_proto_8[35], in[71], - MULA(-_sbc_proto_8[20], in[13], - MULA(-_sbc_proto_8[19], in[29], - MULA(-_sbc_proto_8[18], in[45], - MULA(-_sbc_proto_8[17], in[61], - MUL( -_sbc_proto_8[16], in[77]))))))))))); - - t[6] = SCALE8_STAGE1( - MULA( _sbc_proto_8[36], (in[8] + in[72]), - MULA( _sbc_proto_8[37], (in[24] + in[56]), - MULA( _sbc_proto_8[38], in[40], - MULA(-_sbc_proto_8[39], in[12], - MULA(-_sbc_proto_8[5], in[28], - MULA(-_sbc_proto_8[4], in[44], - MULA(-_sbc_proto_8[3], in[60], - MUL( -_sbc_proto_8[2], in[76]))))))))); - - t[7] = SCALE8_STAGE1( - MULA( _sbc_proto_8[35], in[9], - MULA( _sbc_proto_8[34], in[25], - MULA( _sbc_proto_8[33], in[41], - MULA( _sbc_proto_8[32], in[57], - MULA( _sbc_proto_8[31], in[73], - MULA(-_sbc_proto_8[25], in[11], - MULA(-_sbc_proto_8[24], in[27], - MULA(-_sbc_proto_8[23], in[43], - MULA(-_sbc_proto_8[22], in[59], - MUL( -_sbc_proto_8[21], in[75]))))))))))); - - s[0] = MULA( _anamatrix8[0], t[0], - MUL( _anamatrix8[1], t[6])); - s[1] = MUL( _anamatrix8[7], t[1]); - s[2] = MULA( _anamatrix8[2], t[2], - MULA( _anamatrix8[3], t[3], - MULA( _anamatrix8[4], t[5], - MUL( _anamatrix8[5], t[7])))); - s[3] = MUL( _anamatrix8[6], t[4]); - s[4] = MULA( _anamatrix8[3], t[2], - MULA(-_anamatrix8[5], t[3], - MULA(-_anamatrix8[2], t[5], - MUL( -_anamatrix8[4], t[7])))); - s[5] = MULA( _anamatrix8[4], t[2], - MULA(-_anamatrix8[2], t[3], - MULA( _anamatrix8[5], t[5], - MUL( _anamatrix8[3], t[7])))); - s[6] = MULA( _anamatrix8[1], t[0], - MUL( -_anamatrix8[0], t[6])); - s[7] = MULA( _anamatrix8[5], t[2], - MULA(-_anamatrix8[4], t[3], - MULA( _anamatrix8[3], t[5], - MUL( -_anamatrix8[2], t[7])))); - - out[0] = SCALE8_STAGE2( s[0] + s[1] + s[2] + s[3]); - out[1] = SCALE8_STAGE2( s[1] - s[3] + s[4] + s[6]); - out[2] = SCALE8_STAGE2( s[1] - s[3] + s[5] - s[6]); - out[3] = SCALE8_STAGE2(-s[0] + s[1] + s[3] + s[7]); - out[4] = SCALE8_STAGE2(-s[0] + s[1] + s[3] - s[7]); - out[5] = SCALE8_STAGE2( s[1] - s[3] - s[5] - s[6]); - out[6] = SCALE8_STAGE2( s[1] - s[3] - s[4] + s[6]); - out[7] = SCALE8_STAGE2( s[0] + s[1] - s[2] + s[3]); + FIXED_A t1[8]; + FIXED_T t2[8]; + int i, hop; + + /* rounding coefficient */ + t1[0] = t1[1] = t1[2] = t1[3] = t1[4] = t1[5] = t1[6] = t1[7] = + (FIXED_A)1 << (SBC_PROTO_FIXED8_SCALE-1); + + /* low pass polyphase filter */ + for (hop = 0; hop < 80; hop += 16) { + t1[0] += (FIXED_A)in[hop] * _sbc_proto_fixed8[hop]; + t1[1] += (FIXED_A)in[hop + 1] * _sbc_proto_fixed8[hop + 1]; + t1[2] += (FIXED_A)in[hop + 2] * _sbc_proto_fixed8[hop + 2]; + t1[3] += (FIXED_A)in[hop + 3] * _sbc_proto_fixed8[hop + 3]; + + t1[4] += (FIXED_A)in[hop + 4] * _sbc_proto_fixed8[hop + 4]; + + t1[0] += (FIXED_A)in[hop + 8] * _sbc_proto_fixed8[hop + 8]; + t1[1] += (FIXED_A)in[hop + 7] * _sbc_proto_fixed8[hop + 7]; + t1[2] += (FIXED_A)in[hop + 6] * _sbc_proto_fixed8[hop + 6]; + t1[3] += (FIXED_A)in[hop + 5] * _sbc_proto_fixed8[hop + 5]; + + t1[5] += (FIXED_A)in[hop + 9] * _sbc_proto_fixed8[hop + 9]; + t1[6] += (FIXED_A)in[hop + 10] * _sbc_proto_fixed8[hop + 10]; + t1[7] += (FIXED_A)in[hop + 11] * _sbc_proto_fixed8[hop + 11]; + + t1[5] += (FIXED_A)in[hop + 15] * _sbc_proto_fixed8[hop + 15]; + t1[6] += (FIXED_A)in[hop + 14] * _sbc_proto_fixed8[hop + 14]; + t1[7] += (FIXED_A)in[hop + 13] * _sbc_proto_fixed8[hop + 13]; + } + + /* scaling */ + t2[0] = t1[0] >> SBC_PROTO_FIXED8_SCALE; + t2[1] = t1[1] >> SBC_PROTO_FIXED8_SCALE; + t2[2] = t1[2] >> SBC_PROTO_FIXED8_SCALE; + t2[3] = t1[3] >> SBC_PROTO_FIXED8_SCALE; + t2[4] = t1[4] >> SBC_PROTO_FIXED8_SCALE; + t2[5] = t1[5] >> SBC_PROTO_FIXED8_SCALE; + t2[6] = t1[6] >> SBC_PROTO_FIXED8_SCALE; + t2[7] = t1[7] >> SBC_PROTO_FIXED8_SCALE; + + /* do the cos transform */ + for (i = 0, hop = 0; i < 8; hop += 16, i++) { + /* rounding coefficient */ + t1[i] = (FIXED_A)1 << (SBC_COS_TABLE_FIXED8_SCALE-1-SCALE_OUT_BITS); + + t1[i] += (FIXED_A)t2[0] * cos_table_fixed_8[0 + hop]; + t1[i] += (FIXED_A)t2[1] * cos_table_fixed_8[1 + hop]; + t1[i] += (FIXED_A)t2[2] * cos_table_fixed_8[2 + hop]; + t1[i] += (FIXED_A)t2[3] * cos_table_fixed_8[3 + hop]; + t1[i] += (FIXED_A)t2[4] * cos_table_fixed_8[4 + hop]; + t1[i] += (FIXED_A)t2[5] * cos_table_fixed_8[9 + hop]; + t1[i] += (FIXED_A)t2[6] * cos_table_fixed_8[10 + hop]; + t1[i] += (FIXED_A)t2[7] * cos_table_fixed_8[11 + hop]; + } + + /* scaling */ + out[0] = t1[0] >> (SBC_COS_TABLE_FIXED8_SCALE-SCALE_OUT_BITS); + out[1] = t1[1] >> (SBC_COS_TABLE_FIXED8_SCALE-SCALE_OUT_BITS); + out[2] = t1[2] >> (SBC_COS_TABLE_FIXED8_SCALE-SCALE_OUT_BITS); + out[3] = t1[3] >> (SBC_COS_TABLE_FIXED8_SCALE-SCALE_OUT_BITS); + out[4] = t1[4] >> (SBC_COS_TABLE_FIXED8_SCALE-SCALE_OUT_BITS); + out[5] = t1[5] >> (SBC_COS_TABLE_FIXED8_SCALE-SCALE_OUT_BITS); + out[6] = t1[6] >> (SBC_COS_TABLE_FIXED8_SCALE-SCALE_OUT_BITS); + out[7] = t1[7] >> (SBC_COS_TABLE_FIXED8_SCALE-SCALE_OUT_BITS); } static inline void sbc_analyze_eight(struct sbc_encoder_state *state, struct sbc_frame *frame, int ch, int blk) { - int32_t *x = &state->X[ch][state->position[ch]]; + int16_t *x = &state->X[ch][state->position[ch]]; int16_t *pcm = &frame->pcm_sample[ch][blk * 8]; /* Input 8 Audio Samples */ @@ -1006,7 +935,7 @@ static int sbc_pack_frame(uint8_t *data, struct sbc_frame *frame, size_t len) frame->scale_factor[ch][sb] = 0; scalefactor[ch][sb] = 2; for (blk = 0; blk < frame->blocks; blk++) { - while (scalefactor[ch][sb] < fabs(frame->sb_sample_f[blk][ch][sb])) { + while ((scalefactor[ch][sb] << SCALE_OUT_BITS) <= neginv(frame->sb_sample_f[blk][ch][sb])) { frame->scale_factor[ch][sb]++; scalefactor[ch][sb] *= 2; } @@ -1040,11 +969,11 @@ static int sbc_pack_frame(uint8_t *data, struct sbc_frame *frame, size_t len) frame->sb_sample_f[blk][1][sb]) >> 1; /* calculate scale_factor_j and scalefactor_j for joint case */ - while (scalefactor_j[0] < fabs(sb_sample_j[blk][0])) { + while ((scalefactor_j[0] << SCALE_OUT_BITS) <= neginv(sb_sample_j[blk][0])) { scale_factor_j[0]++; scalefactor_j[0] *= 2; } - while (scalefactor_j[1] < fabs(sb_sample_j[blk][1])) { + while ((scalefactor_j[1] << SCALE_OUT_BITS) <= neginv(sb_sample_j[blk][1])) { scale_factor_j[1]++; scalefactor_j[1] *= 2; } @@ -1100,11 +1029,11 @@ static int sbc_pack_frame(uint8_t *data, struct sbc_frame *frame, size_t len) for (ch = 0; ch < frame->channels; ch++) { for (sb = 0; sb < frame->subbands; sb++) { if (levels[ch][sb] > 0) { - audio_sample = - (uint16_t) (((((int64_t)frame->sb_sample_f[blk][ch][sb]*levels[ch][sb]) >> - (frame->scale_factor[ch][sb] + 1)) + - levels[ch][sb]) >> 1); - PUT_BITS(audio_sample & levels[ch][sb], bits[ch][sb]); + int32_t sample = frame->sb_sample_f[blk][ch][sb]; + int32_t s_shift = (frame->scale_factor[ch][sb] + 1 + SCALE_OUT_BITS); + int32_t ls = levels[ch][sb]; + audio_sample = ((((int64_t)1 << s_shift) + sample) * ls) >> (s_shift + 1); + PUT_BITS(audio_sample, bits[ch][sb]); } } } diff --git a/sbc/sbc_math.h b/sbc/sbc_math.h index b3d87a6..f3937ce 100644 --- a/sbc/sbc_math.h +++ b/sbc/sbc_math.h @@ -23,37 +23,30 @@ * */ -#define fabs(x) ((x) < 0 ? -(x) : (x)) /* C does not provide an explicit arithmetic shift right but this will always be correct and every compiler *should* generate optimal code */ #define ASR(val, bits) ((-2 >> 1 == -1) ? \ ((int32_t)(val)) >> (bits) : ((int32_t) (val)) / (1 << (bits))) -#define SCALE_PROTO4_TBL 15 -#define SCALE_ANA4_TBL 17 -#define SCALE_PROTO8_TBL 16 -#define SCALE_ANA8_TBL 17 +#define neginv(x) ((-2 >> 1 == -1) ? \ + ((((int32_t)(x)) >> 31) ^ (int32_t)(x)) : \ + ((x >= 0) ? (x) : -(x)-1)) + +#define SCALE_OUT_BITS 14 + #define SCALE_SPROTO4_TBL 12 #define SCALE_SPROTO8_TBL 14 #define SCALE_NPROTO4_TBL 11 #define SCALE_NPROTO8_TBL 11 -#define SCALE4_STAGE1_BITS 15 -#define SCALE4_STAGE2_BITS 16 #define SCALE4_STAGED1_BITS 15 #define SCALE4_STAGED2_BITS 16 -#define SCALE8_STAGE1_BITS 15 -#define SCALE8_STAGE2_BITS 15 #define SCALE8_STAGED1_BITS 15 #define SCALE8_STAGED2_BITS 16 typedef int32_t sbc_fixed_t; -#define SCALE4_STAGE1(src) ASR(src, SCALE4_STAGE1_BITS) -#define SCALE4_STAGE2(src) ASR(src, SCALE4_STAGE2_BITS) #define SCALE4_STAGED1(src) ASR(src, SCALE4_STAGED1_BITS) #define SCALE4_STAGED2(src) ASR(src, SCALE4_STAGED2_BITS) -#define SCALE8_STAGE1(src) ASR(src, SCALE8_STAGE1_BITS) -#define SCALE8_STAGE2(src) ASR(src, SCALE8_STAGE2_BITS) #define SCALE8_STAGED1(src) ASR(src, SCALE8_STAGED1_BITS) #define SCALE8_STAGED2(src) ASR(src, SCALE8_STAGED2_BITS) diff --git a/sbc/sbc_tables.h b/sbc/sbc_tables.h index f5daaa7..9c96732 100644 --- a/sbc/sbc_tables.h +++ b/sbc/sbc_tables.h @@ -40,40 +40,11 @@ static const int sbc_offset8[4][8] = { }; -#define SP4(val) (((int32_t)(val))/17658) /* Used to be #define SP4(val) ASR(val, SCALE_PROTO4_TBL) but causes wrong gain */ -#define SA4(val) ASR(val, SCALE_ANA4_TBL) -#define SP8(val) (((int32_t)(val))/57740) /* Used to be #define SP8(val) ASR(val, SCALE_PROTO8_TBL) but causes wrong gain */ -#define SA8(val) ASR(val, SCALE_ANA8_TBL) #define SS4(val) ASR(val, SCALE_SPROTO4_TBL) #define SS8(val) ASR(val, SCALE_SPROTO8_TBL) #define SN4(val) ASR(val, SCALE_NPROTO4_TBL) #define SN8(val) ASR(val, SCALE_NPROTO8_TBL) -static const int32_t _sbc_proto_4[20] = { - SP4(0x02cb3e8c), SP4(0x22b63dc0), SP4(0x002329cc), SP4(0x053b7548), - SP4(0x31eab940), SP4(0xec1f5e60), SP4(0xff3773a8), SP4(0x0061c5a7), - SP4(0x07646680), SP4(0x3f239480), SP4(0xf89f23a8), SP4(0x007a4737), - SP4(0x00b32807), SP4(0x083ddc80), SP4(0x4825e480), SP4(0x0191e578), - SP4(0x00ff11ca), SP4(0x00fb7991), SP4(0x069fdc58), SP4(0x4b584000) -}; - -static const int32_t _anamatrix4[4] = { - SA4(0x2d413cc0), SA4(0x3b20d780), SA4(0x40000000), SA4(0x187de2a0) -}; - -static const int32_t _sbc_proto_8[40] = { - SP8(0x02e5cd20), SP8(0x22d0c200), SP8(0x006bfe27), SP8(0x07808930), - SP8(0x3f1c8800), SP8(0xf8810d70), SP8(0x002cfdc6), SP8(0x055acf28), - SP8(0x31f566c0), SP8(0xebfe57e0), SP8(0xff27c437), SP8(0x001485cc), - SP8(0x041c6e58), SP8(0x2a7cfa80), SP8(0xe4c4a240), SP8(0xfe359e4c), - SP8(0x0048b1f8), SP8(0x0686ce30), SP8(0x38eec5c0), SP8(0xf2a1b9f0), - SP8(0xffe8904a), SP8(0x0095698a), SP8(0x0824a480), SP8(0x443b3c00), - SP8(0xfd7badc8), SP8(0x00d3e2d9), SP8(0x00c183d2), SP8(0x084e1950), - SP8(0x4810d800), SP8(0x017f43fe), SP8(0x01056dd8), SP8(0x00e9cb9f), - SP8(0x07d7d090), SP8(0x4a708980), SP8(0x0488fae8), SP8(0x0113bd20), - SP8(0x0107b1a8), SP8(0x069fb3c0), SP8(0x4b3db200), SP8(0x00763f48) -}; - static const int32_t sbc_proto_4_40m0[] = { SS4(0x00000000), SS4(0xffa6982f), SS4(0xfba93848), SS4(0x0456c7b8), SS4(0x005967d1), SS4(0xfffb9ac7), SS4(0xff589157), SS4(0xf9c2a8d8), @@ -116,11 +87,6 @@ static const int32_t sbc_proto_8_80m1[] = { SS8(0x0d9daee0), SS8(0xeac182c0), SS8(0xfdf1c8d4), SS8(0xfff5bd1a) }; -static const int32_t _anamatrix8[8] = { - SA8(0x3b20d780), SA8(0x187de2a0), SA8(0x3ec52f80), SA8(0x3536cc40), - SA8(0x238e7680), SA8(0x0c7c5c20), SA8(0x2d413cc0), SA8(0x40000000) -}; - static const int32_t synmatrix4[8][4] = { { SN4(0x05a82798), SN4(0xfa57d868), SN4(0xfa57d868), SN4(0x05a82798) }, { SN4(0x030fbc54), SN4(0xf89be510), SN4(0x07641af0), SN4(0xfcf043ac) }, @@ -166,3 +132,169 @@ static const int32_t synmatrix8[16][8] = { { SN8(0xf9592678), SN8(0x018f8b84), SN8(0x07d8a5f0), SN8(0x0471ced0), SN8(0xfb8e3130), SN8(0xf8275a10), SN8(0xfe70747c), SN8(0x06a6d988) } }; + +//#define SBC_HIGH_PRECISION + +#ifdef SBC_HIGH_PRECISION +# define FIXED_A int64_t /* data type for fixed point accumulator */ +# define FIXED_T int32_t /* data type for fixed point constants */ +# define SBC_FIXED8_EXTRA_BITS 16 +#else +# define FIXED_A int32_t /* data type for fixed point accumulator */ +# define FIXED_T int16_t /* data type for fixed point constants */ +# define SBC_FIXED8_EXTRA_BITS 0 +#endif + +/* A2DP specification: Section 12.8 Tables + * + * Original values are premultiplied by 4 for better precision (that is the + * maximum which is possible without overflows) + * + * Note: in each block of 16 numbers sign was changed for elements 4, 13, 14, 15 + * in order to compensate the same change applied to cos_table_fixed_8 + */ +#define SBC_PROTO_FIXED8_SCALE (sizeof(FIXED_T)*CHAR_BIT-1-SBC_FIXED8_EXTRA_BITS+2) +#define F(x) (FIXED_A)((x*4)*((FIXED_A)1<<(sizeof(FIXED_T)*CHAR_BIT-1))+0.5) +static const FIXED_T _sbc_proto_fixed8[80] = { + F(0.00000000E+00), F(1.56575398E-04), F(3.43256425E-04), F(5.54620202E-04), + -F(8.23919506E-04), F(1.13992507E-03), F(1.47640169E-03), F(1.78371725E-03), + F(2.01182542E-03), F(2.10371989E-03), F(1.99454554E-03), F(1.61656283E-03), + F(9.02154502E-04), F(1.78805361E-04), F(1.64973098E-03), F(3.49717454E-03), + + F(5.65949473E-03), F(8.02941163E-03), F(1.04584443E-02), F(1.27472335E-02), + -F(1.46525263E-02), F(1.59045603E-02), F(1.62208471E-02), F(1.53184106E-02), + F(1.29371806E-02), F(8.85757540E-03), F(2.92408442E-03),-F(4.91578024E-03), + -F(1.46404076E-02), F(2.61098752E-02), F(3.90751381E-02), F(5.31873032E-02), + + F(6.79989431E-02), F(8.29847578E-02), F(9.75753918E-02), F(1.11196689E-01), + -F(1.23264548E-01), F(1.33264415E-01), F(1.40753505E-01), F(1.45389847E-01), + F(1.46955068E-01), F(1.45389847E-01), F(1.40753505E-01), F(1.33264415E-01), + F(1.23264548E-01),-F(1.11196689E-01),-F(9.75753918E-02),-F(8.29847578E-02), + + -F(6.79989431E-02),-F(5.31873032E-02),-F(3.90751381E-02),-F(2.61098752E-02), + F(1.46404076E-02),-F(4.91578024E-03), F(2.92408442E-03), F(8.85757540E-03), + F(1.29371806E-02), F(1.53184106E-02), F(1.62208471E-02), F(1.59045603E-02), + F(1.46525263E-02),-F(1.27472335E-02),-F(1.04584443E-02),-F(8.02941163E-03), + + -F(5.65949473E-03),-F(3.49717454E-03),-F(1.64973098E-03),-F(1.78805361E-04), + -F(9.02154502E-04), F(1.61656283E-03), F(1.99454554E-03), F(2.10371989E-03), + F(2.01182542E-03), F(1.78371725E-03), F(1.47640169E-03), F(1.13992507E-03), + F(8.23919506E-04),-F(5.54620202E-04),-F(3.43256425E-04),-F(1.56575398E-04), +}; +#undef F + +/* + * To produce this cosine matrix in Octave: + * + * b = zeros(8, 16); + * for i = 0:7 for j = 0:15 b(i+1, j+1) = cos( (i + 0.5) * (j - 4) * (pi/8) ) endfor endfor; + * printf("%.10f, ", b'); + * + * Note: in each block of 16 numbers sign was changed for elements 4, 13, 14, 15. + * Change of sign for element 4 allows to replace constant 1.0 (not representable + * in Q15 format) with -1.0 (fine with Q15). + * Changed signs for elements 13, 14, 15 allow to have more similar constants + * and simplify subband filter function code. + */ +#define SBC_COS_TABLE_FIXED8_SCALE (sizeof(FIXED_T)*CHAR_BIT-1+SBC_FIXED8_EXTRA_BITS) +#define F(x) (FIXED_A)((x)*((FIXED_A)1<<(sizeof(FIXED_T)*CHAR_BIT-1))+0.5) +static const FIXED_T cos_table_fixed_8[128] = { + F(0.7071067812), F(0.8314696123), F(0.9238795325), F(0.9807852804), + -F(1.0000000000), F(0.9807852804), F(0.9238795325), F(0.8314696123), + F(0.7071067812), F(0.5555702330), F(0.3826834324), F(0.1950903220), + F(0.0000000000), F(0.1950903220), F(0.3826834324), F(0.5555702330), + + -F(0.7071067812),-F(0.1950903220), F(0.3826834324), F(0.8314696123), + -F(1.0000000000), F(0.8314696123), F(0.3826834324),-F(0.1950903220), + -F(0.7071067812),-F(0.9807852804),-F(0.9238795325),-F(0.5555702330), + -F(0.0000000000),-F(0.5555702330),-F(0.9238795325),-F(0.9807852804), + + -F(0.7071067812),-F(0.9807852804),-F(0.3826834324), F(0.5555702330), + -F(1.0000000000), F(0.5555702330),-F(0.3826834324),-F(0.9807852804), + -F(0.7071067812), F(0.1950903220), F(0.9238795325), F(0.8314696123), + F(0.0000000000), F(0.8314696123), F(0.9238795325), F(0.1950903220), + + F(0.7071067812),-F(0.5555702330),-F(0.9238795325), F(0.1950903220), + -F(1.0000000000), F(0.1950903220),-F(0.9238795325),-F(0.5555702330), + F(0.7071067812), F(0.8314696123),-F(0.3826834324),-F(0.9807852804), + -F(0.0000000000),-F(0.9807852804),-F(0.3826834324), F(0.8314696123), + + F(0.7071067812), F(0.5555702330),-F(0.9238795325),-F(0.1950903220), + -F(1.0000000000),-F(0.1950903220),-F(0.9238795325), F(0.5555702330), + F(0.7071067812),-F(0.8314696123),-F(0.3826834324), F(0.9807852804), + F(0.0000000000), F(0.9807852804),-F(0.3826834324),-F(0.8314696123), + + -F(0.7071067812), F(0.9807852804),-F(0.3826834324),-F(0.5555702330), + -F(1.0000000000),-F(0.5555702330),-F(0.3826834324), F(0.9807852804), + -F(0.7071067812),-F(0.1950903220), F(0.9238795325),-F(0.8314696123), + -F(0.0000000000),-F(0.8314696123), F(0.9238795325),-F(0.1950903220), + + -F(0.7071067812), F(0.1950903220), F(0.3826834324),-F(0.8314696123), + -F(1.0000000000),-F(0.8314696123), F(0.3826834324), F(0.1950903220), + -F(0.7071067812), F(0.9807852804),-F(0.9238795325), F(0.5555702330), + -F(0.0000000000), F(0.5555702330),-F(0.9238795325), F(0.9807852804), + + F(0.7071067812),-F(0.8314696123), F(0.9238795325),-F(0.9807852804), + -F(1.0000000000),-F(0.9807852804), F(0.9238795325),-F(0.8314696123), + F(0.7071067812),-F(0.5555702330), F(0.3826834324),-F(0.1950903220), + -F(0.0000000000),-F(0.1950903220), F(0.3826834324),-F(0.5555702330), +}; +#undef F + +/* A2DP specification: Section 12.8 Tables + * + * Original values are premultiplied by 2 for better precision (that is the + * maximum which is possible without overflows) + * + * Note: in each block of 8 numbers sign was changed for elements 2 and 7 + * in order to compensate the same change applied to cos_table_fixed_4 + */ +#define SBC_PROTO_FIXED4_SCALE (sizeof(FIXED_T)*CHAR_BIT-1-SBC_FIXED8_EXTRA_BITS+1) +#define F(x) (FIXED_A)((x*2)*((FIXED_A)1<<(sizeof(FIXED_T)*CHAR_BIT-1))+0.5) +static const FIXED_T _sbc_proto_fixed4[40] = { + F(0.00000000E+00), F(5.36548976E-04),-F(1.49188357E-03), F(2.73370904E-03), + F(3.83720193E-03), F(3.89205149E-03), F(1.86581691E-03), F(3.06012286E-03), + + F(1.09137620E-02), F(2.04385087E-02),-F(2.88757392E-02), F(3.21939290E-02), + F(2.58767811E-02), F(6.13245186E-03),-F(2.88217274E-02), F(7.76463494E-02), + + F(1.35593274E-01), F(1.94987841E-01),-F(2.46636662E-01), F(2.81828203E-01), + F(2.94315332E-01), F(2.81828203E-01), F(2.46636662E-01),-F(1.94987841E-01), + + -F(1.35593274E-01),-F(7.76463494E-02), F(2.88217274E-02), F(6.13245186E-03), + F(2.58767811E-02), F(3.21939290E-02), F(2.88757392E-02),-F(2.04385087E-02), + + -F(1.09137620E-02),-F(3.06012286E-03),-F(1.86581691E-03), F(3.89205149E-03), + F(3.83720193E-03), F(2.73370904E-03), F(1.49188357E-03),-F(5.36548976E-04), +}; +#undef F + +/* + * To produce this cosine matrix in Octave: + * + * b = zeros(4, 8); + * for i = 0:3 for j = 0:7 b(i+1, j+1) = cos( (i + 0.5) * (j - 2) * (pi/4) ) endfor endfor; + * printf("F(%.10f), ", b'); + * + * Note: in each block of 8 numbers sign was changed for elements 2 and 7. + * Change of sign for element 2 allows to replace constant 1.0 (not representable + * in Q15 format) with -1.0 (fine with Q15). + * Changed sign for element 7 allows to have more similar constants + * and simplify subband filter function code. + */ +#define SBC_COS_TABLE_FIXED4_SCALE (sizeof(FIXED_T)*CHAR_BIT-1+SBC_FIXED8_EXTRA_BITS) +#define F(x) (FIXED_A)((x)*((FIXED_A)1<<(sizeof(FIXED_T)*CHAR_BIT-1))+0.5) +static const FIXED_T cos_table_fixed_4[32] = { + F(0.7071067812), F(0.9238795325),-F(1.0000000000), F(0.9238795325), + F(0.7071067812), F(0.3826834324), F(0.0000000000), F(0.3826834324), + + -F(0.7071067812), F(0.3826834324),-F(1.0000000000), F(0.3826834324), + -F(0.7071067812),-F(0.9238795325),-F(0.0000000000),-F(0.9238795325), + + -F(0.7071067812),-F(0.3826834324),-F(1.0000000000),-F(0.3826834324), + -F(0.7071067812), F(0.9238795325), F(0.0000000000), F(0.9238795325), + + F(0.7071067812),-F(0.9238795325),-F(1.0000000000),-F(0.9238795325), + F(0.7071067812),-F(0.3826834324),-F(0.0000000000),-F(0.3826834324), +}; +#undef F ^ permalink raw reply related [flat|nested] 35+ messages in thread
* Re: [RFC/PATCH] sbc: new filtering function for 8 band fixed point encoding 2008-12-22 23:30 ` Siarhei Siamashka @ 2008-12-23 1:00 ` Marcel Holtmann 2008-12-23 8:20 ` Jaska.Uimonen 2008-12-23 10:45 ` Siarhei Siamashka 0 siblings, 2 replies; 35+ messages in thread From: Marcel Holtmann @ 2008-12-23 1:00 UTC (permalink / raw) To: Siarhei Siamashka; +Cc: ext Brad Midgley, Jaska Uimonen, linux-bluetooth Hi Siarhei, > > We had a talk with Jaska Uimonen here, and now I'm kind of delegated to > > finish the work on this filtering function for SBC encoder (including the > > final addition of ARM assembly optimizations). He provided me with his > > last variant of code, which contains some more optimizations to reduce the > > number of operations and also loops unrolling. I will add his changes to > > the patch on next iteration. > > > > Now the question is how to best integrate a fixed filtering function to git > > repository? If I just continue adding changes to the patch in order to make > > it a faster, it will be also not so obvious to see how we got to these code > > transformations just from the commit log. > > Next iteration done. Added support for 4 subbands, number of arithmetic > operations reduced (but without loop unrolling for better code readability), > precision improved for both 16-bit and 32-bit fixed point, 'neginv' macro is > now more portable and faster. The rest is in the code comments. I don't mind having patches as attachment, but this makes it hard to review and comment on them. Especially when it comes to stuff like coding style (since I have no ideas about the rest). + t1[0] = t1[1] = t1[2] = t1[3] = + (FIXED_A)1 << (SBC_PROTO_FIXED4_SCALE-1); Should be like this: (FIXED_A) 1 << (SBC_PROTO_FIXED4_SCALE - 1); Also do you need the (FIXED_A) cast? + for (hop = 0; hop < 40; hop += 8) { + t1[0] += (FIXED_A)in[hop] * _sbc_proto_fixed4[hop]; Same here. There has to be a space after every case. + t1[i] = (FIXED_A)1 << (SBC_COS_TABLE_FIXED4_SCALE-1-SCALE_OUT_BITS); And between every operation there has to be a space: SCALE - 1 - SCALE. In this case I would actually put the - 1 at the end, but that is pure cosmetics and not a coding style violation. Please fix all of these. There at least 8 or so. +#define neginv(x) ((-2 >> 1 == -1) ? \ + ((((int32_t)(x)) >> 31) ^ (int32_t)(x)) : \ + ((x >= 0) ? (x) : -(x)-1)) Space after cast. Space before and after operator. +#ifdef SBC_HIGH_PRECISION +# define FIXED_A int64_t /* data type for fixed point accumulator */ +# define FIXED_T int32_t /* data type for fixed point constants */ +# define SBC_FIXED8_EXTRA_BITS 16 +#else +# define FIXED_A int32_t /* data type for fixed point accumulator */ +# define FIXED_T int16_t /* data type for fixed point constants */ +# define SBC_FIXED8_EXTRA_BITS 0 +#endif No space between # and define. I know that this is meant to improve readability, but I don't see it. Regards Marcel ^ permalink raw reply [flat|nested] 35+ messages in thread
* RE: [RFC/PATCH] sbc: new filtering function for 8 band fixed point encoding 2008-12-23 1:00 ` Marcel Holtmann @ 2008-12-23 8:20 ` Jaska.Uimonen 2008-12-23 11:14 ` Siarhei Siamashka 2008-12-23 10:45 ` Siarhei Siamashka 1 sibling, 1 reply; 35+ messages in thread From: Jaska.Uimonen @ 2008-12-23 8:20 UTC (permalink / raw) To: marcel, siarhei.siamashka; +Cc: bmidgley, linux-bluetooth Hi Guys, As Siarhei said, we had a talk last week about=20 The optimizations to the code.=20 I added the modifications to the code, which reduce=20 The amount of operations quite a lot. The problem=20 With this modification is that it also reduces the=20 Readability of the code.=20 So because of the redundancy in the cosine transform=20 it is possible to reduce the number of variables=20 and operations in the preceding filtering.=20 This is very hard to explain with comments in the code=20 (so you would write something like "here should be t[12], but=20 The table goes only to 8 because of redundancy in the following=20 Cosine transform). To really make the stuff=20 Understandable one should write some small=20 Wikipage or extensive comments to the code.=20 So this is the path the previous code also took, but=20 At some point there was a mistake. I really don't still=20 Get how the anamatrix stuff was calculated, but as=20 You see it takes time to reverse engineer :)=20 But I suggest me and Siarhei clean the code internally=20 And try really hard to follow the conventions. It starts=20 To get little bit messy, because we both have many=20 Versions of the code. Br, Jaska >-----Original Message----- >From: ext Marcel Holtmann [mailto:marcel@holtmann.org]=20 >Sent: 23 December, 2008 03:00 >To: Siamashka Siarhei (Nokia-D/Helsinki) >Cc: ext Brad Midgley; Uimonen Jaska (Nokia-D/Helsinki);=20 >linux-bluetooth@vger.kernel.org >Subject: Re: [RFC/PATCH] sbc: new filtering function for 8=20 >band fixed point encoding > >Hi Siarhei, > >> > We had a talk with Jaska Uimonen here, and now I'm kind of=20 >delegated=20 >> > to finish the work on this filtering function for SBC encoder=20 >> > (including the final addition of ARM assembly optimizations). He=20 >> > provided me with his last variant of code, which contains=20 >some more=20 >> > optimizations to reduce the number of operations and also loops=20 >> > unrolling. I will add his changes to the patch on next iteration. >> > >> > Now the question is how to best integrate a fixed=20 >filtering function=20 >> > to git repository? If I just continue adding changes to=20 >the patch in=20 >> > order to make it a faster, it will be also not so obvious=20 >to see how=20 >> > we got to these code transformations just from the commit log. >>=20 >> Next iteration done. Added support for 4 subbands, number of=20 >> arithmetic operations reduced (but without loop unrolling for better=20 >> code readability), precision improved for both 16-bit and=20 >32-bit fixed=20 >> point, 'neginv' macro is now more portable and faster. The=20 >rest is in the code comments. > >I don't mind having patches as attachment, but this makes it=20 >hard to review and comment on them. Especially when it comes=20 >to stuff like coding style (since I have no ideas about the rest). > >+ t1[0] =3D t1[1] =3D t1[2] =3D t1[3] =3D >+ (FIXED_A)1 << (SBC_PROTO_FIXED4_SCALE-1); > >Should be like this: (FIXED_A) 1 << (SBC_PROTO_FIXED4_SCALE - 1); > >Also do you need the (FIXED_A) cast? > >+ for (hop =3D 0; hop < 40; hop +=3D 8) { >+ t1[0] +=3D (FIXED_A)in[hop] * _sbc_proto_fixed4[hop]; > >Same here. There has to be a space after every case. > >+ t1[i] =3D (FIXED_A)1 <<=20 >+ (SBC_COS_TABLE_FIXED4_SCALE-1-SCALE_OUT_BITS); > >And between every operation there has to be a space: SCALE - 1 - SCALE. >In this case I would actually put the - 1 at the end, but that=20 >is pure cosmetics and not a coding style violation. > >Please fix all of these. There at least 8 or so. > >+#define neginv(x) ((-2 >> 1 =3D=3D -1) ? \ >+ ((((int32_t)(x)) >> 31) ^ (int32_t)(x)) : \ >+ ((x >=3D 0) ? (x) : -(x)-1)) > >Space after cast. Space before and after operator. > >+#ifdef SBC_HIGH_PRECISION >+# define FIXED_A int64_t /* data type for fixed point=20 >accumulator */ #=20 >+define FIXED_T int32_t /* data type for fixed point constants */ #=20 >+define SBC_FIXED8_EXTRA_BITS 16 #else # define FIXED_A=20 >int32_t /* data=20 >+type for fixed point accumulator */ # define FIXED_T int16_t /* data=20 >+type for fixed point constants */ # define SBC_FIXED8_EXTRA_BITS 0=20 >+#endif > >No space between # and define. I know that this is meant to=20 >improve readability, but I don't see it. > >Regards > >Marcel > > > ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [RFC/PATCH] sbc: new filtering function for 8 band fixed point encoding 2008-12-23 8:20 ` Jaska.Uimonen @ 2008-12-23 11:14 ` Siarhei Siamashka 0 siblings, 0 replies; 35+ messages in thread From: Siarhei Siamashka @ 2008-12-23 11:14 UTC (permalink / raw) To: Uimonen Jaska (Nokia-D/Helsinki) Cc: ext Marcel Holtmann, ext Brad Midgley, linux-bluetooth On Tuesday 23 December 2008 10:20:25 Uimonen Jaska (Nokia-D/Helsinki) wrote: > Hi Guys, > > As Siarhei said, we had a talk last week about > The optimizations to the code. > > I added the modifications to the code, which reduce > The amount of operations quite a lot. Yes, thank you very much for doing this part of work, I included these modifications into the last revision of my patch. And of course, your initial patch to fix the SBC quality issues was invaluable. I hope that you will not be forgotten to be credited properly when the filtering functions are complete and committed. > The problem > With this modification is that it also reduces the > Readability of the code. Yes, that's one of the reasons why I'm trying to post different revisions here, so all the history of modifications can be tracked if somebody is interested. > So because of the redundancy in the cosine transform > it is possible to reduce the number of variables > and operations in the preceding filtering. > This is very hard to explain with comments in the code > (so you would write something like "here should be t[12], but > The table goes only to 8 because of redundancy in the following > Cosine transform). To really make the stuff > Understandable one should write some small > Wikipage or extensive comments to the code. > > So this is the path the previous code also took, but > At some point there was a mistake. I really don't still > Get how the anamatrix stuff was calculated, but as > You see it takes time to reverse engineer :) > > But I suggest me and Siarhei clean the code internally > And try really hard to follow the conventions. It starts > To get little bit messy, because we both have many > Versions of the code. I think that my last revision of patch is more or less complete and needs only minor tweaks and cosmetic changes. It's not too obfuscated yet, and the logic still can be seen (hopefully). I intentionally decided not to include loops unrolling part as it actually makes code harder to optimize further (for example add SIMD optimizations). The idea is to have some kind of "reference" implementation, which is reasonably compact and simple and also configurable for having high precision mode (very useful for testing purposes). Implementations, optimized for various platforms can be derived from it: 1. Platforms where multiplications are expensive and you want to avoid as many of them as possible (reintroduce 'anamatrix' table, have shifts and additions instead of multiplications, etc) 2. Platforms where the total number of operations is desired to be kept at minimum (tables need to have unused and duplicate elements removed in order to reduce the number of operations for loading constants) 3. Platforms where SIMD optimizations are possible (the code and tables should have a regular structure, suitable for vectorizing, some table elements may have to be be reordered) As one can see, these platform specific optimizations have somewhat contradictory requirements. I tried to arrange the code in such a way, that any of such further optimizations can be easily introduced based on it. -- Best regards, Siarhei Siamashka ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [RFC/PATCH] sbc: new filtering function for 8 band fixed point encoding 2008-12-23 1:00 ` Marcel Holtmann 2008-12-23 8:20 ` Jaska.Uimonen @ 2008-12-23 10:45 ` Siarhei Siamashka 2008-12-23 11:48 ` Marcel Holtmann 2008-12-29 10:46 ` [RFC/PATCH] sbc: new filtering function for 8 band fixed point encoding Siarhei Siamashka 1 sibling, 2 replies; 35+ messages in thread From: Siarhei Siamashka @ 2008-12-23 10:45 UTC (permalink / raw) To: ext Marcel Holtmann; +Cc: ext Brad Midgley, Jaska Uimonen, linux-bluetooth On Tuesday 23 December 2008 03:00:15 ext Marcel Holtmann wrote: > Hi Siarhei, > > > > We had a talk with Jaska Uimonen here, and now I'm kind of delegated to > > > finish the work on this filtering function for SBC encoder (including > > > the final addition of ARM assembly optimizations). He provided me with > > > his last variant of code, which contains some more optimizations to > > > reduce the number of operations and also loops unrolling. I will add > > > his changes to the patch on next iteration. > > > > > > Now the question is how to best integrate a fixed filtering function to > > > git repository? If I just continue adding changes to the patch in order > > > to make it a faster, it will be also not so obvious to see how we got > > > to these code transformations just from the commit log. > > > > Next iteration done. Added support for 4 subbands, number of arithmetic > > operations reduced (but without loop unrolling for better code > > readability), precision improved for both 16-bit and 32-bit fixed point, > > 'neginv' macro is now more portable and faster. The rest is in the code > > comments. > > I don't mind having patches as attachment, but this makes it hard to > review and comment on them. I thought that as long as the attachments are plain text and 'suggest automatic display' property is set, there should not be much problems reviewing them. I'm sorry for my incompetence in this part, but what do you recommend for posting patches? A link to some guide is sufficient. Well, it is good to always learn some new things. > Especially when it comes to stuff like > coding style (since I have no ideas about the rest). > > + t1[0] = t1[1] = t1[2] = t1[3] = > + (FIXED_A)1 << (SBC_PROTO_FIXED4_SCALE-1); > > Should be like this: (FIXED_A) 1 << (SBC_PROTO_FIXED4_SCALE - 1); Do you have a reference to the coding style standard guide? This particular requirement for casts and a space seems quite unusual and I have never seen it before. > Also do you need the (FIXED_A) cast? Yes, it's good to have it for better portability. We need to make sure that this bit is never shifted outside of the range of int type (constants have int type by default). FIXED_A type can be 64-bit for high precision enabled build. Even if SBC_FIXED8_EXTRA_BITS constant is currently set so that there is no overflow, this kind of precaution is better to have in order not to have any kind of unexpected nasty effect when experimenting with tuning various shift values. > > + for (hop = 0; hop < 40; hop += 8) { > + t1[0] += (FIXED_A)in[hop] * _sbc_proto_fixed4[hop]; > > Same here. There has to be a space after every case. > > + t1[i] = (FIXED_A)1 << > (SBC_COS_TABLE_FIXED4_SCALE-1-SCALE_OUT_BITS); > > And between every operation there has to be a space: SCALE - 1 - SCALE. > In this case I would actually put the - 1 at the end, but that is pure > cosmetics and not a coding style violation. This (SBC_COS_TABLE_FIXED4_SCALE-1-SCALE_OUT_BITS) expression can be interpreted as ((SBC_COS_TABLE_FIXED4_SCALE - 1) - SCALE_OUT_BITS). Putting -1 at the end is kind of obfuscation. > Please fix all of these. There at least 8 or so. > > +#define neginv(x) ((-2 >> 1 == -1) ? \ > + ((((int32_t)(x)) >> 31) ^ (int32_t)(x)) : \ > + ((x >= 0) ? (x) : -(x)-1)) > > Space after cast. Space before and after operator. > > +#ifdef SBC_HIGH_PRECISION > +# define FIXED_A int64_t /* data type for fixed point accumulator */ > +# define FIXED_T int32_t /* data type for fixed point constants */ > +# define SBC_FIXED8_EXTRA_BITS 16 > +#else > +# define FIXED_A int32_t /* data type for fixed point accumulator */ > +# define FIXED_T int16_t /* data type for fixed point constants */ > +# define SBC_FIXED8_EXTRA_BITS 0 > +#endif > > No space between # and define. I know that this is meant to improve > readability, but I don't see it. OK, I'll try to fix these. Getting rid of some spaces was done to try fitting some lines into 80 characters (that's also not always achieved yet). Right now I'm also doublechecking correctness of the code to ensure that there are no overflows of other problems related to audio quality. -- Best regards, Siarhei Siamashka ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [RFC/PATCH] sbc: new filtering function for 8 band fixed point encoding 2008-12-23 10:45 ` Siarhei Siamashka @ 2008-12-23 11:48 ` Marcel Holtmann 2008-12-29 9:16 ` Testing SBC filtering functions Christian Hoene 2008-12-29 10:46 ` [RFC/PATCH] sbc: new filtering function for 8 band fixed point encoding Siarhei Siamashka 1 sibling, 1 reply; 35+ messages in thread From: Marcel Holtmann @ 2008-12-23 11:48 UTC (permalink / raw) To: Siarhei Siamashka; +Cc: ext Brad Midgley, Jaska Uimonen, linux-bluetooth Hi Siarhei, > > > > We had a talk with Jaska Uimonen here, and now I'm kind of delegated to > > > > finish the work on this filtering function for SBC encoder (including > > > > the final addition of ARM assembly optimizations). He provided me with > > > > his last variant of code, which contains some more optimizations to > > > > reduce the number of operations and also loops unrolling. I will add > > > > his changes to the patch on next iteration. > > > > > > > > Now the question is how to best integrate a fixed filtering function to > > > > git repository? If I just continue adding changes to the patch in order > > > > to make it a faster, it will be also not so obvious to see how we got > > > > to these code transformations just from the commit log. > > > > > > Next iteration done. Added support for 4 subbands, number of arithmetic > > > operations reduced (but without loop unrolling for better code > > > readability), precision improved for both 16-bit and 32-bit fixed point, > > > 'neginv' macro is now more portable and faster. The rest is in the code > > > comments. > > > > I don't mind having patches as attachment, but this makes it hard to > > review and comment on them. > > I thought that as long as the attachments are plain text and 'suggest > automatic display' property is set, there should not be much problems > reviewing them. > > I'm sorry for my incompetence in this part, but what do you recommend for > posting patches? A link to some guide is sufficient. > > Well, it is good to always learn some new things. the kernel contains good documentation about how to send inline patches. However as I said, I don't mind that much since I can easily apply them, but for commenting on patches inline is easier. > > Especially when it comes to stuff like > > coding style (since I have no ideas about the rest). > > > > + t1[0] = t1[1] = t1[2] = t1[3] = > > + (FIXED_A)1 << (SBC_PROTO_FIXED4_SCALE-1); > > > > Should be like this: (FIXED_A) 1 << (SBC_PROTO_FIXED4_SCALE - 1); > > Do you have a reference to the coding style standard guide? This > particular > requirement for casts and a space seems quite unusual and I have never > seen > it before. It is the kernel coding style. Check Greg KH's paper from OLS some years ago. > OK, I'll try to fix these. Getting rid of some spaces was done to try fitting > some lines into 80 characters (that's also not always achieved yet). This depends on the code. In normal code you could use continue and break a lot, but within SBC this might not generate good binaries. Don't try to omit whitespaces. Just don't. Regards Marcel ^ permalink raw reply [flat|nested] 35+ messages in thread
* Testing SBC filtering functions 2008-12-23 11:48 ` Marcel Holtmann @ 2008-12-29 9:16 ` Christian Hoene 2008-12-29 10:00 ` Marcel Holtmann 2008-12-29 11:06 ` Testing SBC filtering functions Siarhei Siamashka 0 siblings, 2 replies; 35+ messages in thread From: Christian Hoene @ 2008-12-29 9:16 UTC (permalink / raw) To: 'Marcel Holtmann', 'Siarhei Siamashka' Cc: 'ext Brad Midgley', 'Jaska Uimonen', linux-bluetooth Hello all, Merry Christmas! I wrote this script to make some tests on the SBC decoder and encoder using the recommended testing procedure with the reference bitstreams, the reference codec and PEAQ. http://net.cs.uni-tuebingen.de/html/nexgenvoip/html/ I got little bit confused with all the latest patches and whether they are included or not. Just send me an email on which patch you like to have tested. Running the tests just takes half an hour; me to answer my emails maybe a bit longer. Greetings Christian ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: Testing SBC filtering functions 2008-12-29 9:16 ` Testing SBC filtering functions Christian Hoene @ 2008-12-29 10:00 ` Marcel Holtmann 2008-12-29 10:55 ` Christian Hoene 2008-12-29 11:06 ` Testing SBC filtering functions Siarhei Siamashka 1 sibling, 1 reply; 35+ messages in thread From: Marcel Holtmann @ 2008-12-29 10:00 UTC (permalink / raw) To: hoene Cc: 'Siarhei Siamashka', 'ext Brad Midgley', 'Jaska Uimonen', linux-bluetooth Hi Christian, > I wrote this script to make some tests on the SBC decoder and encoder using > the recommended testing procedure with the reference bitstreams, the > reference codec and PEAQ. > http://net.cs.uni-tuebingen.de/html/nexgenvoip/html/ > I got little bit confused with all the latest patches and whether they are > included or not. Just send me an email on which patch you like to have > tested. Running the tests just takes half an hour; me to answer my emails > maybe a bit longer. I leave the patches to the guys that actually know what they are doing. So I like the test script, but I would really prefer if we can use sbctester.c program to verify the results. What do you think? Regards Marcel ^ permalink raw reply [flat|nested] 35+ messages in thread
* RE: Testing SBC filtering functions 2008-12-29 10:00 ` Marcel Holtmann @ 2008-12-29 10:55 ` Christian Hoene 2008-12-29 12:03 ` Marcel Holtmann 0 siblings, 1 reply; 35+ messages in thread From: Christian Hoene @ 2008-12-29 10:55 UTC (permalink / raw) To: 'Marcel Holtmann' Cc: 'Siarhei Siamashka', 'ext Brad Midgley', 'Jaska Uimonen', linux-bluetooth Hello Marcel, > I leave the patches to the guys that actually know what they are doing. > > So I like the test script, but I would really prefer if we can use > sbctester.c program to verify the results. What do you think? The sbctester.c works fine for the decoder and is used in the testing script. The testing of the encoder might also work with test procedure implemented in sbctester.c. However, the official Bluetooth testing spec requires that "9.2.2.1 SBC Encoder ... [E2] The subjective quality (measured by standardized way or by objective testing methods, see [6]and [7]) shall be equivalent to that of reference encoder." with " [6] Rec. ITU-R BS.1116-1, "METHODS FOR THE SUBJECTIVE ASSESSMENT OF SMALL IMPAIRMENTS IN AUDIO SYSTEMS INCLUDING MULTICHANNEL SOUND SYSTEMS", 1994-1997 [7] Rec. ITU-R BS.1387-1, "METHOD FOR OBJECTIVE MEASUREMENTS OF PERCEIVED AUDIO QUALITY", 2001". Until the encoder works fine, we should use BS.1387-1 (PEAQ). If all errors have been fixed, then sbctester.c might can be used instead of ITU-R BS.1387-1. However, then, we might still need the reference implementations or some pretranslated wav and sbc files. Until the encoder is bug-fixed, it makes sense to work with this intermediate solution. With best regards Christian ^ permalink raw reply [flat|nested] 35+ messages in thread
* RE: Testing SBC filtering functions 2008-12-29 10:55 ` Christian Hoene @ 2008-12-29 12:03 ` Marcel Holtmann 2008-12-29 12:31 ` Christian Hoene 0 siblings, 1 reply; 35+ messages in thread From: Marcel Holtmann @ 2008-12-29 12:03 UTC (permalink / raw) To: hoene Cc: 'Siarhei Siamashka', 'ext Brad Midgley', 'Jaska Uimonen', linux-bluetooth Hi Christian, > > I leave the patches to the guys that actually know what they are doing. > > > > So I like the test script, but I would really prefer if we can use > > sbctester.c program to verify the results. What do you think? > > The sbctester.c works fine for the decoder and is used in the testing > script. The testing of the encoder might also work with test procedure > implemented in sbctester.c. However, the official Bluetooth testing spec > requires that > "9.2.2.1 SBC Encoder ... [E2] The subjective quality (measured by > standardized way or by objective testing > methods, see [6]and [7]) shall be equivalent to that of reference encoder." > with " [6] Rec. ITU-R BS.1116-1, "METHODS FOR THE SUBJECTIVE ASSESSMENT OF > SMALL IMPAIRMENTS IN AUDIO SYSTEMS INCLUDING MULTICHANNEL SOUND SYSTEMS", > 1994-1997 > [7] Rec. ITU-R BS.1387-1, "METHOD FOR OBJECTIVE MEASUREMENTS OF PERCEIVED > AUDIO QUALITY", 2001". > Until the encoder works fine, we should use BS.1387-1 (PEAQ). If all errors > have been fixed, then sbctester.c might can be used instead of ITU-R > BS.1387-1. However, then, we might still need the reference implementations > or some pretranslated wav and sbc files. I really prefer if we have an open source solution available to everybody for the SBC conformance testing. If sbctester.c is not good enough than we have to write one. I also don't mind using the reference encoder/decoder and the reference files. Using wine is not optimal, but I don't really mind it. Another question that I have is how you use the sbc_enc_test_01.wav and sbc_enc_test_02.wav files with Windows reference encoder? It turns into a busy loop for me. Regards Marcel ^ permalink raw reply [flat|nested] 35+ messages in thread
* RE: Testing SBC filtering functions 2008-12-29 12:03 ` Marcel Holtmann @ 2008-12-29 12:31 ` Christian Hoene 2008-12-29 12:41 ` Marcel Holtmann 0 siblings, 1 reply; 35+ messages in thread From: Christian Hoene @ 2008-12-29 12:31 UTC (permalink / raw) To: 'Marcel Holtmann'; +Cc: linux-bluetooth Hello Marcel, > I really prefer if we have an open source solution available to > everybody for the SBC conformance testing. If sbctester.c is not good > enough than we have to write one. Sbctester.c is perfect for the decoder. For the encoder, we might also take it or write something better to replace PEAQ. BTW; then I will have spent 6000 EUR on buying one PEAQ license in vain ;-) > I also don't mind using the reference > encoder/decoder and the reference files. Using wine is not optimal, but > I don't really mind it. Fine. But can it be added to the repository? Or shall a description be given on how to download it from the Bluetooth SIG web page. > > Another question that I have is how you use the sbc_enc_test_01.wav and > sbc_enc_test_02.wav files with Windows reference encoder? It turns into > a busy loop for me. Which file do are you referring too? I usually compress the original.wav file (on the top of the second table). Which line options do you use for the encoder? Greetings Christian ^ permalink raw reply [flat|nested] 35+ messages in thread
* RE: Testing SBC filtering functions 2008-12-29 12:31 ` Christian Hoene @ 2008-12-29 12:41 ` Marcel Holtmann 2008-12-29 13:11 ` Christian Hoene 0 siblings, 1 reply; 35+ messages in thread From: Marcel Holtmann @ 2008-12-29 12:41 UTC (permalink / raw) To: hoene; +Cc: linux-bluetooth Hi Christian, > > I really prefer if we have an open source solution available to > > everybody for the SBC conformance testing. If sbctester.c is not good > > enough than we have to write one. > > Sbctester.c is perfect for the decoder. For the encoder, we might also take > it or write something better to replace PEAQ. I am all for that since we need to be able to pass SBC conformance with pure open source tools. Otherwise it becomes painful for some people. > BTW; then I will have spent 6000 EUR on buying one PEAQ license in vain ;-) I know multiple better ways of spending that money ;) > > I also don't mind using the reference > > encoder/decoder and the reference files. Using wine is not optimal, but > > I don't really mind it. > > Fine. But can it be added to the repository? Or shall a description be given > on how to download it from the Bluetooth SIG web page. No we can't add it. I would have to write something up on how to download it and how testing is suppose to work. > > Another question that I have is how you use the sbc_enc_test_01.wav and > > sbc_enc_test_02.wav files with Windows reference encoder? It turns into > > a busy loop for me. > > Which file do are you referring too? I usually compress the original.wav > file (on the top of the second table). Which line options do you use for the > encoder? The official files from the Bluetooth SIG and I don't know which options would make them encode correctly. Just calling the reference encoder with the filename makes it run into a busy loop. With wine and natively running on Windows XP. Regards Marcel ^ permalink raw reply [flat|nested] 35+ messages in thread
* RE: Testing SBC filtering functions 2008-12-29 12:41 ` Marcel Holtmann @ 2008-12-29 13:11 ` Christian Hoene 2008-12-29 13:17 ` Marcel Holtmann 0 siblings, 1 reply; 35+ messages in thread From: Christian Hoene @ 2008-12-29 13:11 UTC (permalink / raw) To: 'Marcel Holtmann'; +Cc: linux-bluetooth Hi Marcel, > The official files from the Bluetooth SIG and I don't know which options > would make them encode correctly. Just calling the reference encoder > with the filename makes it run into a busy loop. With wine and natively > running on Windows XP. No, the 28 official files from the Bluetooth SIG are already compressed. Use the decoder to get the WAV files (or click on my web page on the links in the first table). For testing the encoder, any kind of wav file can be considered. Usually, I run my tests with about 20 test wav files used by the ITU-R. For my web page, I only considered one stereo sound file (any other will do, too). Greetings Christian ^ permalink raw reply [flat|nested] 35+ messages in thread
* RE: Testing SBC filtering functions 2008-12-29 13:11 ` Christian Hoene @ 2008-12-29 13:17 ` Marcel Holtmann 2009-01-01 14:29 ` Testing SBC encoder correctness with sbctester works Christian Hoene 0 siblings, 1 reply; 35+ messages in thread From: Marcel Holtmann @ 2008-12-29 13:17 UTC (permalink / raw) To: hoene; +Cc: linux-bluetooth Hi Christian, > > The official files from the Bluetooth SIG and I don't know which options > > would make them encode correctly. Just calling the reference encoder > > with the filename makes it run into a busy loop. With wine and natively > > running on Windows XP. > > No, the 28 official files from the Bluetooth SIG are already compressed. Use > the decoder to get the WAV files (or click on my web page on the links in > the first table). there are also *.wav files in one of the test specification zip files. Regards Marcel ^ permalink raw reply [flat|nested] 35+ messages in thread
* Testing SBC encoder correctness with sbctester works... 2008-12-29 13:17 ` Marcel Holtmann @ 2009-01-01 14:29 ` Christian Hoene 0 siblings, 0 replies; 35+ messages in thread From: Christian Hoene @ 2009-01-01 14:29 UTC (permalink / raw) To: 'Marcel Holtmann'; +Cc: linux-bluetooth Hallo Marcel, Good news. We can use the sbctester for conformance testing of the encoder using the procedure in my testing.sh script. http://net.cs.uni-tuebingen.de/html/nexgenvoip/html/testing.sh Only for minor optimizations PEAQ might be a benefit. Greetings Christian > -----Original Message----- > From: Marcel Holtmann [mailto:marcel@holtmann.org] > Sent: Monday, December 29, 2008 2:18 PM > To: hoene@uni-tuebingen.de > Cc: linux-bluetooth@vger.kernel.org > Subject: RE: Testing SBC filtering functions > > Hi Christian, > > > > The official files from the Bluetooth SIG and I don't know which options > > > would make them encode correctly. Just calling the reference encoder > > > with the filename makes it run into a busy loop. With wine and natively > > > running on Windows XP. > > > > No, the 28 official files from the Bluetooth SIG are already compressed. Use > > the decoder to get the WAV files (or click on my web page on the links in > > the first table). > > there are also *.wav files in one of the test specification zip files. > > Regards > > Marcel ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: Testing SBC filtering functions 2008-12-29 9:16 ` Testing SBC filtering functions Christian Hoene 2008-12-29 10:00 ` Marcel Holtmann @ 2008-12-29 11:06 ` Siarhei Siamashka 2008-12-29 12:04 ` Marcel Holtmann 1 sibling, 1 reply; 35+ messages in thread From: Siarhei Siamashka @ 2008-12-29 11:06 UTC (permalink / raw) To: hoene Cc: 'Marcel Holtmann', 'ext Brad Midgley', 'Jaska Uimonen', linux-bluetooth On Monday 29 December 2008 11:16:23 ext Christian Hoene wrote: > Hello all, > > Merry Christmas! > > I wrote this script to make some tests on the SBC decoder and encoder using > the recommended testing procedure with the reference bitstreams, the > reference codec and PEAQ. > http://net.cs.uni-tuebingen.de/html/nexgenvoip/html/ > I got little bit confused with all the latest patches and whether they are > included or not. Just send me an email on which patch you like to have > tested. Running the tests just takes half an hour; me to answer my emails > maybe a bit longer. Please try the following patch (apply it to the latest git): http://marc.info/?l=linux-bluetooth&m=123054787830678&w=2 You can try the patch "as is", and also with SBC_HIGH_PRECISION define uncommented. High precision mode is naturally more likely to pass the conformance tests. I used my own script for testing with 'tiny_psnr' tool for comparing original file before encoding and the final decoded result (it measures standard deviation and PSNR). It would be interesting to see how our results correlate. According to my tests, encoder should now be OK with this patch. Decoder is definitely in a better shape than the current encoder from git. But the decoder also seems to have bugs which degrade quality significantly in some cases. Best regards, Siarhei Siamashka ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: Testing SBC filtering functions 2008-12-29 11:06 ` Testing SBC filtering functions Siarhei Siamashka @ 2008-12-29 12:04 ` Marcel Holtmann 2008-12-29 14:36 ` Siarhei Siamashka 0 siblings, 1 reply; 35+ messages in thread From: Marcel Holtmann @ 2008-12-29 12:04 UTC (permalink / raw) To: Siarhei Siamashka Cc: hoene, 'ext Brad Midgley', 'Jaska Uimonen', linux-bluetooth Hi Siarhei, > > I wrote this script to make some tests on the SBC decoder and encoder using > > the recommended testing procedure with the reference bitstreams, the > > reference codec and PEAQ. > > http://net.cs.uni-tuebingen.de/html/nexgenvoip/html/ > > I got little bit confused with all the latest patches and whether they are > > included or not. Just send me an email on which patch you like to have > > tested. Running the tests just takes half an hour; me to answer my emails > > maybe a bit longer. > > Please try the following patch (apply it to the latest git): > http://marc.info/?l=linux-bluetooth&m=123054787830678&w=2 > > You can try the patch "as is", and also with SBC_HIGH_PRECISION define > uncommented. High precision mode is naturally more likely to pass the > conformance tests. > > I used my own script for testing with 'tiny_psnr' tool for comparing original file > before encoding and the final decoded result (it measures standard deviation > and PSNR). It would be interesting to see how our results correlate. can you open source tiny_psnr an we merge it into the BlueZ source? Regards Marcel ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: Testing SBC filtering functions 2008-12-29 12:04 ` Marcel Holtmann @ 2008-12-29 14:36 ` Siarhei Siamashka 2008-12-29 15:04 ` Siarhei Siamashka 0 siblings, 1 reply; 35+ messages in thread From: Siarhei Siamashka @ 2008-12-29 14:36 UTC (permalink / raw) To: ext Marcel Holtmann Cc: hoene, 'ext Brad Midgley', 'Jaska Uimonen', linux-bluetooth [-- Attachment #1: Type: text/plain, Size: 4053 bytes --] On Monday 29 December 2008 14:04:29 ext Marcel Holtmann wrote: > Hi Siarhei, > > > > I wrote this script to make some tests on the SBC decoder and encoder using > > > the recommended testing procedure with the reference bitstreams, the > > > reference codec and PEAQ. > > > http://net.cs.uni-tuebingen.de/html/nexgenvoip/html/ > > > I got little bit confused with all the latest patches and whether they are > > > included or not. Just send me an email on which patch you like to have > > > tested. Running the tests just takes half an hour; me to answer my emails > > > maybe a bit longer. > > > > Please try the following patch (apply it to the latest git): > > http://marc.info/?l=linux-bluetooth&m=123054787830678&w=2 > > > > You can try the patch "as is", and also with SBC_HIGH_PRECISION define > > uncommented. High precision mode is naturally more likely to pass the > > conformance tests. > > > > I used my own script for testing with 'tiny_psnr' tool for comparing original file > > before encoding and the final decoded result (it measures standard deviation > > and PSNR). It would be interesting to see how our results correlate. > > can you open source tiny_psnr an we merge it into the BlueZ source? It is already open source. This very simple tool is used in ffmpeg project regression tests (and is part of ffmpeg distribution). It does not do anything extraordinary, but just analyzes the difference between several audio files and gets standard deviation and PSNR statistics. I just did not feel like reinventing the wheel and used it for estimating quality :) I only use the following patch on top of it (it can do automatic shift detection): http://article.gmane.org/gmane.comp.video.ffmpeg.devel/74597 I find this patch quite useful (especially as it turns out for SBC), but ffmpeg developers do not think so... A crude ruby script (sorry, I don't speak shell scripting language) is attached. I have been using it for testing audio quality. For example, here is the test of a high precision sbcenc build (bitpool=255): ./sbc_encode_test.rb BigBuckBunny-stereo.flac [2, 48000] ["-j -S -s8 -B16 -b255", "-j -l16 -n8 -r1569000"] --- comparing original / sbcenc + sbcdec --- stddev: 108.67 PSNR: 55.60 bytes:114519261/114520000 --- comparing original / sbcenc + sbc_decoder.exe --- stddev: 1.09 PSNR: 95.56 bytes:114519260/114520000 --- comparing original / sbc_encoder.exe + sbc_decoder.exe --- stddev: 1.09 PSNR: 95.56 bytes:114519260/114520000 --- comparing sbcenc + sbc_decoder.exe / sbc_encoder.exe + sbc_decoder.exe stddev: 0.01 PSNR:130.99 bytes:114519552/114519552 Test of a standard sbcenc build (bitpool=255): ./sbc_encode_test.rb BigBuckBunny-stereo.flac [2, 48000] ["-j -S -s8 -B16 -b255", "-j -l16 -n8 -r1569000"] --- comparing original / sbcenc + sbcdec --- stddev: 108.71 PSNR: 55.59 bytes:114519261/114520000 --- comparing original / sbcenc + sbc_decoder.exe --- stddev: 2.07 PSNR: 89.98 bytes:114519260/114520000 --- comparing original / sbc_encoder.exe + sbc_decoder.exe --- stddev: 1.09 PSNR: 95.56 bytes:114519260/114520000 --- comparing sbcenc + sbc_decoder.exe / sbc_encoder.exe + sbc_decoder.exe stddev: 1.77 PSNR: 91.34 bytes:114519552/114519552 Having high bitrate is useful for detecting bugs in analysis filter because its contribution to quality loss is more visible in this configuration and other factors play lesser role. I tested a lot of various files and settings configurations. According to this script, the quality of high precision sbcenc build matches the quality of reference encoder pretty well. Standard 16-bit fixed point sbcenc build introduces a very minor quality loss, noticeable only at extremely high bitrates. By the way, 'sbcdec' seems to introduce quite a noticeable distortion and is orders of magnitude less precise than the encoder. Of course it would be intertesting to see if the quality estimation done by using tiny_psnr is adequate and if it can replace a 6000 EUR tool for this particular purpose. Best regards, Siarhei Siamashka [-- Attachment #2: sbc_encode_test.rb --] [-- Type: application/x-ruby, Size: 2748 bytes --] ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: Testing SBC filtering functions 2008-12-29 14:36 ` Siarhei Siamashka @ 2008-12-29 15:04 ` Siarhei Siamashka 0 siblings, 0 replies; 35+ messages in thread From: Siarhei Siamashka @ 2008-12-29 15:04 UTC (permalink / raw) To: ext Marcel Holtmann Cc: hoene, 'ext Brad Midgley', 'Jaska Uimonen', linux-bluetooth [-- Attachment #1: Type: text/plain, Size: 5248 bytes --] On Monday 29 December 2008 16:36:15 ext Siarhei Siamashka wrote: > On Monday 29 December 2008 14:04:29 ext Marcel Holtmann wrote: > > Hi Siarhei, > > > > > > I wrote this script to make some tests on the SBC decoder and encoder using > > > > the recommended testing procedure with the reference bitstreams, the > > > > reference codec and PEAQ. > > > > http://net.cs.uni-tuebingen.de/html/nexgenvoip/html/ > > > > I got little bit confused with all the latest patches and whether they are > > > > included or not. Just send me an email on which patch you like to have > > > > tested. Running the tests just takes half an hour; me to answer my emails > > > > maybe a bit longer. > > > > > > Please try the following patch (apply it to the latest git): > > > http://marc.info/?l=linux-bluetooth&m=123054787830678&w=2 > > > > > > You can try the patch "as is", and also with SBC_HIGH_PRECISION define > > > uncommented. High precision mode is naturally more likely to pass the > > > conformance tests. > > > > > > I used my own script for testing with 'tiny_psnr' tool for comparing original file > > > before encoding and the final decoded result (it measures standard deviation > > > and PSNR). It would be interesting to see how our results correlate. > > > > can you open source tiny_psnr an we merge it into the BlueZ source? > > It is already open source. This very simple tool is used in ffmpeg project > regression tests (and is part of ffmpeg distribution). It does not do > anything extraordinary, but just analyzes the difference between several > audio files and gets standard deviation and PSNR statistics. I just did not > feel like reinventing the wheel and used it for estimating quality :) > > I only use the following patch on top of it (it can do automatic shift detection): > http://article.gmane.org/gmane.comp.video.ffmpeg.devel/74597 > I find this patch quite useful (especially as it turns out for SBC), but ffmpeg > developers do not think so... > > A crude ruby script (sorry, I don't speak shell scripting language) is attached. > I have been using it for testing audio quality. > > For example, here is the test of a high precision sbcenc build (bitpool=255): > > ./sbc_encode_test.rb BigBuckBunny-stereo.flac > [2, 48000] > ["-j -S -s8 -B16 -b255", "-j -l16 -n8 -r1569000"] > --- comparing original / sbcenc + sbcdec --- > stddev: 108.67 PSNR: 55.60 bytes:114519261/114520000 > > --- comparing original / sbcenc + sbc_decoder.exe --- > stddev: 1.09 PSNR: 95.56 bytes:114519260/114520000 > > --- comparing original / sbc_encoder.exe + sbc_decoder.exe --- > stddev: 1.09 PSNR: 95.56 bytes:114519260/114520000 > > --- comparing sbcenc + sbc_decoder.exe / sbc_encoder.exe + sbc_decoder.exe > stddev: 0.01 PSNR:130.99 bytes:114519552/114519552 > > Test of a standard sbcenc build (bitpool=255): > > ./sbc_encode_test.rb BigBuckBunny-stereo.flac > [2, 48000] > ["-j -S -s8 -B16 -b255", "-j -l16 -n8 -r1569000"] > --- comparing original / sbcenc + sbcdec --- > stddev: 108.71 PSNR: 55.59 bytes:114519261/114520000 > > --- comparing original / sbcenc + sbc_decoder.exe --- > stddev: 2.07 PSNR: 89.98 bytes:114519260/114520000 > > --- comparing original / sbc_encoder.exe + sbc_decoder.exe --- > stddev: 1.09 PSNR: 95.56 bytes:114519260/114520000 > > --- comparing sbcenc + sbc_decoder.exe / sbc_encoder.exe + sbc_decoder.exe > stddev: 1.77 PSNR: 91.34 bytes:114519552/114519552 [...] > By the way, 'sbcdec' seems to introduce quite a noticeable distortion and > is orders of magnitude less precise than the encoder. Please disregard this statement. I only tried to add support for sbcdec as the last minute change (I did not pay any attention to the decoder audio quality before today) and forgot to change its output to a normal WAV format before comparing files. Sorry about that. A fixed script is attached and updated results are below. Test of a high precision sbcenc build (bitpool=255): ./sbc_encode_test.rb BigBuckBunny-stereo.flac [2, 48000] ["-j -S -s8 -B16 -b255", "-j -l16 -n8 -r1569000"] --- comparing original / sbcenc + sbcdec --- stddev: 4.49 PSNR: 83.26 bytes:114519260/114520000 --- comparing original / sbcenc + sbc_decoder.exe --- stddev: 1.09 PSNR: 95.56 bytes:114519260/114520000 --- comparing original / sbc_encoder.exe + sbc_decoder.exe --- stddev: 1.09 PSNR: 95.56 bytes:114519260/114520000 --- comparing sbcenc + sbc_decoder.exe / sbc_encoder.exe + sbc_decoder.exe stddev: 0.01 PSNR:130.99 bytes:114519552/114519552 Test of a standard sbcenc build (bitpool=255): ./sbc_encode_test.rb BigBuckBunny-stereo.flac [2, 48000] ["-j -S -s8 -B16 -b255", "-j -l16 -n8 -r1569000"] --- comparing original / sbcenc + sbcdec --- stddev: 4.85 PSNR: 82.60 bytes:114519260/114520000 --- comparing original / sbcenc + sbc_decoder.exe --- stddev: 2.07 PSNR: 89.98 bytes:114519260/114520000 --- comparing original / sbc_encoder.exe + sbc_decoder.exe --- stddev: 1.09 PSNR: 95.56 bytes:114519260/114520000 --- comparing sbcenc + sbc_decoder.exe / sbc_encoder.exe + sbc_decoder.exe stddev: 1.77 PSNR: 91.34 bytes:114519552/114519552 Still the sbcdec seems to be the major contributor to quality loss in all cases. Best regards, Siarhei Siamashka [-- Attachment #2: sbc_encode_test.rb --] [-- Type: application/x-ruby, Size: 2811 bytes --] ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [RFC/PATCH] sbc: new filtering function for 8 band fixed point encoding 2008-12-23 10:45 ` Siarhei Siamashka 2008-12-23 11:48 ` Marcel Holtmann @ 2008-12-29 10:46 ` Siarhei Siamashka 2008-12-29 11:56 ` Marcel Holtmann 1 sibling, 1 reply; 35+ messages in thread From: Siarhei Siamashka @ 2008-12-29 10:46 UTC (permalink / raw) To: ext Marcel Holtmann; +Cc: ext Brad Midgley, Jaska Uimonen, linux-bluetooth [-- Attachment #1: Type: text/plain, Size: 1462 bytes --] On Tuesday 23 December 2008 12:45:14 ext Siarhei Siamashka wrote: [coding style problems description skipped] > OK, I'll try to fix these. > Right now I'm also doublechecking correctness of the code to ensure that > there are no overflows of other problems related to audio quality. Here is (hopefully) the final revision of patch. I tested it with all the possible files that I could find or generate myself and it seems to work fine in all cases. The code was reverted to use fabs macro. After all, looks like it is the right way to handle scale factors and adheres to the specification. The weird effects on sound quality related to the use of fabs and the modified macro which was tried as a replacement have been apparently the side effects of other bugs. Also I have to admit that the change from http://marc.info/?l=linux-bluetooth&m=122946440507726&w=2 is not needed and the original patch was in fact correct. The output of quantizer is strictly a positive number (unless there are bugs or overflows). This was also changed back. Rounding for the values in the final analysis function output was removed (we already keep a lot of extra bits in output, so it does not matter for precision). Also the change of SCALE_OUT_BITS to 15 has a nice feature that the shifting of output values is not needed at all for 16-bit version (and naturally rounding is not needed here too), this should be good for performance. Best regards, Siarhei Siamashka [-- Attachment #2: 0002-New-SBC-analysis-filter-function-to-replace-current.patch --] [-- Type: text/x-diff, Size: 27301 bytes --] >From a148f08f98a357c37f2f02efe477948e0723345e Mon Sep 17 00:00:00 2001 From: Siarhei Siamashka <siarhei.siamashka@nokia.com> Date: Sun, 28 Dec 2008 03:22:59 +0200 Subject: [PATCH] New SBC analysis filter function to replace current broken code This code is heavily based on the patch submitted by Jaska Uimonen. Additional changes include preserving extra bits in the output of filter function for better precision, support for both 16-bit and 32-bit fixed point implementation. Sign of some table values was changed in order to preserve a regular code structure and have multiply-accumulate oparations only. No additional optimizations were applied as this code is intended to be some kind of "reference" implementation. Platform specific optimizations may require different tricks and can be branched off from this implementation. Some extra information about this code can be found in linux-bluetooth mailing list archive for December 2008. --- sbc/sbc.c | 306 +++++++++++++++++++----------------------------------- sbc/sbc_math.h | 14 +-- sbc/sbc_tables.h | 247 +++++++++++++++++++++++++++++++++++++------ 3 files changed, 323 insertions(+), 244 deletions(-) diff --git a/sbc/sbc.c b/sbc/sbc.c index d3dcd9a..ce52e1e 100644 --- a/sbc/sbc.c +++ b/sbc/sbc.c @@ -40,6 +40,7 @@ #include <string.h> #include <stdlib.h> #include <sys/types.h> +#include <limits.h> #include "sbc_math.h" #include "sbc_tables.h" @@ -93,7 +94,7 @@ struct sbc_decoder_state { struct sbc_encoder_state { int subbands; int position[2]; - int32_t X[2][160]; + int16_t X[2][160]; }; /* @@ -656,75 +657,47 @@ static void sbc_encoder_init(struct sbc_encoder_state *state, state->position[0] = state->position[1] = 9 * frame->subbands; } -static inline void _sbc_analyze_four(const int32_t *in, int32_t *out) +static inline void _sbc_analyze_four(const int16_t *in, int32_t *out) { - sbc_fixed_t t[8], s[5]; - - t[0] = SCALE4_STAGE1( /* Q8 */ - MULA(_sbc_proto_4[0], in[8] - in[32], /* Q18 */ - MUL( _sbc_proto_4[1], in[16] - in[24]))); - - t[1] = SCALE4_STAGE1( - MULA(_sbc_proto_4[2], in[1], - MULA(_sbc_proto_4[3], in[9], - MULA(_sbc_proto_4[4], in[17], - MULA(_sbc_proto_4[5], in[25], - MUL( _sbc_proto_4[6], in[33])))))); - - t[2] = SCALE4_STAGE1( - MULA(_sbc_proto_4[7], in[2], - MULA(_sbc_proto_4[8], in[10], - MULA(_sbc_proto_4[9], in[18], - MULA(_sbc_proto_4[10], in[26], - MUL( _sbc_proto_4[11], in[34])))))); - - t[3] = SCALE4_STAGE1( - MULA(_sbc_proto_4[12], in[3], - MULA(_sbc_proto_4[13], in[11], - MULA(_sbc_proto_4[14], in[19], - MULA(_sbc_proto_4[15], in[27], - MUL( _sbc_proto_4[16], in[35])))))); - - t[4] = SCALE4_STAGE1( - MULA(_sbc_proto_4[17], in[4] + in[36], - MULA(_sbc_proto_4[18], in[12] + in[28], - MUL( _sbc_proto_4[19], in[20])))); - - t[5] = SCALE4_STAGE1( - MULA(_sbc_proto_4[16], in[5], - MULA(_sbc_proto_4[15], in[13], - MULA(_sbc_proto_4[14], in[21], - MULA(_sbc_proto_4[13], in[29], - MUL( _sbc_proto_4[12], in[37])))))); - - /* don't compute t[6]... this term always multiplies - * with cos(pi/2) = 0 */ - - t[7] = SCALE4_STAGE1( - MULA(_sbc_proto_4[6], in[7], - MULA(_sbc_proto_4[5], in[15], - MULA(_sbc_proto_4[4], in[23], - MULA(_sbc_proto_4[3], in[31], - MUL( _sbc_proto_4[2], in[39])))))); - - s[0] = MUL( _anamatrix4[0], t[0] + t[4]); - s[1] = MUL( _anamatrix4[2], t[2]); - s[2] = MULA(_anamatrix4[1], t[1] + t[3], - MUL(_anamatrix4[3], t[5])); - s[3] = MULA(_anamatrix4[3], t[1] + t[3], - MUL(_anamatrix4[1], -t[5] + t[7])); - s[4] = MUL( _anamatrix4[3], t[7]); - - out[0] = SCALE4_STAGE2( s[0] + s[1] + s[2] + s[4]); /* Q0 */ - out[1] = SCALE4_STAGE2(-s[0] + s[1] + s[3]); - out[2] = SCALE4_STAGE2(-s[0] + s[1] - s[3]); - out[3] = SCALE4_STAGE2( s[0] + s[1] - s[2] - s[4]); + FIXED_A t1[4]; + FIXED_T t2[4]; + int i = 0, hop = 0; + + /* rounding coefficient */ + t1[0] = t1[1] = t1[2] = t1[3] = + (FIXED_A) 1 << (SBC_PROTO_FIXED4_SCALE - 1); + + /* low pass polyphase filter */ + for (hop = 0; hop < 40; hop += 8) { + t1[0] += (FIXED_A) in[hop] * _sbc_proto_fixed4[hop]; + t1[1] += (FIXED_A) in[hop + 1] * _sbc_proto_fixed4[hop + 1]; + t1[2] += (FIXED_A) in[hop + 2] * _sbc_proto_fixed4[hop + 2]; + t1[1] += (FIXED_A) in[hop + 3] * _sbc_proto_fixed4[hop + 3]; + t1[0] += (FIXED_A) in[hop + 4] * _sbc_proto_fixed4[hop + 4]; + t1[3] += (FIXED_A) in[hop + 5] * _sbc_proto_fixed4[hop + 5]; + t1[3] += (FIXED_A) in[hop + 7] * _sbc_proto_fixed4[hop + 7]; + } + + /* scaling */ + t2[0] = t1[0] >> SBC_PROTO_FIXED4_SCALE; + t2[1] = t1[1] >> SBC_PROTO_FIXED4_SCALE; + t2[2] = t1[2] >> SBC_PROTO_FIXED4_SCALE; + t2[3] = t1[3] >> SBC_PROTO_FIXED4_SCALE; + + /* do the cos transform */ + for (i = 0, hop = 0; i < 4; hop += 8, i++) { + out[i] = ((FIXED_A) t2[0] * cos_table_fixed_4[0 + hop] + + (FIXED_A) t2[1] * cos_table_fixed_4[1 + hop] + + (FIXED_A) t2[2] * cos_table_fixed_4[2 + hop] + + (FIXED_A) t2[3] * cos_table_fixed_4[5 + hop]) >> + (SBC_COS_TABLE_FIXED4_SCALE - SCALE_OUT_BITS); + } } static inline void sbc_analyze_four(struct sbc_encoder_state *state, struct sbc_frame *frame, int ch, int blk) { - int32_t *x = &state->X[ch][state->position[ch]]; + int16_t *x = &state->X[ch][state->position[ch]]; int16_t *pcm = &frame->pcm_sample[ch][blk * 4]; /* Input 4 Audio Samples */ @@ -740,133 +713,64 @@ static inline void sbc_analyze_four(struct sbc_encoder_state *state, state->position[ch] = 36; } -static inline void _sbc_analyze_eight(const int32_t *in, int32_t *out) +static inline void _sbc_analyze_eight(const int16_t *in, int32_t *out) { - sbc_fixed_t t[8], s[8]; - - t[0] = SCALE8_STAGE1( /* Q10 */ - MULA(_sbc_proto_8[0], (in[16] - in[64]), /* Q18 = Q18 * Q0 */ - MULA(_sbc_proto_8[1], (in[32] - in[48]), - MULA(_sbc_proto_8[2], in[4], - MULA(_sbc_proto_8[3], in[20], - MULA(_sbc_proto_8[4], in[36], - MUL( _sbc_proto_8[5], in[52]))))))); - - t[1] = SCALE8_STAGE1( - MULA(_sbc_proto_8[6], in[2], - MULA(_sbc_proto_8[7], in[18], - MULA(_sbc_proto_8[8], in[34], - MULA(_sbc_proto_8[9], in[50], - MUL(_sbc_proto_8[10], in[66])))))); - - t[2] = SCALE8_STAGE1( - MULA(_sbc_proto_8[11], in[1], - MULA(_sbc_proto_8[12], in[17], - MULA(_sbc_proto_8[13], in[33], - MULA(_sbc_proto_8[14], in[49], - MULA(_sbc_proto_8[15], in[65], - MULA(_sbc_proto_8[16], in[3], - MULA(_sbc_proto_8[17], in[19], - MULA(_sbc_proto_8[18], in[35], - MULA(_sbc_proto_8[19], in[51], - MUL( _sbc_proto_8[20], in[67]))))))))))); - - t[3] = SCALE8_STAGE1( - MULA( _sbc_proto_8[21], in[5], - MULA( _sbc_proto_8[22], in[21], - MULA( _sbc_proto_8[23], in[37], - MULA( _sbc_proto_8[24], in[53], - MULA( _sbc_proto_8[25], in[69], - MULA(-_sbc_proto_8[15], in[15], - MULA(-_sbc_proto_8[14], in[31], - MULA(-_sbc_proto_8[13], in[47], - MULA(-_sbc_proto_8[12], in[63], - MUL( -_sbc_proto_8[11], in[79]))))))))))); - - t[4] = SCALE8_STAGE1( - MULA( _sbc_proto_8[26], in[6], - MULA( _sbc_proto_8[27], in[22], - MULA( _sbc_proto_8[28], in[38], - MULA( _sbc_proto_8[29], in[54], - MULA( _sbc_proto_8[30], in[70], - MULA(-_sbc_proto_8[10], in[14], - MULA(-_sbc_proto_8[9], in[30], - MULA(-_sbc_proto_8[8], in[46], - MULA(-_sbc_proto_8[7], in[62], - MUL( -_sbc_proto_8[6], in[78]))))))))))); - - t[5] = SCALE8_STAGE1( - MULA( _sbc_proto_8[31], in[7], - MULA( _sbc_proto_8[32], in[23], - MULA( _sbc_proto_8[33], in[39], - MULA( _sbc_proto_8[34], in[55], - MULA( _sbc_proto_8[35], in[71], - MULA(-_sbc_proto_8[20], in[13], - MULA(-_sbc_proto_8[19], in[29], - MULA(-_sbc_proto_8[18], in[45], - MULA(-_sbc_proto_8[17], in[61], - MUL( -_sbc_proto_8[16], in[77]))))))))))); - - t[6] = SCALE8_STAGE1( - MULA( _sbc_proto_8[36], (in[8] + in[72]), - MULA( _sbc_proto_8[37], (in[24] + in[56]), - MULA( _sbc_proto_8[38], in[40], - MULA(-_sbc_proto_8[39], in[12], - MULA(-_sbc_proto_8[5], in[28], - MULA(-_sbc_proto_8[4], in[44], - MULA(-_sbc_proto_8[3], in[60], - MUL( -_sbc_proto_8[2], in[76]))))))))); - - t[7] = SCALE8_STAGE1( - MULA( _sbc_proto_8[35], in[9], - MULA( _sbc_proto_8[34], in[25], - MULA( _sbc_proto_8[33], in[41], - MULA( _sbc_proto_8[32], in[57], - MULA( _sbc_proto_8[31], in[73], - MULA(-_sbc_proto_8[25], in[11], - MULA(-_sbc_proto_8[24], in[27], - MULA(-_sbc_proto_8[23], in[43], - MULA(-_sbc_proto_8[22], in[59], - MUL( -_sbc_proto_8[21], in[75]))))))))))); - - s[0] = MULA( _anamatrix8[0], t[0], - MUL( _anamatrix8[1], t[6])); - s[1] = MUL( _anamatrix8[7], t[1]); - s[2] = MULA( _anamatrix8[2], t[2], - MULA( _anamatrix8[3], t[3], - MULA( _anamatrix8[4], t[5], - MUL( _anamatrix8[5], t[7])))); - s[3] = MUL( _anamatrix8[6], t[4]); - s[4] = MULA( _anamatrix8[3], t[2], - MULA(-_anamatrix8[5], t[3], - MULA(-_anamatrix8[2], t[5], - MUL( -_anamatrix8[4], t[7])))); - s[5] = MULA( _anamatrix8[4], t[2], - MULA(-_anamatrix8[2], t[3], - MULA( _anamatrix8[5], t[5], - MUL( _anamatrix8[3], t[7])))); - s[6] = MULA( _anamatrix8[1], t[0], - MUL( -_anamatrix8[0], t[6])); - s[7] = MULA( _anamatrix8[5], t[2], - MULA(-_anamatrix8[4], t[3], - MULA( _anamatrix8[3], t[5], - MUL( -_anamatrix8[2], t[7])))); - - out[0] = SCALE8_STAGE2( s[0] + s[1] + s[2] + s[3]); - out[1] = SCALE8_STAGE2( s[1] - s[3] + s[4] + s[6]); - out[2] = SCALE8_STAGE2( s[1] - s[3] + s[5] - s[6]); - out[3] = SCALE8_STAGE2(-s[0] + s[1] + s[3] + s[7]); - out[4] = SCALE8_STAGE2(-s[0] + s[1] + s[3] - s[7]); - out[5] = SCALE8_STAGE2( s[1] - s[3] - s[5] - s[6]); - out[6] = SCALE8_STAGE2( s[1] - s[3] - s[4] + s[6]); - out[7] = SCALE8_STAGE2( s[0] + s[1] - s[2] + s[3]); + FIXED_A t1[8]; + FIXED_T t2[8]; + int i, hop; + + /* rounding coefficient */ + t1[0] = t1[1] = t1[2] = t1[3] = t1[4] = t1[5] = t1[6] = t1[7] = + (FIXED_A) 1 << (SBC_PROTO_FIXED8_SCALE-1); + + /* low pass polyphase filter */ + for (hop = 0; hop < 80; hop += 16) { + t1[0] += (FIXED_A) in[hop] * _sbc_proto_fixed8[hop]; + t1[1] += (FIXED_A) in[hop + 1] * _sbc_proto_fixed8[hop + 1]; + t1[2] += (FIXED_A) in[hop + 2] * _sbc_proto_fixed8[hop + 2]; + t1[3] += (FIXED_A) in[hop + 3] * _sbc_proto_fixed8[hop + 3]; + t1[4] += (FIXED_A) in[hop + 4] * _sbc_proto_fixed8[hop + 4]; + t1[3] += (FIXED_A) in[hop + 5] * _sbc_proto_fixed8[hop + 5]; + t1[2] += (FIXED_A) in[hop + 6] * _sbc_proto_fixed8[hop + 6]; + t1[1] += (FIXED_A) in[hop + 7] * _sbc_proto_fixed8[hop + 7]; + t1[0] += (FIXED_A) in[hop + 8] * _sbc_proto_fixed8[hop + 8]; + t1[5] += (FIXED_A) in[hop + 9] * _sbc_proto_fixed8[hop + 9]; + t1[6] += (FIXED_A) in[hop + 10] * _sbc_proto_fixed8[hop + 10]; + t1[7] += (FIXED_A) in[hop + 11] * _sbc_proto_fixed8[hop + 11]; + t1[7] += (FIXED_A) in[hop + 13] * _sbc_proto_fixed8[hop + 13]; + t1[6] += (FIXED_A) in[hop + 14] * _sbc_proto_fixed8[hop + 14]; + t1[5] += (FIXED_A) in[hop + 15] * _sbc_proto_fixed8[hop + 15]; + } + + /* scaling */ + t2[0] = t1[0] >> SBC_PROTO_FIXED8_SCALE; + t2[1] = t1[1] >> SBC_PROTO_FIXED8_SCALE; + t2[2] = t1[2] >> SBC_PROTO_FIXED8_SCALE; + t2[3] = t1[3] >> SBC_PROTO_FIXED8_SCALE; + t2[4] = t1[4] >> SBC_PROTO_FIXED8_SCALE; + t2[5] = t1[5] >> SBC_PROTO_FIXED8_SCALE; + t2[6] = t1[6] >> SBC_PROTO_FIXED8_SCALE; + t2[7] = t1[7] >> SBC_PROTO_FIXED8_SCALE; + + /* do the cos transform */ + for (i = 0, hop = 0; i < 8; hop += 16, i++) { + out[i] = ((FIXED_A) t2[0] * cos_table_fixed_8[0 + hop] + + (FIXED_A) t2[1] * cos_table_fixed_8[1 + hop] + + (FIXED_A) t2[2] * cos_table_fixed_8[2 + hop] + + (FIXED_A) t2[3] * cos_table_fixed_8[3 + hop] + + (FIXED_A) t2[4] * cos_table_fixed_8[4 + hop] + + (FIXED_A) t2[5] * cos_table_fixed_8[9 + hop] + + (FIXED_A) t2[6] * cos_table_fixed_8[10 + hop] + + (FIXED_A) t2[7] * cos_table_fixed_8[11 + hop]) >> + (SBC_COS_TABLE_FIXED8_SCALE - SCALE_OUT_BITS); + } } static inline void sbc_analyze_eight(struct sbc_encoder_state *state, struct sbc_frame *frame, int ch, int blk) { - int32_t *x = &state->X[ch][state->position[ch]]; + int16_t *x = &state->X[ch][state->position[ch]]; int16_t *pcm = &frame->pcm_sample[ch][blk * 8]; /* Input 8 Audio Samples */ @@ -1004,7 +908,7 @@ static int sbc_pack_frame(uint8_t *data, struct sbc_frame *frame, size_t len) for (ch = 0; ch < frame->channels; ch++) { for (sb = 0; sb < frame->subbands; sb++) { frame->scale_factor[ch][sb] = 0; - scalefactor[ch][sb] = 2; + scalefactor[ch][sb] = 2 << SCALE_OUT_BITS; for (blk = 0; blk < frame->blocks; blk++) { while (scalefactor[ch][sb] < fabs(frame->sb_sample_f[blk][ch][sb])) { frame->scale_factor[ch][sb]++; @@ -1026,18 +930,18 @@ static int sbc_pack_frame(uint8_t *data, struct sbc_frame *frame, size_t len) for (sb = 0; sb < frame->subbands - 1; sb++) { scale_factor_j[0] = 0; - scalefactor_j[0] = 2; + scalefactor_j[0] = 2 << SCALE_OUT_BITS; scale_factor_j[1] = 0; - scalefactor_j[1] = 2; + scalefactor_j[1] = 2 << SCALE_OUT_BITS; for (blk = 0; blk < frame->blocks; blk++) { /* Calculate joint stereo signal */ sb_sample_j[blk][0] = - (frame->sb_sample_f[blk][0][sb] + - frame->sb_sample_f[blk][1][sb]) >> 1; + ASR(frame->sb_sample_f[blk][0][sb], 1) + + ASR(frame->sb_sample_f[blk][1][sb], 1); sb_sample_j[blk][1] = - (frame->sb_sample_f[blk][0][sb] - - frame->sb_sample_f[blk][1][sb]) >> 1; + ASR(frame->sb_sample_f[blk][0][sb], 1) - + ASR(frame->sb_sample_f[blk][1][sb], 1); /* calculate scale_factor_j and scalefactor_j for joint case */ while (scalefactor_j[0] < fabs(sb_sample_j[blk][0])) { @@ -1099,13 +1003,19 @@ static int sbc_pack_frame(uint8_t *data, struct sbc_frame *frame, size_t len) for (blk = 0; blk < frame->blocks; blk++) { for (ch = 0; ch < frame->channels; ch++) { for (sb = 0; sb < frame->subbands; sb++) { - if (levels[ch][sb] > 0) { - audio_sample = - (uint16_t) (((((int64_t)frame->sb_sample_f[blk][ch][sb]*levels[ch][sb]) >> - (frame->scale_factor[ch][sb] + 1)) + - levels[ch][sb]) >> 1); - PUT_BITS(audio_sample & levels[ch][sb], bits[ch][sb]); - } + + if (bits[ch][sb] == 0) + continue; + + audio_sample = ((uint64_t) levels[ch][sb] * + (((uint32_t) 1 << + (frame->scale_factor[ch][sb] + + SCALE_OUT_BITS + 1)) + + frame->sb_sample_f[blk][ch][sb])) >> + (frame->scale_factor[ch][sb] + + SCALE_OUT_BITS + 2); + + PUT_BITS(audio_sample, bits[ch][sb]); } } } diff --git a/sbc/sbc_math.h b/sbc/sbc_math.h index b3d87a6..1503d75 100644 --- a/sbc/sbc_math.h +++ b/sbc/sbc_math.h @@ -29,31 +29,21 @@ #define ASR(val, bits) ((-2 >> 1 == -1) ? \ ((int32_t)(val)) >> (bits) : ((int32_t) (val)) / (1 << (bits))) -#define SCALE_PROTO4_TBL 15 -#define SCALE_ANA4_TBL 17 -#define SCALE_PROTO8_TBL 16 -#define SCALE_ANA8_TBL 17 +#define SCALE_OUT_BITS 15 + #define SCALE_SPROTO4_TBL 12 #define SCALE_SPROTO8_TBL 14 #define SCALE_NPROTO4_TBL 11 #define SCALE_NPROTO8_TBL 11 -#define SCALE4_STAGE1_BITS 15 -#define SCALE4_STAGE2_BITS 16 #define SCALE4_STAGED1_BITS 15 #define SCALE4_STAGED2_BITS 16 -#define SCALE8_STAGE1_BITS 15 -#define SCALE8_STAGE2_BITS 15 #define SCALE8_STAGED1_BITS 15 #define SCALE8_STAGED2_BITS 16 typedef int32_t sbc_fixed_t; -#define SCALE4_STAGE1(src) ASR(src, SCALE4_STAGE1_BITS) -#define SCALE4_STAGE2(src) ASR(src, SCALE4_STAGE2_BITS) #define SCALE4_STAGED1(src) ASR(src, SCALE4_STAGED1_BITS) #define SCALE4_STAGED2(src) ASR(src, SCALE4_STAGED2_BITS) -#define SCALE8_STAGE1(src) ASR(src, SCALE8_STAGE1_BITS) -#define SCALE8_STAGE2(src) ASR(src, SCALE8_STAGE2_BITS) #define SCALE8_STAGED1(src) ASR(src, SCALE8_STAGED1_BITS) #define SCALE8_STAGED2(src) ASR(src, SCALE8_STAGED2_BITS) diff --git a/sbc/sbc_tables.h b/sbc/sbc_tables.h index f5daaa7..8df8c1f 100644 --- a/sbc/sbc_tables.h +++ b/sbc/sbc_tables.h @@ -40,40 +40,11 @@ static const int sbc_offset8[4][8] = { }; -#define SP4(val) (((int32_t)(val))/17658) /* Used to be #define SP4(val) ASR(val, SCALE_PROTO4_TBL) but causes wrong gain */ -#define SA4(val) ASR(val, SCALE_ANA4_TBL) -#define SP8(val) (((int32_t)(val))/57740) /* Used to be #define SP8(val) ASR(val, SCALE_PROTO8_TBL) but causes wrong gain */ -#define SA8(val) ASR(val, SCALE_ANA8_TBL) #define SS4(val) ASR(val, SCALE_SPROTO4_TBL) #define SS8(val) ASR(val, SCALE_SPROTO8_TBL) #define SN4(val) ASR(val, SCALE_NPROTO4_TBL) #define SN8(val) ASR(val, SCALE_NPROTO8_TBL) -static const int32_t _sbc_proto_4[20] = { - SP4(0x02cb3e8c), SP4(0x22b63dc0), SP4(0x002329cc), SP4(0x053b7548), - SP4(0x31eab940), SP4(0xec1f5e60), SP4(0xff3773a8), SP4(0x0061c5a7), - SP4(0x07646680), SP4(0x3f239480), SP4(0xf89f23a8), SP4(0x007a4737), - SP4(0x00b32807), SP4(0x083ddc80), SP4(0x4825e480), SP4(0x0191e578), - SP4(0x00ff11ca), SP4(0x00fb7991), SP4(0x069fdc58), SP4(0x4b584000) -}; - -static const int32_t _anamatrix4[4] = { - SA4(0x2d413cc0), SA4(0x3b20d780), SA4(0x40000000), SA4(0x187de2a0) -}; - -static const int32_t _sbc_proto_8[40] = { - SP8(0x02e5cd20), SP8(0x22d0c200), SP8(0x006bfe27), SP8(0x07808930), - SP8(0x3f1c8800), SP8(0xf8810d70), SP8(0x002cfdc6), SP8(0x055acf28), - SP8(0x31f566c0), SP8(0xebfe57e0), SP8(0xff27c437), SP8(0x001485cc), - SP8(0x041c6e58), SP8(0x2a7cfa80), SP8(0xe4c4a240), SP8(0xfe359e4c), - SP8(0x0048b1f8), SP8(0x0686ce30), SP8(0x38eec5c0), SP8(0xf2a1b9f0), - SP8(0xffe8904a), SP8(0x0095698a), SP8(0x0824a480), SP8(0x443b3c00), - SP8(0xfd7badc8), SP8(0x00d3e2d9), SP8(0x00c183d2), SP8(0x084e1950), - SP8(0x4810d800), SP8(0x017f43fe), SP8(0x01056dd8), SP8(0x00e9cb9f), - SP8(0x07d7d090), SP8(0x4a708980), SP8(0x0488fae8), SP8(0x0113bd20), - SP8(0x0107b1a8), SP8(0x069fb3c0), SP8(0x4b3db200), SP8(0x00763f48) -}; - static const int32_t sbc_proto_4_40m0[] = { SS4(0x00000000), SS4(0xffa6982f), SS4(0xfba93848), SS4(0x0456c7b8), SS4(0x005967d1), SS4(0xfffb9ac7), SS4(0xff589157), SS4(0xf9c2a8d8), @@ -116,11 +87,6 @@ static const int32_t sbc_proto_8_80m1[] = { SS8(0x0d9daee0), SS8(0xeac182c0), SS8(0xfdf1c8d4), SS8(0xfff5bd1a) }; -static const int32_t _anamatrix8[8] = { - SA8(0x3b20d780), SA8(0x187de2a0), SA8(0x3ec52f80), SA8(0x3536cc40), - SA8(0x238e7680), SA8(0x0c7c5c20), SA8(0x2d413cc0), SA8(0x40000000) -}; - static const int32_t synmatrix4[8][4] = { { SN4(0x05a82798), SN4(0xfa57d868), SN4(0xfa57d868), SN4(0x05a82798) }, { SN4(0x030fbc54), SN4(0xf89be510), SN4(0x07641af0), SN4(0xfcf043ac) }, @@ -166,3 +132,216 @@ static const int32_t synmatrix8[16][8] = { { SN8(0xf9592678), SN8(0x018f8b84), SN8(0x07d8a5f0), SN8(0x0471ced0), SN8(0xfb8e3130), SN8(0xf8275a10), SN8(0xfe70747c), SN8(0x06a6d988) } }; + +/* Uncomment the following line to enable high precision build of SBC encoder */ + +/* #define SBC_HIGH_PRECISION */ + +#ifdef SBC_HIGH_PRECISION +#define FIXED_A int64_t /* data type for fixed point accumulator */ +#define FIXED_T int32_t /* data type for fixed point constants */ +#define SBC_FIXED_EXTRA_BITS 16 +#else +#define FIXED_A int32_t /* data type for fixed point accumulator */ +#define FIXED_T int16_t /* data type for fixed point constants */ +#define SBC_FIXED_EXTRA_BITS 0 +#endif + +/* A2DP specification: Section 12.8 Tables + * + * Original values are premultiplied by 2 for better precision (that is the + * maximum which is possible without overflows) + * + * Note: in each block of 8 numbers sign was changed for elements 2 and 7 + * in order to compensate the same change applied to cos_table_fixed_4 + */ +#define SBC_PROTO_FIXED4_SCALE \ + ((sizeof(FIXED_T) * CHAR_BIT - 1) - SBC_FIXED_EXTRA_BITS + 1) +#define F(x) (FIXED_A) ((x * 2) * \ + ((FIXED_A) 1 << (sizeof(FIXED_T) * CHAR_BIT - 1)) + 0.5) +static const FIXED_T _sbc_proto_fixed4[40] = { + F(0.00000000E+00), F(5.36548976E-04), + -F(1.49188357E-03), F(2.73370904E-03), + F(3.83720193E-03), F(3.89205149E-03), + F(1.86581691E-03), F(3.06012286E-03), + + F(1.09137620E-02), F(2.04385087E-02), + -F(2.88757392E-02), F(3.21939290E-02), + F(2.58767811E-02), F(6.13245186E-03), + -F(2.88217274E-02), F(7.76463494E-02), + + F(1.35593274E-01), F(1.94987841E-01), + -F(2.46636662E-01), F(2.81828203E-01), + F(2.94315332E-01), F(2.81828203E-01), + F(2.46636662E-01), -F(1.94987841E-01), + + -F(1.35593274E-01), -F(7.76463494E-02), + F(2.88217274E-02), F(6.13245186E-03), + F(2.58767811E-02), F(3.21939290E-02), + F(2.88757392E-02), -F(2.04385087E-02), + + -F(1.09137620E-02), -F(3.06012286E-03), + -F(1.86581691E-03), F(3.89205149E-03), + F(3.83720193E-03), F(2.73370904E-03), + F(1.49188357E-03), -F(5.36548976E-04), +}; +#undef F + +/* + * To produce this cosine matrix in Octave: + * + * b = zeros(4, 8); + * for i = 0:3 + * for j = 0:7 b(i+1, j+1) = cos((i + 0.5) * (j - 2) * (pi/4)) + * endfor + * endfor; + * printf("%.10f, ", b'); + * + * Note: in each block of 8 numbers sign was changed for elements 2 and 7 + * + * Change of sign for element 2 allows to replace constant 1.0 (not + * representable in Q15 format) with -1.0 (fine with Q15). + * Changed sign for element 7 allows to have more similar constants + * and simplify subband filter function code. + */ +#define SBC_COS_TABLE_FIXED4_SCALE \ + ((sizeof(FIXED_T) * CHAR_BIT - 1) + SBC_FIXED_EXTRA_BITS) +#define F(x) (FIXED_A) ((x) * \ + ((FIXED_A) 1 << (sizeof(FIXED_T) * CHAR_BIT - 1)) + 0.5) +static const FIXED_T cos_table_fixed_4[32] = { + F(0.7071067812), F(0.9238795325), -F(1.0000000000), F(0.9238795325), + F(0.7071067812), F(0.3826834324), F(0.0000000000), F(0.3826834324), + + -F(0.7071067812), F(0.3826834324), -F(1.0000000000), F(0.3826834324), + -F(0.7071067812), -F(0.9238795325), -F(0.0000000000), -F(0.9238795325), + + -F(0.7071067812), -F(0.3826834324), -F(1.0000000000), -F(0.3826834324), + -F(0.7071067812), F(0.9238795325), F(0.0000000000), F(0.9238795325), + + F(0.7071067812), -F(0.9238795325), -F(1.0000000000), -F(0.9238795325), + F(0.7071067812), -F(0.3826834324), -F(0.0000000000), -F(0.3826834324), +}; +#undef F + +/* A2DP specification: Section 12.8 Tables + * + * Original values are premultiplied by 4 for better precision (that is the + * maximum which is possible without overflows) + * + * Note: in each block of 16 numbers sign was changed for elements 4, 13, 14, 15 + * in order to compensate the same change applied to cos_table_fixed_8 + */ +#define SBC_PROTO_FIXED8_SCALE \ + ((sizeof(FIXED_T) * CHAR_BIT - 1) - SBC_FIXED_EXTRA_BITS + 2) +#define F(x) (FIXED_A) ((x * 4) * \ + ((FIXED_A) 1 << (sizeof(FIXED_T) * CHAR_BIT - 1)) + 0.5) +static const FIXED_T _sbc_proto_fixed8[80] = { + F(0.00000000E+00), F(1.56575398E-04), + F(3.43256425E-04), F(5.54620202E-04), + -F(8.23919506E-04), F(1.13992507E-03), + F(1.47640169E-03), F(1.78371725E-03), + F(2.01182542E-03), F(2.10371989E-03), + F(1.99454554E-03), F(1.61656283E-03), + F(9.02154502E-04), F(1.78805361E-04), + F(1.64973098E-03), F(3.49717454E-03), + + F(5.65949473E-03), F(8.02941163E-03), + F(1.04584443E-02), F(1.27472335E-02), + -F(1.46525263E-02), F(1.59045603E-02), + F(1.62208471E-02), F(1.53184106E-02), + F(1.29371806E-02), F(8.85757540E-03), + F(2.92408442E-03), -F(4.91578024E-03), + -F(1.46404076E-02), F(2.61098752E-02), + F(3.90751381E-02), F(5.31873032E-02), + + F(6.79989431E-02), F(8.29847578E-02), + F(9.75753918E-02), F(1.11196689E-01), + -F(1.23264548E-01), F(1.33264415E-01), + F(1.40753505E-01), F(1.45389847E-01), + F(1.46955068E-01), F(1.45389847E-01), + F(1.40753505E-01), F(1.33264415E-01), + F(1.23264548E-01), -F(1.11196689E-01), + -F(9.75753918E-02), -F(8.29847578E-02), + + -F(6.79989431E-02), -F(5.31873032E-02), + -F(3.90751381E-02), -F(2.61098752E-02), + F(1.46404076E-02), -F(4.91578024E-03), + F(2.92408442E-03), F(8.85757540E-03), + F(1.29371806E-02), F(1.53184106E-02), + F(1.62208471E-02), F(1.59045603E-02), + F(1.46525263E-02), -F(1.27472335E-02), + -F(1.04584443E-02), -F(8.02941163E-03), + + -F(5.65949473E-03), -F(3.49717454E-03), + -F(1.64973098E-03), -F(1.78805361E-04), + -F(9.02154502E-04), F(1.61656283E-03), + F(1.99454554E-03), F(2.10371989E-03), + F(2.01182542E-03), F(1.78371725E-03), + F(1.47640169E-03), F(1.13992507E-03), + F(8.23919506E-04), -F(5.54620202E-04), + -F(3.43256425E-04), -F(1.56575398E-04), +}; +#undef F + +/* + * To produce this cosine matrix in Octave: + * + * b = zeros(8, 16); + * for i = 0:7 + * for j = 0:15 b(i+1, j+1) = cos((i + 0.5) * (j - 4) * (pi/8)) + * endfor endfor; + * printf("%.10f, ", b'); + * + * Note: in each block of 16 numbers sign was changed for elements 4, 13, 14, 15 + * + * Change of sign for element 4 allows to replace constant 1.0 (not + * representable in Q15 format) with -1.0 (fine with Q15). + * Changed signs for elements 13, 14, 15 allow to have more similar constants + * and simplify subband filter function code. + */ +#define SBC_COS_TABLE_FIXED8_SCALE \ + ((sizeof(FIXED_T) * CHAR_BIT - 1) + SBC_FIXED_EXTRA_BITS) +#define F(x) (FIXED_A) ((x) * \ + ((FIXED_A) 1 << (sizeof(FIXED_T) * CHAR_BIT - 1)) + 0.5) +static const FIXED_T cos_table_fixed_8[128] = { + F(0.7071067812), F(0.8314696123), F(0.9238795325), F(0.9807852804), + -F(1.0000000000), F(0.9807852804), F(0.9238795325), F(0.8314696123), + F(0.7071067812), F(0.5555702330), F(0.3826834324), F(0.1950903220), + F(0.0000000000), F(0.1950903220), F(0.3826834324), F(0.5555702330), + + -F(0.7071067812), -F(0.1950903220), F(0.3826834324), F(0.8314696123), + -F(1.0000000000), F(0.8314696123), F(0.3826834324), -F(0.1950903220), + -F(0.7071067812), -F(0.9807852804), -F(0.9238795325), -F(0.5555702330), + -F(0.0000000000), -F(0.5555702330), -F(0.9238795325), -F(0.9807852804), + + -F(0.7071067812), -F(0.9807852804), -F(0.3826834324), F(0.5555702330), + -F(1.0000000000), F(0.5555702330), -F(0.3826834324), -F(0.9807852804), + -F(0.7071067812), F(0.1950903220), F(0.9238795325), F(0.8314696123), + F(0.0000000000), F(0.8314696123), F(0.9238795325), F(0.1950903220), + + F(0.7071067812), -F(0.5555702330), -F(0.9238795325), F(0.1950903220), + -F(1.0000000000), F(0.1950903220), -F(0.9238795325), -F(0.5555702330), + F(0.7071067812), F(0.8314696123), -F(0.3826834324), -F(0.9807852804), + -F(0.0000000000), -F(0.9807852804), -F(0.3826834324), F(0.8314696123), + + F(0.7071067812), F(0.5555702330), -F(0.9238795325), -F(0.1950903220), + -F(1.0000000000), -F(0.1950903220), -F(0.9238795325), F(0.5555702330), + F(0.7071067812), -F(0.8314696123), -F(0.3826834324), F(0.9807852804), + F(0.0000000000), F(0.9807852804), -F(0.3826834324), -F(0.8314696123), + + -F(0.7071067812), F(0.9807852804), -F(0.3826834324), -F(0.5555702330), + -F(1.0000000000), -F(0.5555702330), -F(0.3826834324), F(0.9807852804), + -F(0.7071067812), -F(0.1950903220), F(0.9238795325), -F(0.8314696123), + -F(0.0000000000), -F(0.8314696123), F(0.9238795325), -F(0.1950903220), + + -F(0.7071067812), F(0.1950903220), F(0.3826834324), -F(0.8314696123), + -F(1.0000000000), -F(0.8314696123), F(0.3826834324), F(0.1950903220), + -F(0.7071067812), F(0.9807852804), -F(0.9238795325), F(0.5555702330), + -F(0.0000000000), F(0.5555702330), -F(0.9238795325), F(0.9807852804), + + F(0.7071067812), -F(0.8314696123), F(0.9238795325), -F(0.9807852804), + -F(1.0000000000), -F(0.9807852804), F(0.9238795325), -F(0.8314696123), + F(0.7071067812), -F(0.5555702330), F(0.3826834324), -F(0.1950903220), + -F(0.0000000000), -F(0.1950903220), F(0.3826834324), -F(0.5555702330), +}; +#undef F -- 1.5.6.5 ^ permalink raw reply related [flat|nested] 35+ messages in thread
* Re: [RFC/PATCH] sbc: new filtering function for 8 band fixed point encoding 2008-12-29 10:46 ` [RFC/PATCH] sbc: new filtering function for 8 band fixed point encoding Siarhei Siamashka @ 2008-12-29 11:56 ` Marcel Holtmann 0 siblings, 0 replies; 35+ messages in thread From: Marcel Holtmann @ 2008-12-29 11:56 UTC (permalink / raw) To: Siarhei Siamashka; +Cc: ext Brad Midgley, Jaska Uimonen, linux-bluetooth Hi Siarhei, > > Right now I'm also doublechecking correctness of the code to ensure that > > there are no overflows of other problems related to audio quality. > > Here is (hopefully) the final revision of patch. I tested it with all the possible files > that I could find or generate myself and it seems to work fine in all cases. > > The code was reverted to use fabs macro. After all, looks like it is the right > way to handle scale factors and adheres to the specification. The weird > effects on sound quality related to the use of fabs and the modified macro > which was tried as a replacement have been apparently the side effects of > other bugs. > > Also I have to admit that the change from > http://marc.info/?l=linux-bluetooth&m=122946440507726&w=2 > is not needed and the original patch was in fact correct. The output of > quantizer is strictly a positive number (unless there are bugs or overflows). > This was also changed back. > > Rounding for the values in the final analysis function output was removed > (we already keep a lot of extra bits in output, so it does not matter for > precision). Also the change of SCALE_OUT_BITS to 15 has a nice feature > that the shifting of output values is not needed at all for 16-bit version > (and naturally rounding is not needed here too), this should be good for > performance. I applied your patch to the upstream tree. Thanks for your work. I know like to get feedback from other people if this helps us with the SBC conformance testing and the audio quality. Regards Marcel ^ permalink raw reply [flat|nested] 35+ messages in thread
end of thread, other threads:[~2009-01-01 14:29 UTC | newest] Thread overview: 35+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2008-11-28 13:35 [RFC/PATCH] sbc: new filtering function for 8 band fixed point encoding Jaska Uimonen 2008-11-28 14:18 ` Marcel Holtmann 2008-11-28 14:24 ` Jelle de Jong 2008-11-28 15:20 ` Jaska Uimonen 2008-11-28 18:13 ` David Sainty 2008-11-28 15:14 ` Jaska Uimonen 2008-12-02 20:15 ` Jim Carter 2008-12-12 17:14 ` Siarhei Siamashka 2008-12-12 19:19 ` Brad Midgley 2008-12-15 12:54 ` Siarhei Siamashka 2008-12-15 15:16 ` Brad Midgley 2008-12-16 22:37 ` Siarhei Siamashka 2008-12-17 8:16 ` Jaska Uimonen 2008-12-19 22:12 ` Siarhei Siamashka 2008-12-22 23:30 ` Siarhei Siamashka 2008-12-23 1:00 ` Marcel Holtmann 2008-12-23 8:20 ` Jaska.Uimonen 2008-12-23 11:14 ` Siarhei Siamashka 2008-12-23 10:45 ` Siarhei Siamashka 2008-12-23 11:48 ` Marcel Holtmann 2008-12-29 9:16 ` Testing SBC filtering functions Christian Hoene 2008-12-29 10:00 ` Marcel Holtmann 2008-12-29 10:55 ` Christian Hoene 2008-12-29 12:03 ` Marcel Holtmann 2008-12-29 12:31 ` Christian Hoene 2008-12-29 12:41 ` Marcel Holtmann 2008-12-29 13:11 ` Christian Hoene 2008-12-29 13:17 ` Marcel Holtmann 2009-01-01 14:29 ` Testing SBC encoder correctness with sbctester works Christian Hoene 2008-12-29 11:06 ` Testing SBC filtering functions Siarhei Siamashka 2008-12-29 12:04 ` Marcel Holtmann 2008-12-29 14:36 ` Siarhei Siamashka 2008-12-29 15:04 ` Siarhei Siamashka 2008-12-29 10:46 ` [RFC/PATCH] sbc: new filtering function for 8 band fixed point encoding Siarhei Siamashka 2008-12-29 11:56 ` Marcel Holtmann
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.