第8章ARM官方DSP库的BasicMathFunctions的使用(一)

合集下载

1、下载文档前请自行甄别文档内容的完整性，平台不提供额外的编辑、内容补充、找答案等附加服务。
2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
3、如文档侵犯您的权益，请联系客服反馈,我们会尽快为您处理(人工客服工作时间：9:00-18:30)。

/*loop Unrolling */ blkCnt = blockSize >> 2u;
（4）
/* First part of the processing with loop unrolling. Compute 4 outputs at a time. ** a second loop below computes the remaining 1 to 3 samples. */ while(blkCnt > 0u) {
/* C = |A| */ /* Calculate absolute and then store the results in the destination buffer. */ *pDst++ = fabsf(*pSrc++);
/* Decrement the loop counter */ blkCnt--; } }
2015年01月15日
版本：1.0
第 3 页共 32 页
安富莱 DSP 教程
in3 = *pSrc++; in4 = *pSrc++;
UM403 STM32-V5 开发板系统篇手册
*pDst++ = (in1 > 0) ? in1 : (q31_t)__QSUB(0, in1); *pDst++ = (in2 > 0) ? in2 : (q31_t)__QSUB(0, in2); *pDst++ = (in3 > 0) ? in3 : (q31_t)__QSUB(0, in3); *pDst++ = (in4 > 0) ? in4 : (q31_t)__QSUB(0, in4);
UM403 STM32-V5 开发板系统篇手册
（5）
/* Update source pointer to process next sampels */ pSrc += 4u;
/* Update destination pointer to process next sampels */ pDst += 4u;
1. 在这里简单的跟大家介绍一下 DSP 库中函数的通用格式，后面就不再赘述了。
（1）基本所有的函数都是可重入的。
（6）（7）（8）
2015年01月15日
版本：1.0
第 2 页共 32 页
安富莱 DSP 教程
UM403 STM32-V5 开发板系统篇手册
（2）大部分函数都支持一组数的计算，比如这个函数 arm_abs_f32 就可以计算一组数的绝对值。所以如果只是就几个数的绝对值，用这个库函数就没有什么优势了。
/* C = |A| */ /* Calculate absolute value of the input (if -1 then saturated to 0x7fffffff) and then store the results in the destination buffer. */ in = *pSrc++; *pDst++ = (in > 0) ? in : ((in == INT32_MIN) ? INT32_MAX : -in);
/* loop counter */
（2）
#ifndef ARM_MATH_CM0_FAMILY
（3）
/* Run the below code for Cortex-M4 and Cortex-M3 */
float32_t in1, in2, in3, in4;
/* temporary variables */
/* Decrement the loop counter */ blkCnt--; }
/* If the blockSize is not a multiple of 4, compute any remaining output samples here. ** No loop unrolling is used. */ blkCnt = blockSize % 0x4u;
安富莱 DSP 教程
UM403 STM32-V5 开发板系统篇手册
第8章 BasicMathFunctions 的使用（一）
本期教程开始学习 ARM 官方的 DSP 库，这里我们先从基本数学函数开始。本期教程主要讲绝对值，加法，点乘和乘法四种运算。
8.1 绝对值（Vector Absolute Value） 8.2 求和（Vector Addition） 8.3 点乘（Vector Dot Product） 8.4 乘法（Vector Multiplication）
{ uint32_t blkCnt; q31_t in;
/* loop counter */ /* Input value */
#ifndef ARM_MATH_CM0_FAMILY
/* Run the below code for Cortex-M4 and Cortex-M3 */ q31_t in1, in2, in3, in4;
8.1.2 arm_abs_q31
这个函数用于求 32 位定点数的绝对值，源代码分析如下：
/**
* @brief Q31 vector absolute value.
* @param[in]
*pSrc points to the input buffer
* @param[out]
*pDst points to the output buffer
（2）
/* Decrement the loop counter */ blkCnt--; }
/* If the blockSize is not a multiple of 4, compute any remaining output samples here. ** No loop unrolling is used. */ blkCnt = blockSize % 0x4u;
#else
/* Run the below code for Cortex-M0 */
/* Initialize blkCnt with number of samples */ blkCnt = blockSize;
#endif /* #ifndef ARM_MATH_CM0_FAMILY */
while(blkCnt > 0u) {
#else
/* Run the below code for Cortex-M0 */
/* Initialize blkCnt with number of samples */ blkCnt = blockSize;
#endif /* #ifndef ARM_MATH_CM0_FAMILY */
while(blkCnt > 0u) {
/* store result to destination */ *(pDst + 1) = in2;
/* store result to destination */ *(pDst + 2) = in3;
/* store result to destination */ *(pDst + 3) = in4;
/* find absolute value */ in2 = fabsf(in2);
/* read sample from source */ *pDst = in1;
/* find absolute value */ in3 = fabsf(in3);
/* find absolute value */ in4 = fabsf(in4);
/* C = |A| */ /* Calculate absolute and then store the results in the destination buffer. */ /* read sample from source */
2015年01月15日
版本：1.0
第 1 页共 32 页
/* Decrement the loop counter */ blkCnt--; }
}
1. 这个函数使用了饱和运算，其实不光这个函数，后面很多函数都是使用了饱和运算的，关于什么是饱和运算，大家看 Cortex-M3 权威指南中文版的 4.3.6 小节：汇编语言：饱和运算即可。对于 Q31 格式的数据，饱和运算会使得数据 0x80000000 变成 0x7fffffff（这个数比较特殊，算是特殊处理，记住即可）。
安富莱 DSP 教程
in1 = *pSrc; in2 = *(pSrc + 1); in3 = *(pSrc + 2);
/* find absolute value */ in1 = fabsf(in1);
/* read sample from source */ in4 = *(pSrc + 3);
/**
* @brief Floating-point vector absolute value.
* @param[in] * @param[out]
*pSrc points to the input buffer *pDst points to the output buffer
* @param[in]
* The Q31 value -1 (0x80000000) will be saturated to the maximum allowable positive value 0x7FFFFFFF.
*/
void arm_abs_q31( q31_t * pSrc, q31_t * pDst, uint32_t blockSize)
8.1 绝对值（Vector Absolute Value）
这部分函数主要用于求绝对值，公式描述如下：
pDst[n] = abs(pSrc[n]), 0 <= n < blockSize. 特别注意，这部分函数支持目标指针和源指针指向相同的缓冲区。
8.1.1 arm_abs_f32
这个函数用于求 32 位浮点数的绝对值，源代码分析如下：
blockSize number of samples in each vector
* @return none.
*/
（1）
void arm_abs_f32( float32_t * pSrc, float32_t * pDst, uint32_t blockSize)
{ uint32_t blkCnt;
/*loop Unrolling */ blkCnt = blockSize >> 2u;
/* First part of the processing with loop unrolling. Compute 4 outputs at a time. ** a second loop below computes the remaining 1 to 3 samples. */ while(blkCnt > 0u) {
（3）库函数基本是 CM0，CM3 和 CM4 都支持的（最新的 DSP 库已经添加 CM7 的支持）。（4）每组数据基本上都是以 4 个数为一个单位进行计算，不够四个再单独计算。（5）大部分函数都是配有 f32，Q31，Q15 和 Q7 四种格式。 2. 函数参数，支持输入一个数组进行计算绝对值。 3. 这部分代码是用于 CM3 和 CM4 内核。 4. 左移两位从而实现每 4 个数据为一组进行计算。 5. fabsf：这个函数不是用 Cortex-M4F 支持的 DSP 指令实现的，而是用 C 语言实现的，这个函数是被 MDK 封装起来的。 6. 切换到下一组数据。 7. 这部分代码用于 CM0. 8. 用于不够 4 个数据的计算或者 CM0 内核。
/* C = |A| */ /* Calculate absolute of input (if -1 then saturated to 0x7fffffff) and then store the results in the destination buffer. */ in1 = *pSrc++; in2 = *pSrc++;
* @param[in]
blockSize number of samples in each vector
* @return none.
*
* <b>Scaling and Overflow Behavior:</b>
（1）
* ቤተ መጻሕፍቲ ባይዱpar
* The function uses saturating arithmetic.