[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Commit-gnuradio] [gnuradio] 13/22: volk: add neon versions for 32i bitw
From: |
git |
Subject: |
[Commit-gnuradio] [gnuradio] 13/22: volk: add neon versions for 32i bitwise operators |
Date: |
Fri, 31 Oct 2014 19:22:31 +0000 (UTC) |
This is an automated email from the git hooks/post-receive script.
jcorgan pushed a commit to branch master
in repository gnuradio.
commit 34670fd911d4f491d5b6c6140b703619283d1432
Author: Nathan West <address@hidden>
Date: Mon Oct 20 19:07:29 2014 -0500
volk: add neon versions for 32i bitwise operators
---
volk/kernels/volk/volk_32i_x2_and_32i.h | 34 +++++++++++++++++++++++++++++++++
volk/kernels/volk/volk_32i_x2_or_32i.h | 34 +++++++++++++++++++++++++++++++++
2 files changed, 68 insertions(+)
diff --git a/volk/kernels/volk/volk_32i_x2_and_32i.h
b/volk/kernels/volk/volk_32i_x2_and_32i.h
index b33a60e..c138540 100644
--- a/volk/kernels/volk/volk_32i_x2_and_32i.h
+++ b/volk/kernels/volk/volk_32i_x2_and_32i.h
@@ -65,6 +65,40 @@ static inline void volk_32i_x2_and_32i_a_sse(int32_t*
cVector, const int32_t* aV
}
#endif /* LV_HAVE_SSE */
+#ifdef LV_HAVE_NEON
+#include <arm_neon.h>
+/*!
+ \brief Ands the two input vectors and store their results in the third vector
+ \param cVector The vector where the results will be stored
+ \param aVector One of the vectors
+ \param bVector One of the vectors
+ \param num_points The number of values in aVector and bVector to be anded
together and stored into cVector
+*/
+static inline void volk_32i_x2_and_32i_neon(int32_t* cVector, const int32_t*
aVector, const int32_t* bVector, unsigned int num_points){
+ int32_t* cPtr = cVector;
+ const int32_t* aPtr = aVector;
+ const int32_t* bPtr= bVector;
+ unsigned int number = 0;
+ unsigned int quarter_points = num_points / 4;
+
+ int32x4_t a_val, b_val, c_val;
+
+ for(number = 0; number < quarter_points; number++){
+ a_val = vld1q_s32(aPtr);
+ b_val = vld1q_s32(bPtr);
+ c_val = vandq_s32(a_val, b_val);
+ vst1q_s32(cPtr, c_val);
+ aPtr += 4;
+ bPtr += 4;
+ cPtr += 4;
+ }
+
+ for(number = quarter_points * 4; number < num_points; number++){
+ *cPtr++ = (*aPtr++) & (*bPtr++);
+ }
+}
+#endif /* LV_HAVE_NEON */
+
#ifdef LV_HAVE_GENERIC
/*!
\brief Ands the two input vectors and store their results in the third vector
diff --git a/volk/kernels/volk/volk_32i_x2_or_32i.h
b/volk/kernels/volk/volk_32i_x2_or_32i.h
index a8556a3..544a71c 100644
--- a/volk/kernels/volk/volk_32i_x2_or_32i.h
+++ b/volk/kernels/volk/volk_32i_x2_or_32i.h
@@ -65,6 +65,40 @@ static inline void volk_32i_x2_or_32i_a_sse(int32_t*
cVector, const int32_t* aVe
}
#endif /* LV_HAVE_SSE */
+#ifdef LV_HAVE_NEON
+#include <arm_neon.h>
+/*!
+ \brief Ands the two input vectors and store their results in the third vector
+ \param cVector The vector where the results will be stored
+ \param aVector One of the vectors
+ \param bVector One of the vectors
+ \param num_points The number of values in aVector and bVector to be anded
together and stored into cVector
+*/
+static inline void volk_32i_x2_or_32i_neon(int32_t* cVector, const int32_t*
aVector, const int32_t* bVector, unsigned int num_points){
+ int32_t* cPtr = cVector;
+ const int32_t* aPtr = aVector;
+ const int32_t* bPtr= bVector;
+ unsigned int number = 0;
+ unsigned int quarter_points = num_points / 4;
+
+ int32x4_t a_val, b_val, c_val;
+
+ for(number = 0; number < quarter_points; number++){
+ a_val = vld1q_s32(aPtr);
+ b_val = vld1q_s32(bPtr);
+ c_val = vorrq_s32(a_val, b_val);
+ vst1q_s32(cPtr, c_val);
+ aPtr += 4;
+ bPtr += 4;
+ cPtr += 4;
+ }
+
+ for(number = quarter_points * 4; number < num_points; number++){
+ *cPtr++ = (*aPtr++) | (*bPtr++);
+ }
+}
+#endif /* LV_HAVE_NEON */
+
#ifdef LV_HAVE_GENERIC
/*!
\brief Ors the two input vectors and store their results in the third vector
- [Commit-gnuradio] [gnuradio] 12/22: volk: update profile to use the new 32u_byteswap puppet, (continued)
- [Commit-gnuradio] [gnuradio] 12/22: volk: update profile to use the new 32u_byteswap puppet, git, 2014/10/31
- [Commit-gnuradio] [gnuradio] 06/22: volk: adding popcnt puppets to qa, git, 2014/10/31
- [Commit-gnuradio] [gnuradio] 09/22: volk: add neon version for 32f_binary_slicer_8i, git, 2014/10/31
- [Commit-gnuradio] [gnuradio] 08/22: volk: add neon protokernel for 16i_s32f_convert_32f., git, 2014/10/31
- [Commit-gnuradio] [gnuradio] 02/22: volk: add neon kernels for 32fc->32f deinterleavers, git, 2014/10/31
- [Commit-gnuradio] [gnuradio] 07/22: volk: add neon kernel for 16i_32fc_dot_prod_32fc, git, 2014/10/31
- [Commit-gnuradio] [gnuradio] 11/22: volk: add neon version of 32u_byteswap, git, 2014/10/31
- [Commit-gnuradio] [gnuradio] 10/22: volk: removed unused variable from neon binary slicer, git, 2014/10/31
- [Commit-gnuradio] [gnuradio] 01/22: volk: add neon kernel for 16i_convert_8i, git, 2014/10/31
- [Commit-gnuradio] [gnuradio] 14/22: volk: fixing 32u_byteswap puppet for SSE, git, 2014/10/31
- [Commit-gnuradio] [gnuradio] 13/22: volk: add neon versions for 32i bitwise operators,
git <=
- [Commit-gnuradio] [gnuradio] 22/22: Merge commit '0c92479f', git, 2014/10/31
- [Commit-gnuradio] [gnuradio] 19/22: volk: add neon log2 implementation and fix QA to properly test, git, 2014/10/31
- [Commit-gnuradio] [gnuradio] 15/22: volk: add neon version of complex<float> dot product, git, 2014/10/31
- [Commit-gnuradio] [gnuradio] 17/22: volk: add a neon table version of 16u_byteswap, git, 2014/10/31
- [Commit-gnuradio] [gnuradio] 21/22: volk: relax log2 qa constraints and use a higher order polynomial, git, 2014/10/31
- [Commit-gnuradio] [gnuradio] 20/22: volk: fix memory overrun/corruption in neon binary_slicer_8i, git, 2014/10/31
- [Commit-gnuradio] [gnuradio] 16/22: volk: add neon kernel for 64u_byteswap and puppets for 64/16 byteswap, git, 2014/10/31
- [Commit-gnuradio] [gnuradio] 18/22: volk: fixing *byteswap sse puppet signatures, git, 2014/10/31