基本思想:最近在阅读NCNN源码,学完OpenMP之后,在学习一下Neon,因为手中的FPGA开发板还在做焊接,所以还是委屈一下手机,进行ARM的neon部分学习和验证吧~
首先先下载一下AIDA64.APK确认一下手机是否支持neon指令
(1) 构建一下Android工程和选择支持JNI编程 ,顺手也把opencv SDK包裹进去了,以防学习中用到 (不用的可以不包裹)
之间遇到一个之前没有遇到一个错误
More than one file was found with OS independent path 'lib/arm64-v8a/libcoreMesh.so'
在build.gradle添加( F:\neon\app\build.gradle)
packagingOptions {
pickFirst "lib/arm64-v8a/libopencv_java4.so"
pickFirst "lib/armeabi-v7a/libopencv_java4.so"
pickFirst "lib/x86/libopencv_java4.so"
pickFirst "lib/x86_64/libopencv_java4.so"
}
(2)、因为要neon在android studio的真机内中支持测试,需要写入
arguments '-DANDROID=c++_shared',"-DANDROID_ARM_NEON=TRUE", "-DANDROID_TOOLCHAIN=clang -mfloat-abi=softfp -mfpu=neon"//"-mfloat-abi=hard"
截图修改如下(真心不会Android studio 只会JNI)
plugins {
id 'com.android.application'
}
android {
compileSdkVersion 30
//buildToolsVersion "30.0.3"
defaultConfig {
applicationId "com.example.neon"
minSdkVersion 21
targetSdkVersion 30
versionCode 1
versionName "1.0"
testInstrumentationRunner "androidx.test.runner.AndroidJUnitRunner"
externalNativeBuild {
cmake {
cppFlags "-std=c++11"
arguments '-DANDROID=c++_shared',"-DANDROID_ARM_NEON=TRUE", "-DANDROID_TOOLCHAIN=clang -mfloat-abi=softfp -mfpu=neon"//"-mfloat-abi=hard"
abiFilters 'armeabi-v7a','arm64-v8a'//,'x86','x86_64'
}
}
}
packagingOptions {
pickFirst "lib/arm64-v8a/libopencv_java4.so"
pickFirst "lib/armeabi-v7a/libopencv_java4.so"
pickFirst "lib/x86/libopencv_java4.so"
pickFirst "lib/x86_64/libopencv_java4.so"
}
sourceSets{
main{
jniLibs.srcDirs=["src/main/jniLibs/libs"]
}
}
buildTypes {
release {
minifyEnabled false
proguardFiles getDefaultProguardFile('proguard-android-optimize.txt'), 'proguard-rules.pro'
}
}
externalNativeBuild {
cmake {
path "src/main/cpp/CMakeLists.txt"
version "3.10.2"
}
}
compileOptions {
sourceCompatibility JavaVersion.VERSION_1_8
targetCompatibility JavaVersion.VERSION_1_8
}
}
dependencies {
implementation 'androidx.appcompat:appcompat:1.1.0'
implementation 'com.google.android.material:material:1.1.0'
implementation 'androidx.constraintlayout:constraintlayout:1.1.3'
implementation project(path: ':sdk')
testImplementation 'junit:junit:4.+'
androidTestImplementation 'androidx.test.ext:junit:1.1.1'
androidTestImplementation 'androidx.test.espresso:espresso-core:3.2.0'
}
(3)、修改一下主程序,测带打印一下手机的ARM架构
package com.example.neon;
import androidx.appcompat.app.AppCompatActivity;
import android.os.Bundle;
import android.util.Log;
import android.widget.TextView;
import java.io.BufferedReader;
import java.io.FileReader;
import java.io.IOException;
import java.util.List;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class MainActivity extends AppCompatActivity {
// Used to load the 'native-lib' library on application startup.
static {
System.loadLibrary("native-lib");
}
public static String getFieldFromCpuinfo(String field) throws IOException {
BufferedReader br = new BufferedReader(new FileReader("/proc/cpuinfo"));
Pattern p = Pattern.compile(field + "\\s*:\\s*(.*)");
try {
String line;
while ((line = br.readLine()) != null) {
Matcher m = p.matcher(line);
if (m.matches()) {
return m.group(1);
}
}
} finally {
br.close();
}
return null;
}
public static boolean isCPU64(){
boolean result = false;
String mProcessor = null;
List<String > list = null;
try {
mProcessor = getFieldFromCpuinfo("Processor");
} catch (IOException e) {
e.printStackTrace();
}
if (mProcessor != null) {
// D/CpuUtils: isCPU64 mProcessor = AArch64 Processor rev 4 (aarch64)
Log.d("TAG", "isCPU64 mProcessor = " + mProcessor);
//list = Arrays.asList(mProcessor.split("\\s"));
if (mProcessor.contains("aarch64")) {
result = true;
}
}
return result;
}
@Override
protected void onCreate(Bundle savedInstanceState) {
super.onCreate(savedInstanceState);
setContentView(R.layout.activity_main);
boolean aarch64=isCPU64();
if(aarch64)
{
Log.d("TAG", "YES " );
}else {
Log.d("TAG", "NO ");
}
// Example of a call to a native method
TextView tv = findViewById(R.id.sample_text);
tv.setText(stringFromJNI());
}
/**
* A native method that is implemented by the 'native-lib' native library,
* which is packaged with this application.
*/
public native String stringFromJNI();
}
(4)、在JNI对应cpp文件中开始测试 demo
#include <jni.h>
#include <string>
#include <arm_neon.h>
#include <android/log.h>
#define LOG_TAG "TEST_NEON"
#define LOGD(...) __android_log_print(ANDROID_LOG_DEBUG, LOG_TAG, __VA_ARGS__)
#define LOGI(...) __android_log_print(ANDROID_LOG_INFO, LOG_TAG, __VA_ARGS__)
extern "C"
{
void test() {
int16_t result[8];
int8x8_t a = vdup_n_s8(121);
int8x8_t b = vdup_n_s8(2);
int16x8_t c;
c = vmull_s8(a,b);
vst1q_s16(result,c);
for(int i=0;i<8;i++){
LOGD("data[%d] is %d ",i,result[i]);
}
}
}
extern "C" JNIEXPORT jstring JNICALL
Java_com_example_neon_MainActivity_stringFromJNI(
JNIEnv* env,
jobject /* this */) {
std::string hello = "Hello from C++";
test();
return env->NewStringUTF(hello.c_str());
}
测试结果
2021-07-14 15:51:29.484 31154-31154/com.example.neon D/TAG: isCPU64 mProcessor = AArch64 Processor rev 4 (aarch64)
2021-07-14 15:51:29.484 31154-31154/com.example.neon D/TAG: YES
2021-07-14 15:51:52.171 31154-31154/com.example.neon D/TEST_NEON: data[0] is 242
2021-07-14 15:51:52.171 31154-31154/com.example.neon D/TEST_NEON: data[1] is 242
2021-07-14 15:51:52.171 31154-31154/com.example.neon D/TEST_NEON: data[2] is 242
2021-07-14 15:51:52.171 31154-31154/com.example.neon D/TEST_NEON: data[3] is 242
2021-07-14 15:51:52.171 31154-31154/com.example.neon D/TEST_NEON: data[4] is 242
2021-07-14 15:51:52.171 31154-31154/com.example.neon D/TEST_NEON: data[5] is 242
2021-07-14 15:51:52.171 31154-31154/com.example.neon D/TEST_NEON: data[6] is 242
2021-07-14 15:51:52.172 31154-31154/com.example.neon D/TEST_NEON: data[7] is 242
2021-07-14 15:51:52.223 31154-31154/com.example.neon E/ANR_LOG: >>> msg's executing time is too long
2021-07-14 15:51:52.224 31154-31154/com.example.neon E/ANR_LOG: Blocked msg = { when=-34s273ms what=100 target=android.app.ActivityThread$H obj=ActivityRecord{8f43ed token=android.os.BinderProxy@a005222 {com.example.neon/com.example.neon.MainActivity}} } , cost = 23399 ms
2021-07-14 15:51:52.224 31154-31154/com.example.neon E/ANR_LOG: >>>Current msg List is:
2021-07-14 15:51:52.224 31154-31154/com.example.neon E/ANR_LOG: Current msg <1> = { when=-33s541ms what=149 target=android.app.ActivityThread$H obj=android.os.BinderProxy@a005222 }
2021-07-14 15:51:52.224 31154-31154/com.example.neon E/ANR_LOG: Current msg <2> = { when=-22s768ms what=0 target=android.os.Handler callback=androidx.core.content.res.ResourcesCompat$FontCallback$2 }
2021-07-14 15:51:52.224 31154-31154/com.example.neon E/ANR_LOG: Current msg <3> = { when=-22s745ms what=0 target=android.os.Handler callback=androidx.core.content.res.ResourcesCompat$FontCallback$2 }
2021-07-14 15:51:52.224 31154-31154/com.example.neon E/ANR_LOG: Current msg <4> = { when=-20ms what=0 target=android.view.ViewRootImpl$ViewRootHandler callback=android.view.ViewRootImpl$4 }
2021-07-14 15:51:52.225 31154-31154/com.example.neon E/ANR_LOG: Current msg <5> = { when=-4ms barrier=0 }
2021-07-14 15:51:52.225 31154-31154/com.example.neon E/ANR_LOG: >>>CURRENT MSG DUMP OVER<<<
2021-07-14 15:51:52.309 31154-31154/com.example.neon I/ViewConfigCompat: Could not find method getScaledScrollFactor() on ViewConfiguration
2021-07-14 15:51:52.313 31154-31255/com.example.neon I/Adreno: QUALCOMM build : e0ff253, I1b6e53de78
Build Date : 02/16/18
OpenGL ES Shader Compiler Version: XE031.09.00.04
Local Branch :
Remote Branch : quic/gfx-adreno.lnx.1.0.c15-rel
Remote Branch : NONE
Reconstruct Branch : NOTHING
2021-07-14 15:51:52.334 31154-31255/com.example.neon I/OpenGLRenderer: Initialized EGL, version 1.4
2021-07-14 15:51:52.334 31154-31255/com.example.neon D/OpenGLRenderer: Swap behavior 1
2021-07-14 15:53:51.635 31154-31255/com.example.neon E/OpenGLRenderer: hwui_debug::CanvasContext createSurface sur=0x0, isValid =0
2021-07-14 15:53:51.733 31154-31255/com.example.neon E/OpenGLRenderer: hwui_debug::CanvasContext createSurface sur=0x0, isValid =0
环境算搭建起来了~。代码很简单,参考手册:https://developer.arm.com/architectures/instruction-sets/intrinsics/ 就可以知道,从内存中取数据到寄存器,然后在寄存器完成运算,然后进行再从寄存器把数据返回给内存,类似汇编语言一样;
(突然想起了我的恩师-雷老师的课了 《汇编语言程序设计教程 雷印胜,贾萍,胡晓鹏等编著》《IBM-PC汇编语言程序设计》~)
先研究几个小demo,然后开始研究nihui大佬的代码,边学习边验证吧~ up yyds
参考:
https://developer.arm.com/solutions/os/android/developer-guides/neon-intrinsics-getting-started-on-android
https://developer.android.com/ndk/guides/cpu-arm-neon#cmake
https://developer.arm.com/architectures/instruction-sets/intrinsics/