深度学习OCR文本检测与识别(python/C++)-CFANZ编程社区

本程序运行在windows10上，采用release x64模式运行，是使用vs2019 C++编译的；

环境：windows10+opencv3.4.4+onnxruntime-gpu1.10+vs2019。

GPU：本人使用的NVIDIA GeForce GTX 1650 4GB显存。

深度学习OCR文本检测与识别(python/C++)_github

使用开源的pytorch工程训练并转换的Onnx模型：

https://github.com/PaddlePaddle/PaddleOCR

https://github.com/WenmuZhou/PytorchOCR

https://github.com/Sierkinhane/CRNN_Chinese_Characters_Rec

使用的pytorch版本是pytorch1.8.1版本：

https://pytorch.org/

离线下载安装版本：

https://download.pytorch.org/whl/torch_stable.html

需要自己训练pytorch模型的可以自己github上下载工程进行训练；

也可以学习课程来训练模型：

https://edu.51cto.com/course/28834.html

https://edu.51cto.com/course/32865.html

训练好模型，转换成onnx模型就可以使用，c++ onnx去加载模型并应用了。

测试数据：

深度学习OCR文本检测与识别(python/C++)_pytorch_02

深度学习OCR文本检测与识别(python/C++)_github_03

测试工程：

接口形式：

#pragma once


#ifndef _DEEPLEANING_OCR_INTERFACE_H_
#define _DEEPLEANING_OCR_INTERFACE_H_

#ifndef  OCR_EXPORTS
#define DEEPLEANING_OCR_API  __declspec(dllimport)
#else
#define DEEPLEANING_OCR_API  __declspec(dllexport)
#endif


#include <iostream>
#include "opencv2/core.hpp"
#include "opencv2/imgproc.hpp"
#include "opencv2/highgui.hpp"
#include <vector>
#include <stdio.h>
#include <map>
#include <algorithm>
#include<string>

using namespace std;
namespace DL_OCR_LIB_CPP
{
  //初始化OCR模型
  DEEPLEANING_OCR_API void* DL_OCR_Init(std::string modelpath, const int gpu_id);

  //全文识别
  DEEPLEANING_OCR_API int DL_OCR_Fulltext_Rec(void*& modelHandles, cv::Mat img, vector<cv::Rect>& detectRect, vector<std::wstring>& recResult);
  //全文识别
  //返回的结果形式：文本行用|@@|隔开,|$$|隔开的是文本行的rect的x,y,w,h和文本内容
  //单行结果：59,75,937,131|$$|校公安部门要求本餐厅已安
  //多行结果：59,75,937,131|$$|校公安部门要求本餐厅已安|@@|52,188,948,137|$$|监控录家及110报警系纸
  DEEPLEANING_OCR_API std::wstring DL_OCR_Fulltext_Rec(void*& modelHandles, cv::Mat img);

  //识别文本,rec_type==0:单行普通文本；1:单行中文打印文本；2:自然场景单行中文文本；3:单行大写字母和数字文本；4:单行数字文本；5:单行日期文本；6:单行身份证号码文本
  //7:电话号码文本；8:车牌号码文本；100:多行普通文本
  DEEPLEANING_OCR_API std::wstring DL_OCR_Rec_Text(void*& modelHandles, cv::Mat img, int rec_type);

  //识别文本
  DEEPLEANING_OCR_API std::vector<std::wstring> DL_OCR_Rec_Text_Batch(void*& modelHandles, std::vector<cv::Mat>& imgs, int rec_type);

  //识别中文打印文本
  DEEPLEANING_OCR_API std::wstring DL_OCR_Rec_Chinese_Print(void*& modelHandles, cv::Mat img);

  //识别中文场景文本
  DEEPLEANING_OCR_API std::wstring DL_OCR_Rec_Chinese_Scene(void*& modelHandles, cv::Mat img);

  //识别大写字母和数字文本
  DEEPLEANING_OCR_API std::wstring DL_OCR_Rec_Upper_Letter_Number(void*& modelHandles, cv::Mat img);

  //识别数字文本
  DEEPLEANING_OCR_API std::wstring DL_OCR_Rec_Number(void*& modelHandles, cv::Mat img);

  //识别日期文本
  DEEPLEANING_OCR_API std::wstring DL_OCR_Rec_Date(void*& modelHandles, cv::Mat img);

  //识别身份证号码文本
  DEEPLEANING_OCR_API std::wstring DL_OCR_Rec_sfzid(void*& modelHandles, cv::Mat img);

  //识别电话号码文本
  DEEPLEANING_OCR_API std::wstring DL_OCR_Rec_telephone_number(void*& modelHandles, cv::Mat img);

  //识别车牌文本
  DEEPLEANING_OCR_API std::wstring DL_OCR_Rec_carplate(void*& modelHandles, cv::Mat img);

  //模型释放
  DEEPLEANING_OCR_API void DL_OCR_Free(void*& modelHandles);

}

#endif //_DEEPLEANING_OCR_INTERFACE_H_