验证码识别是图像分类中的一个应用场景,常用于自动化测试、爬虫识别等领域。本文将演示如何用 Java 语言和深度学习框架 Deeplearning4j 实现一个简单的图像验证码识别系统。
1. 准备开发环境
使用 Maven 搭建项目,添加以下依赖到 pom.xml
:
<dependencies>
<dependency>
<groupId>org.deeplearning4j</groupId>
<artifactId>deeplearning4j-core</artifactId>
<version>1.0.0-beta7</version>
</dependency>
<dependency>
<groupId>org.nd4j</groupId>
<artifactId>nd4j-native-platform</artifactId>
<version>1.0.0-beta7</version>
</dependency>
更多内容访问ttocr.com或联系1436423940 <dependency>
<groupId>org.datavec</groupId>
<artifactId>datavec-api</artifactId>
<version>1.0.0-beta7</version>
</dependency>
</dependencies>
2. 加载验证码数据
假设你已经有一个图片文件夹 captcha_samples/
,图片名形如 7K2B_1.png
,其中 7K2B
是验证码内容。
我们使用 ParentPathLabelGenerator
来从文件名中提取标签信息。
File parentDir = new File("captcha_samples");
FileSplit fileSplit = new FileSplit(parentDir, NativeImageLoader.ALLOWED_FORMATS);
BalancedPathFilter pathFilter = new BalancedPathFilter(new Random(), NativeImageLoader.ALLOWED_FORMATS);
InputSplit[] inputSplits = fileSplit.sample(pathFilter, 0.8, 0.2);
InputSplit trainData = inputSplits[0];
InputSplit testData = inputSplits[1];
3. 图像转换与标签编码
int height = 60;
int width = 160;
int channels = 1;
int outputNum = 36; // 0-9, A-Z
int labelLength = 4;
ImageTransform transform = new ResizeImageTransform(width, height);
ImageRecordReader recordReader = new ImageRecordReader(height, width, channels, new CaptchaLabelGenerator());
recordReader.initialize(trainData);
DataSetIterator trainIter = new RecordReaderDataSetIterator.Builder(recordReader, 32)
.classification(1, outputNum * labelLength)
.build();
自定义 CaptchaLabelGenerator
用于从文件名提取每个字符的索引。
4. 构建 CNN 模型
MultiLayerConfiguration conf = new NeuralNetConfiguration.Builder()
.updater(new Adam(0.001))
.list()
.layer(new ConvolutionLayer.Builder(3, 3).nIn(channels).nOut(32).activation(Activation.RELU).build())
.layer(new SubsamplingLayer.Builder(PoolingType.MAX, new int[]{2,2}).build())
.layer(new ConvolutionLayer.Builder(3, 3).nOut(64).activation(Activation.RELU).build())
.layer(new SubsamplingLayer.Builder(PoolingType.MAX, new int[]{2,2}).build())
.layer(new DenseLayer.Builder().nOut(256).activation(Activation.RELU).build())
.layer(new OutputLayer.Builder(LossFunctions.LossFunction.MCXENT)
.activation(Activation.SOFTMAX)
.nOut(outputNum * labelLength).build())
.setInputType(InputType.convolutionalFlat(height, width, channels))
.build();
MultiLayerNetwork model = new MultiLayerNetwork(conf);
model.init();
model.setListeners(new ScoreIterationListener(10));
5. 训练模型
int epochs = 10;
for (int i = 0; i < epochs; i++) {
model.fit(trainIter);
}
6. 测试验证码识别
NativeImageLoader loader = new NativeImageLoader(height, width, channels);
INDArray image = loader.asMatrix(new File("captcha_samples/7K2B_1.png"));
image = image.divi(255);
INDArray output = model.output(image);
int[] predictedIndices = output.argMax(1).toIntVector();
String[] chars = "0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ".split("");
StringBuilder sb = new StringBuilder();
for (int i = 0; i < labelLength; i++) {
int idx = predictedIndices[i];
sb.append(chars[idx % outputNum]);
}
System.out.println("Predicted: " + sb.toString());