Lab Week 04 实验报告

实验内容：

熟悉 Linux 下 x86 汇编语言编程环境
验证实验：Blum’sBook: Sample programs in Chapter08, 10 (Basic Math Functions and Using Strings)

文章目录

Lab Week 04 实验报告
👽Preview
- Basic knowledge
- Experiment

👽Preview

Basic knowledge

Size of numbers( or operands)

Not a few instruction related to numbers usually followed by a indicator letter, like b (for byte), w (for word), or l (for doubleword)

2.下表是通用寄存器（GPRs）所能储存的位数对应的寄存器名称。

Experiment

Chapter08 Basic Math Functions

A. Integer Arithmetic

1.Addition

The ADD instruction

The ADD instruction format is

add source, destination

Notice:

The destination parameter can be either a register or a value stored in a memory location
cannot use a memory location for both the source and destination at the same time
The result of the addition is placed in the destination location.

Assembly Code:

# addtest1.s - An example of the ADD instruction
.section .data
data:
   .int 40
.section .text
.globl _start
_start:
   nop
   movl $0, %eax
   movl $0, %ebx
   movl $0, %ecx
   movb $20, %al # 将20移动到8-bit 寄存器AL中
   addb $10, %al # 将10与寄存器AL中的值相加，结果保存在寄存器AL中
   movsx %al, %eax # movsx指令能对符号位进行扩展
   movw $100, %cx 
   addw %cx, %bx 
   movsx %bx, %ebx  
   movl $100, %edx
   addl %edx, %edx
   addl data, %eax  #data位于memory中，将寄存器EAX中的值与data中的值相加，
   addl %eax, data 	
   movl $1, %eax
   movl $0, %ebx
   int $0x80

Result:

The additions were performed as expected.

Assembly Code:

# addtest2.s - An example of the ADD instruction and negative numbers
.section .data
data:
   .int -40
.section .text
.globl _start
_start:
   nop
   movl $-10, %eax
   movl $-200, %ebx
   movl $80, %ecx
   addl data, %eax
   addl %ecx, %eax
   addl %ebx, %eax
   addl %eax, data
   addl $210, data
   movl $1, %eax
   movl $0, %ebx
   int $0x80

Result:

We can easily calculate those results by ourselves, there is nothing unusual, just tell you the negative numbers still works in this situation.

Detecting a carry or overflow condition

The ADDB instruction, the carry flag is set if the result is over 255, but in the ADDW instruction, it is not set unless the result is over 65,535.

Assembly Code:

# addtest3.s  An example of detecting a carry condition
.section .text
.globl _start
_start:
   nop
   movl $0, %ebx
   movb $190, %bl 
   movb $100, %al
   addb %al, %bl # 190+100=290>255, exceeds the limit 255
   jc over # jump when Carry flag is set
   movl $1, %eax
   int $0x80
over:
   movl $1, %eax
   movl $0, %ebx
   int $0x80

Result:

We can use instruction echo $? to check the return value of register EBX, which will return 290 unless the carry flag is set and will jump to ‘over’ section and return 0;

But, if we change the code

movb $190, %bl  
movb $100, %al -> movb $10, %al

Let’s see what happen

🤖题外话： 从一开始就很好奇想知道.o文件里面到底藏着什么猫腻，但一直没去做，现在试着打开.o文件看看里面到底有什么内容。

这是section .text:的部分

在这里我使用了指令来打开重定位文件addtest3.o，看看里面到底写的是什么。

打开后让我觉得特别兴奋，让我挑出其中几行代码来试着分析一下吧。

 8:   b0 0a                   mov    $0xa,%al

很明显，它是把0xa也就是十进制的10，移动到AL寄存器中。

c:   72 07                   jb     15 <over>

这句可以看到jump指令，跳转地址是section over的15

Detecting an error in a signed integer addition

Assembly Code:

# addtest4.s - An example of detecting an overflow condition
.code32
.section .data
output:
   .asciz "The result is %d\n"
.section .text
.globl _start
_start:
   movl $-1590876934, %ebx
   movl $-1259230143, %eax
   addl %eax, %ebx
   jo over
   pushl %ebx
   pushl $output
   call printf
   add  $8, %esp
   pushl $0
   call exit
over:
   pushl $0
   pushl $output
   call printf
   add  $8, %esp
   pushl $0
   call exit

Result:

Screenshot 2022-03-08 at 8.00.52 AM

Apparently, overflow flag has been set, and jo over jump to ‘over’ section, so we got 0 by using printf.

The ADC instruction

The ADC instruction can be used to add two unsigned or signed integer values, along with the value contained in the carry flag from a previous ADD instruction.

If we want to add two 64-bit values by using 32-bit registers, there may have carry bit produced by low-order 32-bit, and it should be added to upper 32-bit addition.

In this case, we use ADC instruction to help us implement this.

Assembly Code:

# adctest.s - An example of using the ADC instruction
.section .data
data1:
   .quad 7252051615
data2:
   .quad 5732348928
output:
   .asciz "The result is %qd\n"
.section .text
.globl _start
_start:
   movl data1, %ebx #取低32位
   movl data1+4, %eax # 移动4个字节，相当于取高32位，注意是小端存储方式
   movl data2, %edx
   movl data2+4, %ecx
   addl %ebx, %edx # 先将低32位加起来
   adcl %eax, %ecx # add the two high-order registers, along with the carry flag.
   pushl %ecx
   pushl %edx
   push $output
   call printf
   addl $12, %esp
   pushl $0
   call exit

Result:

Before ADD instruction:

After ADD instruction:

Then we got the answer.

2. Subtraction

The SUB instruction

Assembly Code:

# subtest1.s - An example of the SUB instruction
.section .data
data:
   .int 40
.section .text
.globl _start
_start:
   nop
   movl $0, %eax
   movl $0, %ebx
   movl $0, %ecx
   movb $20, %al
   subb $10, %al
   movsx %al, %eax
   movw $100, %cx
   subw %cx, %bx # subtracts the value in CX from the value in register BX
   movsx %bx, %ebx
   movl $100, %edx
   subl %eax, %edx
   subl data, %eax
   subl %eax, data
   movl $1, %eax
   movl $0, %ebx
   int $0x80

Result:

Take one example

subb $10, %al subtracts 10 from the value in register AL, So we got 10 in register AL.

subl %eax, data the last sub instruction, subtracts the value in the EAX register (-30) from the value at the data1
memory location (40): so we got 70.

Carry and overflow with subtraction

Assembly Code:

# subtest2.s - An example of a subtraction carry
.section .text
.globl _start
_start:
   nop
   movl $5, %eax  # 二进制 101
   movl $2, %ebx  # 二进制 010
   subl %eax, %ebx # 010 - 101 这时产生借位
   jc under
   movl $1, %eax
   int $0x80
under:
   movl $1, %eax
   movl $0, %ebx
   int $0x80

Result:

The result(the value in EBX) will be either the subtraction value or 0 if CF is set and jc under will jump to ‘under’ section then set EBX to 0. Apparently, the result is 0.

Let’s view the subtraction value in EBX.

The carry flag was set when the result was less than zero (which is invalid in unsigned integers) besides. It is used to determine when subtracting unsigned integers produces a negative result.

Subtracting with signed integers

Assembly Code:

# subtest3.s - An example of an overflow condition in a SUB instruction
.code32
.section .data
output:
   .asciz "The result is %d\n"
.section .text
.globl _start
_start:
   movl $-1590876934, %ebx
   movl $1259230143, %eax
   subl %eax, %ebx  # -1590876934-1259230143  -> too large for the 32-bit EBX register
   jo over 
   pushl %ebx
   pushl $output
   call printf
   add  $8, %esp
   pushl $0
   call exit
over:
   pushl $0
   pushl $output
   call printf
   add  $8, %esp
   pushl $0
   call exit

Result:

As the result above, the subtraction result is out of limit, overflow flag has been set, and jump to ‘over’ section, then print 0;

The SBB instruction

This instruction is similar to ADC instruction which was introduced in last section previously.

The format of the SBB instruction is:
sbb source, destination

Notice: both source and destination can be 8- , 16-, or 32-bit registers or values in memory, but you cannot use memory locations for both the source and destination values at the same time.

Usage:

When the previous SUB instruction is executed and a carry results, the carry bit is “borrowed” by the SBB instruction to continue the subtraction on the next data pair. (As demonstrated in the figure above)

Assembly Code:

# sbbtest.s - An example of using the SBB instruction
.section .data
data1:
   .quad 7252051615
data2:
   .quad 5732348928
output:
   .asciz "The result is %qd\n"
.section .text
.globl _start
_start:
   nop
   movl data1, %ebx
   movl data1+4, %eax
   movl data2, %edx
   movl data2+4, %ecx
   subl %ebx, %edx
   sbbl %eax, %ecx
   pushl %ecx
   pushl %edx
   push $output
   call printf
   add  $12, %esp
   pushl $0
   call exit

Result:

3. Multiplication

Unsigned integer multiplication using MUL

! The MUL instruction can only be used for unsigned integers

The format for the MUL instruction is
mul source

The MUL instruction is used to multiply two unsigned integers.

? : How MUL instruction multiply two unsigned integers with just one operand?

Ans : one of the operands used in the multiplication must be placed in the AL, AX, or EAX registers, depending on the size of the value.

Assembly Code:

# multest.s - An example of using the MUL instruction
.section .data
data1:
   .int 315814
data2:
   .int 165432
result:
   .quad 0
output:
   .asciz "The result is %qd\n"
.section .text
.globl _start
_start:
   nop
   movl data1, %eax
   mull data2
   movl %eax, result # 低32位
   movl %edx, result+4 # 高32位
   pushl %edx
   pushl %eax
   pushl $output
   call printf
   add $12, %esp
   pushl $0
   call exit

Result:

In this case, the result from the EDX:EAX register pair is both loaded into a 64-bit memory location (using indexed memory access) and displayed using the printf C function.

Signed integer multiplication using IMUL

The IMUL instruction can be used by both signed and unsigned integers

For larger values, the IMUL instruction is only valid for signed integers.

The first format of the IMUL instruction

imul source or imul source, destination

or imul multiplier, source, destination

I.For the first format, the source operand can be an 8-, 16-, or 32-bit register or value in memory, and it is multiplied with the implied operand located in the AL, AX, or EAX registers (depending on the source operand size). The result is then placed in the AX register, the DX:AX register pair(32-bit), or the EDX:EAX (64-bit) register pair.

II.For the second format, the source operand can be a 16- or 32-bit register or value in memory, and destination must be a 16- or 32-bit GPR. Putting the result to what register(the destination) is determined by yourself.

III.For the third format, where multiplier is an immediate value, source is a 16- or 32-bit register or value in memory, and destination must be a general-purpose register. You can perform a quick multiplication of a value (the source) with a signed integer (the multiplier), storing the result in a GPR (the destination).

Assembly Code:

# imultest.s - An example of the IMUL instruction formats
.section .data
value1:
   .int 10
value2:
   .int -35
value3:
   .int 400
.section .text
.globl _start
_start:
   nop
   movl value1, %ebx
   movl value2, %ecx
   imull %ebx, %ecx  # imul source, destination
   movl value3, %edx
   imull $2, %edx, %eax # imul multiplier source, destination
   movl $1, %eax
   movl $0, %ebx
   int $0x80

Result:

Getting this result is obvious though.

Detecting overflows

Every calculation can suffer overflows and that is inevitably will cost unless you detecting them before you using them.

Assembly Code:

# imultest2.s - An example of detecting an IMUL overflow
.section .text
.globl _start
_start:
   nop
   movw $680, %ax
   movw $100, %cx
   imulw %cx
   jo over
   movl $1, %eax
   movl $0, %ebx
   int $0x80
over:
   movl $1, %eax
   movl $1, %ebx
   int $0x80

Result:

As is demonstrated in the figure above, the returned value is 1, which indicate the multiplication is overflow, and the jo over instruction is executed and then jump to ‘over’ section. ( The limit of 16-bit register is 65,535)

4. Division

Unsigned division

The format of the DIV instruction is
div divisor

where divisor is the value that is divided into the implied dividend, and can be an 8-, 16-, or 32-bit register or value in memory

The dividend must already be stored in the AX register (for a 16-bit value), the
DX:AX register pair (for a 32-bit value), or the EDX:EAX register pair (for a 64-bit value) before the DIV instruction is performed.

The result of the division is two separate numbers: the quotient and the remainder. Both values are stored in the same registers used for the dividend value.

The following table shows how this is set up.

Assembly Code:

# divtest.s - An example of the DIV instruction
.section .data
dividend:
   .quad 8335
divisor:
   .int 25
quotient:
   .int 0
remainder:
   .int 0
output:
   .asciz "The quotient is %d, and the remainder is %d\n"
.section .text
.globl _start
_start:
   nop
   movl dividend, %eax
   movl dividend+4, %edx
   divw divisor                      # WRONG prefix!
   movl %eax, quotient
   movl %edx, remainder
   pushl remainder
   pushl quotient
   pushl $output
   call printf
   add  $12, %esp
   pushl $0
   call exit

Because the dividend is a 64-bit value, so it should be loaded into two registers with upper and lower bits respectively in EDX:EAX pair.

Result:

B. Shift Instructions

1.Multiply by shifting

Multiplying integers by a power of 2, the following instructions can help.

SAL (shift arithmetic left) and SHL (shift logical left). Both of these instructions
perform the same operation, and are interchangeable. They have three different formats:

sal destination shifts the destination value left one position

sal %cl, destination shifts the destination value left by the number of times specified in the CL register.

sal shifter, destination shifts the destination value left the number of times indicated by the shifter value.

The destination operand can be an 8-, 16-, or 32-bit register or value in memory.

Assembly Code:

# saltest.s - An example of the SAL instruction
.section .data
value1:
   .int 25
.section .text
.globl _start
_start:
   nop
   movl $10, %ebx
   sall %ebx #sal destination 左移1位 乘2
   movb $2, %cl
   sall %cl, %ebx # sal %cl, destination 左移2位 乘4
   sall $2, %ebx # sal shifter, destination 左移2位 乘4
   sall value1 #sal destination 左移1位 乘2
   sall $2, value1 # sal shifter, destination 左移2位 乘4
   movl $1, %eax
   movl $0, %ebx
   int $0x80

Result:

After all sall instructions were performed, the result in EBX and memory is displayed in the figure above.

C. Decimal Arithmetic

1.Unpacked BCD arithmetic

这里讨论的是非压缩型BCD码，它用一个字节表示一个一位十进制数。

而压缩型BCD码则压缩BCD码的每一位用4位二进制表示，一个字节表示两位十进制数。

Notice:

The AAA, AAS, and AAM instructions assume that the previous operation result is placed in the AL register, and converts that value to unpacked BCD format.

The AAD instruction assumes that the dividend value is placed in the AX register
in unpacked BCD format, and converts it to binary format for the DIV instruction to handle.

The result is a proper unpacked BCD value, the quotient in the AL register, and the remainder in the AH register (in unpacked BCD format).

Here is the figure for the following code.

Assembly Code:

# aaatest.s - An example of using the AAA instruction
.section .data
value1:
   .byte 0x05, 0x02, 0x01, 0x08, 0x02  # 28125
value2:
   .byte 0x03, 0x03, 0x09, 0x02, 0x05  # 52933
.section .bss
   .lcomm sum, 6
.section .text
.globl _start
_start:
   nop
   xor %edi, %edi # 异或清零操作
   movl $5, %ecx # 设置循环次数的上限为5
   clc # CLC instruction before the loop to ensure that the carry flag is cleared
loop1:
   movb value1(, %edi, 1), %al #将value1中一子节大小的值移到AL(16-bit)中
   adcb value2(, %edi, 1), %al #将value2中的值与AL中的值相加，并且考虑低位的进位
   aaa # 对加法操作的结果转换成Unpacked BCD
   movb %al, sum(, %edi, 1)
   inc %edi
   loop loop1
   adcb $0, sum(, %edi, 4)
   movl $1, %eax
   movl $0, %ebx
   int $0x80

Result:

After the third time the ADC instruction is executed (the third value place), the AL register contains the value:

这表示0x01+0x09 = 0xa（1010）

当aaa指令执行后：

AL中的值被转换成Unpacked BCD的格式，并且产生了进位1

以下是加法操作的结果，可见sum中的值都为Unpacked BCD格式

Packed BCD arithmetic

Assembly Code:

# dastest.s - An example of using the DAS instruction
.section .data
value1:
   .byte 0x25, 0x81, 0x02
value2:
   .byte 0x33, 0x29, 0x05
.section .bss
   .lcomm result, 4
.section .text
.globl _start
_start:
   nop
   xor %edi, %edi
   movl $3, %ecx
loop1:
   movb value2(, %edi, 1), %al
   sbbb value1(, %edi, 1), %al
   das
   movb %al, result(, %edi, 1)
   inc %edi
   loop loop1
   sbbb $0, result(, %edi, 4)
   movl $1, %eax
   movl $0, %ebx
   int $0x80

Result:

For example, after the first subtraction, the EAX register has the following
value:

00110011 - 00100101 = 00001110 (0x0e)

After das instruction, convert to packed BCD

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-YKTX9jdr-1646812975304)(/Users/maxwell/Library/Application Support/typora-user-images/image-20220309094434690.png)]

D. Logical Operations

Bit testing

Assembly Code:

# cpuidtest.s - An example of using the TEST instruction
.section .data
output_cpuid:
   .asciz "This processor supports the CPUID instruction\n"
output_nocpuid:
   .asciz "This processor does not support the CPUID instruction\n"
.section .text
.globl _start
_start:
   nop
   pushfl
   popl %eax
   movl %eax, %edx
   xor $0x00200000, %eax
   pushl %eax
   popfl
   pushfl
   popl %eax
   xor %edx, %eax
   test $0x00200000, %eax
   jnz cpuid
   pushl $output_nocpuid
   call printf
   add  $4, %esp
   pushl $0
   call exit
cpuid:
   pushl $output_cpuid
   call printf
   add  $4, %esp
   pushl $0
   call exit

Result:

Chapter10 Using Strings

1.Moving Strings

movs instructions use implied source and destination operands

source operand: ESI register

destination operand: EDI register

The instruction
leal output, %edi loads the effective address of an object.
e.g. loads the 32-bit memory location of the output label to the EDI register

Assembly Code:

# movstest1.s - An example of the MOVS instructions
.section .data
value1:
   .ascii "This is a test string.\n"
.section .bss
   .lcomm output, 23
.section .text
.globl _start
_start:
   nop
   leal value1, %esi # source
   leal output, %edi # destination
   movsb # it moves 1 byte of data from the value1 location to the output location. 
   movsw # it moves 2 byte of data from the value1 location to the output location. 
   movsl # it moves 4 byte of data from the value1 location to the output location. 

   movl $1, %eax
   movl $0, %ebx
   int $0x80

Result:

一个字母为1Byte。 movsb、 movsw、 ``movsl`分别move 1、2、4Byte，也就是1、2、4个字母

Using DF Flags（操作方向标志位 DF(Direction Flag)

If the DF flag is cleared, the ESI and EDI registers are incremented after each MOVS instruction.

If the DF flag is set, the ESI and EDI registers are decremented after each MOVS instruction

Assembly Code:

# movstest2.s - A second example of the MOVS instructions
.section .data
value1:
   .ascii "This is a test string.\n"
.section .bss
   .lcomm output, 23
.section .text
.globl _start
_start:
   nop
   leal value1+22, %esi
   leal output+22, %edi

   std # set DF flag，ESI and EDI registers are decremented after each MOVS instruction
   movsb
   movsw
   movsl

   movl $1, %eax
   movl $0, %ebx
   int $0x80

Result:

这里需要注意的是，在代码中用std设定了DF，这时ESI、EDI寄存器在执行完 ·movs 后都会递减，这时寄存器应该从后往前移动，但是他们读取字符串的方向仍然是向前读取的，在执行完movsb、 movsw、 ``movsl`后ESI、EDI寄存器分别递减1、2、4Byte，这也就是为什么output中有4个地址空间有值。

Copying a large string

Copying a large string can simply use a loop with the MOVL instruction, controlled by the ECX register set to the length of the string.

Assembly Code:

# movstest3.s - An example of moving an entire string
.section .data
value1:
   .ascii "This is a test string.\n"
.section .bss
   .lcomm output, 23
.section .text
.globl _start
_start:
   nop
   leal value1, %esi
   leal output, %edi
   movl $23, %ecx #设置循环上限
   cld  # clear the DF flag，incremented 
loop1:
   movsb  #移动单位为Byte。每执行完毕，ESI和EDI递增
   loop loop1

   movl $1, %eax
   movl $0, %ebx
   int $0x80

Result:

2. The REP prefix

Moving a string byte by byte

rep is used to repeat a string instruction a specific number of times, controlled by the value in the ECX register.

Assembly Code:

# reptest1.s - An example of the REP instruction
.section .data
value1:
   .ascii "This is a test string.\n"
.section .bss
   .lcomm output, 23
.section .text
.globl _start
_start:
   nop
   leal value1, %esi
   leal output, %edi
   movl $23, %ecx # The size of the string
   cld
   rep movsb # move a single byte of data 23 times
   movl $1, %eax
   movl $0, %ebx
   int $0x80

Result:

Moving strings block by block

rep在指令movsl上同样适用，这样能够一次移动4个字符。

Assembly Code:

# reptest2.s - An incorrect example of using the REP instruction
.section .data
value1:
   .ascii "This is a test string.\n"
value2:
   .ascii "Oops"
.section .bss
   .lcomm output, 23
.section .text
.globl _start
_start:
   nop
   leal value1, %esi
   leal output, %edi
   movl $6, %ecx #  loop 6 times to move blocks of 4 bytes of data.
   cld # clear the DF flag，incremented  incremented
   rep movsl

   movl $1, %eax
   movl $0, %ebx
   int $0x80

Result:

It will erroneously pick up a byte from the next string defined.(the output end with ‘0’ )

(因为Value1中的字符串长为23，这里进行6次循环，每次移动4个字符，共24个字符，因此最后会有一个多余字符的出现)

Moving large strings

Assembly Code:

# reptest3.s - Moving a large string using MOVSL and MOVSB
.section .data
string1:
   .asciz "This is a test of the conversion program!\n"
length:
   .int 43
divisor:
   .int 4
.section .bss
   .lcomm buffer, 43
.section .text
.globl _start
_start:
   nop
   leal string1, %esi
   leal buffer, %edi
   movl length, %ecx
   shrl $2, %ecx # shift the length value right 2 bits, length divided by 4, then quotient value loaded into the ECX register.
   cld
   rep movsl # repeat (ECX) times，move 4bytes per time.
   movl length, %ecx
   andl $3, %ecx
   rep movsb

   movl $1, %eax
   movl $0, %ebx
   int $0x80

Result:

Moving a string in reverse order

Assembly Code:

# reptest4.s - An example of using REP backwards
.section .data
value1:
   .asciz "This is a test string.\n"
.section .bss
   .lcomm output, 24
.section .text
.globl _start
_start:
   nop
   leal value1+22, %esi
   leal output+22, %edi
   movl $23, %ecx
   std  # sets the DF flag，decreased
   rep movsb 

   movl $1, %eax
   movl $0, %ebx
   int $0x80

Result:

从后往前进行move操作，

3. Storing and Loading Strings

The LODS instruction and the STOS instruction

Assembly Code:

# stostest1.s - An example of using the STOS instruction
.section .data
space:
   .ascii " "
.section .bss
   .lcomm buffer, 256
.section .text
.globl _start
_start:
   nop
   leal space, %esi
   leal buffer, %edi
   movl $256, %ecx
   cld
   lodsb # Loads a byte into the AL register
   rep stosb # repeat store

   movl $1, %eax
   movl $0, %ebx
   int $0x80

Result:

从EAX的值可以看到，现在存储在EAX中的是space character，它的ASCII正是0x20，这正是指令lodsb将其move到EAX中的，随后执行指令 stosb ，将EAX中的值store到memory中，这里存到了 256-byte的 buffer中。

Building your own string functions

Assembly Code:

# convert.s - Converting lower to upper case
.section .data
string1:
   .asciz "This is a TEST, of the conversion program!\n"
length:
   .int 43
.section .text
.globl _start
_start:
   nop
   leal string1, %esi
   movl %esi, %edi
   movl length, %ecx
   cld
loop1:
   lodsb
   cmpb $'a', %al # 字符a与AL中的值比
   jl skip  # 如果(al)<'a'跳过
   cmpb $'z', %al
   jg skip
   subb $0x20, %al
skip:
   stosb # The STOSB instruction is run on each character, and then the code loops back for the next character until it runs out of characters in the string
   loop loop1
end:
   pushl $string1
   call printf
   addl $4, %esp # 栈指针上移，
   pushl $0
   call exit

Result:

这里通过 cmpb指令判断字符是否在a～z这个range里面，如果在此范围内，则将它减20，这相当于将它转换成大写字母。

4. Comparing Strings

Assembly Code:

# cmpstest1.s - A simple example of the CMPS instruction
.section .data
value1:
   .ascii "Test" # 4个Bytes
value2:
   .ascii "Test"  # 4个Bytes
.section .text
.globl _start
_start:
   nop
   movl $1, %eax
   leal value1, %esi
   leal value2, %edi
   cld
   cmpsl
   je equal # Compares a doubleword (4 bytes) value
   movl $1, %ebx
   int $0x80
equal:
   movl $0, %ebx
   int $0x80

Result:

这说明两个比较的字符串是相等的。

Using REP with CMPS

If non match has been detected, the ECX register will contain the position of the mismatched character (counting back from the end of the string).

Assembly Code:

# cmpstest2.s - An example of using the REPE CMPS instruction
.section .data
value1:
   .ascii "This is a test of the CMPS instructions"
value2:
   .ascii "This is a test of the CMPS Instructions"
.section .text
.globl _start
_start:
   nop
   movl $1, %eax
   lea value1, %esi
   leal value2, %edi
   movl $39, %ecx
   cld
   repe cmpsb
   je equal
   movl %ecx, %ebx
   int $0x80
equal:
   movl $0, %ebx
   int $0x80

Result:

这里进行逐字节的字符比较，value1和value2中只有i和I不同，cmpsb一旦检测到字符不同，会设定zero flag，而je 指令检测zero flag，如果被set，则不执行jump。这里返回的是不匹配的第一个字符的index（index从后往前数）。

String inequality

Assembly Code:

# strcomp.s - An example of comparing strings
.section .data
string1:
   .ascii "test"
length1:
   .int 4
string2:
   .ascii "test1"
length2:
   .int 5
.section .text
.globl _start
_start:
   nop
   lea string1, %esi
   lea string2, %edi
   movl length1, %ecx
   movl length2, %eax
   cmpl %eax, %ecx
   ja longer
   xchg %ecx, %eax
longer:
   cld
   repe cmpsb
   je equal
   jg greater
less:
   movl $1, %eax
   movl $255, %ebx
   int $0x80
greater:
   movl $1, %eax
   movl $1, %ebx
   int $0x80
equal:
   movl length1, %ecx
   movl length2, %eax
   cmpl %ecx, %eax
   jg greater
   jl less
   movl $1, %eax
   movl $0, %ebx
   int $0x80

Result:

这里将"test"和"test1"用cmpl %eax, %ecx进行比较，显然，第一个字符串比第二个小，因此跳转到“less”。

从返回值即可看出。

5. The SCAS instruction

The SCAS instructions use an implied destination operand of the EDI register. The EDI register must contain the memory address of the string to scan.

The ESI register contains the search character.

The ECX register contains the position from the end of the string that contains the search character.

❑ REPE: Scans the string characters looking for a character that does not match the search
character

❑ REPNE: Scans the string characters looking for a character that matches the search character.

Assembly Code:

# scastest1.s - An example of the SCAS instruction
.section .data
string1:
   .ascii "This is a test - a long text string to scan."
length:
   .int 44
string2:
   .ascii "-"
.section .text
.globl _start
_start:
   nop
   leal string1, %edi 
   leal string2, %esi # 当前要搜索的是'-'
   movl length, %ecx
   lodsb
   cld
   repne scasb
   jne notfound
   subw length, %cx
   neg %cx
   movl $1, %eax
   movl %ecx, %ebx
   int $0x80
notfound:
   movl $1, %eax
   movl $0, %ebx
   int $0x80

Result:

返回目标字符的index 16。

Scanning for multiple characters

SCASW和SCASL指令遍历字符串，在AX或EAX寄存器中查找字符序列，但它们不执行逐字符比较。相反，在每次比较之后，EDI寄存器增加2(用于SCASW)或4(用于SCASL)，而不是增加1。

Assembly Code:

# scastest2.s - An example of incorrectly using the SCAS instruction
.section .data
string1:
   .ascii "This is a test - a long text string to scan."
length:
   .int 11
string2:
   .ascii "test"
.section .text
.globl _start
_start:
   nop
   leal string1, %edi
   leal string2, %esi
   movl length, %ecx
   lodsl
   cld
   repne scasl
   jne notfound
   subw length, %cx
   neg %cx
   movl $1, %eax
   movl %ecx, %ebx
   int $0x80
notfound:
   movl $1, %eax
   movl $0, %ebx
   int $0x80

Result:

Finding a string length

利用寻找terminating zero来计算出字符串的长度。

Assembly Code:

# strsize.s - Finding the size of a string using the SCAS instruction
.section .data
string1:
   .asciz "Testing, one, two, three, testing.\n"
.section .text
.globl _start
_start:
   nop
   leal string1, %edi
   movl $0xffff, %ecx # The ECX register will keep track of how many iterations it takes to find the terminating zero in the string. 
   movb $0, %al
   cld
   repne scasb
   jne notfound
   subw $0xffff, %cx
   neg %cx
   dec %cx
   movl $1, %eax
   movl %ecx, %ebx
   int $0x80
notfound:
   movl $1, %eax
   movl $0, %ebx
   int $0x80

Result: