Skip to content

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Emulation of MOV instructions in M16C architecture do not get affected by preceding INDEX instructions. #7468

Closed
dikidera opened this issue Feb 10, 2025 · 6 comments
Assignees
Labels
Feature: Processor/M16C Status: Triage Information is being gathered Type: Bug Something isn't working

Comments

@dikidera
Copy link

The INDEXB,INDEXBS etc instruction modify the src operand of the next LOAD type instruction such as MOV. However when emulated, despite the INDEX instruction having correct semantics implemented, they do not actually affect the mov.

I have narrowed down the problem in my specific example to

INDEXBS.B  local_f:8[FB]
MOV.B:G    DAT_003200,local_f:8[FB]

and the corresponding two patterns in M16C/80 slaspec are

src5B: imm24_dat                    is indSrc=0 & useSrcByteIndexOffset=1 & b1_s5=0x3; b2_s5=0x2; imm24_dat             { ptr:3 = imm24_dat + **byteIndexOffset**; export *:1 ptr; }    # abs24 (+byteIndexOffset)
src5B: imm24_dat                    is indSrc=0 & b1_s5=0x3; b2_s5=0x2; imm24_dat                                       { export *:1 imm24_dat; }    # abs24 (special constant address case)

While byteIndexOffset is actually set, and useSrcByteIndexOffset is also globalset for the next instruction(the MOV), it seems like byteIndexOffset is not evaluated.
I have used the DebuGSleighInstructionParse plugin and produced this output

check pattern[1 of 2] src5B: {line# 356} <imm24_dat>
        (  byte pattern: mask=01110000.00110000.00000000.00000000
                   bytes[0-3]=10110010.11101011.00000000.00110010
                  match-value=00110000.00100000.00000000.00000000 Matched
        ) -and- (
           context pattern: mask=01000010.00000000.00000000.00000000
                  context(0..31)=01001000.00000000.00000000.00000000
                     match-value=01000000.00000000.00000000.00000000 Matched
                       indSrc(6,6) == 0x0 Match
                       ***useByteIndexOffset**(1,2) == 0x2 Match
                       **useSrcByteIndexOffset**(1,1) == 0x1 Match
        ) Matched
        imm24_dat: TokenField = 0x3200
  dst5B_afterSrc5: resolving...
     check pattern[1 of 1] dst5B_afterSrc5: {line# 505} <dst5B>
        byte pattern: always-Matched
           Set dstFollowsSrc(7,8) = 0x1
        dst5B: resolving...
           decide on instruction bits: byte-offset=0, bitrange=(4,6), value=0x1, bytes=1011(001)0
              decendent constructors for decision node (complete tree dump ordered by line number):
               dst5B: {line# 439} <dst5dsp8>[<b2_d5_regAxSF>]
               dst5B: {line# 440} [<dst5dsp8>[<b2_d5_regAxSF>]]
           check pattern[1 of 2] dst5B: {line# 440} [<dst5dsp8>[<b2_d5_regAxSF>]]
              (  byte pattern: mask=00001110.00000000.00000000.00000000
                         bytes[0-3]=10110010.11101011.00000000.00110010
                        match-value=00000010.00000000.00000000.00000000 Matched
              ) -and- (
                 context pattern: mask=00000100.00000000.00000000.00000000
                        context(0..31)=01001000.10000000.00000000.00000000
                           match-value=00000100.00000000.00000000.00000000 Failed
                             indDst(5,5) == 0x1 Failed (=0x0)
              ) Failed
           check pattern[2 of 2] dst5B: {line# 439} <dst5dsp8>[<b2_d5_regAxSF>]
              byte pattern: mask=00001110.00000000.00000000.00000000
                      bytes[0-3]=10110010.11101011.00000000.00110010
                     match-value=00000010.00000000.00000000.00000000 Matched
              dstIndexOffset: resolving...
                 decide on context bits: bitrange=(2,2), value=0x0, context=01(0)01000.10000000.00000000.00000000
                    decendent constructors for decision node (complete tree dump ordered by line number):
                     dstIndexOffset: {line# 419} 
                 check pattern[1 of 1] dstIndexOffset: {line# 419} 
                    context pattern: mask=00100000.00000000.00000000.00000000
                           context(0..31)=01001000.10000000.00000000.00000000
                              match-value=00000000.00000000.00000000.00000000 Matched
                                useDstByteIndexOffset(2,2) == 0x0 Match
                                *useByteIndexOffset(1,2) == 0x0 Failed (=0x2)
              b2_d5_regAxSF: resolving...
                 b2_d5_regAxSF: register SB (size:3)
              dst5dsp8: resolving...
                 check pattern[1 of 2] dst5dsp8: {line# 424} <simm8_dat>:8
                    byte pattern: mask=11000000.00000000.00000000.00000000
                            bytes[1-4]=11101011.00000000.00110010.00000000
                           match-value=11000000.00000000.00000000.00000000 Matched
                    b1_d5: TokenField = 0x1
                    skipBytesBeforeDst5: resolving...
                       decide on context bits: bitrange=(7,7), value=0x0, context=0100100(0).10000000.00000000.00000000
                          decendent constructors for decision node (complete tree dump ordered by line number):
                           skipBytesBeforeDst5: {line# 411} <b1_s5>, <b2_s5>
                           skipBytesBeforeDst5: {line# 412} <b2_s5>, <imm8_dat>
                           skipBytesBeforeDst5: {line# 413} <b2_s5>, <imm16_dat>
                           skipBytesBeforeDst5: {line# 414} <b2_s5>, <imm24_dat>
                           skipBytesBeforeDst5: {line# 415} <imm16_dat>
                       decide on instruction bits: byte-offset=0, bitrange=(2,3), value=0x3, bytes=10(11)0010
                          decendent constructors for decision node (complete tree dump ordered by line number):
                           skipBytesBeforeDst5: {line# 411} <b1_s5>, <b2_s5>
                           skipBytesBeforeDst5: {line# 414} <b2_s5>, <imm24_dat>
                           skipBytesBeforeDst5: {line# 415} <imm16_dat>
                       check pattern[1 of 3] skipBytesBeforeDst5: {line# 415} <imm16_dat>
                          (  byte pattern: mask=01110000.00110000.00000000.00000000
                                     bytes[0-3]=10110010.11101011.00000000.00110010
                                    match-value=00110000.00110000.00000000.00000000 Failed
                          ) -and- (
                             context pattern: mask=00000001.10000000.00000000.00000000
                                    context(0..31)=01001000.10000000.00000000.00000000
                                       match-value=00000000.10000000.00000000.00000000 Matched
                                         dstFollowsSrc(7,8) == 0x1 Match
                          ) Failed
                       check pattern[2 of 3] skipBytesBeforeDst5: {line# 414} <b2_s5>, <imm24_dat>
                          (  byte pattern: mask=01110000.00000000.00000000.00000000
                                     bytes[0-3]=10110010.11101011.00000000.00110010
                                    match-value=00110000.00000000.00000000.00000000 Matched
                          ) -and- (
                             context pattern: mask=00000001.10000000.00000000.00000000
                                    context(0..31)=01001000.10000000.00000000.00000000
                                       match-value=00000000.10000000.00000000.00000000 Matched
                                         dstFollowsSrc(7,8) == 0x1 Match
                          ) Matched
                          b2_s5: TokenField = 0x2
                          imm24_dat: TokenField = 0x3200
                    simm8_dat: TokenField = -0xc

Prototype parse successful: MOV.B:G 0x3200,-0xc:8[FB]
Instruction length = 6 bytes
Instr Mask: 11111111.00111111.00000000.00000000.00000000.00000000
Instr Value: 10110010.00101011.00000000.00000000.00000000.00000000
Op-0 Mask: 00000000.00000000.11111111.11111111.11111111.00000000
Op-0 Value: 00000000.00000000.00000000.00110010.00000000.00000000
Op-1 Mask: 00000000.11000000.00000000.00000000.00000000.11111111
Op-1 Value: 00000000.11000000.00000000.00000000.00000000.11110100
DebugSleighInstructionParse.java> Finished!

From which we see that byteIndexOffset is not added to the imm24_dat token, despite the correct pattern being selected.

@GhidorahRex GhidorahRex self-assigned this Feb 10, 2025
@GhidorahRex GhidorahRex added Type: Bug Something isn't working Status: Triage Information is being gathered Feature: Processor/M16C labels Feb 10, 2025
@GhidorahRex
Copy link
Collaborator

I would like to reproduce the issue. Can you share the bytes for the pair of instructions?

@dikidera
Copy link
Author

dikidera commented Feb 10, 2025

c2 c3 f5     ->          INDEXBS.B  local_f:8[FB]
b2 eb 00 32 00 f5   - >  MOV.B:G    DAT_003200,local_f:8[FB]

I am also not sure if worth mentioning, however the stack variables applied here are also displayed incorrect, and are off by 4 bytes. Though the actual executed instructions reference memory correctly, as evidenced by the printed sleigh debug output.

Basically local_f is supposed to be local_B. But it's just a visual error.

The execution of INDEXBS.B correctly sets byteIndexOffset to the expected value, but MOV.B:G never uses it, it only ever references the immediate value 003200.

@GhidorahRex
Copy link
Collaborator

the local_f is just a name given to the variable. When I look at the actual operands I see -0xb.

I also see byteIndexOffset being set by INDEXBS.B and MOV.B:G adding it to FB before loading the value in the pcode (I would suggest you check this yourself in Edit Listing Fields... above the Listing and enable the PCode field). However, in my debugSleighInstructionParse run I get line 355:

decide on instruction bits: byte-offset=0, bitrange=(11,11), value=0x0, bytes=10110010.111(0)1011
            descendant constructors for decision node (complete tree dump ordered by line number):
             src5B: {line# 355} <imm24_dat>
             src5B: {line# 356} <imm24_dat>
             src5B: {line# 357} [<imm24_dat>]
         decide on context bits: bitrange=(6,6), value=0x0, context=010010(0)0.00000000.00000000.00000000
            descendant constructors for decision node (complete tree dump ordered by line number):
             src5B: {line# 355} <imm24_dat>
             src5B: {line# 356} <imm24_dat>
         check pattern[1 of 2] src5B: {line# 355} <imm24_dat>
            (  byte pattern: mask=01110000.00110000.00000000.00000000
                       bytes[0-3]=10110010.11101011.00000000.00110010
                      match-value=00110000.00100000.00000000.00000000 Matched
            ) -and- (
               context pattern: mask=01000010.00000000.00000000.00000000
                      context(0..31)=01001000.00000000.00000000.00000000
                         match-value=01000000.00000000.00000000.00000000 Matched
                           indSrc(6,6) == 0x0 Match
                           *useByteIndexOffset(1,2) == 0x2 Match
                           useSrcByteIndexOffset(1,1) == 0x1 Match
            ) Matched
            imm24_dat: TokenField = 0x3200

There should probably be a useSrcByteIndexOffset=0 in the sub-constructor on line 356 to ensure it isn't erroneously matched.

@dikidera
Copy link
Author

dikidera commented Feb 10, 2025

It does not work for me even after the suggested change. And yes byteIndexOffset is set, but just never used. I am using GhidraEmu and had to significantly modify it to emulate the behaviour of INDEXBS and only in single stepping mode (emuhelper.step()).

No matter what the src value is of INDEXBS it is never added to the immediate value in the MOV, always resulting in LOADS from the base immediate value.

So if the base is 003200 then all reads are that base address instead of 003200 + byteIndexOffset and that is what is stored in the stack.

Otherwise, I do see it.

Image

In this case, byteIndexOffset = 0x8d. Added to 0x003200 = 0x00328D, which has the value 0x5d but the actual value loaded is from base 0x003200 where its 0x63.

@GhidorahRex
Copy link
Collaborator

In your screenshot, you can see the pcode that should be adding byteIndexOffset which is the register at 0x3009 in the sleigh file. (unique, 0x3800, 3) = INT_ADD (const, 0x3200, 3), (register, 0x3009, 3)

So the issue seems to be possibly in the GhidraEmu? I've never used that.

@dikidera
Copy link
Author

I wonder if the usage of the writeRegister apis before the actual emuhelper.step() is somehow overwriting part of the context/clearing the opcodes?

Or perhaps it's this

public void setEmuStackBytes() {
    byte[] dest = new byte[stackSize];
    try {
        program.getMemory().getBytes(stackStart, dest);
    } catch (MemoryAccessException e) {            
        e.printStackTrace();
    }
    emuHelper.writeMemory(stackStart, dest);
}

@GhidorahRex GhidorahRex converted this issue into discussion #7478 Feb 11, 2025

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

Labels
Feature: Processor/M16C Status: Triage Information is being gathered Type: Bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants