Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GXR and beebjit #40

Open
ojwb opened this issue Dec 11, 2020 · 12 comments
Open

GXR and beebjit #40

ojwb opened this issue Dec 11, 2020 · 12 comments

Comments

@ojwb
Copy link
Collaborator

ojwb commented Dec 11, 2020

Currently we disable the GXR in Rocket mode (and the other long running modes) because apparently it makes things slow with beebjit. I tried it to see, but it seems to not affect the speed for me (here I patched in a different rocket emoji to run beebjit with the GXR loaded - compare the beebjit command lines in the log messages):

$ echo '🚀REP.P."POO";:U.0'|time node client try
[11/12/2020 13:57:33 ] [LOG]    Cli0: 
[11/12/2020 13:57:33 ] [LOG]    Cli0: Running try from @try
[11/12/2020 13:57:33 ] [LOG]    Cli0: 
 {
  emulator: 'beebjit',
  flags: '-cycles 25600000000 -frame-cycles 21600000000 -opt video:border-chars=0',
  compressed: false,
  input: 'REP.P."POO";:U.0',
  isBASIC: true,
  rude: false
}
[11/12/2020 13:57:33 ] [LOG]    Cli0: Loading OS from roms/master/mos3.20
[11/12/2020 13:57:51 ] [LOG]    Cli0: cd beebjit && ./beebjit -fast -headless -frames-dir ../tmp/ -cycles 25600000000 -frame-cycles 21600000000 -opt video:border-chars=0 -commands 'breakat 725000;c;loadmem ../tmp/tweet.bas 1900;loadmem ../tmp/keys.bin 03e0;writem 02e1 e4;writem 0000 11;writem 0001 19;writem 0002 11;writem 0003 19;writem 0012 11;writem 0013 19;c'
[11/12/2020 13:57:51 ] [LOG]    Cli0: beebjit DONE in 17.739s 
[11/12/2020 13:57:51 ] [LOG]    Cli0: Captured 1 frames (1 unique) ./tmp/beebjit_frame_
[11/12/2020 13:57:51 ] [LOG]    Cli0: Ffmpeg DONE in 0.169s 
[11/12/2020 13:57:51 ] [LOG]    Cli0: Generated image/png
18.10user 0.20system 0:18.35elapsed 99%CPU (0avgtext+0avgdata 64812maxresident)k
0inputs+2856outputs (1major+26212minor)pagefaults 0swaps
$ echo '🙯REP.P."POO";:U.0'|time node client try
[11/12/2020 14:08:34 ] [LOG]    Cli0: 
[11/12/2020 14:08:34 ] [LOG]    Cli0: Running try from @try
[11/12/2020 14:08:34 ] [LOG]    Cli0: 
 {
  emulator: 'beebjit',
  flags: '-rom 7 roms/gxr.rom -cycles 25600000000 -frame-cycles 21600000000 -opt video:border-chars=0',
  compressed: false,
  input: 'REP.P."POO";:U.0',
  isBASIC: true,
  rude: false
}
[11/12/2020 14:08:34 ] [LOG]    Cli0: Loading OS from roms/master/mos3.20
[11/12/2020 14:08:51 ] [LOG]    Cli0: cd beebjit && ./beebjit -fast -headless -frames-dir ../tmp/ -rom 7 roms/gxr.rom -cycles 25600000000 -frame-cycles 21600000000 -opt video:border-chars=0 -commands 'breakat 725000;c;loadmem ../tmp/tweet.bas 1c00;loadmem ../tmp/keys.bin 03e0;writem 02e1 e4;writem 0000 11;writem 0001 1c;writem 0002 11;writem 0003 1c;writem 0012 11;writem 0013 1c;c'
[11/12/2020 14:08:51 ] [LOG]    Cli0: beebjit DONE in 17.194s 
[11/12/2020 14:08:51 ] [LOG]    Cli0: Captured 1 frames (1 unique) ./tmp/beebjit_frame_
[11/12/2020 14:08:51 ] [LOG]    Cli0: Ffmpeg DONE in 0.215s 
[11/12/2020 14:08:51 ] [LOG]    Cli0: Generated image/png
17.71user 0.12system 0:17.82elapsed 100%CPU (0avgtext+0avgdata 67300maxresident)k
0inputs+2840outputs (1major+25819minor)pagefaults 0swaps

Perhaps this got resolved in beebjit? Or does it only affect certain cases?

@8bitkick
Copy link
Owner

8bitkick commented Dec 14, 2020

Try writing PLOT 69 pixels in graphics mode (i.e. non GXR graphics, but is still accessing the ROM) in the loop

@ojwb
Copy link
Collaborator Author

ojwb commented Dec 14, 2020

OK, I can see the problem now - 7.872 vs 1626.078 seconds:

$ echo '🚀MO.0:REP.PL.69,RND(1280),RND(1024):U.0'|time node client try
[15/12/2020 11:35:57 ] [LOG]    Cli0: 
[15/12/2020 11:35:57 ] [LOG]    Cli0: Running try from @try
[15/12/2020 11:35:57 ] [LOG]    Cli0: 
 {
  emulator: 'beebjit',
  flags: '-cycles 25600000000 -frame-cycles 21600000000 -opt video:border-chars=0',
  compressed: false,
  input: 'MO.0:REP.PL.69,RND(1280),RND(1024):U.0',
  isBASIC: true,
  rude: false
}
[15/12/2020 11:35:57 ] [LOG]    Cli0: Loading OS from roms/master/mos3.20
[15/12/2020 11:36:05 ] [LOG]    Cli0: cd beebjit && ./beebjit -fast -headless -frames-dir ../tmp/ -cycles 25600000000 -frame-cycles 21600000000 -opt video:border-chars=0 -commands 'breakat 725000;c;loadmem ../tmp/tweet.bas 1900;loadmem ../tmp/keys.bin 03e0;writem 02e1 e4;writem 0000 20;writem 0001 19;writem 0002 20;writem 0003 19;writem 0012 20;writem 0013 19;c'
[15/12/2020 11:36:05 ] [LOG]    Cli0: beebjit DONE in 7.872s 
[15/12/2020 11:36:05 ] [LOG]    Cli0: Captured 1 frames (1 unique) ./tmp/beebjit_frame_
[15/12/2020 11:36:05 ] [LOG]    Cli0: Ffmpeg DONE in 0.192s 
[15/12/2020 11:36:05 ] [LOG]    Cli0: Generated image/png
8.57user 0.11system 0:08.64elapsed 100%CPU (0avgtext+0avgdata 65136maxresident)k
320inputs+2728outputs (21major+29598minor)pagefaults 0swaps
$ echo '🙯MO.0:REP.PL.69,RND(1280),RND(1024):U.0'|time node client try
[15/12/2020 11:36:15 ] [LOG]    Cli0: 
[15/12/2020 11:36:15 ] [LOG]    Cli0: Running try from @try
[15/12/2020 11:36:15 ] [LOG]    Cli0: 
 {
  emulator: 'beebjit',
  flags: '-rom 7 roms/gxr.rom -cycles 25600000000 -frame-cycles 21600000000 -opt video:border-chars=0',
  compressed: false,
  input: 'MO.0:REP.PL.69,RND(1280),RND(1024):U.0',
  isBASIC: true,
  rude: false
}
[15/12/2020 11:36:15 ] [LOG]    Cli0: Loading OS from roms/master/mos3.20
[15/12/2020 12:03:21 ] [LOG]    Cli0: cd beebjit && ./beebjit -fast -headless -frames-dir ../tmp/ -rom 7 roms/gxr.rom -cycles 25600000000 -frame-cycles 21600000000 -opt video:border-chars=0 -commands 'breakat 725000;c;loadmem ../tmp/tweet.bas 1c00;loadmem ../tmp/keys.bin 03e0;writem 02e1 e4;writem 0000 20;writem 0001 1c;writem 0002 20;writem 0003 1c;writem 0012 20;writem 0013 1c;c'
[15/12/2020 12:03:21 ] [LOG]    Cli0: beebjit DONE in 1626.078s 
[15/12/2020 12:03:21 ] [LOG]    Cli0: Captured 1 frames (1 unique) ./tmp/beebjit_frame_
[15/12/2020 12:03:21 ] [LOG]    Cli0: Ffmpeg DONE in 0.189s 
[15/12/2020 12:03:21 ] [LOG]    Cli0: Generated image/png
1601.85user 5.06system 27:06.84elapsed 98%CPU (0avgtext+0avgdata 69832maxresident)k
0inputs+9176outputs (1major+28239minor)pagefaults 0swaps

@8bitkick
Copy link
Owner

Talking to @scarybeasts we need to benchmark a few other modes with GXR -inturbo or -interp and see how they compare.

@ojwb
Copy link
Collaborator Author

ojwb commented Nov 27, 2022

With a newer beebjit the performance is much better:

$ echo '🚀MO.0:REP.PL.69,RND(1280),RND(1024):U.0'|time node client try
[27/11/2022 19:06:11 ] [LOG]    Cli0: 
[27/11/2022 19:06:11 ] [LOG]    Cli0: Running try from https://localhost/@try
[27/11/2022 19:06:12 ] [LOG]    Cli0: 
 {
  emulator: 'beebjit',
  flags: '-cycles 25600000000 -frame-cycles 21600000000 -opt video:border-chars=0',
  compressed: false,
  input: 'MO.0:REP.PL.69,RND(1280),RND(1024):U.0',
  mode: 2,
  isBASIC: true,
  rude: false
}
[27/11/2022 19:06:12 ] [LOG]    Cli0: Parser output {
  emulator: 'beebjit',
  flags: '-cycles 25600000000 -frame-cycles 21600000000 -opt video:border-chars=0',
  compressed: false,
  input: 'MO.0:REP.PL.69,RND(1280),RND(1024):U.0',
  mode: 2,
  isBASIC: true,
  rude: false
}
[27/11/2022 19:06:12 ] [LOG]    Cli0: MO.0:REP.PL.69,RND(1280),RND(1024):U.0
[27/11/2022 19:06:12 ] [LOG]    Cli0: Loading OS from roms/master/mos3.20
[27/11/2022 19:06:16 ] [LOG]    Cli0: cd beebjit && ./beebjit -fast -headless -frames-dir ../tmp/ -cycles 25600000000 -frame-cycles 21600000000 -opt video:border-chars=0 -commands 'breakat 725000;c;loadmem ../tmp/tweet.bas 1900;loadmem ../tmp/keys.bin 03e0;writem 02e1 e4;writem 0000 20;writem 0001 19;writem 0002 20;writem 0003 19;writem 0012 20;writem 0013 19;c'
[27/11/2022 19:06:16 ] [LOG]    Cli0: beebjit DONE in 4.835s 
[27/11/2022 19:06:16 ] [LOG]    Cli0: Captured 1 frames (1 unique) ./tmp/beebjit_frame_
[27/11/2022 19:06:17 ] [LOG]    Cli0: Ffmpeg DONE in 0.175s 
[27/11/2022 19:06:17 ] [LOG]    Cli0: Generated image/png
5.48user 0.11system 0:05.47elapsed 102%CPU (0avgtext+0avgdata 68160maxresident)k
0inputs+2728outputs (39major+24514minor)pagefaults 0swaps

So it's now actually faster than non-rocket mode - presumably running slowly for 30 seconds now takes longer than running fast for 3 hours!

@scarybeasts
Copy link

I don't see the GXR rom loaded in the most recent example. It doesn't have a flag like:
-rom 7 roms/gxr.rom

@ojwb
Copy link
Collaborator Author

ojwb commented Nov 27, 2022

D'oh! Of course in the original test I must have patched the bot.

Trying again with a patch to use the GXR in rocket mode it's still slow unfortunately:

 echo '🚀MO.0:REP.PL.69,RND(1280),RND(1024):U.0'|time node client try
[28/11/2022 07:31:26 ] [LOG]    Cli0: 
[28/11/2022 07:31:26 ] [LOG]    Cli0: Running try from https://localhost/@try
[28/11/2022 07:31:26 ] [LOG]    Cli0: 
 {
  emulator: 'beebjit',
  flags: '-rom 7 roms/gxr.rom -cycles 25600000000 -frame-cycles 21600000000 -opt video:border-chars=0',
  compressed: false,
  input: 'MO.0:REP.PL.69,RND(1280),RND(1024):U.0',
  mode: 2,
  isBASIC: true,
  rude: false
}
[28/11/2022 07:31:26 ] [LOG]    Cli0: Parser output {
  emulator: 'beebjit',
  flags: '-rom 7 roms/gxr.rom -cycles 25600000000 -frame-cycles 21600000000 -opt video:border-chars=0',
  compressed: false,
  input: 'MO.0:REP.PL.69,RND(1280),RND(1024):U.0',
  mode: 2,
  isBASIC: true,
  rude: false
}
[28/11/2022 07:31:26 ] [LOG]    Cli0: MO.0:REP.PL.69,RND(1280),RND(1024):U.0
[28/11/2022 07:31:26 ] [LOG]    Cli0: Loading OS from roms/master/mos3.20
[28/11/2022 08:08:37 ] [LOG]    Cli0: cd beebjit && ./beebjit -fast -headless -frames-dir ../tmp/ -rom 7 roms/gxr.rom -cycles 25600000000 -frame-cycles 21600000000 -opt video:border-chars=0 -commands 'breakat 725000;c;loadmem ../tmp/tweet.bas 1c00;loadmem ../tmp/keys.bin 03e0;writem 02e1 e4;writem 0000 20;writem 0001 1c;writem 0002 20;writem 0003 1c;writem 0012 20;writem 0013 1c;c'
[28/11/2022 08:08:37 ] [LOG]    Cli0: beebjit DONE in 2230.315s 
[28/11/2022 08:08:37 ] [LOG]    Cli0: Captured 1 frames (1 unique) ./tmp/beebjit_frame_
[28/11/2022 08:08:37 ] [LOG]    Cli0: Ffmpeg DONE in 0.169s 
[28/11/2022 08:08:37 ] [LOG]    Cli0: Generated image/png
2225.02user 1.55system 37:10.97elapsed 99%CPU (0avgtext+0avgdata 70732maxresident)k
312inputs+13688outputs (1880major+25757minor)pagefaults 0swaps

It actually seems slower than before. Shame, I was hoping the dynamic opcode changes I noticed in the beebjit history had solved this.

These latest tests are with beebjit commit 9fcaa1096857e919f6872856c7bbc77f38e77027 plus a cherry-pick of c00f1b736582beb8362dfd5e5ecc33e192ab3040 to fix a GCC12 warning that's an error due to use of -Werror as beebjit after than seems to introduce a timing difference I haven't fully investigated yet. I'll re-test with the latest beebjit.

@scarybeasts
Copy link

The JIT is slow with the GXR ROM because its presence causes MOS to page the BASIC <-> GXR ROMs excessively. ROM paging is very slow with the JIT because all the compiled JIT code in the ROM region must be carefully invalidated.

JIT + GXR isn't likely to get much faster in the forseeable future. -mode inturbo may be the best bet.

@ojwb
Copy link
Collaborator Author

ojwb commented Nov 27, 2022

OK - I was vaguely assuming it was some evil generated and/or self-modifying code, but the cause being ROM paging makes rather more sense as we see things which don't actually use GXR features run much more slowly too.

I suppose it'd be hard to have a separate JIT version of &8000-&BFFF for each ROM image and swap in which is live when the paged ROM is changed?

I'll try some tests with -mode inturbo.

Since I'd already started the test with latest git running, here's the result (essentially no significant change):

$ echo '🚀MO.0:REP.PL.69,RND(1280),RND(1024):U.0'|time node client try
[28/11/2022 08:44:45 ] [LOG]    Cli0: 
[28/11/2022 08:44:45 ] [LOG]    Cli0: Running try from https://localhost/@try
[28/11/2022 08:44:45 ] [LOG]    Cli0: 
 {
  emulator: 'beebjit',
  flags: '-rom 7 roms/gxr.rom -cycles 25600000000 -frame-cycles 21600000000 -opt video:border-chars=0',
  compressed: false,
  input: 'MO.0:REP.PL.69,RND(1280),RND(1024):U.0',
  mode: 2,
  isBASIC: true,
  rude: false
}
[28/11/2022 08:44:45 ] [LOG]    Cli0: Parser output {
  emulator: 'beebjit',
  flags: '-rom 7 roms/gxr.rom -cycles 25600000000 -frame-cycles 21600000000 -opt video:border-chars=0',
  compressed: false,
  input: 'MO.0:REP.PL.69,RND(1280),RND(1024):U.0',
  mode: 2,
  isBASIC: true,
  rude: false
}
[28/11/2022 08:44:45 ] [LOG]    Cli0: MO.0:REP.PL.69,RND(1280),RND(1024):U.0
[28/11/2022 08:44:45 ] [LOG]    Cli0: Loading OS from roms/master/mos3.20
[28/11/2022 09:22:09 ] [LOG]    Cli0: cd beebjit && ./beebjit -fast -headless -frames-dir ../tmp/ -rom 7 roms/gxr.rom -cycles 25600000000 -frame-cycles 21600000000 -opt video:border-chars=0 -commands 'breakat 725000;c;loadmem ../tmp/tweet.bas 1c00;loadmem ../tmp/keys.bin 03e0;writem 02e1 e4;writem 0000 20;writem 0001 1c;writem 0002 20;writem 0003 1c;writem 0012 20;writem 0013 1c;c'
[28/11/2022 09:22:09 ] [LOG]    Cli0: beebjit DONE in 2244.118s 
[28/11/2022 09:22:09 ] [LOG]    Cli0: Captured 1 frames (1 unique) ./tmp/beebjit_frame_
[28/11/2022 09:22:10 ] [LOG]    Cli0: Ffmpeg DONE in 0.194s 
[28/11/2022 09:22:10 ] [LOG]    Cli0: Generated image/png
2239.70user 1.73system 37:24.74elapsed 99%CPU (0avgtext+0avgdata 68432maxresident)k
0inputs+13176outputs (1808major+25071minor)pagefaults 0swaps

@ojwb
Copy link
Collaborator Author

ojwb commented Nov 27, 2022

With -mode inturbo it takes less than 30 seconds:

$ echo '🚀MO.0:REP.PL.69,RND(1280),RND(1024):U.0'|time node client try
[28/11/2022 09:29:31 ] [LOG]    Cli0: 
[28/11/2022 09:29:31 ] [LOG]    Cli0: Running try from https://localhost/@try
[28/11/2022 09:29:31 ] [LOG]    Cli0: 
 {
  emulator: 'beebjit',
  flags: '-mode inturbo -rom 7 roms/gxr.rom -cycles 25600000000 -frame-cycles 21600000000 -opt video:border-chars=0',
  compressed: false,
  input: 'MO.0:REP.PL.69,RND(1280),RND(1024):U.0',
  mode: 2,
  isBASIC: true,
  rude: false
}
[28/11/2022 09:29:31 ] [LOG]    Cli0: Parser output {
  emulator: 'beebjit',
  flags: '-mode inturbo -rom 7 roms/gxr.rom -cycles 25600000000 -frame-cycles 21600000000 -opt video:border-chars=0',
  compressed: false,
  input: 'MO.0:REP.PL.69,RND(1280),RND(1024):U.0',
  mode: 2,
  isBASIC: true,
  rude: false
}
[28/11/2022 09:29:31 ] [LOG]    Cli0: MO.0:REP.PL.69,RND(1280),RND(1024):U.0
[28/11/2022 09:29:31 ] [LOG]    Cli0: Loading OS from roms/master/mos3.20
[28/11/2022 09:29:59 ] [LOG]    Cli0: cd beebjit && ./beebjit -fast -headless -frames-dir ../tmp/ -mode inturbo -rom 7 roms/gxr.rom -cycles 25600000000 -frame-cycles 21600000000 -opt video:border-chars=0 -commands 'breakat 725000;c;loadmem ../tmp/tweet.bas 1c00;loadmem ../tmp/keys.bin 03e0;writem 02e1 e4;writem 0000 20;writem 0001 1c;writem 0002 20;writem 0003 1c;writem 0012 20;writem 0013 1c;c'
[28/11/2022 09:29:59 ] [LOG]    Cli0: beebjit DONE in 27.686s 
[28/11/2022 09:29:59 ] [LOG]    Cli0: Captured 1 frames (1 unique) ./tmp/beebjit_frame_
[28/11/2022 09:29:59 ] [LOG]    Cli0: Ffmpeg DONE in 0.183s 
[28/11/2022 09:29:59 ] [LOG]    Cli0: Generated image/png
28.32user 0.18system 0:28.33elapsed 100%CPU (0avgtext+0avgdata 72764maxresident)k
0inputs+2744outputs (26major+24420minor)pagefaults 0swaps

That's with latest beebjit git.

A test which actually uses GXR features (MODE1:GCOL32,0 then the same) was essentially the same speed.

-mode interp (as suggested earlier) is significantly slower - just under 54 seconds for the 🚀MO.0:REP.PL.69,RND(1280),RND(1024):U.0 case.

Maybe we want to make GXR opt-in (e.g. 🎨 emoji selects it)? GXR imposes an annoyingly high speed overhead inside the emulation too, even for code that doesn't use any of its features, due to its interception of the VDU vector. You can unhook that with e.g. !526=!-9906 but that's pretty low-level hackery. Selection of GXR could work like DFS for Owlet (v1 vs v3) for compatibility with existing source links and could be specified for new source links like rocket mode is, but Owlet probably needs a button to easily toggle GXR on/off.

@scarybeasts
Copy link

Yeah, I've thought about having a separate JIT space for each ROM, and paging that around. I've stayed away because I'm wary of adding the extra complexity for a not-so-common use case.

@scarybeasts
Copy link

I like the idea of unhooking the VDU / other vectors, so that standard drawing commands would be fast. Could the GXR bot mode be defined so that you have to explicitly call the extended drawing routines -- is that possible?

@ojwb
Copy link
Collaborator Author

ojwb commented Nov 28, 2022

We could unhook the GXR from VDUV by default by poking bytes via writem debug commands, then provide some custom way to enable it from within the emulation, but that seems a significant deviation from running your program on a real BBC micro.

I think the cleaner approach to this would be for the bot not to load the GXR ROM by default, with an emoji to enable it which you could include with your program if you want to use the GXR (similar to how the rocket emoji enables taking a snapshot after 3 hours). Then the default bot is like using a BBC model B without the GXR (which was probably a much more common configuration of actual BBC model B machines). This would also mean more RAM to easily use from BASIC by default, since the GXR raises page by 768 bytes.

This approach seems more in line with running on real hardware - the emoji just selects which real hardware you're picking to use.

Then in combination with the rocket emoji this could load the GXR and select -mode inturbo (whereas rocket alone could stick with the faster JIT mode). That'd be harder to achieve with in-emulation approaches as we wouldn't know whether GXR would actually be used when we're picking the beebjit command line options.

Owlet could smooth the experience of working with this - click a button to enable the GXR in owlet when testing, and then owlet would know to include the emoji when you share the program.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants