Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Go for speed, but need many patches #70

Open
wants to merge 24 commits into
base: main
Choose a base branch
from
Open

Conversation

AI-M-BOT
Copy link

I was not allowed to share this code until now 😞, for any reason why just ask my boss

Modified code is buggy, it might just give an idea how to make DXcam a bit faster.
Now yall can do partial screenshot more efficiently.

for shot function usage:

# Of course create a camera first
...
img = np.zeros((cut_h, cut_w, 4), dtype='uint8')
camera.shot(img.ctypes.data_as(c_char_p), region=your_region)
...

For any issues happened we all can discuss and fix them together

Agade09 and others added 10 commits April 15, 2023 10:04
In profiling a benchmark code this takes processing overhead from ~14% to ~1.4% and ~0.06% respectively
…ller than the whole screen.

The idea is to ask ctypes.string_at for as little memory as possible. Since images are stored in memory with width being the fast index. If we want to grab a 480x640 region from a 1440x2560 screen we can ask ctypes.string_at() for a 480x2560 region. This reduces memory allocation and memcpy overhead in ctypes.string_at().
To grab a 480x640 region out of a 1440x2560 screen the profiler time spent went from ~24% to ~8%.
…ere its width region matches the screen's width.
…rom_address API instead. In profiling a 1440x2560 grab, total time spent went from 20% in string_at() to almost 0% in from_address. My understanding is that string_at uses memove which is slower than the memcpy I suspect from_address uses.
…that case self.region is used, and self.region was already validated when it was defined.

In profiling a max FPS benchmark with no region defined, this spares 3% of total execution time.
… if statement bypassed the call to self.process_cvtcolor(). Simplify code in process_cvtcolor since it no longer needs to handle 'BGRA''.

In profiling this spares 0.4% of total execution time in 'BGRA' mode.
…but it was producing a non-contiguous array, which changes the behavior of the library
Performance improvements
@Agade09
Copy link

Agade09 commented Aug 6, 2023

Thanks for posting this. I used it in a branch of mine but couldn't fork from your fork to keep your commit because your fork is based on mine and we were both working on the main branch, so I just didn't know how.
I believe I've fixed your code for !=0 rotation angles and made small performance improvements.

AI-M-BOT and others added 10 commits August 9, 2023 11:00
	modified:   dxcam/__init__.py
	modified:   dxcam/_libs/dxgi.py
	modified:   dxcam/core/duplicator.py
	modified:   dxcam/dxcam.py
	modified:   dxcam/processor/numpy_processor.py
	modified:   dxcam/core/duplicator.py
	modified:   dxcam/__init__.py
	modified:   dxcam/core/duplicator.py
	modified:   dxcam/dxcam.py
	modified:   dxcam/processor/numpy_processor.py
…e formats

check LastMouseUpdateTime for non-zero value first
allow for all shape formats
 	modified:   dxcam/core/duplicator.py
 	modified:   dxcam/processor/numpy_processor.py
check info.LastPresentTime to actually skip repeat frames even if DXGI_ERROR_WAIT_TIMEOUT was not hit.
Add functionality to get PointerShape
@tooichitake
Copy link

Hi, can you check the branch: NewFeatures by Agade09? I found this version is quit faster than other versions, but nobody even mentioned it. As I tested, NewFeatures, grab(), is average 0-1 ms per picture, and what you have provided-Main, with grab() function, is 2-5 ms. Will you check and merge it to this

@crackwitz
Copy link

crackwitz commented Jan 11, 2024

I would recommend reviewing all uses of ctypes.string_at(). That function is meant to be used on C strings, i.e. they stop at the first null byte (0x00). Also it might actually make a copy of the data. There might be better types to use for handling blocks of bytes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants