feat: windows support #63

DifferentialityDevelopment · 2024-05-24T10:22:37Z

Was able to get it working on Windows, still need to do some cleaning up and more thorough testing.

DifferentialityDevelopment · 2024-05-24T10:34:01Z

I do need to do some alterations as it doesn't currently build on Linux, but it should be minor changes.
Though I've confirmed that Windows seems to work correctly at least.

b4rtaz · 2024-05-24T10:34:24Z

Nice job @DifferentialityDevelopment!

First thoughts:

we should not commit binary files,
I think we don't need the pthreads-w32 dependency, this project uses just 2 functions from pthread.h. You can easily replace them by using CreateThread and WaitForSingleObject from windows.h. You can check how it's done in llama.cpp.
I'm not sure if we need the allocateAndLockMemory function. I didn't observe any significant improvement by using posix_memalign, maybe this could be removed. Maybe calloc would be enought? Or maybe here we need some tests.

DifferentialityDevelopment · 2024-05-24T10:42:39Z

Yeah the changes in utils.cpp were messy, haven't yet gotten around to working my backwards so that it changes the least amount of code while still working.
Malloc just wouldn't work with the size of what needed to be allocated and I wasn't having much luck with the windows API functions for allocating large amount of memory and ended up just using the simple vector approach which worked like a charm.

It's a first working draft, can modify it forwards to remove the dependency on pthreads-win32

DifferentialityDevelopment · 2024-05-24T12:23:21Z

Still need to do some more testing and more refinement, but it does at least build again now on both Linux & Windows, also removed pthreads-win32 dependency

src/socket.cpp

Makefile

b4rtaz · 2024-05-25T14:11:05Z

Can we add Windows to .github/workflows/main.yml?

DifferentialityDevelopment · 2024-05-25T15:24:39Z

Can we add Windows to .github/workflows/main.yml?

That would be great, then we know if a change breaks either platform :)
I'll do some refactoring on the code tonight to clean it up a bit.

DifferentialityDevelopment · 2024-05-25T21:05:05Z

I think the main things left to refactor are the changes in transformers.cpp & utils.cpp

DifferentialityDevelopment · 2024-05-25T23:22:45Z

I refactored utils.cpp, gracefullyAllocateBuffer acts like a fallback for allocating the memory buffer, which also sorts out the weirdness that happens if you happen to not run distributed-llama as sudo.

I was able to run dllama without running it as sudo in Linux using this approach

./dllama inference --model /mnt/d/Meta-Llama-3-8B-Instruct-Distributed/dllama_original_q40.bin --tokenizer /mnt/d/Meta-Llama-3-8B-Instruct-Distributed/dllama-llama3-tokenizer.t --weights-float-type q40 --buffer-float-type q80 --nthreads 4 --steps 64 --prompt "Hello world"
💡 arch: llama2
💡 dim: 4096
💡 hiddenDim: 14336
💡 nLayers: 32
💡 nHeads: 32
💡 nKvHeads: 8
💡 vocabSize: 128256
💡 seqLen: 2048
💡 nSlices: 1
💡 ropeTheta: 500000.0
📄 bosId: 128000
📄 eosId: 128001
mmap succeeded. data = 0x7f8c7260a000
weights = 0x7f8c7260a060
🕒 ropeCache: 32768 kB
⏩ Loaded 6175568 kB
🔶 G 421 ms I 421 ms T 0 ms S 0 kB R 0 kB Hello
🔶 G 382 ms I 382 ms T 0 ms S 0 kB R 0 kB world
🔶 G 421 ms I 420 ms T 0 ms S 0 kB R 0 kB !
🔶 G 385 ms I 384 ms T 0 ms S 0 kB R 0 kB This
🔶 G 390 ms I 389 ms T 0 ms S 0 kB R 0 kB is
🔶 G 377 ms I 377 ms T 0 ms S 0 kB R 0 kB a
🔶 G 389 ms I 387 ms T 1 ms S 0 kB R 0 kB test
🔶 G 395 ms I 395 ms T 0 ms S 0 kB R 0 kB of
🔶 G 381 ms I 380 ms T 1 ms S 0 kB R 0 kB the
🔶 G 376 ms I 374 ms T 1 ms S 0 kB R 0 kB emergency
🔶 G 453 ms I 451 ms T 2 ms S 0 kB R 0 kB broadcast
🔶 G 421 ms I 420 ms T 1 ms S 0 kB R 0 kB system
🔶 G 423 ms I 421 ms T 1 ms S 0 kB R 0 kB .

DifferentialityDevelopment · 2024-05-27T13:08:57Z

I've updated the readme as well.

b4rtaz · 2024-05-27T13:10:23Z

Please let me know when I can review again (still for example main.yml is not updated).

DifferentialityDevelopment · 2024-05-27T13:47:38Z

Please let me know when I can review again (still for example main.yml is not updated).

Your welcome to review again.

I haven't much experience with Github workflows, but will try and update main.yml to include a windows build 👍

.github/workflows/main.yml

DifferentialityDevelopment · 2024-05-27T19:13:41Z

.github/workflows/main.yml

Don't know how I swapped those two around! Must have been by accident..

b4rtaz · 2024-05-27T20:48:14Z

I reverted .github/workflows/main.yml to the previous approach. The new approach didn't test all Linux CPUs.

DifferentialityDevelopment · 2024-05-28T08:53:23Z

@b4rtaz It seems you removed a bit of code that was necessary in transformers.cpp

#ifdef _WIN32
#define ftell(fp) _ftelli64(fp)
#define fseek(fp, offset, origin) _fseeki64(fp, offset, origin)
#endif

Without it I am unable to load the model files.

./dllama-api.exe --model D:\openchat-3.6-8b-20240522-distributed\dllama_model_openchat-3.6-8b-20240522_q40.m --tokenizer D:\openchat-3.6-8b-20240522-distributed\dllama_tokenizer_llama3.t --weights-float-type q40 --buffer-float-type q80 --nthreads 8 --chat-template openchat3 --port 10111
ƒÆí arch: llama
ƒÆí hiddenAct: silu
ƒÆí dim: 4096
ƒÆí hiddenDim: 14336
ƒÆí nLayers: 32
ƒÆí nHeads: 32
ƒÆí nKvHeads: 8
ƒÆí vocabSize: 128256
ƒÆí seqLen: 8192
ƒÆí nSlices: 1
ƒÆí ropeTheta: 500000.0
terminate called after throwing an instance of 'std::runtime_error'
what(): Error determining model file size

If I add it back in then it works.

b4rtaz · 2024-05-28T08:56:37Z

@DifferentialityDevelopment ah, sorry! But how this is able to compile now? If #define ftell(fp) _ftelli64(fp) overrides anything, I think it would be better to create an own function, like seekToEnd or similiar (with #ifdef _WIN32 ... else).

DifferentialityDevelopment · 2024-05-28T09:01:52Z

@DifferentialityDevelopment ah, sorry! But how this is able to compile now? If #define ftell(fp) _ftelli64(fp) overrides anything, I think it would be better to create an own function, like seekToEnd or similiar (with #ifdef _WIN32 ... else).

It compiles fine because the arguments for both functions are the same, so on Windows, it's usually better to use _ftelli64 and _fseeki64 instead of the standard C++ functions.

That said if an argument happens to end on ftell it would be overwritten, which I guess is where your coming from.

Maybe something like:

#ifdef _WIN32
static inline long fileGetLength(FILE* fp){
return _ftelli64(fp);
}
static inline long fileSeekEnd(FILE* fp, long offset, int origin){
return _fseeki64(fp, offset, origin);
}
#else
static inline long fileGetLength(FILE* fp){
return ftell(fp);
}
static inline long fileSeekEnd(FILE* fp, long offset, int origin){
return fseek(fp, offset, origin);
}
#endif

b4rtaz · 2024-05-28T15:40:05Z

Does this PR solve the problem?

DifferentialityDevelopment · 2024-05-28T15:43:32Z

Does this PR solve the problem?

I'll give it a try, but it looks like it should work just fine 👍

DifferentialityDevelopment · 2024-05-28T16:55:18Z

Does this PR solve the problem?

I'll give it a try, but it looks like it should work just fine 👍

Not sure what's going on exactly now, I've just tried it out but now I'm getting "Cannot open file" error, will check if I did something wrong.

DifferentialityDevelopment · 2024-05-28T16:59:36Z

@b4rtaz
I figured it out, you forgot to add fclose at the end of loadSpecFromFile

Other than that it works perfect, thank you!

b4rtaz · 2024-05-28T17:10:33Z

@DifferentialityDevelopment thanks for help and sorry the problem. Probably I need some Windows envenvironment.

Windows support!

844d084

Wouter Tichelaar added 3 commits May 24, 2024 14:04

Fixed linux build

835b639

Small fix

d059217

Removed bin dir output from makefile

90e7a6d

DifferentialityDevelopment added 2 commits May 25, 2024 15:26

Update .gitignore

84e19db

Update .gitignore

0499f95

b4rtaz reviewed May 25, 2024

View reviewed changes

src/socket.cpp Outdated Show resolved Hide resolved

b4rtaz reviewed May 25, 2024

View reviewed changes

src/socket.cpp Outdated Show resolved Hide resolved

b4rtaz reviewed May 25, 2024

View reviewed changes

src/socket.cpp Outdated Show resolved Hide resolved

b4rtaz reviewed May 25, 2024

View reviewed changes

Makefile Outdated Show resolved Hide resolved

Wouter Tichelaar added 5 commits May 25, 2024 22:47

refactor: Added isEagainError function to socket.cpp

0c2a849

Refactored some more code in socket.cpp

48fa75f

Refactored makefile

7ef3eeb

Trying to resolve .gitignore conflict?

5201f57

Some more refactoring

ea175d5

Refactored utils.cpp

dbde6ca

Updated readme

6f960a6

Add windows build to workflow file

92860f2

b4rtaz reviewed May 27, 2024

View reviewed changes

.github/workflows/main.yml Outdated Show resolved Hide resolved

Wouter Tichelaar added 6 commits May 27, 2024 16:24

Trying to get it work

44448ca

Testing

4fb52ca

Trying to correct script path

093e1b8

Moved scripts to scripts folder and made dependencies.sh executable

094c8da

Removed old path

3d188ce

Another fix

b16804f

DifferentialityDevelopment requested a review from b4rtaz May 27, 2024 18:50

b4rtaz added 2 commits May 27, 2024 21:02

install_dependencies.

fc729a1

install_dependencies.

0efc59e

DifferentialityDevelopment commented May 27, 2024

View reviewed changes

.github/workflows/main.yml

Copy link

Contributor Author

DifferentialityDevelopment May 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't know how I swapped those two around! Must have been by accident..

newBuffer & freeBuffer.

596121e

b4rtaz changed the title ~~feat: Windows support~~ feat: windows support May 27, 2024

b4rtaz added 5 commits May 27, 2024 22:06

mmap file.

1248d2b

fix: include.

c15d643

polishing.

ab06abe

main.yml.

8e16486

main.yml.

2afbede

b4rtaz added 2 commits May 27, 2024 22:50

cleaning.

5e84e46

polishing.

589519b

b4rtaz merged commit 2fa9d9f into b4rtaz:main May 27, 2024
3 checks passed

DifferentialityDevelopment deleted the windows-support branch May 28, 2024 06:24

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: windows support #63

feat: windows support #63

DifferentialityDevelopment commented May 24, 2024

DifferentialityDevelopment commented May 24, 2024

b4rtaz commented May 24, 2024

DifferentialityDevelopment commented May 24, 2024

DifferentialityDevelopment commented May 24, 2024

b4rtaz commented May 25, 2024

DifferentialityDevelopment commented May 25, 2024

DifferentialityDevelopment commented May 25, 2024

DifferentialityDevelopment commented May 25, 2024 •

edited

Loading

DifferentialityDevelopment commented May 27, 2024

b4rtaz commented May 27, 2024

DifferentialityDevelopment commented May 27, 2024

DifferentialityDevelopment May 27, 2024

b4rtaz commented May 27, 2024

DifferentialityDevelopment commented May 28, 2024 •

edited

Loading

b4rtaz commented May 28, 2024 •

edited

Loading

DifferentialityDevelopment commented May 28, 2024 •

edited

Loading

b4rtaz commented May 28, 2024

DifferentialityDevelopment commented May 28, 2024

DifferentialityDevelopment commented May 28, 2024

DifferentialityDevelopment commented May 28, 2024 •

edited

Loading

b4rtaz commented May 28, 2024

feat: windows support #63

feat: windows support #63

Conversation

DifferentialityDevelopment commented May 24, 2024

DifferentialityDevelopment commented May 24, 2024

b4rtaz commented May 24, 2024

DifferentialityDevelopment commented May 24, 2024

DifferentialityDevelopment commented May 24, 2024

b4rtaz commented May 25, 2024

DifferentialityDevelopment commented May 25, 2024

DifferentialityDevelopment commented May 25, 2024

DifferentialityDevelopment commented May 25, 2024 • edited Loading

DifferentialityDevelopment commented May 27, 2024

b4rtaz commented May 27, 2024

DifferentialityDevelopment commented May 27, 2024

DifferentialityDevelopment May 27, 2024

Choose a reason for hiding this comment

b4rtaz commented May 27, 2024

DifferentialityDevelopment commented May 28, 2024 • edited Loading

b4rtaz commented May 28, 2024 • edited Loading

DifferentialityDevelopment commented May 28, 2024 • edited Loading

b4rtaz commented May 28, 2024

DifferentialityDevelopment commented May 28, 2024

DifferentialityDevelopment commented May 28, 2024

DifferentialityDevelopment commented May 28, 2024 • edited Loading

b4rtaz commented May 28, 2024

DifferentialityDevelopment commented May 25, 2024 •

edited

Loading

DifferentialityDevelopment commented May 28, 2024 •

edited

Loading

b4rtaz commented May 28, 2024 •

edited

Loading

DifferentialityDevelopment commented May 28, 2024 •

edited

Loading

DifferentialityDevelopment commented May 28, 2024 •

edited

Loading