Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Could you please provide a training data format for custom dataset training? #1

Open
duducheng opened this issue Dec 16, 2019 · 2 comments

Comments

@duducheng
Copy link

Hi Tang,

Thanks for your excellent work!

As title, could you please provide a training data format for custom dataset training? How can I preprocess my dataset to train my model?

Thanks!

Cheers,
Jiancheng

@tanghaotommy
Copy link
Collaborator

Thanks for your interest!

The fastest way is to write your own dataset in PyTorch. If you take a look at dataset/brain_reader.py, you may follow its return types.

The brain_reader dataset returns an array of four elements. The first is the input 3D volume [1, depth, height, width] a float32 torch tensor. The second is a list of ground truth bounding box of objects in the volume [num_of_objects, 6], the six elements for each object is their z, y, x, depth, height, width. The third is a list of object category id, denoting the class for the corresponding bounding box in the second. The final one is a one-hot encoding segmention mask for each all classes, of shape [num_of_classes, depth, height, width].

@jmarsil
Copy link

jmarsil commented May 29, 2020

Hi @tanghaotommy I also want to run your pipeline on an internal dataset at our institution. We currently have exported all DICOMS with binary masks to images in .nrrd format. What will I need to do, in addition to this, to train your model with our data?

Thanks for the help in advance!!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants