Skip to content

Latest commit

 

History

History
298 lines (214 loc) · 16.2 KB

README.md

File metadata and controls

298 lines (214 loc) · 16.2 KB

A reusable pipeline for large-scale fiber segmentation on unidirectional fiber beds using fully convolutional neural networks

Alexandre Fioravante de Siqueira1,2, Daniela Mayumi Ushizima1,2, Stéfan J. van der Walt1,2

1 Berkeley Institute for Data Science, University of California, Berkeley, USA

2 Lawrence Berkeley National Laboratory, Berkeley, USA

  • This paper is available on: [Scientific Data] [arXiv]

  • Supplementary material available on: [GitHub]

  • Supplementary data available on: [Dryad]

  • If you find this material useful, please cite the accompanying paper:

Fioravante de Siqueira, A., Ushizima, D.M. & van der Walt, S.J. A reusable neural network pipeline for unidirectional fiber segmentation. Sci Data 9, 32 (2022). https://doi.org/10.1038/s41597-022-01119-6

In this study we analyzed fibers from ex-situ X-ray CT fiber beds of nine samples from Larson et al (2019)'s datasets. To detect individual fibers in these samples, we tested four different fully convolutional neural networks — U-net, 3D U-net, Tiramisu, and 3D Tiramisu. When comparing our neural network approach to Larson et al. results, we obtained Dice and Matthews coefficients greater than 92.28 ± 9.65%, reaching up to 98.42 ± 0.03%. This shows that the network results are close to the human-supervised ones in these fiber beds, in some cases separating fibers that Larson et al's analysis could not identify. Here you find the data resulting from our study.

Downloading Larson et al's data

This study uses neural networks to process fibers in fiber beds, using Larson et al (2019)'s data. To be able to reproduce our study, it is necessary to download that data. For that, you will need a login at the Globus platform.

Larson et al's dataset is available at this link. We used twelve different datasets in total. We keep the same file identifiers Larson et al. used in their study, for fast cross-reference:

  • "232p1":
    • wet: folder data/Recons/Bunch2WoPR/rec20160324_055424_232p1_wet_1cm_cont_4097im_1500ms_17keV_13_a.h5
  • "232p3":
    • wet: folder data/Recons/Bunch2WoPR/rec20160318_191511_232p3_2cm_cont__4097im_1500ms_ML17keV_6.h5
    • cured: folder data/Recons/Bunch2WoPR/rec20160323_093947_232p3_cured_1p5cm_cont_4097im_1500ms_17keV_10.h5
    • cured registered: folder data/Seg/Bunch2/rec20160323_093947_232p3_cured_1p5cm_cont_4097im_1500ms_17keV_10.h5/Registered/Bunch2WoPR
  • "235p1":
    • wet: folder data/Recons/Bunch2WoPR/rec20160324_123639_235p1_wet_0p7cm_cont_4097im_1500ms_17keV_14.h5
  • "235p4":
    • wet: folder data/Recons/Bunch2WoPR/rec20160326_175540_235p4_wet_1p15cm_cont_4097im_1500ex_17keV_20.h5
    • cured: folder data/Recons/Bunch2WoPR/rec20160327_003824_235p4_cured_1p15cm_cont_4097im_1500ex_17keV_22.h5
    • cured registered: folder data/Seg/Bunch2/rec20160327_003824_235p4_cured_1p15cm_cont_4097im_1500ex_17keV_22.h5/Registered/Bunch2WoPR
  • "244p1":
    • wet: folder data/Recons/Bunch2WoPR/rec20160318_223946_244p1_1p5cm_cont__4097im_1500ms_ML17keV_7.h5
    • cured: folder data/Recons/Bunch2WoPR/rec20160320_160251_244p1_1p5cm_cont_4097im_1500ms_ML17keV_9.h5
    • cured registered: folder data/Seg/Bunch2/rec20160320_160251_244p1_1p5cm_cont_4097im_1500ms_ML17keV_9.h5/Registered/Bunch2WoPR
  • "245p1":
    • wet: folder rec20160327_160624_245p1_wet_1cm_cont_4097im_1500ex_17keV_23.h5

The first three numeric characters correspond to a material sample, and the last character correspond to different extrinsic factors, e.g. deformation. Despite being samples from similar materials, the reconstructed files presented several differences: different amount of ringing artifacts, intensity variation, noise, etc.

Larson et al.'s folder structure should be placed at the folder data. A copy of the folder structure is given at the Appendix, at the end of this file. For more info on the data, please refer to Larson et al (2019).

Preparing the PC to run the code locally

To download this repository to your machine, please use git. It can be downloaded freely at the project's page.

When git is installed, the following command on a Linux/Mac OS Terminal or a Windows PowerShell downloads this repository to the subfolder fcn_microct in your current folder.

$ git clone https://github.com/alexdesiqueira/fcn_microct.git fcn_microct

The $ represents the Terminal prompt.


Using Git

For more information on how to use git, please check its documentation. Git Immersion is also a great — and extensive — tour through Git fundamentals.


You need Python installed to execute the code. We recommend using the Anaconda distribution; all necessary tools are pre-installed. For installation instructions and packages to different operating systems, please refer to their downloads page. The following command installs the necessary dependencies:

$ pip install -r requirements.txt

The $ represents the Terminal prompt. Now you are ready to use this repository.

Training a neural network

After downloading Larson et al's original data and preparing the PC to run the code in this repository, you can use the script train.py to train the neural networks in the input data. For example, when on the folder fullconvnets, the command

$ python train.py -n 'tiramisu_3d' -t 'tiramisu-67' -w 'larson_tiramisu_3d-67.hdf5' -e 5 -b 2

will train a 3D Tiramisu-67 for 5 epochs, with a batch size of 2 using Larson et al's input data. The resulting weights will be stored at larson_tiramisu_3d-67.hdf5.

Arguments

  • -n, --network : convolutional network to be used in the training. Available networks: 'tiramisu', 'tiramisu_3d', 'unet', 'unet_3d'.

  • -t, --tiramisu_model : when the network used is a tiramisu, the model to be used. Not necessary when using U-Nets. Available models: 'tiramisu-56', 'tiramisu-67'.

  • -v, --train_vars : JSON file containing the training variables 'target_size', 'folder_train', 'folder_validate', 'training_images', 'validation_images'. Defaults: based on constants.py to train using Larson et al samples.

An example of a JSON file follows:

{
    "target_size": [64, 64, 64],
    "folder_train": "data/train",
    "folder_validate": "data/validate",
    "training_images": 1000,
    "validation_images": 200
}
  • -b, --batch_size : size of the batches used in the training (optional). Default: 2.

  • -e, --epochs : how many epochs are used in the training (optional). Default: 5.

  • -w, --weights : output containing weight coefficients. Default: weights_<NETWORK>.hdf5.

Predicting on Larson et al's data

After training one of the architectures into the input data — or if you would like to use one of weights we made available on Dryad — you can use the script predict.py to predict results — i.e., use the network to separate regions of interest into your data. For example, when on the folder fullconvnets, the command

$ python predict.py -n 'tiramisu_3d' -t 'tiramisu-67' -w 'larson_tiramisu_3d-67.hdf5'

will separate fibers in Larson et al's input data using a 3D Tiramisu-67, with weights contained in the file larson_tiramisu_3d-67.hdf5.

Arguments

  • -n, --network : convolutional network to be used in the prediction. Available networks: 'tiramisu', 'tiramisu_3d', 'unet', 'unet_3d'.

  • -t, --tiramisu_model : when the network used is a tiramisu, the model to be used. Not necessary when using U-Nets. Available models: 'tiramisu-56', 'tiramisu-67'.

  • -v, --train_vars : JSON file containing the training variables 'folder', 'path', 'file_ext', 'has_goldstd', 'path_goldstd', 'segmentation_interval', 'registered_path'. Defaults: based on constants.py to train using Larson et al samples.

An example of a JSON file follows:

{
    "folder": "data",
    "path": "data/test",
    "file_ext": ".tif",
    "has_goldstd": true,
    "path_goldstd": "data/test/label",
    "segmentation_interval": null,
    "registered_path": null
}
  • -w, --weights : file containing weight coefficients to be used on the prediction.

HOW-TO: Reproducing our study

The following instructions can be used to reproduce the results from our manuscript. All CNN algorithms were implemented using TensorFlow and Keras on a computer with two Intel Xeon Gold processors 6134 and two Nvidia GeForce RTX 2080 graphical processing units. Each GPU has 10 GB of RAM.

Preparing the training samples

After downloading Larson et al's data, on the folder fullconvnets, start a Python prompt — e.g, Python interpreter, IPython, Jupyter Notebook. First, we import the library prepare.py:

>>> import prepare

After importing prepare, we copy the training samples we will use, as defined in constants.py. Use the function prepare.copy_training_samples():

>>> prepare.copy_training_samples()

Then, we crop the images to fit the network input. If you would like to train the 2D networks, the following statement crops the training images and their labels:

>>> prepare.crop_training_images()

To crop the training samples and their labels for the 3D networks, use the following statement:

>>> prepare.crop_training_chunks()

Training the networks

The following commands in a Linux/Mac OS Terminal or a Windows PowerShell will train the four networks using the downloaded and prepared data, according to our study:

$ python train.py -n 'unet' -w 'larson_unet.hdf5' -e 5 -b 2
$ python train.py -n 'unet_3d' -w 'larson_unet_3d.hdf5' -e 5 -b 2
$ python train.py -n 'tiramisu' -t 'tiramisu-67' -w 'larson_tiramisu-67.hdf5' -e 5 -b 2
$ python train.py -n 'tiramisu_3d' -t 'tiramisu-67' -w 'larson_tiramisu_3d-67.hdf5' -e 5 -b 2

Predicting using the trained networks

The following commands in a Linux/Mac OS Terminal or a Windows PowerShell will predict results in the data using the four trained networks.

$ python predict.py -n 'unet' -w 'larson_unet.hdf5'
$ python predict.py -n 'unet_3d' -w 'larson_unet_3d.hdf5'
$ python predict.py -n 'tiramisu' -t 'tiramisu-67' -w 'larson_tiramisu-67.hdf5'
$ python predict.py -n 'tiramisu_3d' -t 'tiramisu-67' -w 'larson_tiramisu_3d-67.hdf5'

Here we assume that the files for unet, unet_3d, tiramisu and tiramisu_3d are named larson_unet.hdf5, larson_unet_3d.hdf5, larson_tiramisu-67.hdf5, and larson_tiramisu_3d-67, respectively. In this case, we also expect them to be in the same folder you are executing the code.

Another example. If you would like to predict results using the U-net network and your coefficients are in the folder coefficients/unet, you would use:

$ python predict.py -n 'unet' -w 'coefficients/unet/larson_unet.hdf5'

References

Larson, N. M., Cuellar, C. & Zok, F. W. X-ray computed tomography of microstructure evolution during matrix impregnation and curing in unidirectional fiber beds. Composites Part A: Applied Science and Manufacturing 117, 243–259 (2019)

Appendices

Structure on Larson et al's data

This is the structure of Larson et al's folders we used in this study, for reference.

data/
└── Recons/
    └── Bunch2WoPR/
        ├── rec20160318_191511_232p3_2cm_cont__4097im_1500ms_ML17keV_6.h5/
        │   ├── rec_SFRR_2600_B0p2_00159.tiff
        │   ├── rec_SFRR_2600_B0p2_00160.tiff
        │   ├── (...)
        │   └── rec_SFRR_2600_B0p2_01158.tiff
        ├── rec20160318_223946_244p1_1p5cm_cont__4097im_1500ms_ML17keV_7.h5/
        │   ├── rec_SFRR_2600_B0p2_00000.tiff
        │   ├── rec_SFRR_2600_B0p2_00001.tiff
        │   ├── (...)
        │   └── rec_SFRR_2600_B0p2_02159.tiff
        ├── rec20160320_160251_244p1_1p5cm_cont_4097im_1500ms_ML17keV_9.h5/
        │   ├── rec_SFRR_2600_B0p2_00000.tiff
        │   ├── rec_SFRR_2600_B0p2_00001.tiff
        │   ├── (...)
        │   └── rec_SFRR_2600_B0p2_02159.tiff
        ├── rec20160323_093947_232p3_cured_1p5cm_cont_4097im_1500ms_17keV_10.h5/
        │   ├── rec_SFRR_2600_B0p2_00000.tiff
        │   ├── rec_SFRR_2600_B0p2_00001.tiff
        │   ├── (...)
        │   └── rec_SFRR_2600_B0p2_02159.tiff
        ├── rec20160324_055424_232p1_wet_1cm_cont_4097im_1500ms_17keV_13_a.h5/
        │   ├── rec_SFRR_2600_B0p2_00000.tiff
        │   ├── rec_SFRR_2600_B0p2_00001.tiff
        │   ├── (...)
        │   └── rec_SFRR_2600_B0p2_02159.tiff
        ├── rec20160324_123639_235p1_wet_0p7cm_cont_4097im_1500ms_17keV_14.h5/
        │   ├── rec_SFRR_2600_B0p2_00000.tiff
        │   ├── rec_SFRR_2600_B0p2_00001.tiff
        │   ├── (...)
        │   └── rec_SFRR_2600_B0p2_02159.tiff
        ├── rec20160326_175540_235p4_wet_1p15cm_cont_4097im_1500ex_17keV_20.h5/
        │   ├── rec_SFRR_2600_B0p2_00000.tiff
        │   ├── rec_SFRR_2600_B0p2_00001.tiff
        │   ├── (...)
        │   └── rec_SFRR_2600_B0p2_02159.tiff
        ├── rec20160327_003824_235p4_cured_1p15cm_cont_4097im_1500ex_17keV_22.h5/
        │   ├── rec_SFRR_2600_B0p2_00000.tiff
        │   ├── rec_SFRR_2600_B0p2_00001.tiff
        │   ├── (...)
        │   └── rec_SFRR_2600_B0p2_02159.tiff
        └── rec20160327_160624_245p1_wet_1cm_cont_4097im_1500ex_17keV_23.h5
            ├── rec_SFRR_2600_B0p2_00000.tiff
            ├── rec_SFRR_2600_B0p2_00001.tiff
            ├── (...)
            └── rec_SFRR_2600_B0p2_02159.tiff

data/
└── Seg/
    └── Bunch2/
        ├── rec20160320_160251_244p1_1p5cm_cont_4097im_1500ms_ML17keV_9.h5/
        │   └── Registered/
        │       └── Bunch2WoPR/
        │           ├── Reg_0001.tif
        │           ├── Reg_0002.tif
        │           ├── (...)
        │           └──
        ├── rec20160323_093947_232p3_cured_1p5cm_cont_4097im_1500ms_17keV_10.h5/
        │   └── Registered/
        │       └── Bunch2WoPR/
        │           ├── Reg_0001.tif
        │           ├── Reg_0002.tif
        │           ├── (...)
        │           └── Reg_2160.tif
        └── rec20160327_003824_235p4_cured_1p15cm_cont_4097im_1500ex_17keV_22.h5/
            └── Registered/
                └── Bunch2WoPR/
                    ├── Reg_0001.tif
                    ├── Reg_0002.tif
                    ├── (...)
                    └── Reg_2160.tif