Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

parse names with a "-" #29

Open
manerotoni opened this issue Sep 24, 2020 · 10 comments
Open

parse names with a "-" #29

manerotoni opened this issue Sep 24, 2020 · 10 comments

Comments

@manerotoni
Copy link

Hello,

I found a small issue with Define Dataset plugin [Plugins > BigStitcher > BatchProcessing > Define Dataset...]. I think it has something to do how it parse channels and their names. I have a metamorph file with a global .nd files and several stk files.

image_w1conf561-488dual_s1.stk, first channel, tile 1
image_w2conf488-488dual_s1.stk, 2nd channel, tile 1
image_w1conf561-488dual_s2.stk, first channel, tile 2
, ...

The automatic loader does not properly works for this. It does not ask to specify which patterns is for channel or tiles and one obtains a 30 tiles 4 channels result instead of 30 tiles 3 channels.
The manual loader can't parse channels named 1conf561-488dual or 2conf488-488dual. The problem is the "-" as with other names it works.
One could rename the channels, but all files are connected to a meta nd files and this mess up the reading with other tools.
Would it be possible to make the parser also accept more special characters?

Thanks

Antonio

@hoerldavid
Copy link
Contributor

Hi @manerotoni ,

I just tried to replicate your issue with some renamed TIFFs, but for me, the pattern was detected normally (one would just have to assign both the first and second pattern to channel):

image

So I'm thinking it might have something to do with the metamorph files. Did you also include the .nd file in the first step or just the .stk files?

Could you provide:

  • Screenshots of the first (with the list of included files) and second (the same I attached) windows of the dataset definition?
  • a small dataset for which the problem occurs, if it is not too big?

Best,
David

@manerotoni
Copy link
Author

Dear David,
thanks for the quick reply. Depending on the naming I can or cannot use "Automatic Loader (Bioformats based)" or "Manual Load (Bioformats based)". For simplicity, I will provide only examples where it does not work. The errors are independently of the metamorph formats it really looks like a parsing errors. I will try to also the corresponding data. I need to decrease its size. personally I found the manual parser more flexible.

Failed automatic parsing: The automatic parsing sometime fails because it tries to assign automatically the index for the tiles and the number is then wrong. In this example 12 tiles a 2 channels is recognized as 13 tiles a 2 channels (which does not make much sense for 24 files). In the Patter2 there is no choice for tiles:
image
image

Failed manual parsing The manual parsing fails because of parsing errors. If you remove the "-" sign it will work. BTW in the current example the automatic parser works and does not assign a random index to the tile but asks for it.
automatic_parsing0
manual_parsing0
manual_parsing1
manual_parsing2

@manerotoni
Copy link
Author

manerotoni commented Sep 25, 2020

Hello David,
I uploaded several examples.

https://owncloud.gwdg.de/index.php/s/2uojGuAKsDCZeuK

You will find 3 directories with each an example of loading. either tif files or stk files.
I found that the problem with the Manual Loader and parsing "-" is not specific to stk or tif files.

The issue with the Automatic Loader, and identifying the wrong tiles and channels seems specific to stk files.
I could not downscale the stk files further. So the dataset is quite big (1.15 GB) and a little bit redundant.
sorry :-(
There is also a README file for you to navigate.
Greetings

Antonio

@hoerldavid
Copy link
Contributor

Hi Antonio,

Thanks for all the additional info and data!

The way I understand it now, these are probably 2 separate problems:

  • the manual loader is unable to use channel names containg "-": this is actually expected, as you can specify a range (e.g. 1-5) here. I will have a look whether we can add some form of "escaping", e.g. if you put quotation marks around the name containing the minus, it would not be treated as a range.
  • the automatic loader tries to parse metadata from the .stk files, but this fails somehow (e.g. it shows 13 tiles found): on this, I will have a detailed look using your data and try to find a solution.

Best,
David

@manerotoni
Copy link
Author

Dear David,

yes it is how you say it. Now I understand why "-" is not allowed, it makes sense. Important is whether one can then script the manual loader if "escaping" some of the characters.
With the metadata contained in the stk files, I am also sometime confused. It may be an issue from the files generated by the company or a more general metamorph format issue.

Best,

Antonio

@manerotoni
Copy link
Author

Just a note. I am not sure if for the stk files it is a pure problem of reading metadata there is also something with the names.

For the automatic loading we have following cases:

  • Stk files named (folder success_manual_failed_automatic_stkndfiles) test3_w1conf488_s1 and test3_w2conf405_s1 are interpreted as 4 channels 2 tiles. There is no option to specify the index for tiles and channels. Having this option would help. For instance one could say in the automatic loader that the loader should not try to interpret the indexes on its own.
  • stk files named test3_w1conf488-405dual_s1 and test3_w2conf405-488dual_s1 can be loaded correctly with the automatic loader. In this cases the loader provides a choice for which index corresponds to what. To test this case just rename previous files.

Greetings

Antonio

@manerotoni
Copy link
Author

Hello David,
also another possibility is to have a minimal pattern matching like using * for anything that does not matter. Like in glob.

Antonio

@hoerldavid
Copy link
Contributor

Hi Antonio,

Just a quick update on the .stk files in the automatic loader.
I could indeed replicate your error, but I may have found a workaround: If you give the loader just the .nd file (you may have to set the size limit to 0), it seems to parse the files fine (as 2 channel and 2 tile/angle - you can select which in the next step):

image

this is how it looks in BDV after stitching:

image

Regardless, I will have a look what is going wrong if you give it the .stk files and also if we can read stage positions - that would reduce the amount of comparisons we have to do for stitching.
You also mentioned that it does not read pixel sizes correctly? For me, it read 0.166x0.166x0.5µm - is that correct?

Regarding the manual loader: I've checked the possibility to put channel names in quotes to escape the range specification - I think this is a minimal change that still fixes the problem without causing problems for other users:

image

I think this should also work in macros, where you would write something like: ... channels_=\"1conf488-405\",\"2conf405-488\" .... I will test it some more and let you know once an update is available.

Best,
David

@StephanPreibisch
Copy link
Member

Hi @hoerldavid, can we close this, is it solved?

@imagejan
Copy link
Contributor

@hoerldavid wrote:

the manual loader is unable to use channel names containg "-": this is actually expected, as you can specify a range (e.g. 1-5) here.

It would be great if the community could agree on using a common file name pattern implementation. For example, bio-formats provides a FilePattern class that solves the same purpose, and is well-tested in various use cases.

IMHO, it's a pity that you (and others in other places) felt the need to re-implement the same thing, but all implementations differ slightly in how they work. That contributes to some degree of confusion on the user side.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants