Skip to content

More informative failure messages #89

@mortonjt

Description

@mortonjt

Below is an example of one of these error messages

Traceback (most recent call last):
  File "/home/centos/birdman/run_single_feature.py", line 110, in <module>
    model.fit_model(
  File "/home/centos/miniconda3/envs/birdman/lib/python3.11/site-packages/birdman/model_base.py", line 173, in fit_model
    self.fit = self.sm.sample(
               ^^^^^^^^^^^^^^^
  File "/home/centos/miniconda3/envs/birdman/lib/python3.11/site-packages/cmdstanpy/model.py", line 1201, in sample
    raise RuntimeError(msg)
RuntimeError: Error during sampling:
Exception: mismatch in dimension declared and found in context; processing stage=data initialization; variable name=x_TP; position=0; dims declared=(131); dims found=(149) (in '/home/centos/birdman/model.stan', line 10, column 2 to column 38)
Exception: mismatch in dimension declared and found in context; processing stage=data initialization; variable name=x_TP; position=0; dims declared=(131); dims found=(149) (in '/home/centos/birdman/model.stan', line 10, column 2 to column 38)
Exception: mismatch in dimension declared and found in context; processing stage=data initialization; variable name=x_TP; position=0; dims declared=(131); dims found=(149) (in '/home/centos/birdman/model.stan', line 10, column 2 to column 38)
Exception: mismatch in dimension declared and found in context; processing stage=data initialization; variable name=x_TP; position=0; dims declared=(131); dims found=(149) (in '/home/centos/birdman/model.stan', line 10, column 2 to column 38)
Command and output files:
RunSet: chains=4, chain_ids=[1, 2, 3, 4], num_processes=4
 cmd (chain 1):
	['/home/centos/birdman/model', 'id=1', 'random', 'seed=0', 'data', 'file=/tmp/tmp199y9rwi/z_66ve5j.json', 'output', 'file=/tmp/tmp199y9rwi/modeln3c79wls/model-20230925191211_1.csv', 'method=sample', 'num_samples=500', 'num_warmup=500', 'algorithm=hmc', 'adapt', 'engaged=1']
 retcodes=[1, 1, 1, 1]
 per-chain output files (showing chain 1 only):
 csv_file:
	/tmp/tmp199y9rwi/modeln3c79wls/model-20230925191211_1.csv
 console_msgs (if any):
	/tmp/tmp199y9rwi/modeln3c79wls/model-20230925191211_0-stdout.txt

From the error message, it looks like there is a dimension mismatch. After deeper investigation, it looks like a dimension mismatch between the sampleids in the biom table and the metadata.

Given that this is a common use-case, we could probably include a validation step in the ABC of FeatureModel to check to make sure that the biom table and the metadata are properly synced, and give a more informative error message if they aren't. Something as following may suffice

    common_ids = list(set(metadata.index) & set(table.ids()))
    metadata = metadata.loc[common_ids]
    table.filter(metadata.index, inplace=True)
    if len(metadata) == 0 or len(table.ids()) == 0:
         raise ValueError('Biom Table sample ids and sample metadata ids are not overlapping')

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions