Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Try rare probabilitic masking to reinforce context-dependency #47

Open
source-data opened this issue Sep 1, 2019 · 2 comments
Open

Comments

@source-data
Copy link
Collaborator

Use relevant bits from anonym2 branch. Maybe mask with proba=0.1. Important: use validation set where no masking. The idea is to monitor the effect of masking on the generalization of unmasked data and compare with no masking on unmaked (the usual setup).

@tlemberger
Copy link
Collaborator

proba masking

smtag-convert2th -c 190414 -f 5X_L1200_fig_article_embeddings_128_proba_masking -X5 -L1200 -E ".//fig/caption" -A ".//sd-tag[@type='molecule']",".//sd-tag[@type='gene']",".//sd-tag[@type='protein']",".//sd-tag[@type='subcellular']",".//sd-tag[@type='cell']",".//sd-tag[@type='tissue']",".//sd-tag[@type='organism']",".//sd-tag[@category='assay']"  --noocr --noviz -p rack

Build a composite training set where train/ is probabilitically (10%) masked and valid/ and test/ are unmasked.

Train

smtag-meta -f composite_train_proba_valid_normal -E50 -Z32 -R0.005 -D0.2 -o small_molecule,geneprod,subcellular,cell,tissue,organism,assay -k 7,7,7,7,7,7,7,7,7,7 -n 128,128,128,128,128,128,128,128,128,128 -g 3,3,3,3,3,3,3,3,3,3 -p rack

composite_train_proba_valid_normal_small_molecule_geneprod_subcellular_cell_tissue_organism_assay_2019-09-10-22-32.

compared to

smtag-meta -f 5X_L1200_article_embeddings_128 -E20 -Z32 -R0.005 -D0.2 -o small_molecule,geneprod,subcellular,cell,tissue,organism,assay -k 7,7,7,7,7,7,7,7,7,7 -n 128,128,128,128,128,128,128,128,128,128 -g 3,3,3,3,3,3,3,3,3,3

@tlemberger
Copy link
Collaborator

no difference :-(

image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant