-
Notifications
You must be signed in to change notification settings - Fork 5
Description
rationale
names will be exposed in documentation for editors, curators, and general end-users who wish to understand the general patterns of the ontology, so it is important to use consistent, clear, non-ontologist language
these will also be exposed as headers in TSVs
many of the conventions that apply to programming languages and data models and schemas apply here
it is also useful to think of patterns as metaclasses, whose instances are owl classes. The pattern instances may also correspond to what scientists think of as 'entities'.
principles
always use meaningful names
- don't use var names like
v
. Spell it out. - use terms a domain scientist would understand
- use spaces, not underscores
- never use camel case
exception: filename / IRI should use underscores not spaces. however, for human-readable labels change the underscores to spaces
vars should be named by relationship or role of range, not range itself
e.g. if an disease pattern has a variable to specify the location and the range is anatomical structure, call it location
not anatomical structure
rationale: later on you may want a sub-pattern where you have a separate var with the same range
genus vars should be named non-generically
E.g disease by location, with 2 vars, one for the genus, the other for the location. Do not call the genus var 'disease'. call it something like
- parent disease
- disease group
- disease class
- ...
also consider something like "morphological type" if that describes the genus relation
rationale: just 'disease' is too broad. See previous principle. later we may add a sub-pattern that references another disease
name the pattern by the identity criteria
the set of vars that are used in the equivalence axiom constitute the compound key. these are the identity crietria
e.g a pattern for subtyping leiomyosarcomas by location. do not call this 'leiomyosarcoma'. call it 'leiomyosarcoma by location'
rationale: we may later add leiomyosarcomas subtyped by gene. we can't have two called 'leiomyosarcoma'
in general a good pattern is to name the pattern by the sequence of elements in the equivalence axiom, where the elements are the named classes (the things in single quotes) and var names.
Some recommended changes for mondo patterns
- adult => adult form of disease
use consistent vocabulary
e.g "adult form of disease" is OK as a name. "adult variant of disease" is not good if we use variant to mean a non-subclass variant
long names are not necessarily bad
we don't pay for characters, don't worry too much about length, within reason
use the term specific as appropriate (TBD)
consider a pattern name 'cancer by location'. This is ambiguous. Do we mean:
- pattern with 2 vars: (1) cancer morphological type [genus] (2) location
- patterm with 1 var: location [the genus is fixed at 'disease'
consider prefixing with "specific"; e.g. the first would be called "specific cancer by location"; alternatively "cancer subtype by location"
Perhaps we should even call the 2nd "cancer (general) by location" (TBD.. this is awkwrd)
avoid X in name
always use a meaningful name
use the same filename as pattern name
description should describe the pattern instances not the class instances
E.g.
mondo leiomyosarcoma
https://mondo.readthedocs.io/en/latest/editors-guide/patterns/leiomyosarcoma/
An uncommon, aggressive malignant smooth muscle neoplasm, usually occurring in post-menopausal women that is characterized by a proliferation of neoplastic spindle cells that is located in a specific anatomical location.
This is not a good pattern description, it describes to leiomyosarcomas, not leiomyosarcoma classes
Instead:
This pattern is for classes representing leiomyosarcomas differentiated by where they are found in the body. leiomyosarcomas are uncommon, aggressive malignant smooth muscle neoplasms
include motivation
E.g. leiomyosarcomas can occur in different sites in the body so we include this pattern to...
include examples
As well as auto-examples, include manually selected examples that highlight key aspects
TODO: we should have a specific field for listing this. These should then be used as unit tests
include minimal metadata
- status
- contributors
- authors
- links to tickets
- date of creation
document rules
some patterns may be associated with rules: sparql, regexes, python, ... document these
be specific with range constraints
avoid owl:Thing
consider unions rather than going up the hierarchy if a specific class doesn't exist
challenges: for upper level terms we want to use cob but it is not yet ready
be careful with specifying things too specific and accidentally forcing some classes not to be matched. This is why examples / unt tests (see above) are vital
patterns should be disjoint
this is more of an aspiration at the moment
consider 2 patterns
- cancer subtype by location; 2 vars: cancer_subtype, location
- cancer (grouping) by location, 1 var: location
any class that conforms to the 2nd will also conform to the first. Ideally we could extend dosdp to be able to say: the range of this class is a proper subclass of cancer