You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
CelFiE will perform several random restarts and select
110
+
the one with the highest log-likelihood. Default 10.
111
+
```
112
+
113
+
### Output
114
+
115
+
CelFiE will output the tissue estimates for each sample in your input - i.e. the proportion of each tissue in the reference making up the cfDNA sample. See `celfie_demo/sample_output/1_tissue_proportions.txt` for an example of this output.
116
+
117
+
```
118
+
tissue1 tissue2 .... unknown
119
+
sample1 0.05 0.08 .... 0.1
120
+
sample2 0.7 0.12 .... 0.2
121
+
122
+
```
123
+
124
+
CelFiE also outputs the methylation proportions for each of the tissues plus however many unknowns were estimated. This output will look like this:
125
+
126
+
```
127
+
tissue1 tissue2 ... unknown
128
+
CpG1 0.99 1.0 ... 0.3
129
+
CpG2 0.45 0.88 ... 0.1
130
+
```
131
+
132
+
Sample code for processing both of these outputs can be seen in `demo.ipynb`.
133
+
134
+
### L1 projection method
135
+
136
+
We also developed a method to project estimates onto the L1 ball, based on Duchi et al 2008. The code for this method is available at `EM/projection.py`. It can be ran as
137
+
138
+
```python
139
+
python projection.py <output_dir><replicate><number of tissues><number of sites><number of individuals><input depth><reference depth><tissue_proportions.pkl>
140
+
```
141
+
142
+
Sample tissue proportions are included at `EM/simulations/unknown_sim_0201_10people.pkl`.
143
+
70
144
## Tissue Informative Markers
71
145
72
146
In our paper, we identified a set of tissue informative markers (TIMs). We claim that these are a good set of CpGs to use for decomposition.
@@ -143,80 +217,6 @@ The pipeline can then be ran as
143
217
./tim.sh
144
218
```
145
219
146
-
## Code
147
-
148
-
### EM Script
149
-
150
-
After preparing data as above, you can run EM script as follows:
CelFiE will perform several random restarts and select
186
-
the one with the highest log-likelihood. Default 10.
187
-
```
188
-
189
-
### Output
190
-
191
-
CelFiE will output the tissue estimates for each sample in your input - i.e. the proportion of each tissue in the reference making up the cfDNA sample. See `celfie_demo/sample_output/1_tissue_proportions.txt` for an example of this output.
192
-
193
-
```
194
-
tissue1 tissue2 .... unknown
195
-
sample1 0.05 0.08 .... 0.1
196
-
sample2 0.7 0.12 .... 0.2
197
-
198
-
```
199
-
200
-
CelFiE also outputs the methylation proportions for each of the tissues plus however many unknowns were estimated. This output will look like this:
201
-
202
-
```
203
-
tissue1 tissue2 ... unknown
204
-
CpG1 0.99 1.0 ... 0.3
205
-
CpG2 0.45 0.88 ... 0.1
206
-
```
207
-
208
-
Sample code for processing both of these outputs can be seen in `demo.ipynb`.
209
-
210
-
### L1 projection method
211
-
212
-
We also developed a method to project estimates onto the L1 ball, based on Duchi et al 2008. The code for this method is available at `EM/projection.py`. It can be ran as
213
-
214
-
```python
215
-
python projection.py <output_dir><replicate><number of tissues><number of sites><number of individuals><input depth><reference depth><tissue_proportions.pkl>
216
-
```
217
-
218
-
Sample tissue proportions are included at `EM/simulations/unknown_sim_0201_10people.pkl`.
219
-
220
220
## Figures
221
221
222
222
Jupyter notebooks to reproduce figures and statistical analyses for the final version of this manuscript can be found in `paper_figures` directory.
0 commit comments