-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
FitMSD #1
Comments
Hey @nikos-95 I will investigate. However I want to mention that it is strongly not recommended to perform MSD analysis with tracks as short as 8 points. You probably need 50 to 100. The reference by Michalet in the tutorial gives some hints about how many time points you need in your tracks. |
Also could you share an excerpt of the data that generates negative R2 values? |
Hi, about the track length, I'l keep it in mind and watch for the accuracy with simulations. I don't have many alternatives though and will see if the MSDs are fitted well enough and average out. Thanks anyways I will read the paper.
(http://web.maths.unsw.edu.au/~adelle/Garvan/Assays/GoodnessOfFit.html) I do not really understand though why this would happen as we have a simple linear fit and I will probably resort to reading out the normal r2. I hope that both r2 values will give results that are ordered in the same way so that only the threshold needs a slight adjustment. I also made another interesting observation in this example and hope that it is reproducible: When I filter the resulting diffusion coefficients by fit goodness, for example r2fit >0.6, I get a mean diffusion coefficient of 0.18 +- 0.138 (N=156 of 200). Really interesting to me and I appreciate your time! |
After some testing I want to add the observation that adjrsquare and rsquare values are not ordered in the same way exactly. For interpretation I sadly have to pass. |
Hi @nikos-95 I suggest in the issue to fix the |
Ok, so the negative r2 was just a misunderstanding from my side, no worries there. I now understand that the adjusted r2 diminishes for smaller numbers of fit points, which makes sense. Specifically for two fit points it even results in NaN, as such a low count is not reliable.
Sorry, I don't quite get it. Do you mean that I should alter the code, or my data? Anyways, here is the rearranged code that I am now using (excerpt of fitMSD): `for i_spot = 1 : n_spots
end` In my experience it should work safely and correctly, feel free to adopt this change into the project. |
Hello,
first of all great program!
When fitting trajectories of shorter length (both simulated and actual measurements), however, I noticed a lot of negative diffusion coefficients and r² values (which both afaik shouldn't be possible) and dug into the code of fitMSD.
If I interpret it correctly, the program determines how many time points correspond to 25% of the longest trajectory in the pool in terms of length, and than applies that to all other curves for their linear fit evaluation. This seems counter intuitive, as it includes very uncertain points from the shorter curves. The output even says "fitting the first 25% of EACH curve", but then the program doesn't take into account all the NaN points in the MSD data.
This also leads to the following problem: when I have 100 tracks of length 8 each, the program will not fit any because they are all shortened to 1 point prior to fitting. If I add just one track of e.g. length 16 , then suddenly all tracks are properly fitted from point 2 to 16*0.25=4. See the issue?
Edit: And yes, I realize that 25% of points of 8 is not 3, but this is beside the point here
Next, there is also another problem in regard to averaging the fits for a global diffusion coefficient. When I include ALL tracks, the estimate is about right (with a simulation of short tracks). However, when I include only the ones with r²>0.6 or so, a lot of the lower diffusion coefficients are discarded due to bad fit, causing an inflated estimate of total D. Maybe this will be solved with the problem above, when the fit can properly choose points and a correct r² is calculated for selection.
I think I came up with a coded solution for the point selection of each curve (not too hard) and can add it per request if you see the problem in the same way.
However, I have no idea yet on how to deal with the negative D's or averaging in regards to low r2 so that the global mean is accurate, and what math says about the negative r2.
Thank you for reading.
The text was updated successfully, but these errors were encountered: