sobota 20. septembra 2008

datamining

datamining in genetic/genomic research

finland Helsinki
DM in LD mapping ... haplotype association etc. PDF

PUBLICATIONS BY HANNU TOIVONEN, UNI helsinki

štvrtok 18. septembra 2008

datamining intro

datamining course materials from Australian national Uni,
- same as links below.
COURSE SLIDES -FROM slideshare.com

by prof Lanzi, some slides look identical to slides provided by ANU

- but generally looks good, COMPREHENSIBLE .....



DATAMINING, UNSUPERVISED RECORD LINKAGE.

Markus Hegland , australia

CHRISTEN
DATAMINING. CHALLENGES, MODELS, METHODS AND ALGORITHMS
year 2003
intro -- mainly from the standpoint of computation science.. algorithms
--
progream - FEBRL year 2008
Febrl - A Freely Available Record Linkage System with a Graphical User Interface Peter Christen Proceedings of the Australasian Workshop on Health Data and Knowledge Management (HDKM), Wollongong, January 2008.

- DISTANCE - EUCLIDEAN, PYTHAGOREAN ETC WOLFRAM MATH



ASSOCIATION RULES

- Support, confidence

Support gives total number of transaction of any particular item are occurring in datasets while confidence gives strength of a data in a dataset, we can say support is probability of A and B while confidence is conditional probability. Association rule based on these two characteristics.


pondelok 15. septembra 2008

c-statistic

____?????? I would like to know whether we can calculate C-statistic using SPSS >> 13.

> If by the "c-statisitc" you are referring to the measure of the > discriminative power of the logistic equation, you can calculate it by > saving the predictive probabilities from the logistic regression > analysis and running a ROC curve with the preditive probabilities as > the "test variable." The c-statistic is the area under the curve > value.


HTH

In R/S-Plus you can just use the lrm function in the Design package or:

mean.rank <- mean(rank(x)[y == 1]) c.index <- (mean.rank - (n1 + 1)/2)/(n - n1)
(where n1 is the number of observations with y=1)
or use the somers2 function in the Hmisc package.

Frank Harrell

utorok 9. septembra 2008

statistics general

Q&A by Seaman /course in psychology/


Newsom Portland UNI - handouts to course , SPSS examples


STATNOTES - WEBLECTURES. brief, explanatory, with formulas and examples of both manual calculation of basic tests and examples in SPSS.


semipartial correlation coefficient = Rsq change,,,,, semipartial = part correlation






Dr Newsom has nice disclaimer on his page : "DISCLAIMER: I am not always right."






nedeľa 7. septembra 2008


by Venter


cf dinococcus radiodurans -

sobota 6. septembra 2008

genetics statistics microarray spss

Genovese


//also see the article by authors of the method FDR /method for controlling multiple comparison problem / benjamini, hochberg from 1995




SPSS tutorials at texas ..

latex /superscripts/


SPSS macro basics at Raynald







Rodenburg

A framework to identify physiological responses in microarray-based gene expression studies: selection and interpretation of biologically relevant genes



A Bayesian Measure of the Probability of False Discovery in Genetic Epidemiology Studies
Jon Wakefield*



Comprehensive Analysis of Affymetrix Exon Arrays Using BioConductor
Michał J. Okoniewski, Crispin J. Miller *




streda 3. septembra 2008

SPSS statistics resources

flash/audiovisual tutorials with screenshots and narrated.

wide range of topics, including advanced ones like nonlinear, segmented, robust regression,

Box-Cox transformation, contour, surface plotting and much more....

*****************Uni TEXAS


utorok 2. septembra 2008

enrichment analysis - combining different sources of information




cf SW package GSEAP