TAMO.HT | index /home/David_Gordon/docs/TAMO/HT.py |
HT.py -- Fast Interface to Tabular High-Throughput data (e.g. microarray)
CORE OBJECTS:
class Dataset
class metaDataset
Example:
Say you have a comma-separated file summarizing p-values for a large number of
high-throughput experiments in the form:
refseq_id, HNF4a_HepG2, HNF4a_Hepcyt, HNF1a_HepG2, HNF1a_Hepcyt, ....
NM_000345, 0.0001, 0.01, 0.343, 0.23,
NM_000347, 0.01, 0.443, 0.13, 0.5,
NM_000456, 0.21, 0.04, 1.0, 0.004,
.
.
.
Such files could represent enrichment ratios or p-values from expression data, ChIP-chip data, or
other high-throughput data.
Instantiate a Dataset object:
>>> DATA = MT.Dataset('human_chip_data.csv')
The first time a file is loaded, a cached '.dataset' file is generated for faster access later. You
must therefore have write permission in the directory of the original .csv file if it is being
instantiated into a Dataset object for the first time.
Now you can ask questions:
>>> print DATA.bound('HNF4_HepG2',threshold=0.001) #Produces ['NM_00345']
>>> print DATA.bound('HNF4_HepG2',0.01) #Produces ['NM_00345', 'NM_00347']
If the input file contains Expression data, the Dataset object can be queried for
overexpressed or underexpressed genes in terms of the ratios represented in the dataset:
>>> genes = DATA.ratioabove('yeast_heat',2.0) #With correct dataset, produces upregulated gene list
>>> genes = DATA.ratiobelow('yeast_heat',0.2) #With correct dataset, produces downregulated gene list
In conjunction with the ProbeSet object (in the MotifMetrics module), these genes may be
directly associated with sequences.
A 'metaDataset' provides a way to consider a collection of '.CSV' files as a single dataset.
Other member functions include:
boundq(experiment,id,threshold) # True or false: Bound (or ratiobelow) for this id/experiment condition?
boundby(id,threshold) # List of experiments in which 'id' is bound (or ratiobelow).
value(experiment,id) # Query values
values(experiment, idlist) # Query many values
scores(experiment) # Query all values, as (value, id) tuples
boundre(regexp,threshold) # logical 'and' on all experiments matching threshold (bound or ratiobelow)
In the metaDataset object, there are the member functions:
highest_n(experiment,N,threshold) # Return of list of N id's with values above threshold
lowest_n(experiment,N,threshold) # Return of list of N id's with values below threshold
scores(experiment) # Same as for Dataset object
values(experiment,idlist) # Same as for Dataset object
Copyright (2005) Whitehead Institute for Biomedical Research
All Rights Reserved
Author: David Benjamin Gordon
Modules | ||||||
|
Classes | ||||||||||||||||||||||||
|
Functions | ||
|
Data | ||
Character = 'c' Complex = 'D' Complex0 = 'F' Complex16 = 'F' Complex32 = 'F' Complex64 = 'D' Complex8 = 'F' Float = 'd' Float0 = 'f' Float16 = 'f' Float32 = 'f' Float64 = 'd' Float8 = 'f' Int = 'l' Int0 = '1' Int16 = 's' Int32 = 'i' Int8 = '1' LittleEndian = True NewAxis = None PyObject = 'O' UInt = 'u' UInt16 = 'w' UInt32 = 'u' UInt8 = 'b' UnsignedInt16 = 'w' UnsignedInt32 = 'u' UnsignedInt8 = 'b' UnsignedInteger = 'u' absolute = <ufunc 'absolute'> add = <ufunc 'add'> arccos = <ufunc 'arccos'> arccosh = <ufunc 'arccosh'> arcsin = <ufunc 'arcsin'> arcsinh = <ufunc 'arcsinh'> arctan = <ufunc 'arctan'> arctan2 = <ufunc 'arctan2'> arctanh = <ufunc 'arctanh'> bitwise_and = <ufunc 'bitwise_and'> bitwise_or = <ufunc 'bitwise_or'> bitwise_xor = <ufunc 'bitwise_xor'> ceil = <ufunc 'ceil'> conjugate = <ufunc 'conjugate'> cos = <ufunc 'cos'> cosh = <ufunc 'cosh'> divide = <ufunc 'divide'> divide_safe = <ufunc 'divide_safe'> e = 2.7182818284590455 equal = <ufunc 'equal'> exp = <ufunc 'exp'> fabs = <ufunc 'fabs'> floor = <ufunc 'floor'> floor_divide = <ufunc 'floor_divide'> fmod = <ufunc 'fmod'> greater = <ufunc 'greater'> greater_equal = <ufunc 'greater_equal'> hypot = <ufunc 'hypot'> invert = <ufunc 'invert'> left_shift = <ufunc 'left_shift'> less = <ufunc 'less'> less_equal = <ufunc 'less_equal'> log = <ufunc 'log'> log10 = <ufunc 'log10'> logical_and = <ufunc 'logical_and'> logical_not = <ufunc 'logical_not'> logical_or = <ufunc 'logical_or'> logical_xor = <ufunc 'logical_xor'> maximum = <ufunc 'maximum'> minimum = <ufunc 'minimum'> multiply = <ufunc 'multiply'> negative = <ufunc 'negative'> not_equal = <ufunc 'not_equal'> pi = 3.1415926535897931 power = <ufunc 'power'> remainder = <ufunc 'remainder'> right_shift = <ufunc 'right_shift'> sin = <ufunc 'sin'> sinh = <ufunc 'sinh'> sqrt = <ufunc 'sqrt'> subtract = <ufunc 'subtract'> tan = <ufunc 'tan'> tanh = <ufunc 'tanh'> true_divide = <ufunc 'true_divide'> typecodes = {'Character': 'c', 'Complex': 'FD', 'Float': 'fd', 'Integer': '1sil', 'UnsignedInteger': 'bwu'} |