Wednesday, 11 February 2009

Could you help with a Perl project?

I was wondering if anyone might be able to help a bit, with a current Perl based project i'm working on. Given some peptide fragments from Mass Spec experiments, including duplicate epxeriments on the same protein, and given the gene ID, looking to compare the found peptieds, with what wasn't picked up by Mass Spec, that should have been created by trypsin digest, and potentially shown through MS.


Given a text file with the structure

>GENEID1

PEPTIDE1, PEPTIDE2, PEPTIDE3

e.g.

>ENSG00000000971
ISEENETTCYMGK,SEENETTCYMGK,EENETTCYMGK,PPQIEHGTINSSR,PCSQPPQIEHGTINSSR,QPPQIEHGTINSSR,PQIEHGTINSSR
>ENSG00000005421
VTQVYAENGTVLQGSTVASVYK

Looking to analyses this. To compare the "seen" peptides, with the "unseen" peptides, e..g physicochemical properties.
Wondering what properties might make certain peptides observable, others not so.

Looking to create perl script to determine the properties of these 2 groups - the observed peptides, and the unseen peptides. Then compare the properties of the two groups (isoelectric point, length, MW, amino acid composition etc).

Any pointers? Or an idea through some pseudocode? I can update post and include code so far.

I can use BioPerl pepstat, emows etc I'd imagine. Or collate the physicochemical properties in an array and then export to R, but i'm happy to just get figures, then do visual analysis/ some data chewing through Perl's graph tools.

1 comment:

Anonymous said...

Ok, so if I understand right, you first want to divide the peptides into a 'seen' and 'unseen' groups, and then compare them.

Are you going to compare on a peptide vs peptide basis? ie
ISEENETTCYMGK vs SEENETTCYMGK?

could you give an example of a comparison of these 2, like you would do on paper? or am I getting it all wrong?